
Join our daily and weekly newsletters for newest updates and exclusive content to cover the industry. Learn more
New META language flags of Llama Model 4 came suddenly by the weekendWith parent company on Facebook, Instagram, WhatsApp and Search VR .
Also, all three have contents of Windows Windows – the amount of information a AI language model can handle an input / tool.
But following the surprise announcement and public release of both models for downloading and using – the more parameter Llama 4 Maverick – Saturday Saturday social media.
Llama 4 Sparks confusion and criticizing AI users
An unannounced place In the North American Chinese community community forum 1Point3Acres in R / Localalllama Subreddit Of reddit declaring that from a meta organization researcher claims that the model made poorly in third party benchmarks “Suggested test sets from different benchmarks in the posting process, seeking to meet targets in different metrics and produces a ‘presentable’ result.”
The post was met with doubt from the community in its authenticity, and a venturebeat email in a meta spokesman has not received an answer.
But other users find the reasons for doubting benchmarks regardless of.
“At this point, I suspect the mema is polluted something in the released scales … if not, they need to throw away all that works it and then use money to get the nous“Commented @cto_junior to x, in reference to an independent user test showing the bad performance of Llama 4 Maverick (16%) in a benchmark known as an Aider Polyglotrunning a model through 225 coding functions. That’s good at the bottom of the show of larger, older models like Feresteek v3 and Claude 3.7 Sonnet.
Change 10 million token context in window meta meta command for llama 4 scout, ai phd and author Andriy Burkov wrote X In part that: “The declared 10m context is virtual because no model has been trained in the prompts of more than 256k tokens. This means that in these 256k tokens it will get quality output most of the time.”
Also in r / localalllama subreddit, written by user dr_karminski “I’m never disappointed in llama-4,“And demonstrated its bad performance compared to non-rational model of Dreepseecek in coding functions such as simulating bolons.
Former meta researcher and now AI2 (Allen Institute for artificial intelligence) Senior Research Sciesta Nathan Lambert His interconnects substack blogs On Monday to focus that a benchmark comparison posted in the meta self Llama Download Llama in Llama 4 Maverick to other Terst-to-Head Comparison tools ____ Aka Chatbot Arena, actually used a other version of Llama 4 Maverick than the company itself is available to the public – one optimized for conversation. “

As lambert wrote: “Sneaky. The results below are fake, and this is a significant small in the meta community they do not release the opening of the essential models of important models of important models of important models of important models of important people.
Lambert noticed that while this specific arena model “The technical reputation of release is because its character is Juvenile,” including many emojis and useless emotional dialogues, “The real model of other hosting providers is relatively good and have a reasonable tone!”
In response to the river of criticism and accusations in the benchmark seat, VP and Head of Meta Ahmad Al-Dahle took X to express:
“We’re glad to start getting llama 4 with all your hands. We’ve listened to many big odds that people get.
Thus, we also heard some reports of equal quality of different services. Because we dropped the models when they were ready, we hope we take many public days. We will continue to work with our heavy repairs.
We also hear the claims we train in test sets – that’s not true and we don’t do it. Our best understanding is that variable quality people see because of necessity to strengthen implementations.
We believe the models of Llama 4 is an important development and we look forward to working with the community to open their value.“
However the answer was met Complaints of bad performance and calls for more information, such as more Technical documentation LLAMA 4 models are designed and their training processes, as well as further questions about why this release is compared to all that has not been released in Llama especially beaten with issues.
Heels also arrived at the number two of the VP in Meta Joelle Pineau, who worked at the adjacent Meta Foundational Jewish Research (fair) organization, informing his departure from the company On LinkedIn last week with “nothing but praise and deep appreciation for each of my managers.” By, it should also be remembered too Promoted the release of the Llama family 4 This weekend.
Llama 4 keeps spreading other provisions of predictions with mixed results, but safely tell the first release of the model family without a AI community.
And the future Meta Llamacon on April 29The first celebration and grouping for third-party developers in the model model, most likely to have more fodder for discussion. We track it all, stay fast.
Source link