Meta execs were surprised to beat OpenAI’s GPT-4 internally, court filings reveal

Executives and researchers leading Meta’s AI efforts were amazed to beat OpenAI’s GPT-4 model while developing Llama 3, according to internal messages revealed in court on Tuesday in one of the company’s ongoing AI copyright cases, Kadrey v. Meta.

“Honestly… Our goal should be GPT-4,” said Meta’s VP of Generative AI, Ahmad Al-Dahle, in an October 2023 message to Meta researcher Hugo Touvron. “We have 64k GPUs coming! We need to learn how to create the boundary and win this race.

Although Meta has released open AI models, the company’s AI leaders are more focused on beating competitors who don’t typically release their model weights, such as Anthropic and OpenAI, and instead gate them behind an API. Meta executives and researchers hold Anthropic’s Claude and OpenAI’s GPT-4 as a gold standard to work towards.

The French AI startup Mistral, one of Meta’s biggest open competitors, was mentioned several times in internal messages, but the tone was dismissive.

“Mistral is peanuts for us,” said Al-Dahle in a message. “We should have done better,” he said later.

Tech companies are racing to outdo each other with the latest AI models these days, but these court filings reveal just how competitive Meta’s AI leaders really are – and it seems still there. At several points in the message exchange, Meta’s AI leads talked about how they were “very aggressive” in getting the right data to train Llama; at one point, an exec even said “Llama 3 is literally all I care about,” in a message to coworkers.

Prosecutors in this case say Meta executives sometimes cut short their mad rush to ship AI models, training copyrighted books in the process.

Touvron noted in a message that the mix of datasets used for Llama 2 was “bad,” and discussed how Meta could use a better mix of data sources to improve Llama. 3. Touvron and Al-Dahle then talked about clearing the way to use the LibGen dataset, which contains copyrighted works from Cengage Learning, Macmillan Learning, McGraw Hill, and Pearson Education.

“Do we have the right data there (?)” said Al-Dahle. “Is there anything you want to use but can’t for some stupid reason?”

Meta CEO Mark Zuckerberg previously said that he was trying to close the performance gap between Llama’s AI models and closed models from OpenAI, Google, and others. Internal messages reveal intense pressure within the company to do so.

“This year, Llama 3 competes with the most advanced models and leads in some areas,” Zuckerberg said in a LETTERS from July 2024. “Starting next year, we expect future Llama models to be the most advanced in the industry.”

When Meta finally Llama 3 was released in April 2024the open AI model competes with leading closed models from Google, OpenAI, and Anthropic, and better open options from Mistral. However, the data that Meta uses to train its models — data that Zuckerberg reportedly gave the green light to use, despite its copyright status — faces scrutiny in several ongoing lawsuits.

Source link