Meta Secretly Trained Its AI in a Notorious Russian ‘Shadow Library’, New Unredacted Court Documents Reveal


“Meta treated the so-called ‘public availability’ of shadow data as a get-out-of-jail-free card, even though Meta’s internal records show every relevant Meta decision-maker, up to and including its CEO, Mark Zuckerberg, knew LibGen. was ‘a dataset we knew was pirated,'” the plaintiffs said in the motion. (Originally filed in late 2024, the motion was a request to file a third amended complaint.)

In addition to the plaintiffs’ briefs, another filing was not redacted in response to Chhabria’s order—Meta’s opposition on the motion to file an amended complaint. It argued that the authors’ attempts to add additional claims to the case were an “eleventh-hour gamble based on a false and inflammatory premise,” and denied that Meta was waiting to reveal the important discovery information. However, Meta argued that it first disclosed to the plaintiffs that it was using a LibGen dataset in July 2024. (As most of the discovered materials remain confidential, it was difficult for WIRED to confirm that claim .)

Meta’s argument rests on its claim that the plaintiffs already knew about the use of LibGen and should not have been given more time to file a third amended claim if they had had enough time to do so. it before the end of discovery in December 2024. “Plaintiffs were aware of Meta’s download and use of LibGen and other alleged ‘shadow libraries’ since mid-July 2024,” the tech giant’s lawyers ARGUE.

In November 2023, Chhabria granted Meta’s motion to dismiss certain claims in the lawsuit, including its claim that Meta’s alleged use of the authors’ work to train the AI ​​violated Digital Millennium Copyright Actis a US law introduced in 1998 to prevent people from selling or copying copyrighted works on the internet. At that time, the judge AGREES with Meta’s position that the plaintiffs did not provide sufficient evidence to prove that the company removed the so-called “copyright management information” (CMI), such as the author’s name and work title.

The unredacted documents argue that the plaintiffs should be allowed to amend their complaint, saying that the information revealed by Meta is evidence that a DMCA claim is necessary. They also said the discovery process unearthed reasons to add new allegations. “Meta, through a corporate representative testifying on November 20, 2024, now admits under oath to uploading (aka ‘seeding’) pirated files containing the works of the Plaintiffs in ‘torrent’ sites,” the motion states. (Seeding is when torrent files are shared with other peers after they are downloaded.)

“This streaming activity makes Meta itself a distributor of the same pirated copyrighted material that it also downloads for use in commercially available AI models,” one of the new unedited documents claim, saying that Meta, in other words, does not exist. just used copyrighted material without permission but also distributed it.

LibGen, an archive of books uploaded to the internet that originated in Russia around 2008, is one of the largest and most controversial “shadow libraries” in the world. In 2015, a judge in New York Controls a preliminary injunction against the site, a measure designed in theory to temporarily close the archive, but anonymous administrators simply transferred its domain. In September 2024, another judge in New York Controls LibGen to pay $30 million to rights holders for violating their copyrights, despite not knowing who actually runs the piracy hub.

The discovery of Meta’s woes for this case is far from over. In the same vein, Chhabria warned the tech giant against any overly restrictive redaction requests in the future: “If Meta resubmits an unreasonably broad request to sealing, all materials will simply be unsealed,” he wrote.



Source link

  • Related Posts

    The judge approved the settlement in the case that Tesla’s board overreached themselves

    A judge has approved a settlement that will finally end a case against Tesla in the Police and Fire Retirement System of the City of Detroit. The shareholders argue that…

    Today’s NYT Mini Crossword Answers for Jan. 10

    sought the latest Mini Crossword answer? Click here for today’s Mini Crossword hintsas well as our daily answers and hints for The New York Times Wordle, Strands and Connections puzzles.…

    Leave a Reply

    Your email address will not be published. Required fields are marked *