Meta is Stealing Your Book To Train Their AI: A Discussion on AI, Pirated Books, & What You Can Do // The writing community has been in shambles since it was recently revealed how Meta has been using pirated books to train their AI model, Llama 3. Meta wanted their AI model to compete with ChatGPT, but they needed data to train it. How do they get data fast and free? They decided the best place to go was LibGen, one of the largest book pirating websites out there, with 7.5 million books in its database. This brings so many things to question. Is using copywritten work to train AI legal? How can you prevent your book from being pirated? Is any type of AI ethical? Is a book you’ve written on LibGen? If Meta used your book to train their AI, what can you do as an author? This video only skims the surface of the issue, but let’s get to talking.

Links/Articles Mentioned:

Search LibGen, the Pirated-Books Database That Meta Used to Train AI: https://www.theatlantic.com/technology/archive/2025/03/search-libgen-data-set/682094/

Send a Letter to AI Companies Telling Them They Do Not Have the Right to Use Your Work: https://actionnetwork.org/letters/authors-guild-author-letters-to-ai-companies/

Reach out to Emily D. Baker to cover this case: https://www.youtube.com/@TheEmilyDBaker/

The Unbelievable Scale of AI’s Pirated-Books Problem: https://www.theatlantic.com/technology/archive/2025/03/libgen-meta-openai/682093/

DOES TRAINING AI VIOLATE COPYRIGHT LAW?: https://btlj.org/wp-content/uploads/2023/02/0003-36-4Quang.pdf?fbclid=IwY2xjawJK7hVleHRuA2FlbQIxMAABHQUBWx9CMr_8W_bmWVdNC1om_HK5FSk5hPOSNbdIUuZCeTfHkyFH9wGXuA_aem_9UpUgs0gKq_vAX–8avKLg

Meta’s Massive AI Training Book Heist: What Authors Need to Know: https://authorsguild.org/news/meta-libgen-ai-training-book-heist-what-authors-need-to-know/

US Copyright Page: https://copyright-agency.org/