Adobe Sued Over AI Training Data Linked to Pirated Books
Could Adobe’s AI ambitions be derailed by a copyright lawsuit?
Adobe’s aggressive pivot into generative AI is facing a significant legal roadblock. A proposed class-action lawsuit, filed by author Elizabeth Lyon, alleges that Adobe illegally used pirated books—specifically her own works—to train its SlimLM AI model for mobile devices.
Lyon’s complaint hinges on the lineage of the training data. Adobe claims SlimLM was trained on SlimPajama-627B, an open-source dataset released by Cerebras. However, the lawsuit argues this dataset is a derivative of RedPajama, which explicitly includes the “Books3” dataset. Books3 is a notorious collection of over 191,000 pirated books frequently cited in AI litigation.
The legal filing details that Lyon’s guidebooks were likely part of a processed subset manipulated into Adobe’s final training model. This mirrors a wider industry crisis, where AI models are inadvertently fueled by infringing data.
This lawsuit places Adobe alongside major tech giants. Apple and Salesforce also face litigation citing RedPajama’s inclusion of protected works. The precedent is shifting rapidly; recently, AI firm Anthropic settled a similar suit for $1.5 billion, signaling a new era of financial accountability for data acquisition.
For Adobe, the timing is delicate. As the company integrates Firefly and other AI tools deeply into its Creative Cloud ecosystem, the reliability of its data pipeline is under scrutiny.
This case underscores the complex legal frontier of AI development. As courts assess whether “fair use” covers training on pirated materials, tech companies must rethink how they source data. For creators like Lyon, it is a critical stand for intellectual property rights in the digital age.


No Comments