silversurfer
Level 85
Thread author
Verified
Honorary Member
Top Poster
Content Creator
Malware Hunter
Well-known
- Aug 17, 2014
- 11,043
Book authors are suing Nvidia, alleging that the chipmaker's AI platform NeMo—used to power customized chatbots—was trained on a controversial dataset that illegally copied and distributed their books without their consent.
In a proposed class action, novelists Abdi Nazemian (Like a Love Story), Brian Keene (Ghost Walk), and Stewart O’Nan (Last Night at the Lobster) argued that Nvidia should pay damages and destroy all copies of the Books3 dataset used to power NeMo large language models (LLMs).
The Books3 dataset, novelists argued, copied "all of Bibliotek," a shadow library of approximately 196,640 pirated books. Initially shared through the AI community Hugging Face, the Books3 dataset today "is defunct and no longer accessible due to reported copyright infringement," the Hugging Face website says.