\(
\def\WIPO{World Intellectual Property Organisation}
\)
Out from the shadows: developing effective copyright laws for AI training datasets and shadow libraries
2026
شكل
| تنسيق | |
|---|---|
| BibTeX | |
| MARCXML | |
| TextMARC | |
| MARC | |
| DataCite | |
| DublinCore | |
| EndNote | |
| NLM | |
| RefWorks | |
| RIS |
Cite
تفاصيل
العنوان
Out from the shadows: developing effective copyright laws for AI training datasets and shadow libraries
نوع المادة
Journal article
الوصف
1 online resource (pages 22–35)
معرف الوثيقة الرقمية
ملخص
While there has been extensive consideration of whether generative artificial intelligence (GAI) developers infringe copyright when scraping online works to train their systems, there has been limited analysis of the legal status and treatment of the underlying shadow libraries, which facilitate such infringement. ‘Shadow libraries’ are vast online repositories of textual content, the majority of which are copyrighted works that have been compiled without permission. The objective of this paper is to develop effective copyright datasets laws that calibrate the protection of author rights with the support of GAI innovation. The paper begins by developing a three-step test (author incentives, access to works and public interest) for determining what constitutes ‘effective’ copyright law in the AI era. Applying this three-step test to the selected case of Australian copyright law, the paper advocates for the implementation of a modified version of the European Union’s copyright exception for text and data mining to provide GAI developers with broader legitimate access to works, with opt-out provisions preserving author rights. Additionally, a more clearly defined requirement that copied works be ‘lawfully accessible’ would operate to exclude pirated works being scraped from shadow libraries. The paper further recommends accompanying mandatory disclosure obligations to assist rightsholders to identify and prosecute extractions of their works from shadow libraries for AI training. It is hoped that this multifaceted and nuanced suite of recommended reforms will provide insights to law and policy makers globally who are seeking to design copyright laws for AI datasets.
السلسلة
Intellectual Property Law & Practice, 21, 1, 2026
الموارد المرتبطة
منشور
Oxford, UK : Oxford University Press, 2026.
اللغة(لغات)
eng
معلومات حق المؤلف
https://academic.oup.com/pages/using-the-content/citation