This browser is not actively supported anymore. For the best passle experience, we strongly recommend you upgrade your browser.
| 4 minute read

Training AI models and extraterritorial application of copyright rules

The extraterritorial application of copyright law has become a hot topic in AI. 

The UK’s ongoing consultation on Copyright and AI has a section entitled ‘Treatment of models trained in other jurisdictions’ which states “we want to encourage AI developers operating in the UK to comply with UK law on AI model training, even if their models are trained in other countries” (emphasis added). The consultation seeks views on this although it stops short of advancing a recommendation or preferred approach at this stage.

In what might be described as a legislative landgrab, in the Data (Use and Access) Bill [HL], the House of Lords has proposed an addition to the Bill which would require both the operators of web crawlers and general purpose AI (GPAI) models to comply with UK copyright law “regardless of the jurisdiction in which the copyright-relevant acts relating to the pre-training, development and operation of those web crawlers and general-purpose AI models take place”.  This addition was partly the result of very well-mobilised right holder lobbying in the UK. Whether the addition will materialise into law is open to question, particularly as it pre-empts the ongoing consultation. 

The position under the EU AI Act is slightly more advanced, with that Act being in force. However, even under the EU AI Act, the notion of extraterritorial application of copyright law is beset with challenges and the issue is still being played out in the EU’s draft General Purpose AI Code of Practice. 

Article 53(1)(c) of the EU AI Act requires general purpose AI (GPAI) model providers to put in place a policy to comply with EU copyright law.  Recital 101 clarifies that this includes, in particular, complying with any reservation of rights under Article 4(3) of the DSM Directive.  A reservation of rights is where a copyright holder signals (by machine readable means) that they do not wish to have their works copied or extracted for text and data mining, which according to the EU AI Act includes scraping data for the purpose of training GPAI model (see recital 105). There is a separate debate as to precisely how a copyright holder can implement such a reservation – see here.

On the face of it, requiring a GPAI model provider to put in place a policy to comply with EU copyright law seems uncontroversial.  However,  Recital 106 to the EU AI Act suggests that this is required “regardless of the jurisdiction in which the copyright-relevant acts underpinning the training of those general-purpose AI models take place”. In theory, this means that if a US-based GPAI model provider who trains their model in the US wishes to place their model on the EU market, they are required to comply with EU copyright law. 

There are a number of difficulties associated with this, and they are difficulties that the UK will also have to grapple with under the ongoing consultation.

(1) Copyright law generally applies on a territorial basis

The underlying rationale behind recital 106 is sound insofar as it is seeking to regulate product liability and ensure that EU GPAI model providers are not placed at a competitive disadvantage.  However, the acts which it seeks to regulate occur outside of the EU. Copyright applies in a territorial manner – this is a fundamental principle of copyright law. If, for example, I carry out copyright relevant acts (e.g. copying) in the US in the course of training the model, then typically the copyright laws of the US apply to my acts, not EU copyright law just because the model has subsequently been made available there.  

The suggestion in recital 106 that EU copyright law should apply extraterritorially is therefore controversial.

(2) Article 53(1)(c) only applies to GPAI model developers

Article 53(1)(c) of the EU AI Act regulates only GPAI model providers. It does not impose these requirements on other entities in the supply chain. Take, for example, LAION which creates data sets which are used by others (such as Stable Diffusion) to develop and deploy AI models. LAION would carry out the acts of reproduction and extraction necessary in the course of creating the training data sets (such as LAION-5B) but would not be regulated under the EU AI Act as a GPAI model developer.  In other words, Article 53(1)(c) may have a limited effect. 

(3) Recital 101 is not an operative part of the EU AI Act

Recitals to EU legislation essentially serve to explain the substantive obligations/purposes contained in the regulations/articles themselves. However, recital 101 arguably goes further than simply explaining or even embellishing Article 53(1)(c). There is no reference in Article 53(1)(c) to the extraterritorial application of EU copyright law. So recital 106 appears to introduce an additional substantive requirement. However, that is not the purpose of a recital, which is not legally binding anyway.

These challenges beg the question: can EU copyright law really be applied extraterritorially? The logical answer is probably not. 

There may, however, be scope for quasi-regulation through the back door. The second draft of the EU’s General Purpose AI Code of Practice requires GPAI model providers to put in place “an internal policy” to comply with EU law for all development phases, including data collection, training, testing, and placing on the market. It also requires GPAI model providers to “commit to undertaking copyright due diligence” when contracting with third parties to acquire datasets to train their model – that is, to obtain assurances from these parties about their compliance with EU copyright law. There are also commitments to ensuring lawful access to copyright-protected content and to refrain from crawling websites making available copyright infringing content. 

The Code of Practice is, however, not legislation. It is “a guiding document for providers of general-purpose AI models in demonstrating compliance with the AI Act along the full life cycle of the models”. These commitments in the Code of Practice implicitly recognise the difficulties around extraterritoriality in the EU AI Act highlighted above. 

The challenges associated with extraterritoriality are a paradigm of trying to fit square pegs of territorial copyright laws into the round hole of a global marketplace. It is fair to say that the UK legislators have their hands full with this.

Subscribe to receive our latest insights - on the topics that matter most to you - direct to your inbox, at your preferred frequency. Subscribe here

Tags

artificial intelligence, database rights, copyright, technology, article