The licensing of content for AI model training is on the rise. Amid reports that developers are facing a scarcity of content to train AI models, and in light of ongoing litigation and legislative developments in this area, developers are increasingly looking to enter into licensing deals with content providers for AI model training (see our previous article on this here). However, the practice of licensing content for AI training is a developing area and it is not without its complexities – both for developers and content providers. Licences need to be tailored to reflect the nature of AI technology, the AI model under consideration, and the content being licensed. Some key areas to consider are:
- What content is within scope? While a developer may wish to secure access to a broad range of content to train an AI model, content providers will want to set tight parameters around what content is and is not included, and a reservation of rights for excluded content. The parties should consider whether a licence will be limited to existing content, or whether it will cover updates and future works, and what mechanisms apply for providing these.
- What rights are granted to use content, and on what basis? Permitted uses should be clearly defined – for example, in addition to initial use for model training, will the developer be permitted to use content for validation and testing, and for ongoing improvement and fine tuning of a model, and for how long will these rights apply? And will any restrictions apply to commercial use? Exclusivity, territorial scope and sub-licensing should also be addressed.
- Assurances regarding content – Developers will want to include assurances that any content provider has the rights to license content. In turn, this may require providers to check the terms of any contracts under which they obtained rights to content, to ensure that they have the necessary rights – this can be a tricky area, especially where contracts were entered into before the widespread adoption of AI. The situation may be complex where rights have been acquired from contractors or other independent third parties.
- Acknowledgements – Depending on the nature of the content provided, this can be a key area of concern for those licensing content. Content providers may wish to ensure that their contribution (or that of any underlying creators) is acknowledged in relation to the model and its outputs – for example by crediting the content’s source, or by providing a link to a licensor’s website. Any such stipulations would need to be set out in the agreement.
- Restrictions on use – A content provider may wish to include restrictions on content being modified, and prohibitions on content being reproduced in outputs of a model. However, developers may try to push back against these sorts of provisions – for example, they may need to amend content as part of the preprocessing of content for model training, and they may be concerned about inadvertently reproducing content. Licence terms will need to address these points in a flexible way that addresses both parties’ concerns.
- What payments and payment structure will apply? The value of any deal will of course be one of the most challenging areas for the parties to consider, and the sums will vary hugely depending on the content being provided (and whether this is static, or evolving), and the parties involved. The parties will also need to consider what payment structures will be appropriate –will payments be made upfront, or on an ongoing basis, and will royalties or revenue sharing arrangements apply?
- Consequences of termination – what happens to content? Content providers will want to include obligations not to use content following termination, and to delete/return material provided. However, a developer may find it difficult to sign up to this where they have already used content to train a model, and they are likely to want to be able to continue to use any model that has already been trained on content prior to termination. This will be a key point in any negotiation.
The licensing of content for AI model training is a dynamic area of practice. We look forward to seeing how the licensing market continues to develop as AI technologies advance.