Content Licensing for Training AI: How Licensing Deals Look
As more companies develop AI models for generating videos, the need for high-quality video content is growing rapidly. These AI models require vast amounts of video data—sometimes millions of hours—to train effectively. While some developers are still scraping content from the internet, this doesn’t always provide the best quality. Therefore, there’s an increasing interest in licensing video content from film and TV archives, especially content that isn’t easily accessible online.
Although major Hollywood studios haven’t yet fully embraced licensing their content for AI training, smaller distributors around the world are starting to make deals. These smaller players see an opportunity to license their films, TV shows, and other video content to AI companies. One such company, Calliope Networks, has been actively building a large collection of video content for this purpose. Calliope now has over 17,000 hours of content from more than 10,000 titles, which they’re preparing to license to AI developers.
When licensing content for AI training, it’s crucial to ensure that the video files are of high quality and diverse enough to be useful for various training purposes. Companies like Calliope Networks focus on creating diverse datasets that include different locations, objects, and activities. They also ensure that all content is at least in HD quality, which is important for the effectiveness of AI models.
The cost of licensing high-quality film and TV content can be significantly higher than licensing stock footage. For example, Calliope charges $6.25 per minute for HD content, with additional fees for 4K or 3D content. This pricing reflects the value of obtaining clean, high-quality data that isn’t readily available online, making it a valuable resource for AI companies seeking reliable training data.
Despite these efforts, there are still concerns among content owners about licensing their material for AI training. Some fear that by doing so, they might be contributing to the development of technology that could eventually replace them in the market. However, advocates for licensing argue that it provides a way for content creators to be fairly compensated and to have some control over how their content is used in AI development.
In parallel with these emerging licensing deals, many copyright owners have brought lawsuits against generative AI platforms like OpenAI, which are slowly moving through the courts. In the meantime, copyright owners are operating on the assumption that AI companies will need to obtain licenses for the content they use in training their models or in generating outputs. The recent announcement by the Copyright Clearance Center, including some AI rights in its Annual Copyright License for corporations, is a sign of growing efforts toward this goal.
The landscape for AI content licensing is becoming increasingly complex. More and more media companies are reaching individual license agreements with AI companies, while several startups have emerged to aggregate content into large collections that AI platforms can license through one-stop arrangements, known as blanket licenses. Last month, these startups even formed a trade association, the Dataset Providers Alliance, to better organize and advocate for their interests.
However, the growing volume of licensing activity could also pose challenges. It will likely take years before the ongoing lawsuits provide clarity on the legal rules for copyright in the AI era. In the meantime, both courts and Congress may consider how easy it is for AI companies to obtain licenses when determining whether licensing is required.
AI companies often prefer to license “all” content of a given type—whether it’s text, images, music, or video—similar to how subscription services like Spotify operate. The challenge is ensuring that licensing remains straightforward and not overly fragmented, as too many individual deals could drive AI companies to continue using unlicensed content, as some have done in the past.
In response to this potential fragmentation, startups like Calliope Networks are working to create comprehensive content libraries that AI companies can easily license. Meanwhile, other entities like the Copyright Clearance Center are expanding their licensing frameworks to include AI usage, hoping to establish clear and efficient licensing models that will benefit both content owners and AI developers.
Ultimately, content owners face three main choices: license their content to AI companies, sue for copyright infringement, or withhold their content and risk losing out in a rapidly evolving market. The decision isn’t easy, but as AI technology continues to advance, it’s becoming increasingly crucial for the future of creative industries.
Share:
As more companies develop AI models for generating videos, the need for high-quality video content is growing rapidly. These AI models require vast amounts of video data—sometimes millions of hours—to train effectively. While some developers are still scraping content from the internet, this doesn’t always provide the best quality. Therefore, there’s an increasing interest in licensing video content from film and TV archives, especially content that isn’t easily accessible online.
Although major Hollywood studios haven’t yet fully embraced licensing their content for AI training, smaller distributors around the world are starting to make deals. These smaller players see an opportunity to license their films, TV shows, and other video content to AI companies. One such company, Calliope Networks, has been actively building a large collection of video content for this purpose. Calliope now has over 17,000 hours of content from more than 10,000 titles, which they’re preparing to license to AI developers.
When licensing content for AI training, it’s crucial to ensure that the video files are of high quality and diverse enough to be useful for various training purposes. Companies like Calliope Networks focus on creating diverse datasets that include different locations, objects, and activities. They also ensure that all content is at least in HD quality, which is important for the effectiveness of AI models.
The cost of licensing high-quality film and TV content can be significantly higher than licensing stock footage. For example, Calliope charges $6.25 per minute for HD content, with additional fees for 4K or 3D content. This pricing reflects the value of obtaining clean, high-quality data that isn’t readily available online, making it a valuable resource for AI companies seeking reliable training data.
Despite these efforts, there are still concerns among content owners about licensing their material for AI training. Some fear that by doing so, they might be contributing to the development of technology that could eventually replace them in the market. However, advocates for licensing argue that it provides a way for content creators to be fairly compensated and to have some control over how their content is used in AI development.
In parallel with these emerging licensing deals, many copyright owners have brought lawsuits against generative AI platforms like OpenAI, which are slowly moving through the courts. In the meantime, copyright owners are operating on the assumption that AI companies will need to obtain licenses for the content they use in training their models or in generating outputs. The recent announcement by the Copyright Clearance Center, including some AI rights in its Annual Copyright License for corporations, is a sign of growing efforts toward this goal.
The landscape for AI content licensing is becoming increasingly complex. More and more media companies are reaching individual license agreements with AI companies, while several startups have emerged to aggregate content into large collections that AI platforms can license through one-stop arrangements, known as blanket licenses. Last month, these startups even formed a trade association, the Dataset Providers Alliance, to better organize and advocate for their interests.
However, the growing volume of licensing activity could also pose challenges. It will likely take years before the ongoing lawsuits provide clarity on the legal rules for copyright in the AI era. In the meantime, both courts and Congress may consider how easy it is for AI companies to obtain licenses when determining whether licensing is required.
AI companies often prefer to license “all” content of a given type—whether it’s text, images, music, or video—similar to how subscription services like Spotify operate. The challenge is ensuring that licensing remains straightforward and not overly fragmented, as too many individual deals could drive AI companies to continue using unlicensed content, as some have done in the past.
In response to this potential fragmentation, startups like Calliope Networks are working to create comprehensive content libraries that AI companies can easily license. Meanwhile, other entities like the Copyright Clearance Center are expanding their licensing frameworks to include AI usage, hoping to establish clear and efficient licensing models that will benefit both content owners and AI developers.
Ultimately, content owners face three main choices: license their content to AI companies, sue for copyright infringement, or withhold their content and risk losing out in a rapidly evolving market. The decision isn’t easy, but as AI technology continues to advance, it’s becoming increasingly crucial for the future of creative industries.