Theodyx Business

The AI Copyright Reckoning: Inside the Cases Reshaping Media

The lawsuits and statutes deciding whether AI is built on permission or appropriation are now the most important business story in media.

Theodyx Editorial

For two years the artificial intelligence industry operated on a quiet assumption: that the open web was free training material, and that scraping it to build commercial models was a defensible form of fair use. That assumption is now being tested in courtrooms and legislatures on both sides of the Atlantic, and the answers arriving through early 2026 are reshaping the economics of media. The question is no longer whether AI changes content businesses. It is who gets paid when it does.

The case that anchors everything

The defining fight remains The New York Times's suit against OpenAI and Microsoft, filed at the end of 2023. The Times alleged that its journalism was copied at scale to train large language models, and that those models can in some cases reproduce its articles closely enough to substitute for the original. OpenAI's defense rests on fair use and on the argument that training is a transformative, non-expressive process. The case has survived early motions and moved into a contentious discovery phase, with disputes over how much training data and model output each side must disclose.

The Times is not alone. A widening field of publishers, authors, visual artists, and other rights holders has brought parallel claims, and the litigation has clustered into coordinated proceedings to manage the volume. What unites them is a single legal hinge: whether ingesting copyrighted work to train a model is fair use, or whether it is infringement at industrial scale.

The decisive variable is turning out to be not what a model does, but how it was fed.

Mixed signals, one emerging pattern

Rulings through 2025 did not deliver a clean verdict. In two closely watched decisions involving AI developers and book authors, judges signaled real sympathy for the argument that training itself can be transformative. But the same opinions drew a hard line around acquisition: a model trained on lawfully obtained material sits on far stronger ground than one built on pirated libraries or circumvented paywalls. The provenance of the data, in other words, may matter more than the abstraction of the training.

That distinction has enormous commercial consequences. It rewards companies that can document a clean chain of rights and punishes those that treated the internet as a no-cost commons. It also reframes the debate from a binary—legal or illegal—into a spectrum priced by risk. For media owners, that spectrum is the opening. If lawful acquisition is the safe harbor, then a license is the product.

From lawsuit to license

The most important development is not any single ruling but the parallel track running alongside the courts: the deal market. Rather than wait years for precedent, a growing roster of major publishers and platforms has signed paid licensing arrangements with AI developers, covering both archival training and live access to content. The pattern echoes how the music industry eventually monetized streaming after years of litigation—settlement and licensing, not abstention, became the business model.

The music labels are now running the same playbook in a sharper key. The major recording companies sued leading AI music-generation startups in 2024, alleging mass copying of recordings to train systems that produce competing songs. By late 2025, reporting indicated those disputes were moving toward licensing frameworks rather than scorched-earth outcomes—an acknowledgment that a negotiated rate beats an uncertain judgment for both sides. For creators, this is the critical signal: the value of a catalog now includes its worth as training data, a right that did not exist as a line item three years ago.

For publishers: archives become recurring-revenue assets, not just SEO inventory.
For labels and studios: licensing terms now must contemplate synthetic outputs, not only reproduction.
For independent creators: leverage depends entirely on whether you own your rights and can prove provenance—the case for owning your audience and first-party channels has never been stronger.

The other front: voice, face, and the deepfake statutes

Copyright is only half the reckoning. A parallel body of law—the right of publicity—governs the commercial use of a person's name, image, voice, and likeness, and it is moving faster than the copyright cases. Tennessee's ELVIS Act, effective in 2024, extended that state's protections explicitly to cover AI-generated voice clones, and other states have advanced their own measures. At the federal level, proposals such as the NO FAKES Act would create a nationwide right against unauthorized AI replicas of voice and likeness, with carve-outs intended to protect parody and news.

This matters for media because it protects something copyright never fully did: identity itself. A creator's voice or face is not a copyrighted work, but it is increasingly the most valuable, most clonable asset they own. The same statutes that shield a recording artist from an unauthorized vocal clone also give an influencer recourse against a deepfaked endorsement. For brands and agencies, the exposure runs the other way—deploying a synthetic likeness without airtight consent is becoming a fast route to liability.

What the emerging rules mean for operators

The contours of the new regime are now visible, even before the landmark cases conclude. Three shifts deserve planning attention.

First, provenance becomes infrastructure. Rights metadata, content authentication, and clean licensing trails are migrating from legal afterthought to operational requirement. The media companies that can prove what they own—and license it cleanly—will command a premium that those with murky catalogs cannot.

Second, training data is an asset class. The deal market is establishing reference prices for content as AI fuel. That reprices archives, catalogs, and even niche creator libraries. Diligence on any media acquisition now has to value AI-licensing optionality, and platform dependence cuts both ways—a theme we explored in the rented algorithm.

Third, the regulatory map is fragmented and global. The EU's AI Act imposes transparency obligations on training data and general-purpose models, while US law develops case by case and state by state. Operators building or licensing AI products face a patchwork, and the safe posture is to design for the strictest regime, not the most permissive.

The reckoning is not a single verdict; it is a repricing. The era of treating creative work as free training fuel is closing, and what replaces it is a market—messy, contested, and lucrative—in which permission is the product and provenance is the moat. The media businesses that internalize that early will not merely survive the transition. They will set its terms.