In the din of excitement about the artificial intelligence revolution, a more subtle but intriguing debate is now unfolding among journalists and media executives worldwide: If AI systems are being trained on our journalism, shouldn’t we be compensated for that work? You might call it the “Does AI owe for news?” debate.
The proposed solution now gaining traction in policy and press circles is known as “statutory licensing,” under which AI companies would be required to pay news publishers if AI models are trained on their articles. The notion is no longer fringe, and has gained steam in recent months in various legislative and industry circles.
So why now? Nearly all generative AI models are trained on massive amounts of content from the web, billions of pages of text scraped from the internet. Alongside blogs and scholarly articles, journalism plays a significant role in that mix. News stories, investigations, analysis pieces, basically, the output of reporters every day, are used as part of the data that AI models learn from to improve their ability to explain concepts and draw connections.
But for news organizations, that dynamic feels a bit one-sided.
Consider this: A journalist may work for weeks or months reporting and gathering information, conducting interviews, fact-checking and writing a story. An editor edits, a lawyer vets, and the process takes significant time and resources. Then an AI model trains on that work and produces something similar in a matter of seconds. And the news organization that produced the original journalism is not paid a penny.
You can see why publishers might raise an eyebrow.
And this is not a theoretical exercise, it’s already playing out in the courts. One of the most high-profile cases is a lawsuit filed by The New York Times against OpenAI and Microsoft alleging the company’s reporting was used to train AI models. It’s shaping up to be one of the highest-profile copyright cases of the AI age.
Proponents of the licensing proposal point out that we have been here before. The rise of streaming upended the music industry until it settled into a model where music streaming services pay royalties to artists and rights holders every time a song is played. Some advocates believe AI could follow a similar pattern, with companies paying into a system that disperses payments to publishers whose content was used to train models.
That sounds nice. But it’s not that easy.
The biggest challenge is determining what content actually informed an AI model. With music, it’s simple to keep track of how many times a song is played. That doesn’t work with AI training. The problem is that AI models are trained on billions of documents at once, weaving them into patterns and probabilities. It’s hard to quantify the value of one article to that training process. Academic researchers who focus on AI transparency and training data have only recently begun to explore the question.
On the other side, tech companies that develop AI say that new rules for public data would hamper innovation. They argue that AI models learn like humans learn, by reading widely, and drawing information from a vast array of sources. In their mind, the internet has always operated like a public library.
But that comparison doesn’t sit right with critics. When a human reads 10,000 articles, they don’t become a computer program that can answer queries for millions of people in seconds. AI does. And that’s what’s got news organizations spooked. Governments are beginning to take notice. Some countries have already tested policies to rebalance tech companies and journalism.
Last year, Australia launched a plan to force tech companies to negotiate payments with news publishers. It was polarizing at the time, but it showed that governments are willing to intervene if they believe the media ecosystem is at risk. And the stakes are high. The news industry has struggled financially for years. Ad revenue flowed to giant tech platforms, local newspapers closed or consolidated, and many outlets are still testing subscription models.
Now AI is here, and some publishers are worried that it could siphon off even more readers from their sites. Imagine a scenario: someone asks an AI assistant to summarize a complex news story. The AI responds with a smart summary. That’s convenient, but they may never click through to the newsroom that produced the original report. That’s what this argument is really about. If AI companies benefit from journalism, should they help pay for it?
Some people believe the answer is simple. Others believe that payments could stifle innovation, or spark messy legal battles. For now, the debate is ongoing. Policymakers are exploring options, news organizations are advocating for protections, and AI companies are navigating a shifting legal landscape.
But one thing seems increasingly clear: The era when AI companies could train on the internet’s news archives without scrutiny is probably coming to an end. And however this fight plays out, it will likely shape the future relationship between journalism and artificial intelligence for years to come.