Perspective Map

Generative AI and Intellectual Property: What Each Position Is Protecting

March 2026

Sarah is an illustrator who has been making a living doing commercial work for a decade. Last year she discovered that her distinctive style — laboriously developed across hundreds of hours of paid and unpaid work — had been replicated almost exactly by an image-generation model that had been trained, without her knowledge or consent, on images scraped from her public portfolio. When she contacted the company, she received a form letter explaining that training on publicly available internet data falls within fair use doctrine. She does not accept this. She knows what it cost her to develop that style, and she knows that the model now sells it at a fraction of what she charges. Marcus is a researcher at an AI lab who believes Sarah's frustration is understandable but that her legal theory is wrong. He argues that training a model on images is not meaningfully different from a human artist studying other artists' work, that copyright protects specific expressions rather than styles, and that restricting AI training data would be like restricting human artists from going to art school. He thinks the hard version of her position — that training without consent violates copyright — would make scientific research, search engines, and digital libraries impossible if applied consistently.

Elena is a legal scholar who thinks both Sarah and Marcus are too focused on the immediate battle and missing the deeper structural question. Copyright was designed to give creators a limited monopoly as an incentive to produce work, not to give them perpetual control over every downstream use of their ideas. But she also worries that the current legal framework, designed for individual human copying, is genuinely inadequate for a technology that can ingest billions of works simultaneously and produce outputs that compete directly with the people whose work trained it. She thinks the question is not who wins this lawsuit but what kind of IP system we want for the next century. And James, a labor organizer who works with writers and journalists, thinks all three of them are arguing about property when the real question is labor. Generative AI was built on work done by millions of people — not just famous illustrators whose style is identifiable but countless anonymous writers, coders, and artists who contributed to the dataset. Those people are not named in any lawsuit. Their work generated the value. They will receive nothing.

These four people are not disagreeing primarily about copyright law. They are disagreeing about what creative work is, what creators owe society, what society owes creators, and who is entitled to capture the value that collective human expression generates when it is transformed into a product. The generative AI and IP debate looks like a technical legal dispute about training data. Underneath it are harder questions: Is learning from prior work categorically different when done at machine scale? Does copyright protect expression or creative labor? And when a technology is built on everyone's contributions, who gets to own it?

What copyright holders and creators are protecting

The creator-protection position begins from a claim about consent. Sarah did not agree to have her work used to train a commercial product. The fact that she made her work publicly accessible — on a portfolio site, on Instagram, on ArtStation — was not an invitation for it to be scraped into a training dataset and used to produce competing outputs at industrial scale. The argument is not primarily philosophical; it is about what "public" means. Publishing work publicly has never before meant licensing it for unlimited commercial extraction. A photograph in a public gallery does not authorize a printer to reproduce it at will. The internet's openness was premised on certain baseline assumptions about what sharing work there did and did not permit.

They are protecting the economic relationship between creative work and creative livelihood. The core of this position is not that AI is wrong in some abstract sense but that it breaks the chain between making something distinctive and being compensated for that distinctiveness. Sarah's style is marketable because it is hers — because a client who wants it has to come to her. When a model can replicate it on demand, that economic relationship disappears. Copyright maximalists are defending not just a legal framework but the material conditions under which professional creative work is sustainable. Behind the legal filings of the Authors Guild, the litigation from Getty Images and music publishers, and the opt-out coalitions of illustrators and photographers is a simpler claim: we deserve to be asked.

They are protecting the principle that scale does not change the moral character of an act. The fair use argument made by AI companies rests partly on the claim that training a model on data is not "copying" in the relevant legal sense because the model does not store and reproduce the training data — it learns from it and generates new outputs. Creators dispute both the technical description and the moral logic. On the technical claim: models do store information about their training data in ways that can be extracted through adversarial prompting, and outputs can closely resemble specific training examples. On the moral logic: even if an AI's processing of a work is technically different from photocopying it, the consequence — a commercial product that competes with the original creator — is the same. The fact that this is done simultaneously to millions of works, rather than one at a time, does not improve the situation; it makes it worse.

They are protecting the right to control the uses of work in which one's identity is expressed. For many creators, the objection to AI training is not only economic but expressive. A novelist whose prose style, personal history, and decades of craft have been absorbed into a model that can now write "in the style of" her without her permission experiences this as something more than a licensing problem. Intellectual property law has always struggled with this — the tension between copyright's economic logic and the more personal moral rights traditions that European copyright law (but not American law) explicitly recognizes. The moral rights position — that creators have non-economic interests in their work's integrity and attribution — is not well served by the current US legal framework, but it names something real about why many creators experience AI training as a form of violation that goes beyond theft.

What AI developers and fair-use advocates are protecting

The developer position does not dispute that generative AI is trained on human-created work, or that this creates genuine disruption for some creators. It disputes the claim that this constitutes copyright infringement and argues that the policy implications of treating it as such would be seriously harmful. The legal foundation is fair use doctrine, particularly the transformative use test established in Campbell v. Acuff-Rose Music (1994): whether a use transforms the original material by adding new expression, meaning, or message and whether it serves a different function. AI developers argue that training a model on works to produce a general-purpose creative tool is transformative in exactly this sense — the model is not a repository of the training data but a different kind of thing that learned from it.

They are protecting the principle that learning from prior work is what creative and intellectual progress requires. The analogy to human learning is imperfect but not trivial. A novelist reads thousands of novels before writing one; a programmer studies thousands of programs before writing their own; a painter spends years copying masters in museums. None of this requires licensing. The fair use advocates argue that treating machine learning as categorically different from human learning because of scale imports a legal distinction that has no precedent and no clear principled basis. If reading and learning from publicly available work is permitted, why should it matter whether the reader is a human graduate student or a machine? The counter-argument — that it matters because of the commercial scale — is itself contested: commercial use is a fair use factor but not a determinative one.

They are protecting the possibility of building general-purpose knowledge infrastructure. The position is not only about AI companies' commercial interests. It encompasses search engines (which crawl and index content without individual consent), digital archives (which copy works to preserve them), computational research (which uses large datasets to study language, culture, and history), and scientific work (which trains models on research data). A legal doctrine that makes training on publicly available data presumptively infringing would reach all of these activities. Researchers who use large text corpora to study linguistic change, historians who use digitized archives, and public interest organizations that build tools for navigating government records would all be affected by a ruling that training data requires consent and compensation.

They are protecting innovation that is broadly accessible against concentration in whoever holds legacy content rights. One underappreciated dimension of the copyright maximalist position is who benefits if training data requires licensing. The answer is largely: existing large rights-holders. Major publishers, record labels, and media conglomerates that hold large back catalogs would become the gatekeepers of what AI can be trained on. Independent artists, whose work is dispersed and unlicensed, might gain formal rights but practically lack the leverage to enforce them. The developer position sometimes argues that the licensing regime the copyright maximalists want would entrench media consolidation rather than protect individual creators — that the practical beneficiaries of training data licenses would be Disney, News Corp, and the major labels, not the illustrators and novelists making the public case for consent.

What open knowledge advocates are protecting

The open knowledge position approaches this debate from a different direction. Its primary concern is not who wins the current litigation but whether the legal and policy frameworks that emerge will entrench or loosen the barriers to accessing and building on human knowledge. Advocates in this tradition — drawing on the work of Lawrence Lessig, the Creative Commons movement, and open access scholarship — have spent decades arguing that copyright has expanded far beyond its original purpose and that excessive IP protection harms the public interest. They see the generative AI debate as a new front in an old struggle.

They are protecting the public domain as a genuine commons rather than a shrinking residue of expired copyrights. Copyright terms in the United States have been extended repeatedly — from fourteen years at the founding to the current life-plus-seventy-years standard, driven largely by lobbying from entertainment companies seeking to prevent Mickey Mouse and other valuable properties from entering the public domain. Open knowledge advocates argue that this expansion has been systematically bad for culture, innovation, and education — that works should enter the public domain much sooner than they do, and that the legal framework governing AI training is another battleground in this larger war. The danger they see is not only that AI companies get an unfair advantage but that the response to AI — mandatory licensing, consent requirements, new enforcement mechanisms — becomes an occasion to extend and entrench IP maximalism.

They are protecting the distinction between expression and style as a principled limit on IP protection. Copyright has always protected specific expressions, not ideas, styles, or techniques. You cannot copyright a genre, a narrative structure, a photographic technique, or a compositional approach. This limitation is not an accident or a loophole — it is a deliberate policy choice made on the theory that allowing people to own styles and ideas would balkanize creative culture and allow incumbents to tax all subsequent creativity in a genre they pioneered. Open knowledge advocates argue that creator demands for protection against AI style replication, if taken seriously, would require abandoning this foundational distinction — that protecting Sarah's style would require courts to define and enforce rights in an area that copyright law has always treated as unownable.

They are protecting access to AI tools for people who are not professional creative workers. A rarely discussed dimension of this debate is the question of who generative AI serves. For professional illustrators and authors, AI tools represent competitive threats. For students, hobbyists, people in non-English-speaking countries, people with disabilities that limit their ability to draw or write, and people who cannot afford professional creative services, these tools represent something different: unprecedented access to creative capacity. A licensing regime that makes training data expensive concentrates AI capabilities in large companies. A more open regime distributes them more widely. Open knowledge advocates argue that the creator-protection position, while emotionally sympathetic, would deliver its benefits primarily to professional creators while foreclosing access for everyone else.

What structural and labor critics are protecting

James's position starts from a different question: where did the value come from? Generative AI models exist because hundreds of millions of people produced text, images, code, and other creative work that was publicly accessible on the internet. The models are extraordinarily capable because of the scale and quality of that collective contribution. The structural critic notes that essentially none of the economic value captured by AI companies flows back to the people whose work made it possible — and argues that the current legal debate is framed too narrowly to address this. Whether fair use applies to training data is a legal question. Who captures the value of the resulting technology is a political economy question, and the two are not the same.

They are protecting the claim that creative and intellectual labor deserves a stake in the wealth it generates. The comparison that structural critics sometimes make is to early internet platforms. Millions of people produced content for Facebook, Twitter, and YouTube; those platforms became extraordinarily valuable; the people whose content made them valuable received nothing except the use of the platform itself. The structural critique of generative AI is that the same pattern is repeating at greater scale and speed: the value produced by collective human expression is being captured by a small number of companies, and the legal frameworks being contested are not designed to prevent this. The remedy that structural critics tend to favor is not copyright expansion but something more novel — collective licensing arrangements, revenue-sharing pools, or new institutions for distributing AI-generated value back to the people whose work built the systems.

They are protecting workers who are displaced rather than credited. The legal cases brought by well-known authors and illustrators have public visibility. The situation of anonymous data workers is less visible. Researchers have documented that AI training involves not only the passive ingestion of scraped data but the active labor of human annotators — often low-wage workers in the Global South — who label data, filter harmful content, and evaluate model outputs through tasks that are repetitive, poorly compensated, and psychologically damaging. The structural critique encompasses both the original creators whose work was ingested without consent and the invisible labor force that made the models safe enough to deploy. Neither group is a significant beneficiary of the wealth being created.

They are protecting a different theory of what copyright is supposed to do. Copyright maximalists argue from the premise that creators deserve control over and compensation for their work. Fair use advocates argue from the premise that social learning and transformative use should be free. The structural critic argues that both sides are arguing about individual rights in a situation that is fundamentally collective. Copyright was designed for a world where creative works were produced by individual authors and copied by other individuals or entities. Generative AI scrambles this picture: the training process is collective (drawing on millions of works), the value is collective (emerging from scale rather than any individual contribution), and the benefit should be collective too. A system designed around individual rights — whether the creator's right to control use or the developer's right to train on public data — may simply be the wrong framework for a technology built on aggregated collective output.

Generative AI and Intellectual Property: What Each Position Is Protecting

What copyright holders and creators are protecting

What AI developers and fair-use advocates are protecting

What open knowledge advocates are protecting

What structural and labor critics are protecting

See also

Further reading