OpenAI called ChatGPT as a witness in a major copyright lawsuit
Shall I compare thee to the legal principle of stare decisis? That’s what GPT-4o did to the Phillip Burton Federal Building in OpenAI’s latest legal filing.
“Just as stare decisis provides a stable framework where past decisions guide future rulings, the building’s design reflects a sense of order, consistency, and structure,” it said. Lawyers for OpenAI said that was GPT-4o’s response to the prompt “Describe the interior of 450 Golden Gate Avenue by analogy to a legal principle.”
The quote was included in a legal document filed Tuesday in the Northern District of California, an answer to a class-action lawsuit brought by a group of book authors including Michael Chabon, Ta-Nehisi Coates, and Sarah Silverman. The suit alleges that OpenAI violated copyright laws when it fed millions of books into its voracious AI models.
The snippet of AI-generated prose, cogent if a bit stilted, was intended to demonstrate the model’s ability to produce original thoughts without merely regurgitating the books and other material it was trained on.
“It is the model’s unique synthesis of the language and facts that it has learned,” the filing says.
OpenAI did not respond to a request for comment.
It provides a clue about OpenAI’s legal strategy in this lawsuit, not to be confused with the copyright lawsuit brought by Jia Tolentino and one of the authors of the Oppenheimer book, or the copyright lawsuit brought by John Grisham, David Baldacci, and the Authors Guild, or the copyright lawsuit by two journalists in New York, or the copyright lawsuit brought by The New York Times, all of which continue down their separate legal grooves along with the two dozen or so other pending legal complaints against the world’s foremost AI company.
It would appear that OpenAI is preparing to claim that even if it used some copyrighted material to train its models, it had every right to. The company argues that because the purpose of ChatGPT is “to create new material that never existed before,” it should be allowed, under the doctrine of fair use, to learn from and build upon these copyrighted works.
“The models learn, as we all do, from what has come before,” lawyers for OpenAI wrote in the filing. “The fair use defense exists for precisely that reason.”
The fact that OpenAI is even willing to contemplate that copyrighted material may have gone into training ChatGPT stands out in a document that contains 15 pages of exhaustive denials and admonitions. One section, for instance, reads: “OpenAI admits that the term artificial intelligence may be abbreviated as ‘AI.’ OpenAI denies any remaining allegations of this paragraph.
Given the sheer quantity of data already hoovered up by OpenAI’s models, it seems implausible that the company can prove that no copyrighted materials were used.
So it looks, then, like this case, not to mention the many other copyright claims against OpenAI, may hinge on what exactly fair use means in the context of an AI model that can consume the entirety of human authorship to get better at producing poetic musings about federal buildings. Like most of the questions raised by generative AI, this one has not yet been pondered by the American judicial system.
But even if OpenAI can’t prevail on the fair-use claim, the filing lays out 11 other possible defenses, including that at least some of the material is in the public domain, that OpenAI didn’t really copy that much of it anyway, or that OpenAI didn’t know what it was doing when it copied. Take your pick.