By Michael Justus

Thomson Reuters Enterprise Centre GmbH v. ROSS Intelligence Inc. is the first summary judgment ruling regarding fair use of copyrighted material to train generative artificial intelligence models, and it is a must-read.

Judge Stephanos Bibas normally sits on the US Court of Appeals for the Third Circuit, and is sitting on the US District Court for the District of Delaware by designation.

So, the decision provides an early glimpse of how an appeals court judge views key generative AI copyright issues.

In short, the case involves alleged unauthorized use of proprietary content from Thomson's Westlaw legal research database as training data for ROSS' generative AI legal research tool.

The court mostly denied the parties' motions for summary judgment, holding that "many of the critical facts in this case remain genuinely disputed" and must go to a jury. But the decision is packed with guidance that applies beyond the facts of the case.

Here are five key takeaways from the decision.

First, from a 10,000-foot view, the decision previews how judges — and a court of appeals judge in particular — may view generative AI copyright issues. For example, Bibas reflected on the difficulty of deciding between competing public policy interests:

"[W]e run into a hotly debated question: Is it in the public benefit to allow AI to be trained with copyrighted material? The value of any given AI is likely to be reflected in the traditional factors: How transformative is it? Can the public use it for free? Does it discourage other creators by swallowing up their markets? So an independent evaluation of the benefits of AI is unlikely to be useful yet, even though both the potential benefits and risks are huge. Suffice it to say, each side presents a plausible and powerful account of the public benefit that would result from ruling for it. So a jury must decide the fourth factor — and the ultimate conclusion on fair use."

Second, throughout the decision, the court highlighted the crucial difference under copyright law between unprotectable facts and ideas, and protectable creative expression. That issue permeates the fair-use test.

For example, the court explained that Thomson's Westlaw's headnotes are more likely to be protected by copyright the more they differ from the underlying unprotectable judicial opinions they are meant to summarize. This is a likely theme in future cases, as courts delve into complex issues regarding which specific materials were allegedly copied by generative AI tools, and whether such materials constitute protectable creative expression rather than facts or ideas.

Third, the court addressed the impact of the US Supreme Court's May 18 decision in Andy Warhol Foundation for the Visual Arts Inc. v. Goldsmith on the first fair-use factor. The court held that Warhol leaves room for a commercial use to be transformative:

"[In Warhol], the Court determined that the use in question was not fair largely by emphasizing its commercial nature. But I decline to overread one decision, especially because the Court recognized that "use's transformativeness may outweigh its commercial character" and that in Warhol, 'both elements point[ed] in the same direction.' Plus, just two terms ago, in a technological context much more like this one, the Court placed much more weight on transformation than commercialism. So I focus on transformativeness."

Fourth, the court parsed the "intermediate copying" case law that some commentators believe will drive the results in generative AI litigation.

In those cases, copying material to discover unprotectable information or as a minor step toward developing an entirely new product — e.g., to understand technological compatibility of software — was fair use.

Applying those cases to generative AI tools, the court suggested that both the training process and the output of generative AI tools inform a fact-specific analysis:

"[W]hether the intermediate copying caselaw tells us that Ross's use was transformative depends on the precise nature of Ross's actions. It was transformative intermediate copying if Ross's AI only studied the language patterns in the headnotes to learn how to produce judicial opinion quotes. But if Thomson Reuters is right that Ross used the untransformed text of headnotes to get its AI to replicate and reproduce the creative drafting done by Westlaw's attorney-editors, then Ross's comparisons to [the intermediate copying caselaw] are not apt. Again, this is a material question of fact that the jury needs to decide."

Fifth, the court addressed application of the third fair-use factor — the amount and substantiality of the copying — to the fact that generative AI tools necessarily require vast amounts of training data.

The court suggested that the vast amount of training data must be balanced against the practical need for the data to further a transformative purpose:

"Westlaw says Ross copied far more than it needed. Ross says it needed a vast, diverse set of material to train its AI effectively. Though Ross need not prove that each headnote was strictly necessary, it must show that the scale of copying (if any) was practically necessary and furthered its transformative goals. So the third factor hinges on the answers to these disputed factual questions which the jury needs to resolve."

The court teed up but did not decide key issues likely to repeat themselves in other cases. However, there is much to be learned from which factual issues the court focused on, and how the court telegraphed which way the legal analysis may go depending on how such factual issues are resolved by a jury.

I've highlighted only a handful of the many useful points in the decision.

Among other issues, the decision provides guidance regarding registration and infringement of compilations; the analyses for direct, contributory and vicarious infringement; the role of bad faith in the fair use test, if any; the test for market substitution or other market impacts under the fourth fair-use factor; Thomson's tortious interference claims, which will partially go to a jury; and a variety of affirmative defenses asserted by ROSS, which were dismissed.

The court indicated that a jury trial will be set for May 2024. In the meantime, the court's detailed decision provides a partial road map for litigants on both sides of generative AI copyright cases.

"5 Takeaways From Bellwether AI Copyright Case" *Law360, October 2, 2023

*Subscription may be required for article access.