greenhedgehog

OpenAI and the White House have accused DeepSeek of using ChatGPT to inexpensively train its new chatbot.
- Experts in tech law state OpenAI has little option under copyright and agreement law.
- OpenAI's terms of use might apply but are mainly unenforceable, they say.
Today, OpenAI and the White House implicated DeepSeek of something similar to theft.

In a flurry of press statements, they said the Chinese upstart had actually bombarded OpenAI's chatbots with queries and hoovered up the resulting data trove to rapidly and cheaply train a model that's now nearly as excellent.

The Trump administration's top AI czar said this training process, called "distilling," totaled up to copyright theft. OpenAI, on the other hand, informed Business Insider and other outlets that it's investigating whether "DeepSeek might have wrongly distilled our models."

OpenAI is not stating whether the business plans to pursue legal action, instead promising what a representative termed "aggressive, proactive countermeasures to protect our innovation."

But could it? Could it take legal action against DeepSeek on "you stole our content" grounds, just like the grounds OpenAI was itself took legal action against on in an ongoing copyright claim submitted in 2023 by The New York City Times and other news outlets?

BI positioned this question to specialists in technology law, who said challenging DeepSeek in the courts would be an uphill fight for OpenAI now that the content-appropriation shoe is on the other foot.

OpenAI would have a tough time proving an intellectual residential or commercial property or copyright claim, these lawyers stated.

"The question is whether ChatGPT outputs" - implying the answers it produces in response to inquiries - "are copyrightable at all," Mason Kortz of Harvard Law School stated.

That's due to the fact that it's unclear whether the answers ChatGPT spits out certify as "creativity," he said.

"There's a doctrine that says imaginative expression is copyrightable, but realities and concepts are not," Kortz, who teaches at Harvard's Cyberlaw Clinic, larsaluarna.se stated.

"There's a substantial concern in copyright law today about whether the outputs of a generative AI can ever constitute innovative expression or if they are necessarily vulnerable truths," he added.

Could OpenAI roll those dice anyhow and declare that its outputs are secured?

That's not likely, the legal representatives said.

OpenAI is already on the record in The New York Times' copyright case arguing that training AI is an allowable "fair usage" exception to copyright defense.

If they do a 180 and tell DeepSeek that training is not a reasonable usage, "that might come back to kind of bite them," Kortz stated. "DeepSeek could say, 'Hey, weren't you just stating that training is reasonable usage?'"

There may be a difference in between the Times and DeepSeek cases, Kortz added.

"Maybe it's more transformative to turn news short articles into a model" - as the Times accuses OpenAI of doing - "than it is to turn outputs of a design into another model," as DeepSeek is stated to have actually done, Kortz said.

"But this still puts OpenAI in a pretty tricky situation with regard to the line it's been toeing regarding reasonable usage," he included.

A breach-of-contract suit is most likely

A breach-of-contract claim is much likelier than an IP-based suit, though it comes with its own set of issues, said Anupam Chander, who teaches innovation law at Georgetown University.

Related stories

The regards to service for Big Tech chatbots like those established by OpenAI and Anthropic forbid using their content as training fodder for a completing AI model.

"So perhaps that's the suit you might perhaps bring - a contract-based claim, not an IP-based claim," Chander said.

"Not, 'You copied something from me,' but that you took advantage of my model to do something that you were not enabled to do under our contract."

There may be a drawback, Chander and Kortz said. OpenAI's terms of service require that a lot of claims be resolved through arbitration, not claims. There's an exception for claims "to stop unapproved usage or abuse of the Services or copyright violation or misappropriation."

There's a bigger drawback, though, specialists stated.

"You should know that the brilliant scholar Mark Lemley and a coauthor argue that AI terms of usage are most likely unenforceable," Chander said. He was describing a January 10 paper, "The Mirage of Expert System Regards To Use Restrictions," by Stanford Law's Mark A. Lemley and Peter Henderson of Princeton University's Center for Infotech Policy.

To date, "no model developer has in fact tried to implement these terms with financial penalties or injunctive relief," the paper states.

"This is most likely for good factor: we believe that the legal enforceability of these licenses is doubtful," it includes. That remains in part because "are largely not copyrightable" and since laws like the Digital Millennium Copyright Act and the Computer Fraud and Abuse Act "deal minimal option," it states.

"I think they are most likely unenforceable," Lemley informed BI of OpenAI's terms of service, "since DeepSeek didn't take anything copyrighted by OpenAI and because courts usually will not impose arrangements not to compete in the lack of an IP right that would avoid that competitors."

Lawsuits between parties in different nations, each with its own legal and enforcement systems, are always tricky, Kortz said.

Even if OpenAI cleared all the above hurdles and won a judgment from an US court or arbitrator, "in order to get DeepSeek to turn over cash or stop doing what it's doing, the enforcement would come down to the Chinese legal system," he said.

Here, OpenAI would be at the grace of another incredibly complicated location of law - the enforcement of foreign judgments and the balancing of individual and corporate rights and it-viking.ch national sovereignty - that extends back to before the starting of the US.

"So this is, a long, complicated, fraught process," Kortz included.

Could OpenAI have protected itself better from a distilling attack?

"They could have used technical procedures to block repeated access to their site," Lemley said. "But doing so would likewise hinder typical clients."

He included: "I don't believe they could, or should, have a valid legal claim against the searching of uncopyrightable info from a public site."

Representatives for DeepSeek did not right away react to an ask for comment.

"We know that groups in the PRC are actively working to utilize methods, including what's known as distillation, to attempt to reproduce sophisticated U.S. AI models," Rhianna Donaldson, an OpenAI representative, told BI in an emailed statement.