Home > News > ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet
OpenAI has voiced concerns that China's DeepSeek AI models, known for their remarkably low cost, may have been developed using data from OpenAI. This has prompted strong reactions, with Donald Trump calling DeepSeek a wake-up call for the U.S. tech industry following a significant drop in Nvidia's market value—a loss of nearly $600 billion. The emergence of DeepSeek triggered a sharp decline in the stock prices of major AI-focused companies. Nvidia, a dominant player in the GPU market crucial for AI model operation, experienced a staggering 16.86% drop—the largest single-day loss in Wall Street history. Microsoft, Meta Platforms, Alphabet (Google's parent company), and Dell Technologies also saw substantial losses.
DeepSeek promotes its R1 model as a significantly more affordable alternative to Western AI offerings like ChatGPT. Built upon the open-source DeepSeek-V3, it reportedly requires far less computing power than Western models, with an estimated training cost of just $6 million. While this cost figure has been disputed, DeepSeek's emergence has raised questions about the massive investments made by American tech companies in AI, unsettling investors. The app quickly climbed to the top of the U.S. free app download charts, fueled by discussions about its effectiveness.
Bloomberg reported that OpenAI and Microsoft are investigating whether DeepSeek leveraged OpenAI's API to integrate OpenAI's AI models into its own. OpenAI acknowledged to Bloomberg that Chinese companies, among others, actively seek to adapt leading U.S. AI models, a practice they consider a violation of their terms of service. OpenAI stated that they employ countermeasures to protect their intellectual property, including careful selection of capabilities included in released models, and are collaborating with the U.S. government to safeguard advanced models from unauthorized use.
David Sacks, President Donald Trump's AI czar, suggested to Fox News that DeepSeek used a technique called distillation—extracting data from larger models—to train its own, a practice OpenAI reportedly views negatively. He predicted that leading AI companies will implement measures to prevent such distillation in the coming months.
The situation has highlighted the irony of OpenAI's position, given previous accusations of its own data sourcing practices. Ed Zitron, a tech PR writer, pointed out the hypocrisy, referencing OpenAI's reliance on vast amounts of internet data in the creation of ChatGPT.
OpenAI previously acknowledged its reliance on copyrighted material for training its models, stating in a submission to the UK's House of Lords that it was "impossible" to create AI tools like ChatGPT without access to copyrighted work. This statement was made in January 2024, highlighting the ongoing debate surrounding the use of copyrighted material in AI model training. The New York Times also filed a lawsuit against OpenAI and Microsoft in December 2023, alleging unlawful use of its content, while OpenAI maintains that its training practices constitute "fair use." These actions follow a September 2023 lawsuit from 17 authors alleging widespread copyright infringement. Further complicating the issue, a 2018 U.S. Copyright Office finding stated that AI-generated art cannot be copyrighted.