Audrey Woods | CSAIL Alliances  

January 28th, 2025 

What is DeepSeek?
  • DeepSeek is a small artificial intelligence lab and startup based in Hangzhou, China, founded in 2023 by Liang Wenfeng, a prominent investor and entrepreneur in AI technology. In addition to being the company’s CEO, Wenfeng also created the hedge fund solely responsible for funding DeepSeek, High-Flyer. Forbes says, “This unique funding model has allowed DeepSeek to pursue ambitious AI projects without the pressure of external investors, enabling it to prioritize long-term research and development.”
  • On January 20th, 2025 DeepSeek released DeepSeek R1, a new open-source Large Language Model (LLM) which is comparable to top AI models like ChatGPT but was built at a fraction of the cost, allegedly coming in at only $6 million. For comparison, ChatGPT4 is estimated to have cost OpenAI over $100 million.
  • DeepSeek R1 has about 670 billion parameters, making it the largest open-source LLM yet, according to BBC.
  • DeepSeek’s success with the R1 model is based on several key innovations, Forbes reports, such as heavily relying on reinforcement learning, utilizing a “mixture-of-experts” architecture which allows it to activate only a small number of parameters for any given task (cutting down on costs and enhancing efficiency), incorporating multi-head latent attention to handle multiple input aspects simultaneously, and employing distillation techniques to transfer the knowledge of larger and more capable models into smaller, more efficient ones.  
  • By January 26th, DeepSeek’s mobile app reached the number one spot on the Apple App Store, bumping ChatGPT to number two on the same chart.
  • According to the artificial analysis quality index, DeepSeek R1 is now second only to OpenAI’s o1 model in overall quality, beating leading models from Google, Meta, and Anthropic.
  • Unlike many other commercial AI models, DeepSeek R1 has been released as open-source software, which has allowed scientists around the world to verify the model’s capabilities. They are also pricing their API significantly lower than their competitors, encouraging widespread use.
  • More details about the methodology behind DeepSeek R1 can be found in their paper, DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.
Why is DeepSeek important?
  • The Monday after the announcement of DeepSeek R1, the US Stock Market lost more than $1 trillion in market cap, with the S&P 500 dropping 1.5% and the more tech-heavy Nasdaq Composite dropping 3%. Nvidia was most negatively affected, with their stock dropping almost 17%, erasing $589 billion in market value—the worst single-day loss of market cap in history—and allowing Apple to overtake Nvidia as the most valuable company in the world. Other leading chipmakers saw similar declines.
  • The existence of DeepSeek R1 challenges many key assumptions in AI development, including:
    • The enormous assumed cost of building competitive AI models: DeepSeek R1 shows that quality models can be built faster and cheaper than previously thought.
    • The inevitable environmental impact of the burgeoning AI industry: By using less resources and leveraging more efficient programming methods, DeepSeek’s success could mitigate the environmental and energy concerns surrounding AI.
    • The traditional belief that larger models and datasets are inherently superior: DeepSeek R1 shows that relatively small models can match or exceed the performance of much larger models when trained properly.   
    • That AI models can only be developed by large tech companies: DeepSeek’s accomplishment shows that startups and smaller companies are capable of competing with Silicon Valley giants.
    • That the US is leading in AI development: That this innovation came from China, even after US export restrictions on powerful AI chips, highlights China’s advanced AI capabilities and raises questions about the efficacy of US government efforts in what some are calling “the AI Arms Race.” Some even argue that US restrictions pressured Chinese AI firms to be more innovative and prioritize resource optimization.
    • The importance of proprietary IP: A notable aspect of DeepSeek’s strategy is their embrace of open source, which has fostered collaboration and pooled expertise, accelerating innovation.
  • Marc Andreessen, a prominent figure in the tech community, called the release of DeepSeek R1 “AI’s Sputnik Moment.” 
What are the concerns?
  • While DeepSeek R1 is rapidly gaining popularity, many worry about the security threats. Forbes points out that DeepSeek’s privacy policy expressly tells users that extensive personal information (including IP addresses and keystrokes) is collected and stored in the People’s Republic of China. Chinese national security laws require Chinese firms to share data with government agencies (one reason for the scrutiny on TikTok), so businesses should exercise caution in using DeepSeek with sensitive data.
  • With its explosion in popularity, DeepSeek has already faced a large-scale cyberattack which forced the platform to disable new user registrations.
  • Users have discovered several ways in which DeepSeek’s training leads it to spread disinformation or avoid answering questions about censored subjects in China, such as Taiwan and Tiananmen Square.
  • Some are concerned that downloading and using DeepSeek code might open users up to later security concerns, like backdoor access. 
Listen to CSAIL researchers and other experts discuss current research, challenges and successes, as well as the potential impact of emerging tech.
Image
lock
See inside the CSAIL labs and learn more about CSAIL researchers and projects.
Image
CTA HCI
Get to know some of the leading researchers and students, who call CSAIL home.
Image
Ray and Maria Stata Center