DeepSeek is the Fork in History the US Needed

In reality, this is the best thing that could have happened and reinforces why the US will dominate the AI race.

Jan 27, 2025

Scrolling X this weekend, it was DeepSeek, DeepSeek, DeepSeek, and more DeepSeek.

And for good reason. DeepSeek just shook the AI world. Making us all fundamentally rethink how we develop AI going forward.

And of course the spiraling narratives start; calling for the end of the AI Bubble, the end of Nvidia’s ascent, the rise of China AI, and the slowdown in growth. Nasdaq futures already down Sunday night “DeepSeek Shakes Up Stocks as Traders Fear for US Tech Leadership”

But I am calling BS on it all.

In reality, this is the best thing that could have happened and further reinforces why the US will dominate the AI race.

A New Fork in History

On one hand, DeepSeek represents an incredible breakthrough in performance and cost-efficiency. DeepSeek-R1 and DeepSeek-R1-Zero have set a new standard for AI development at a fraction of the cost.

On the other hand, it’s not short of controversy—questions around potential model development ethics, actual development costs, and biases towards Chinese government-controlled narratives.

But it doesn’t matter. What DeepSeek just demonstrated is a fork in history.

From a pre-DeepSeek timeline:

Dominated by a few companies with access to vast computational resources, leading to high cost, high barrier models.
A narrative was one of increasing computational power and data size dictating AI progress.

To a post-DeepSeek timeline:

Instead of a race for ever-larger models, a belief efficiency in training and inference can yield superior results.
Where AI development focuses more on algorithmic innovation and less on brute-force computation.
Innovation not as a result of unfettered access to resources but a creative response to constraints, possibly leading to more sustainable tech development practices.

First, how DeepSeek is different from other models

DeepSeek’s approach to AI stands out in several key ways from OpenAI, Google, or Anthropic.

1. Pure Reinforcement Learning (RL)

Traditional Models: Most AI models, including OpenAI’s GPT series, rely heavily on supervised fine-tuning (SFT), where humans provide labeled data to teach the model.
DeepSeek’s Innovation: DeepSeek-R1-Zero is the first model trained entirely via reinforcement learning (RL) without SFT. This allows the model to self-evolve its reasoning capabilities through trial and error, leading to behaviors like self-reflection and multi-step error correction.
In simple terms, most AI models start by memorizing examples (like a student copying answers). DeepSeek R1 skips this and learns entirely through trial and error, like a robot figuring out how to walk by experimenting. It tries solutions, gets feedback on what worked, and improves over time.

2. Group Relative Policy Optimization (GRPO)

Traditional Models: Training often requires a separate "critic" model to judge responses, which is computationally expensive.
DeepSeek’s Innovation: GRPO eliminates the need for a critic model by comparing batches of answers at once and rewarding the best ones. This reduces costs and speeds up training while maintaining high performance.

3. Self-Improvement Loop

Traditional Models: These models depend on human-generated data for learning, which limits scalability.
DeepSeek’s Innovation: DeepSeek-R1 generates its own practice problems and learns from its mistakes, creating a self-improvement loop. This allows the model to scale learning without being bottlenecked by human input.

4. Internal Monologue on Full Display

Traditional Models: Models like ChatGPT often operate as "black boxes," making it difficult to understand their decision-making processes.
DeepSeek’s Innovation: DeepSeek-R1 incorporates explainable AI tools, displaying its internal reasoning steps during problem-solving. You can follow along it’s logic chains to understand the eventual output.

Second, the importance of high-quality training data

One of the key factors behind DeepSeek’s success is the high-quality training data it started with. While many AI models rely on vast amounts of uncurated or noisy data, DeepSeek took a different approach:

1. Curated "Cold Start" Data

Before diving into full reinforcement learning, DeepSeek-R1 was initialized with a small but highly curated dataset of 80,000 reasoning chains. These examples were carefully selected to ensure clarity, accuracy, and logical structure.
Why this matters: This "cold start" phase provided the model with a strong foundation in reasoning and problem-solving, enabling it to generate coherent and logical outputs from the outset. Requiring orders of magnitude less training data to start.

2. Focus on Reasoning and Logic

Unlike models trained on general-purpose datasets, DeepSeek’s initial data emphasized reasoning, logic, and structured thinking. This focus allowed the model to excel in tasks requiring multi-step problem-solving, such as math, coding, and scientific research.
Why this matters: By starting with high-quality, reasoning-focused data, DeepSeek avoided the pitfalls of models that struggle with logical consistency or fail to "show their work."

Why the U.S. Wins in AI Thanks to These Innovations

It’s simple. We have what the rest of the world doesn’t – the most sophisticated, purpose-built AI compute clusters in the world. DeepSeek represents the fork in history we needed to use that infrastructure 100x more effectively.

1. Cheaper, Faster Training - 90% Cheaper, or 10x Faster.

DeepSeek’s Innovation: Group Relative Policy Optimization (GRPO) eliminates the need for a separate "teacher" model, reducing training costs by up to 90%.
Why the U.S. Wins: By lowering training costs, GRPO allows U.S. companies and research institutions to train more models with the same resources. This efficiency could lead to a surge in AI development, growing model performance by 10x from today with the same compute.

2. Self-Improvement Loop on Steroids - 10x More Effective Data Utilization

DeepSeek’s Innovation: The self-improvement loop allows models to generate their own practice problems and learn through trial and error, scaling efficiently with massive computing power.
Why the U.S. Wins: With our vast network of data centers and Nvidia's progress, we can generate billions of synthetic problems and solutions in days. This accelerates learning without the bottleneck of human input, enabling faster and more efficient model training. Models could generate 10x more training data through reasoning and understanding with the same data sets.

3. Better Use of Compute = Better Models = 10x More Inference Time

DeepSeek’s Innovation: Redirects wasted computing power toward productive tasks like self-play and distillation, creating smarter, more efficient models.
Why the U.S. Wins: We redirect time and resources from massive training compute clusters to train larger and more complex models to massive inference clusters focused on solving real problems, from drug discovery to climate modeling, rather than wasting energy on inefficient training processes.

US strengths about to go on display

The creative innovations by DeepSeek + the US strengths in computing power, innovation, and infrastructure can lead to 90% cheaper training costs, 10x faster training, with models that could utilize data 10x more effectively and spend 10x more time in inference than training.