DeepSeek has released DeepSeek-V4 in preview, and the benchmarks are turning heads. The Chinese AI startup continues its streak of delivering frontier-competitive models at radical cost efficiency, this time with an open-weight Mixture-of-Experts (MoE) architecture that supports context windows of up to one million tokens.
Architecture Deep Dive
DeepSeek-V4 builds on the sparse MoE paradigm that made V3 a breakthrough, but with several major advances:
- Expanded context: 1M token support enables processing of entire codebases, legal document collections, and multi-day conversation histories in a single prompt
- Improved routing: V4’s expert routing mechanism activates only 37B parameters per forward pass from a total pool of ~670B, keeping inference costs dramatically lower than dense models of equivalent capability
- Enhanced reasoning: Extended chain-of-thought capabilities with built-in verification steps, closing the gap with proprietary models like Claude Opus 4.7 and GPT-5.4
- Multi-modal inputs: Native image and document understanding, with video comprehension marked as coming in the V4.1 update
Cost Disruption Continues
At $0.27 per million input tokens and $1.10 per million output tokens through DeepSeek’s API, V4 undercuts OpenAI by roughly 20x and Anthropic by roughly 18x. For enterprises processing massive document collections or running persistent agent sessions, the cost savings are transformational.
Open Weights, Open Questions
True to form, DeepSeek is releasing V4’s weights under a permissive license, allowing self-hosting and fine-tuning. This strategy has made DeepSeek one of the most influential forces in democratizing access to frontier-class AI.
However, the open-weight approach raises familiar concerns:
- Safety: Without API-level guardrails, organizations deploying self-hosted V4 instances bear full responsibility for use-case restrictions
- Geopolitics: US policymakers continue to debate whether open-weight models from Chinese labs should face export or usage restrictions
- Competition: Meta’s Llama and Google’s Gemma teams are now competing for third place in a race that DeepSeek keeps redefining
Why It Matters
DeepSeek-V4 represents the most capable open-weight model ever released. Its combination of million-token context, efficient MoE architecture, and aggressive pricing makes it the default choice for cost-conscious enterprises and researchers worldwide. The frontier is no longer exclusive to billion-dollar labs.
Source: aibusiness.com, deepseek.com