Details
- DeepSeek has introduced V3.2-Exp, an experimental large language model that builds on the previous V3.1-Terminus version.
- The standout feature is DeepSeek Sparse Attention (DSA), which reduces computational needs in long-context processing by focusing on selective tokens.
- Company benchmarks report that V3.2-Exp maintains the quality of V3.1 while accelerating model training and inference speeds.
- The model is accessible now via DeepSeek’s mobile app, web console, and public API.
- API prices per 1,000 tokens have been cut by over 50 percent across all tiers, starting September 29, 2025.
- For direct comparison, V3.1-Terminus will remain available on a legacy endpoint until October 15, 2025, at 15:59 UTC.
- Weights, an in-depth technical report, and TileLang/CUDA kernels are published under an open-source license to encourage community engagement and replication.
Impact
This dramatic price drop puts pressure on major LLM providers like OpenAI and Anthropic, particularly for long-context tasks where costs are typically high. DeepSeek’s efficiency advances and open-sourcing strategy align with growing industry demands for transparency and signal a shift toward hardware-conscious AI architectures. As adoption grows, these moves may reshape funding flows and spur greater investment in efficient, sparse computation techniques across the AI landscape.