DeepSeek Unveils V3.2-Exp Model with Sparse Attention and Major API Price Reduction

Details

DeepSeek has introduced V3.2-Exp, an experimental large language model that builds on the previous V3.1-Terminus version.
The standout feature is DeepSeek Sparse Attention (DSA), which reduces computational needs in long-context processing by focusing on selective tokens.
Company benchmarks report that V3.2-Exp maintains the quality of V3.1 while accelerating model training and inference speeds.
The model is accessible now via DeepSeek’s mobile app, web console, and public API.
API prices per 1,000 tokens have been cut by over 50 percent across all tiers, starting September 29, 2025.
For direct comparison, V3.1-Terminus will remain available on a legacy endpoint until October 15, 2025, at 15:59 UTC.
Weights, an in-depth technical report, and TileLang/CUDA kernels are published under an open-source license to encourage community engagement and replication.

Impact

This dramatic price drop puts pressure on major LLM providers like OpenAI and Anthropic, particularly for long-context tasks where costs are typically high. DeepSeek’s efficiency advances and open-sourcing strategy align with growing industry demands for transparency and signal a shift toward hardware-conscious AI architectures. As adoption grows, these moves may reshape funding flows and spur greater investment in efficient, sparse computation techniques across the AI landscape.

DeepSeek Unveils V3.2-Exp Model with Sparse Attention and Major API Price Reduction

Details

Impact

Social

CONTENT

INFO