DeepSeek V4

DeepSeek V4 Release & Features | 1.6T-Param MoE, 1M Context, Open-Source MIT

DeepSeek V4 is here: 1.6T-parameter MoE (49B active), 1M-token context, CSA+HCA hybrid attention, SWE-bench 80.6% | Released and open-sourced April 24, 2026

Last Update: April 2026

DeepSeek V4 was officially released and open-sourced under the MIT license on April 24, 2026, with weights on Hugging Face. It ships in two versions: V4-Pro (1.6T total / 49B active) for high-end reasoning and agentic coding, and V4-Flash (284B / 13B) for faster, lower-cost use. Both offer a 1M-token context window powered by a hybrid attention architecture (CSA + HCA) that cuts per-token compute to ~27% of V3.2 and KV-cache memory to ~10% at 1M context. V4-Pro scores 80.6% on SWE-bench Verified — the highest among open models, tied with Gemini 3.1 Pro (80.6%) and ahead of GPT-5.4 (77.2%). API pricing is $0.435/$0.87 per M tokens (Pro) and $0.14/$0.28 (Flash), roughly 5-30x cheaper than closed frontier models.

📅 Release Timeline

2024.12

DeepSeek-V3 Released

671B params, 37B active, MoE architecture

2025.01

MODEL1 Code Appears

MODEL1 identifier found in GitHub FlashMLA repo

2026.04

V4 Released & Open-Sourced

Launched April 24, 2026 (MIT). V4-Pro 1.6T/49B, V4-Flash 284B/13B, 1M context

2026.04

Enterprise Version Live

Atlas Cloud syncs V4 enterprise service on release

🚀 Core Features

From the official April 24, 2026 release

🌐

Two Versions: Pro & Flash

DeepSeek V4 ships in two open-source versions: V4-Pro for high-end reasoning and agentic coding, and V4-Flash for faster, lower-cost workloads. Both focus on text, code and reasoning.

• V4-Pro: 1.6T total / 49B active parameters
• V4-Flash: 284B total / 13B active parameters
• Both support a 1M-token context window
• Focused on text, code and reasoning

Source: DeepSeek Official

🏗️

1.6 Trillion Parameter MoE

V4-Pro features 1.6 trillion total parameters with only 49B active per token via Mixture-of-Experts. This delivers frontier performance at a fraction of closed-model cost.

• V4-Pro: 1.6T total, 49B active per token
• Mixture-of-Experts (MoE) architecture
• API pricing: $0.435/$0.87 per M tokens (Pro)
• Open-source under MIT license

Source: DeepSeek Official

📚

Million-Token Context

Both V4-Pro and V4-Flash support a 1M-token context window (max output ~384K), enough to process entire books, large codebases or ultra-long documents.

• 1M-token context by default on both versions
• Max output around 384K tokens
• Process entire books (~500K words)
• Analyze complete large project codebases

Source: DeepSeek Official

🧬

CSA + HCA Hybrid Attention

V4 combines Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) on top of its MoE design, making a 1M-token context window practical and cheap.

• Hybrid of CSA and HCA attention
• ~27% of V3.2 per-token compute at 1M context
• ~10% of V3.2 KV-cache memory at 1M context
• Enables low-cost ultra-long context

Source: DeepSeek Official

⚡

Ultra-Efficient 1M Context

Thanks to the CSA+HCA hybrid attention, V4 serves a full 1M-token context at a fraction of prior compute and memory cost — the core efficiency story of the release.

• Per-token compute ~27% of V3.2 at 1M context
• KV-cache memory ~10% of V3.2 at 1M context
• Makes million-token workloads affordable
• Strong throughput for agentic, long-context tasks

Source: DeepSeek Official

🧠

Strong Reasoning & Math

V4 posts strong reasoning and math results: GPQA Diamond 90.1%, MMLU-Pro 87.5%, GSM8K 92.6%, and Terminal-Bench 2.0 67.9% for agentic tasks.

• GPQA Diamond: 90.1%
• MMLU-Pro: 87.5%
• GSM8K: 92.6%
• Terminal-Bench 2.0: 67.9%

Source: DeepSeek Official Benchmarks

💰

Far Cheaper Than Closed Models

DeepSeek V4-Pro API pricing is $0.435/M input and $0.87/M output (after a 75% cut); V4-Flash is $0.14/$0.28. Open-source means free self-hosting.

• V4-Pro: $0.435 input / $0.87 output per M tokens
• V4-Flash: $0.14 input / $0.28 output per M tokens
• Roughly 5-30x cheaper than closed frontier models
• Open-source (MIT): free to self-host

Source: DeepSeek Official Pricing

🎯

Leads Open Models in Coding

V4-Pro scores 80.6% on SWE-bench Verified — the highest among open models, tied with Gemini 3.1 Pro (80.6%) and ahead of GPT-5.4 (77.2%).

• SWE-bench Verified: 80.6% (highest open model)
• LiveCodeBench Pass@1: 93.5
• Codeforces rating: 3206
• Strong multi-language and repo-level coding

Source: DeepSeek Official Benchmarks

🔬 Technical Deep Analysis

Inside the V4 hybrid attention architecture

Architecture Innovation

✓ MoE backbone with two sizes (1.6T/49B Pro, 284B/13B Flash)
✓ Hybrid attention combining CSA and HCA
✓ 1M-token context window by default
✓ Open-sourced under the MIT license

Efficiency at 1M Context

✓ Per-token compute ~27% of V3.2
✓ KV-cache memory ~10% of V3.2
✓ CSA compresses sparse long-range attention
✓ HCA heavily compresses retained context

Performance & Cost

✓ SWE-bench Verified 80.6% (highest open model)
✓ LiveCodeBench Pass@1 93.5, Codeforces 3206
✓ V4-Pro pricing $0.435/$0.87 per M tokens
✓ V4-Flash pricing $0.14/$0.28 per M tokens

📊 V3 vs V4 Comparison

Main upgrade points overview

Feature

Parameters

671B total / 37B active

Pro 1.6T/49B · Flash 284B/13B

Modality

Text only

Text, code, reasoning focused

Context

128K tokens

1M tokens (max output ~384K)

Memory

KV Cache

~10% of V3.2 KV-cache memory at 1M context

Attention

Standard MLA

CSA + HCA hybrid attention (~27% compute at 1M)

Reasoning

Standard

GPQA 90.1%, MMLU-Pro 87.5%, GSM8K 92.6%

API Price (input)

$0.28/1M tokens

$0.435/1M (Pro) · $0.14/1M (Flash)

Coding (SWE-bench)

~70%

80.6% (highest open model)

Open Source

Yes, Apache 2.0

Yes, MIT

Hardware

H800 optimized

Blackwell + Huawei Ascend + Cambricon

🏆 V4 vs Frontier Models

How DeepSeek V4 stacks up against GPT-5.4, Claude 4.6, and Gemini 3.1 Pro

Feature

DeepSeek V4

GPT-5.4

Claude 4.6

Gemini 3.1 Pro

Release Date

Apr 24, 2026

Mar 5, 2026

Feb 5, 2026

Feb 19, 2026

Context Window

1.05M

Architecture

MoE + CSA/HCA

MoE

Dense

MoE

Input Price

$0.435/M (Pro)

$2.50/M

$5.00/M

$2.00/M

Output Price

$0.87/M (Pro)

$15.00/M

$25.00/M

$12.00/M

SWE-bench

80.6%

77.2%

80.8%

80.6%

LiveCodeBench

93.5 (Pass@1)

N/A

Multimodal

Text/code/reasoning focused

Text + Vision + Audio

Text + Vision

Native (text/image/video/audio)

Open Source

✅ MIT

❌ Closed

Local Deploy

✅ Free self-host

❌ API only

📎 Information Sources

The following is based on DeepSeek's official release (2026-04-24)

🟢

Official Release

• Released and open-sourced (MIT) on April 24, 2026, weights on Hugging Face
• V4-Pro 1.6T/49B and V4-Flash 284B/13B, both 1M-token context
• CSA + HCA hybrid attention: ~27% compute, ~10% KV-cache memory vs V3.2 at 1M

🟢

Official Benchmarks

• SWE-bench Verified 80.6% (highest open model, tied with Gemini 3.1 Pro)
• LiveCodeBench Pass@1 93.5, Codeforces 3206
• MMLU-Pro 87.5%, GPQA Diamond 90.1%, GSM8K 92.6%, Terminal-Bench 2.0 67.9%

🟢

Official Pricing & Access

• V4-Pro $0.435/$0.87 per M tokens, V4-Flash $0.14/$0.28 per M tokens
• Available via chat.deepseek.com (Expert/Instant Mode), official API, Atlas Cloud
• Legacy deepseek-chat and deepseek-reasoner retire on July 24, 2026

⚠️ Disclaimer: The above reflects DeepSeek's official release on 2026-04-24. Some third-party benchmark figures may shift as evaluations are updated.

🎁 How to Use V4 Today

DeepSeek V4 is live on Atlas Cloud

✅ Available now, no waiting

✅ No server configuration needed, direct API calls

✅ Compatible with V3 code, zero upgrade cost

✅ Enterprise-grade stability and tech support

Get Your API Key

Create an API key in the console

Switch Model

Set model to 'deepseek-v4-pro' (or 'deepseek-v4-flash') in your API request

📬 Subscribe to V4 Updates

Get DeepSeek V4 news, tutorials and updates

✓ Release and update notifications✓ Technical analysis articles✓ Usage tutorials✓ Special offers

DeepSeek V4 Release & Features | 1.6T-Param MoE, 1M Context, Open-Source MIT

📅 Release Timeline

🚀 Core Features

Two Versions: Pro & Flash

1.6 Trillion Parameter MoE

Million-Token Context

CSA + HCA Hybrid Attention

Ultra-Efficient 1M Context

Strong Reasoning & Math

Far Cheaper Than Closed Models

Leads Open Models in Coding

🔬 Technical Deep Analysis

Architecture Innovation

Efficiency at 1M Context

Performance & Cost

📊 V3 vs V4 Comparison

🏆 V4 vs Frontier Models

📎 Information Sources

Official Release

Official Benchmarks

Official Pricing & Access

🎁 How to Use V4 Today

📬 Subscribe to V4 Updates

DeepSeek V4 Is Live — Try It Now