DeepSeek V4 is out, and it’s a bigger deal than you think

DeepSeek just released V4, their first flagship model since R1 shook the AI world back in January 2025. For those who forgot, R1 was the model that made everyone stop and pay attention—trained on a shoestring budget, it matched or beat much more expensive models. That single release turned DeepSeek from a quiet research team into China’s most famous AI company overnight.

Since then, DeepSeek has stayed relatively quiet. There were rumors of delays, some high-profile departures, and a lot of scrutiny from both the US and Chinese governments. But earlier this month, they teased something big by adding “expert” and “flash” modes to their online model. Now we know why.

V4 comes in two flavors: V4-Pro, a larger model for coding and complex agent tasks, and V4-Flash, a smaller, cheaper version for everyday use. Both are open-source, meaning anyone can download, modify, or build on them. Both handle up to 1 million tokens of context—that’s roughly the length of three Lord of the Rings books in a single prompt. That’s a big deal for anyone working with long documents, codebases, or research papers.

The price is ridiculous

Let’s talk cost. V4-Pro charges $1.74 per million input tokens and $3.48 per million output tokens. V4-Flash is even cheaper: $0.14 per million input tokens and $0.28 per million output tokens. For comparison, OpenAI’s GPT-5.4 and Anthropic’s Claude-Opus-4.6 are multiples more expensive. This isn’t just cheap—it’s disruptive. Developers and startups can now build on frontier-level AI without burning through their entire budget.

And the performance? According to DeepSeek’s benchmarks, V4-Pro matches or beats Claude-Opus-4.6, GPT-5.4, and Gemini-3.1 on major benchmarks. Against other open-source models like Alibaba’s Qwen-3.5 or Z.ai’s GLM-5.1, V4 leads on coding, math, and STEM problems. They also ran an internal survey with 85 experienced developers—over 90% put V4-Pro in their top choices for coding tasks. That’s not nothing.

A new approach to memory

The 1 million token context window isn’t just a number. DeepSeek claims they’ve redesigned the architecture to handle long sequences more efficiently. I haven’t tested it myself yet, but if true, this is a meaningful step forward. Most models start hallucinating or losing coherence past a few thousand tokens. If V4 can actually hold a million tokens of context without falling apart, that changes what’s possible with AI agents, document analysis, and code review.

Why this matters beyond the numbers

This release is a signal. DeepSeek has been under a lot of pressure—personnel changes, government scrutiny, and the weight of being China’s AI poster child. V4 shows they can still ship. More importantly, it shows that open-source AI isn’t just catching up—it’s leading in some areas. The price-performance ratio here is absurd. US labs are spending billions on training runs, and DeepSeek is releasing models that compete for pennies.

Will V4 cause the same shock as R1? Probably not. The AI industry has gotten used to Chinese labs releasing strong open models. But this release matters because it keeps the pressure on proprietary models. It gives developers real alternatives. And it proves that the open-source approach isn’t a compromise—it’s a competitive advantage.

I’m curious to see how the US labs respond. They’ve been raising prices and locking down features. DeepSeek is going the opposite direction: open weights, cheap API, and strong performance. That’s a bet on ecosystem and adoption over short-term revenue. I think it’s the right call.

If you’re building on AI, V4 is worth a serious look. The Flash version is so cheap you can experiment without worrying about costs. And if you need raw power for agentic tasks, the Pro version is a no-brainer. DeepSeek is back, and they’re not messing around.

DeepSeek V4 is out, and it’s a bigger deal than you think

The price is ridiculous

A new approach to memory

Why this matters beyond the numbers

Comments (0)