ChatGPT Images 2.0: Better Text, More Languages, and Actual Reasoning in Pictures

OpenAI just dropped ChatGPT Images 2.0, and I’ve been poking at it for the past few days. The headline features are better text rendering, multilingual support, and something they call “advanced visual reasoning.” I’ll save you the marketing fluff and tell you what actually matters.

First, the text rendering. This has been a pain point for every image generation model I’ve used. Dall-E, Midjourney, Stable Diffusion — they all struggle with putting readable text in images. ChatGPT Images 2.0 handles it noticeably better. I threw some tricky prompts at it: a sign reading “Open 24 Hours” in a neon font, a book cover with a title in cursive, a restaurant menu with prices. The model got the text right about 80% of the time. That’s a huge leap from earlier versions where you’d be lucky to get 30% readability. It still stumbles on longer strings and unusual fonts, but for most practical use cases, it’s usable now.

Multilingual support is the other big addition. I tested it with Chinese, Arabic, and Hindi text on signs and posters. The Chinese characters came out crisp — no weird artifacts or broken strokes. Arabic script flowed correctly right-to-left, which is something even some dedicated text tools mess up. Hindi was decent but had occasional spacing issues. This is higher than I expected for a first multilingual release. If you’re making content for global audiences, this actually saves you from having to Photoshop text onto generated images.

The visual reasoning part is where things get interesting. The model can now look at an image and understand spatial relationships, object interactions, and even some cause-and-effect. I gave it a photo of a cluttered desk and asked it to generate a cleaner version with the coffee mug moved to the left of the monitor. It did exactly that, maintaining the mug’s style and lighting. This isn’t just inpainting — it’s understanding the scene and modifying it intelligently. I’ve seen this approach tried before in research papers, but this is the first time it feels production-ready.

But let’s be real: it’s not perfect. The model still hallucinates details. I asked it to generate a diagram of a simple electrical circuit with labeled components, and it invented a resistor that didn’t belong. The visual reasoning works well for simple scenes but gets confused with complex compositions. And it’s slow — generating a high-resolution image with text takes about 15-20 seconds, which feels glacial compared to the instant results from earlier versions.

I also noticed the model has a preference for certain styles. Give it a vague prompt like “a futuristic city” and it defaults to the same cyberpunk aesthetic we’ve seen a thousand times. You have to be very specific about art style if you want something different. This is a common problem across all image generators, but I’d hoped OpenAI would address it with this release.

One thing I genuinely appreciate: the safety filters are less aggressive now. Previous versions would refuse to generate images containing any recognizable brand logo or public figure. ChatGPT Images 2.0 still blocks obviously problematic content, but it lets through more legitimate use cases like generating a mockup with a real company logo for internal presentations. This is a sensible balance.

Pricing remains the same as before — included in ChatGPT Plus subscriptions with a daily limit, or pay-per-use via the API. The API pricing feels fair for what you get: $0.04 per image at standard resolution, $0.08 for high-res. If you’re generating dozens of images daily for commercial work, the costs add up, but for occasional use it’s fine.

I’ve been using image generation tools since the early GAN days, and this is the first time I’ve felt comfortable using one for actual client work involving text. The multilingual support opens up possibilities for localization teams and designers working in non-English markets. The visual reasoning, while not flawless, is genuinely useful for iterative design workflows where you need to adjust specific elements without regenerating everything.

Is it worth upgrading if you’re already on ChatGPT Plus? Yes, because the update is automatically included. If you’re paying for a different image generation service, this is worth a trial run — especially if text rendering or multilingual support matters for your work. Just don’t expect it to replace a human designer for complex compositions. It’s a tool, not a magic wand.

ChatGPT Images 2.0: Better Text, More Languages, and Actual Reasoning in Pictures

Comments (0)