The new GPT-5 Codex model, an improved version of GPT-5, is specifically enhanced for agentic tasks and coding, especially in Codex. OpenAI claims this model is better at coding, costs less (90% fewer tokens), is faster, and can think longer on complex tasks. It has shown significant improvement over the previous GPT-5 Codex in testing. The model is now integrated into Codex tool and web. While it performs well in many areas, some limitations remain, such as with complex multi-file edits and GDScript. Comparisons are made with CodeBuff, Claude, and GLM Code, with the new GPT-5 Codex generally performing very well.
GPT-5 Codex Model Enhancements #
- Improved for Agentic Tasks & Coding: Specifically enhanced for agentic tasks and coding, particularly in Codex.
- Integration: Now integrated throughout Codex tool, Codex web, and similar tools.
- Performance Claims:
- Better at coding.
- Costs less (uses ~90% fewer tokens).
- Faster for most tasks.
- Can think for longer on complex tasks.
Benchmarking and Comparisons #
- Previous GPT-5 Codex: Was the lowest scorer in earlier tests.
- Current GPT-5 Codex (New):
- Movie Tracker Expo App (TMDB API): Makes a "kind of good" app with a different style. Lacks movie details page and features found in Claude or CodeBuff. Marked as "amazingly good" compared to the previous GPT-5 Codex.
- Visual Calculator (Go and Bubble T): Creates an "awesome" and "one of the best" visual calculators, using very few tokens.
- FPS Game Editing (Godo): Still struggles with Godot and GDScript, producing syntax errors and attempting Python scripts. Remains "still bad at Godo."
- SVG Creation Modal (Open Code Repo): Fails on multi-file edits, a task only CodeBuff successfully handles.
- Overall Ranking: Considered better than Claude Code, taking the second spot under CodeBuff.
- CodeBuff:
- "Awesome" and "really good" but costs "almost more than double the price."
- Best for movie tracker app and SVG creation modal.
- Claude:
- Makes a better movie tracker app than the new GPT-5 Codex in terms of desired features.
- GLM Code:
- The $3 coding plan is considered "unbeatable for students or people on a budget."
Usage and Pricing #
- Activation: Simple upgrade command in Codex.
- ChatGPT Subscription: Available with a ChatGPT subscription.
- Message Limits:
- $20 plan: 30 to 150 messages per 5 hours (uncertain limits, possibly load-dependent).
- Pro user: 300 to 1,500 messages (wide and uncertain range).
- API Pricing: Hopes API for the new model will be released.
Ninja Chat (Sponsor) #
- All-in-one AI platform: Access to top AI models (GPT-4o, Claude 4 Sonnet, Gemini 2.5 Pro) for $11/month.
- Features:
- AI playground for side-by-side model comparison.
- Mind map generator.
- Basic plan: 1,000 messages, 30 images, 5 videos monthly.
- Discount Codes:
- KING25 for 25% off any plan.
- KING40YEARLY for 40% off annual subscriptions.
Future Hopes and Overall Impression #
- Message Limits: Hopes for a stable limit around 300 messages.
- API Release: Desires an API release to use with tools like Rode or Klein (preferred over Codex).
- Codex VS Code Extension: Likes it for being less memory-hungry than RU, but it doesn't match RU's raw performance.
- Combined Tools: Suggests combining Codex with RU could offer a strong alternative to Sonnet.
- Mini Variant: Hopes for a GPT-5 Mini Codex variant that costs less.
- Token Efficiency: Appreciates that the new model uses fewer tokens, costs less, and performs better.
- OpenAI's Direction: Glad OpenAI is "listening and building useful stuff instead of just benchmark maxing."
- Overall: "Pretty cool."
last updated: