MiniMax M2.5 Overview and Pricing #
- Model Background: MiniMax M2.5 is the successor to the M2.1 and M2 models. It features 230 billion parameters, making it small enough to potentially run locally.
- "Cheap to Meter" Intelligence: MiniMax aims to make intelligence inexpensive, costing approximately $1 to run the model continuously for an hour at 100 tokens per second (TPS).
- Model Variants:
- M2.5 Lightning: Offers a steady throughput of 100 TPS (2x faster than other frontier models). Costs $0.30 per million input tokens and $2.40 per million output tokens.
- M2.5: Operates at 50 TPS and costs half the price of the Lightning version.
- Cost Efficiency: The output price is 1/10th to 1/20th the cost of competitors like Claude 3 Opus, Gemini 1.5 Pro, and GPT-4o.
- Release Context: The model is being released ahead of the Chinese New Year, a period when many Chinese AI labs deploy new updates.
Capabilities and Use Cases #
- Agentic Focus: The model is designed specifically as a "workhorse" for agentic workflows (coding, autonomous tasks) rather than general chat.
- Performance Benchmarks: Despite its smaller size and lower cost, it performs comparably to—and sometimes exceeds—Claude 3 Opus.
- Inner Thinking: The model features robust "inner thinking" capabilities, allowing it to self-correct during complex tasks.
Coding and Performance Tests #
- Expo Movie Tracker: Successfully built a movie tracking app using Expo. It handled over 52,000 tokens of generation and completed the task in 4 minutes.
- Go/Bubble Tea Terminal Calculator: Created a CLI calculator, correctly managing library installations and layout.
- Tauri Desktop App (Image Cropper): Built a functional image cropping tool in a single prompt, a task that often challenges much larger models like Claude 3 Opus.
- Nuxt Stack Overflow Clone: Successfully generated a clone including database integration and authentication.
- Svelte Kanban App: Developed a fully functional project management board with boards, lists, and tasks.
Summary #
MiniMax M2.5 represents a significant shift toward high-performance, low-cost AI specialized for agentic applications and coding. Occupying the fourth spot on the creator's leaderboard, it achieves performance parity with frontier models like Claude 3 Opus while being roughly 30 times cheaper. Its high throughput (up to 100 TPS) and low price point are intended to enable the development of complex, autonomous agents without the "barrier of cost."
last updated: