OpenAI has launched o3-mini, the latest model in its reasoning series, now available in ChatGPT and the API. Initially previewed in December 2024, this model enhances performance in science, math, and coding while offering lower costs and faster processing speeds compared to its predecessor, o1-mini. It is optimized for complex STEM tasks, providing efficiency and affordability.
Key Features and Capabilities
The o3-mini model introduces several developer-focused features, including:
- Function calling, Structured Outputs, and developer messages for seamless integration into production environments.
- Streaming support, similar to o1-mini and o1-preview.
- Adjustable reasoning effort with three levels—low, medium, and high—to optimize speed and accuracy based on specific requirements.
However, o3-mini does not support vision-related tasks. Developers requiring visual reasoning should continue using o1.
OpenAI describes o3-mini as a specialized model for technical fields, prioritizing precision and speed. In ChatGPT, it defaults to medium reasoning effort, balancing response time and accuracy.
Compared to o1-mini, it delivers clearer answers with stronger reasoning capabilities, reducing major errors by 39% in complex real-world scenarios. In blind evaluations, testers preferred o3-mini’s responses over o1-mini 56% of the time.
Performance Benchmarks
The o3-mini model has been tested across multiple STEM-focused evaluations, demonstrating superior performance:
- Mathematics (AIME 2024): Matches o1 at medium reasoning effort and surpasses both o1 and o1-mini at high effort.
- PhD-Level Science (GPQA Diamond): Outperforms o1-mini at low effort and matches o1 at high effort in biology, chemistry, and physics.
- Research-Level Math (FrontierMath): Solves 32% of problems on the first attempt at high reasoning effort, including 28% of the most difficult tasks.
- Competitive Coding (Codeforces): Achieves higher Elo scores than o1-mini across all effort levels and matches o1 at medium effort.
- Software Engineering (SWE-bench Verified): Scores highest among OpenAI’s released models, achieving up to 61% accuracy with internal tools.
- Coding Efficiency (LiveBench Coding): Surpasses o1-high even at medium effort and further extends its lead at high effort.
- General Knowledge: Outperforms o1-mini across various knowledge domains.
Additionally, o3-mini generates responses 2500ms faster on average compared to o1-mini, reducing latency and improving efficiency.
Safety and Testing
OpenAI has emphasized safety in o3-mini’s development by implementing deliberative alignment, a training method that enables the model to analyze safety guidelines before responding. This approach improves its ability to handle sensitive prompts while maintaining accuracy and reliability.
OpenAI reports that o3-mini surpasses GPT-4o in safety and jailbreak evaluations. Prior to release, OpenAI conducted extensive safety testing, including external red-teaming and internal risk assessments. Further details on potential risks and mitigation strategies are outlined in the o3-mini system card.
Availability
The o3-mini model became available on January 31, 2025, for ChatGPT Plus, Team, and Pro users. Enterprise access will roll out in February 2025. It replaces o1-mini in the model picker, offering higher rate limits and lower latency, making it particularly suited for STEM, coding, and logical reasoning tasks.
- Plus and Team users now have an increased daily message limit of 150 (previously 50 with o1-mini).
- Pro users have unlimited access to both o3-mini and o3-mini-high, a higher-intelligence version that takes longer to generate responses.
- Free ChatGPT users can now access o3-mini by selecting ‘Reason’ in the message composer, marking the first time a reasoning model is available for free-tier users.
For developers, o3-mini is available through the Chat Completions API, Assistants API, and Batch API for selected users in tiers 3-5.
Pricing
- $0.55 per million cached input tokens
- $4.40 per million output tokens
- 63% cheaper than o1-mini
Additionally, o3-mini now integrates with search, providing real-time answers with linked sources. This feature is in prototype mode as OpenAI continues developing search capabilities for its reasoning models.
Future Outlook
OpenAI views o3-mini as part of its ongoing effort to make high-quality AI more accessible and cost-effective. The model builds on OpenAI’s 95% reduction in per-token costs since GPT-4, while maintaining strong reasoning capabilities.
As AI adoption expands, OpenAI emphasized that it remains focused on developing models that balance intelligence, efficiency, and safety, ensuring scalable AI solutions for technical and general use cases.