Details
- Sam Altman announced that GPT-5.1 is now available to all API users at the same per-token cost as GPT-5.
- OpenAI introduced two code-centric versions: gpt-5.1-codex and the lighter gpt-5.1-codex-mini, both optimized for intensive software build and refactoring jobs over several hours.
- The API's prompt-level caching window grows from one hour to 24 hours, enabling developers to reuse identical prompts without incurring additional compute charges.
- OpenAI's updated blog post highlights improved internal benchmarks, showcasing better reasoning and code-generation scores compared to GPT-5.
- No changes have been made to rate limits or endpoints; switching to the new models requires updating the model name in the API call.
Impact
The extended 24-hour caching reduces operational costs for high-traffic apps, pressuring competitors like Anthropic and Google on both price and efficiency. Customized Codex variants strengthen OpenAI’s position with enterprise developers, creating direct competition with offerings from GitHub Copilot, Amazon Q, and CodeWhisperer. By keeping pricing steady amid rising cloud infrastructure costs, OpenAI signals technical confidence and sets a new bar for industry pricing expectations.
