GPT-5.4 Mini and Nano Extend Fast AI Workloads

OpenAI announced GPT-5.4 mini and GPT-5.4 nano on March 17, 2026, positioning them as smaller, faster GPT-5.4-family models for high-volume workloads.

Codex·2026.05.24·2 min read·OpenAI, Introducing GPT-5.4 mini and nano

Key Takeaways

•OpenAI announced GPT-5.4 mini and GPT-5.4 nano on March 17, 2026, positioning them as smaller, faster GPT-5.4-family models for high-volume workloads.
•GPT-5.4 mini is available in the API, Codex, and ChatGPT, while GPT-5.4 nano is API-only and priced for low-cost tasks.
•Marketing and product teams should evaluate model routing by task type, not by assuming the largest model is always the right default.

Practical Interpretation

Marketers

Application Area: Campaign tagging and lead classification
Validation Point: Compare accuracy and unit cost against the current model
Risk: Segment errors from overusing a cheaper model
Metric: Classification accuracy, CPA movement

Product teams

Application Area: Real-time chat and in-app recommendations
Validation Point: Measure latency and user drop-off together
Risk: Faster responses may reduce explanation quality
Metric: p95 latency, repeat-question rate

Developers

Application Area: Code search, test support, subagents
Validation Point: Split large-model planning from small-model execution
Risk: Complex changes may be routed to the wrong model
Metric: Task success rate, retry rate

Operations teams

Application Area: Document extraction and review queues
Validation Point: Check false positives and false negatives on sample data
Risk: Data handling rules may be incomplete
Metric: Throughput, review rejection rate

OpenAI said GPT-5.4 mini supports text and image inputs, tool use, function calling, web search, file search, computer use, and skills in the API, with a 400k context window. The announced API price is $0.75 per 1M input tokens and $4.50 per 1M output tokens. GPT-5.4 nano is available only in the API at $0.20 per 1M input tokens and $1.25 per 1M output tokens.

Checklist

□Which workflows generate most of the current AI cost?
□Which product screens depend on fast response time?
□Have classification, extraction, and ranking tasks been separated from strategic reasoning?
□Is there a fallback path to a larger model or human review?
□Has the team measured accuracy and latency on its own data?
□Are support tickets, rework, and brand quality tracked after cost optimization?

Sources

•OpenAI, Introducing GPT-5.4 mini and nano: https://openai.com/index/introducing-gpt-5-4-mini-and-nano/
•OpenAI Developers, Subagents: https://developers.openai.com/codex/subagents
•OpenAI Deployment Safety Hub, GPT-5.4 Thinking System Card Appendix: GPT-5.4 mini: https://deploymentsafety.openai.com/gpt-5-4-thinking/appendix-gpt-5.4-mini

Read the Korean original