GPT‑4 API Structured Outputs: A Hands‑On Tutorial for Reliable JSON
A practical GPT‑4 API guide to Structured Outputs: enforce JSON Schemas via Responses and Chat Completions, with code, streaming, and production tips.
Image used for representation purposes only.
Overview
Structured outputs let you ask GPT‑4 class models (like GPT‑4o) for JSON that exactly matches a schema you define. This makes downstream code simpler, safer, and easier to test—no brittle regexes or ad‑hoc post‑processing. You can enable it in two ways:
- Responses API: set text.format to a JSON Schema definition (recommended for new projects).
- Chat Completions API: set response_format to type=json_schema; or use function calling with strict: true to constrain tool arguments. (platform.openai.com )
Why it matters: with constrained decoding, the API masks invalid tokens so the model can only produce outputs that satisfy your schema. This substantially improves reliability over plain “JSON mode.” (openai.com )
When to use structured outputs
Use structured outputs when you need parseable data, not prose. Common patterns:
- Information extraction (entities, PII redaction, receipts, contracts)
- Classification with enums and confidence scores
- UI/command generation for agents and tools
- Safety gating: extract only whitelisted fields to limit prompt injection blast radius (platform.openai.com )
Which models and APIs support it
- Responses API: structured outputs via text.format are available on GPT‑4o and later families. Use Responses for new builds. (platform.openai.com )
- Chat Completions API: structured outputs via response_format, and strict function calling for tool arguments. (platform.openai.com )
Notes you should know before shipping:
- Only a subset of JSON Schema is enforced in strict mode; prefer simple, explicit schemas. (platform.openai.com )
- First use of a new schema may add preprocessing latency before results are cached. (openai.com )
- If a request is refused for safety or runs out of tokens, the API indicates that (e.g., refusal field), and the payload may not match your schema. Handle this explicitly. (platform.openai.com )
Quickstart (Responses API, recommended)
The Responses API replaces response_format with text.format. Here’s a minimal example that extracts product data as structured JSON.
Python
from openai import OpenAI
client = OpenAI()
schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"price_usd": {"type": "number"},
"in_stock": {"type": "boolean"},
"tags": {"type": "array", "items": {"type": "string"}}
},
"required": ["name", "price_usd", "in_stock"],
"additionalProperties": False
}
resp = client.responses.create(
model="gpt-4o-2024-08-06",
input="Acme SuperWidget — $129.99 — ships today; backorder tag removed",
text={
"format": {
"type": "json_schema",
"name": "product",
"strict": True,
"schema": schema
}
}
)
# SDK helper for plain text exists (output_text). For JSON, parse resp.output_text.
import json
product = json.loads(resp.output_text)
print(product)
JavaScript/TypeScript
import OpenAI from "openai";
const client = new OpenAI();
const schema = {
type: "object",
properties: {
name: { type: "string" },
price_usd: { type: "number" },
in_stock: { type: "boolean" },
tags: { type: "array", items: { type: "string" } }
},
required: ["name", "price_usd", "in_stock"],
additionalProperties: false
};
const resp = await client.responses.create({
model: "gpt-4o-2024-08-06",
input: "Acme SuperWidget — $129.99 — ships today; backorder tag removed",
text: {
format: { type: "json_schema", name: "product", strict: true, schema }
}
});
const product = JSON.parse(resp.output_text);
console.log(product);
The Responses API also provides parse helpers with typed schemas (Pydantic/Zod) so you can access response.output_parsed directly:
import OpenAI, { zodTextFormat } from "openai";
import { z } from "zod";
const client = new OpenAI();
const Product = z.object({
name: z.string(),
price_usd: z.number(),
in_stock: z.boolean(),
tags: z.array(z.string()).default([])
});
const response = await client.responses.parse({
model: "gpt-4o-2024-08-06",
input: "Acme SuperWidget — $129.99 — ships today; backorder tag removed",
text: { format: zodTextFormat(Product, "product") }
});
console.log(response.output_parsed); // typed object
# Python Pydantic helper
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()
class Product(BaseModel):
name: str
price_usd: float
in_stock: bool
tags: list[str] = []
response = client.responses.parse(
model="gpt-4o-2024-08-06",
input="Acme SuperWidget — $129.99 — ships today; backorder tag removed",
text_format=Product,
)
print(response.output_parsed) # typed object
These helpers compile your schema and return parsed objects. (platform.openai.com )
Using Chat Completions instead
If you’re on the Chat Completions API, set response_format to a JSON Schema object. Example:
from openai import OpenAI
client = OpenAI()
completion = client.chat.completions.create(
model="gpt-4o-2024-08-06",
messages=[{"role": "user", "content": "Summarize: Mars is the 4th planet."}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "summary",
"strict": True,
"schema": {
"type": "object",
"properties": {"bullets": {"type": "array", "items": {"type": "string"}}},
"required": ["bullets"],
"additionalProperties": False
}
}
}
)
print(completion.choices[0].message.content)
Prefer json_schema over the older json_object (JSON mode) for models that support it. If you stay on Completions long‑term, consider migrating to Responses where structured outputs move to text.format. (platform.openai.com )
Strict function calling (tool arguments)
When you use function calling, add strict: true to the function’s parameters so that the model can only emit arguments that fit your schema. Disable parallel tool calls if you need exact shape matching per call.
{
"tools": [
{
"type": "function",
"function": {
"name": "create_ticket",
"description": "Create a support ticket",
"strict": true,
"parameters": {
"type": "object",
"properties": {
"priority": {"type": "string", "enum": ["low", "med", "high"]},
"title": {"type": "string", "minLength": 3}
},
"required": ["priority", "title"],
"additionalProperties": false
}
}
}
],
"parallel_tool_calls": false
}
This ensures tool call arguments respect your schema; parallel tool calls can break shape guarantees if multiple calls are interleaved. (openai.com )
Streaming structured outputs
You can stream parsed chunks of a structured response, reducing time‑to‑first‑byte while maintaining schema guarantees.
import OpenAI, { zodResponseFormat } from "openai";
import { z } from "zod";
const client = new OpenAI();
const Entities = z.object({ attributes: z.array(z.string()), colors: z.array(z.string()), animals: z.array(z.string()) });
client.beta.chat.completions
.stream({
model: "gpt-4.1",
messages: [ { role: "system", content: "Extract entities" }, { role: "user", content: "The quick brown fox..." } ],
response_format: zodResponseFormat(Entities, "entities")
})
.on("content.delta", ({ parsed }) => console.log(parsed))
.on("refusal.done", () => console.log("request refused"));
The stream surfaces parsed deltas and explicit refusal events. (platform.openai.com )
Handling errors and edge cases
- Refusals and truncation: Check refusal, finish_reason, and max token limits; don’t assume a valid payload if these are present. (platform.openai.com )
- Unsupported keywords: Some JSON Schema features are unsupported in strict mode—simplify and test your schema. (platform.openai.com )
- Latency: The first request with a new schema can incur extra latency while the grammar is prepared and cached. Plan warm‑ups during deploys. (openai.com )
How it works under the hood (the short version)
The API compiles your JSON Schema to a grammar and uses constrained decoding to mask invalid next tokens. This enables reliable matching—even with nested or recursive structures that are hard to enforce with simple FSMs or regexes. (openai.com )
Best‑practice checklist
- Prefer Responses API + text.format for new work; migrate from response_format when convenient. (platform.openai.com )
- Always set strict: true and define additionalProperties: false on objects.
- Use enums for classifiers; add minLength/maxLength for strings; validate numeric ranges with minimum/maximum.
- Keep schemas small and composable; avoid deeply nested objects unless required. (platform.openai.com )
- For function calling, add strict: true and consider parallel_tool_calls: false. (openai.com )
- Add explicit error handling for refusal and incomplete generations. (platform.openai.com )
- Use structured outputs within agents to reduce injection surface area. (platform.openai.com )
- Evaluate your tasks with the Evals examples for structured output quality. (cookbook.openai.com )
Bonus: JSON mode vs. structured outputs
JSON mode (json_object) only guarantees syntactically valid JSON; structured outputs (json_schema/text.format) guarantee schema adherence on supported models. Prefer the latter for production. (platform.openai.com )
Full example schema (reference)
{
"type": "object",
"properties": {
"id": {"type": "string"},
"status": {"type": "string", "enum": ["draft", "published", "archived"]},
"items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"title": {"type": "string", "minLength": 1},
"qty": {"type": "integer", "minimum": 1},
"price": {"type": "number", "minimum": 0}
},
"required": ["title", "qty", "price"],
"additionalProperties": false
}
}
},
"required": ["id", "status", "items"],
"additionalProperties": false
}
Summary
- Turn on structured outputs in Responses (text.format) or Chat Completions (response_format) to get reliable, schema‑valid JSON.
- Favor GPT‑4o models for strong adherence; plan for first‑schema warm‑up and handle refusals.
- Keep schemas explicit and tight; test with Evals; use strict function calling when integrating tools. (platform.openai.com )
Related Posts
OpenAI API with React: A 2026 Guide to Chat, Tools, and Realtime Voice
A practical 2026 guide to building React apps on the OpenAI API: secure setup, typed streaming, tools/function calling, and live voice with Realtime.
gRPC Streaming API Tutorial: Server, Client, and Bidirectional Patterns with Go and Python
Hands-on gRPC streaming tutorial: server, client, and bidirectional streams with Go and Python, plus proto design, flow control, deadlines, security, and testing.
React Hydration Mismatch: A Practical Debugging Guide
Learn how to diagnose and fix React hydration mismatches with step-by-step checks, common causes, and production-safe patterns for SSR and Next.js.