GPT‑4 API Structured Outputs: A Hands‑On Tutorial for Reliable JSON

A practical GPT‑4 API guide to Structured Outputs: enforce JSON Schemas via Responses and Chat Completions, with code, streaming, and production tips.

ASOasis
6 min read
GPT‑4 API Structured Outputs: A Hands‑On Tutorial for Reliable JSON

Image used for representation purposes only.

Overview

Structured outputs let you ask GPT‑4 class models (like GPT‑4o) for JSON that exactly matches a schema you define. This makes downstream code simpler, safer, and easier to test—no brittle regexes or ad‑hoc post‑processing. You can enable it in two ways:

  • Responses API: set text.format to a JSON Schema definition (recommended for new projects).
  • Chat Completions API: set response_format to type=json_schema; or use function calling with strict: true to constrain tool arguments. (platform.openai.com )

Why it matters: with constrained decoding, the API masks invalid tokens so the model can only produce outputs that satisfy your schema. This substantially improves reliability over plain “JSON mode.” (openai.com )

When to use structured outputs

Use structured outputs when you need parseable data, not prose. Common patterns:

  • Information extraction (entities, PII redaction, receipts, contracts)
  • Classification with enums and confidence scores
  • UI/command generation for agents and tools
  • Safety gating: extract only whitelisted fields to limit prompt injection blast radius (platform.openai.com )

Which models and APIs support it

  • Responses API: structured outputs via text.format are available on GPT‑4o and later families. Use Responses for new builds. (platform.openai.com )
  • Chat Completions API: structured outputs via response_format, and strict function calling for tool arguments. (platform.openai.com )

Notes you should know before shipping:

  • Only a subset of JSON Schema is enforced in strict mode; prefer simple, explicit schemas. (platform.openai.com )
  • First use of a new schema may add preprocessing latency before results are cached. (openai.com )
  • If a request is refused for safety or runs out of tokens, the API indicates that (e.g., refusal field), and the payload may not match your schema. Handle this explicitly. (platform.openai.com )

The Responses API replaces response_format with text.format. Here’s a minimal example that extracts product data as structured JSON.

Python

from openai import OpenAI
client = OpenAI()

schema = {
  "type": "object",
  "properties": {
    "name": {"type": "string"},
    "price_usd": {"type": "number"},
    "in_stock": {"type": "boolean"},
    "tags": {"type": "array", "items": {"type": "string"}}
  },
  "required": ["name", "price_usd", "in_stock"],
  "additionalProperties": False
}

resp = client.responses.create(
  model="gpt-4o-2024-08-06",
  input="Acme SuperWidget — $129.99 — ships today; backorder tag removed",
  text={
    "format": {
      "type": "json_schema",
      "name": "product",
      "strict": True,
      "schema": schema
    }
  }
)

# SDK helper for plain text exists (output_text). For JSON, parse resp.output_text.
import json
product = json.loads(resp.output_text)
print(product)

JavaScript/TypeScript

import OpenAI from "openai";
const client = new OpenAI();

const schema = {
  type: "object",
  properties: {
    name: { type: "string" },
    price_usd: { type: "number" },
    in_stock: { type: "boolean" },
    tags: { type: "array", items: { type: "string" } }
  },
  required: ["name", "price_usd", "in_stock"],
  additionalProperties: false
};

const resp = await client.responses.create({
  model: "gpt-4o-2024-08-06",
  input: "Acme SuperWidget — $129.99 — ships today; backorder tag removed",
  text: {
    format: { type: "json_schema", name: "product", strict: true, schema }
  }
});

const product = JSON.parse(resp.output_text);
console.log(product);

The Responses API also provides parse helpers with typed schemas (Pydantic/Zod) so you can access response.output_parsed directly:

import OpenAI, { zodTextFormat } from "openai";
import { z } from "zod";
const client = new OpenAI();

const Product = z.object({
  name: z.string(),
  price_usd: z.number(),
  in_stock: z.boolean(),
  tags: z.array(z.string()).default([])
});

const response = await client.responses.parse({
  model: "gpt-4o-2024-08-06",
  input: "Acme SuperWidget — $129.99 — ships today; backorder tag removed",
  text: { format: zodTextFormat(Product, "product") }
});

console.log(response.output_parsed); // typed object
# Python Pydantic helper
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()

class Product(BaseModel):
    name: str
    price_usd: float
    in_stock: bool
    tags: list[str] = []

response = client.responses.parse(
    model="gpt-4o-2024-08-06",
    input="Acme SuperWidget — $129.99 — ships today; backorder tag removed",
    text_format=Product,
)
print(response.output_parsed)  # typed object

These helpers compile your schema and return parsed objects. (platform.openai.com )

Using Chat Completions instead

If you’re on the Chat Completions API, set response_format to a JSON Schema object. Example:

from openai import OpenAI
client = OpenAI()

completion = client.chat.completions.create(
  model="gpt-4o-2024-08-06",
  messages=[{"role": "user", "content": "Summarize: Mars is the 4th planet."}],
  response_format={
    "type": "json_schema",
    "json_schema": {
      "name": "summary",
      "strict": True,
      "schema": {
        "type": "object",
        "properties": {"bullets": {"type": "array", "items": {"type": "string"}}},
        "required": ["bullets"],
        "additionalProperties": False
      }
    }
  }
)
print(completion.choices[0].message.content)

Prefer json_schema over the older json_object (JSON mode) for models that support it. If you stay on Completions long‑term, consider migrating to Responses where structured outputs move to text.format. (platform.openai.com )

Strict function calling (tool arguments)

When you use function calling, add strict: true to the function’s parameters so that the model can only emit arguments that fit your schema. Disable parallel tool calls if you need exact shape matching per call.

{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "create_ticket",
        "description": "Create a support ticket",
        "strict": true,
        "parameters": {
          "type": "object",
          "properties": {
            "priority": {"type": "string", "enum": ["low", "med", "high"]},
            "title": {"type": "string", "minLength": 3}
          },
          "required": ["priority", "title"],
          "additionalProperties": false
        }
      }
    }
  ],
  "parallel_tool_calls": false
}

This ensures tool call arguments respect your schema; parallel tool calls can break shape guarantees if multiple calls are interleaved. (openai.com )

Streaming structured outputs

You can stream parsed chunks of a structured response, reducing time‑to‑first‑byte while maintaining schema guarantees.

import OpenAI, { zodResponseFormat } from "openai";
import { z } from "zod";
const client = new OpenAI();
const Entities = z.object({ attributes: z.array(z.string()), colors: z.array(z.string()), animals: z.array(z.string()) });

client.beta.chat.completions
  .stream({
    model: "gpt-4.1",
    messages: [ { role: "system", content: "Extract entities" }, { role: "user", content: "The quick brown fox..." } ],
    response_format: zodResponseFormat(Entities, "entities")
  })
  .on("content.delta", ({ parsed }) => console.log(parsed))
  .on("refusal.done", () => console.log("request refused"));

The stream surfaces parsed deltas and explicit refusal events. (platform.openai.com )

Handling errors and edge cases

  • Refusals and truncation: Check refusal, finish_reason, and max token limits; don’t assume a valid payload if these are present. (platform.openai.com )
  • Unsupported keywords: Some JSON Schema features are unsupported in strict mode—simplify and test your schema. (platform.openai.com )
  • Latency: The first request with a new schema can incur extra latency while the grammar is prepared and cached. Plan warm‑ups during deploys. (openai.com )

How it works under the hood (the short version)

The API compiles your JSON Schema to a grammar and uses constrained decoding to mask invalid next tokens. This enables reliable matching—even with nested or recursive structures that are hard to enforce with simple FSMs or regexes. (openai.com )

Best‑practice checklist

  • Prefer Responses API + text.format for new work; migrate from response_format when convenient. (platform.openai.com )
  • Always set strict: true and define additionalProperties: false on objects.
  • Use enums for classifiers; add minLength/maxLength for strings; validate numeric ranges with minimum/maximum.
  • Keep schemas small and composable; avoid deeply nested objects unless required. (platform.openai.com )
  • For function calling, add strict: true and consider parallel_tool_calls: false. (openai.com )
  • Add explicit error handling for refusal and incomplete generations. (platform.openai.com )
  • Use structured outputs within agents to reduce injection surface area. (platform.openai.com )
  • Evaluate your tasks with the Evals examples for structured output quality. (cookbook.openai.com )

Bonus: JSON mode vs. structured outputs

JSON mode (json_object) only guarantees syntactically valid JSON; structured outputs (json_schema/text.format) guarantee schema adherence on supported models. Prefer the latter for production. (platform.openai.com )

Full example schema (reference)

{
  "type": "object",
  "properties": {
    "id": {"type": "string"},
    "status": {"type": "string", "enum": ["draft", "published", "archived"]},
    "items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "title": {"type": "string", "minLength": 1},
          "qty": {"type": "integer", "minimum": 1},
          "price": {"type": "number", "minimum": 0}
        },
        "required": ["title", "qty", "price"],
        "additionalProperties": false
      }
    }
  },
  "required": ["id", "status", "items"],
  "additionalProperties": false
}

Summary

  • Turn on structured outputs in Responses (text.format) or Chat Completions (response_format) to get reliable, schema‑valid JSON.
  • Favor GPT‑4o models for strong adherence; plan for first‑schema warm‑up and handle refusals.
  • Keep schemas explicit and tight; test with Evals; use strict function calling when integrating tools. (platform.openai.com )

Related Posts