Optional fields with OpenAI Structured Output

A small tip on improving optional fields with OpenAI's new response schema

Posted 20 Aug 2024


Recently, OpenAI announced an improvement to their structured outputs API. Among other things, the update included ways to enforce a single response type, via the response_schema configuration field. In short, they’ve developed an approach for ensuring schema shape is adhered to, with 100% certainty, by modifying the grammar and tokens that a model can output, to those that are a) JSON-compatible and b) compatible with your specific schema. For this reason, the first request takes longer than the rest, since the schema must be compiled into some LLM-compatible form, as a once-off.

To ensure our schema is correct, us developers are required to enable the strict: true option. In turn, we must also disable additional fields and make every field required. However, real world applications might not require all fields. In fact, requiring all fields may actively hurt our ability to extract the desired structured information.

Using the zodFunction export, as shown in the docs, we might naively write our schema like so, for our OpenAI call.

import { zodFunction } from "openai/helpers/zod";

const client = new OpenAI();

const PersonSchema = z.object({
  name: z.string(),
  job: z.string().optional(),
  age: z.number().optional(),
});

const completion = await client.beta.chat.completions.parse({
  model: "gpt-4o-2024-08-06",
  messages: [
    /** ...config */
  ],
  response_format: zodResponseFormat(PersonSchema, "person"),
});

Here, we want the name, for sure, but the age and job are optional - maybe the input text didn’t provide that information, fair enough. However, by virtue of using the zodResponseFormat, the strict, additional properties (irrelevant here) and required properties modifications are all applied for us. In other words, our .optional’s are ignored, as our schema gets converted into JSONSchema form. As a result, the LLM will populate these fields. When the fields are missing from the text we are parsing from, this means either hallucinated and/or values being returned. Not ideal.

The underlying structure for the above zod schema, in JSONSchema format, is something like:

{
  "strict": true,
  "type": "object",
  "properties": {
    "name": {
      "type": "string"
    },
    "job": {
      "type": "string"
    },
    "age": {
      "type": "number"
    }
  },
  "additionalProperties": false,
  "requiredProperties": ["name", "job", "age"]
}

Fix #1

An improvement, you might suggest, could be to make the .optionals nullable instead.

const PersonSchema = z.object({
  name: z.string(),
  job: z.string().nullable(), // <- z/optional/nullable
  age: z.number().nullable(), // <- z/optional/nullable
});

Resulting in

{
  "strict": true,
  "type": "object",
  "properties": {
    "name": {
      "type": "string"
    },
    "job": {
      "type": "string",
      "nullable": true // <--
    },
    "age": {
      "type": "number",
      "nullable": true // <--
    }
  },
  "additionalProperties": false,
  "requiredProperties": ["name", "job", "age"]
}

This is certainly an improvement, insofar as the model actually has some information that the fields are non-essential nor required. Anecdotally, the results were slightly better, but didn’t fully solve the problem.

The Best Solution

The nullable modifier here isn’t actually valid JSONSchema. Instead it’s an OpenAPI/Swagger invention. Introduced in OpenAPI 3.0 (2017), the nullable modifier was deprecated several years later. Instead it was superseded by the current OpenAPI 3.1 (2021) and JSONSChema standard, which uses unions to represent the nullability of a type.

By using the desired type, unioned with a z.null(), the results were much better in my experimentation. Presumably, this is because of the better adherence to more modern standards.

Whether it’s better directly because of standards, or some downstream effect of that, like greater adoption/usage, is TBC

const PersonSchema = z.object({
  name: z.string(),
  job: z.union([z.string(), z.null()]),
  age: z.union([z.number(), z.null()]),
});

The resulting JSONSchema looks like

{
  "strict": true,
  "type": "object",
  "properties": {
    "name": {
      "type": ["string", "null"]
    },
    "job": {
      "type": ["string", "null"]
    },
    "age": {
      "type": ["number", "null"]
    }
  },
  "additionalProperties": false,
  "requiredProperties": ["name", "job", "age"]
}

So now all fields are still “required”, but they can be null, in a way that is officially part of a) the most recent OpenAPI spec and b) JSONSchema. Ostensibly, this is why the approach is better than the prior two.

One caveat to this approach is that the decorations like .describe() to aid prompting must be done on the union itself, not the boxed value. As we can see, in the case of the string, only the type itself is retained.