fix!: align chat completions API with OpenAI spec #4598

cdoern · 2026-01-14T21:41:09Z

What does this PR do?

Fix GET /chat/completions query parameters to match OpenAI:
- Remove nullable types from after, limit, model, order
- Add enum constraint to order parameter (asc, desc)
- Set correct default values
Make finish_reason nullable in OpenAIChoice and OpenAIChunkChoice to match OpenAI streaming response spec
Make input_messages optional in OpenAICompletionWithInputMessages since it's not part of standard OpenAI response Fixed errors:

  error    [request-parameter-became-enum] at docs/static/openai-spec-2.3.0.yml
      in API GET /chat/completions
          the 'query' request parameter 'order' was restricted to a list of enum values

  error    [request-parameter-default-value-changed] at docs/static/openai-spec-2.3.0.yml
      in API GET /chat/completions
          for the 'query' request parameter 'order', default value was changed from 'desc' to 'asc'

  error    [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml
      in API GET /chat/completions
          'query' request parameter 'after' list-of-types was narrowed by removing types 'null'

  error    [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml
      in API GET /chat/completions
          'query' request parameter 'limit' list-of-types was narrowed by removing types 'null'

  error    [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml
      in API GET /chat/completions
          'query' request parameter 'model' list-of-types was narrowed by removing types 'null'

  error    [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml
      in API GET /chat/completions
          'query' request parameter 'order' list-of-types was narrowed by removing types 'null'

  error    [request-parameter-type-changed] at docs/static/openai-spec-2.3.0.yml
      in API GET /chat/completions
          for the 'query' request parameter 'limit', the type/format was changed from ''/'' to 'integer'/''

  error    [response-required-property-removed] at docs/static/openai-spec-2.3.0.yml
      in API GET /chat/completions
          removed the required property 'data/items/input_messages' from the response with the '200' status

  error    [response-required-property-removed] at docs/static/openai-spec-2.3.0.yml
      in API GET /chat/completions/{completion_id}
          removed the required property 'input_messages' from the response with the '200' status

The changes here align the GET /chat/completions endpoint with OpenAI's API spec.

When you write after: str | None = None, the schema generator produces:

anyOf:                                                                                                                                                                        
  - type: string                                                                                                                                                              
  - type: 'null'

But OpenAI's spec has these parameters as optional but not nullable — you can omit them, but you can't explicitly pass null.

Using after: str = None produces the correct schema:
type: string

The # type: ignore[assignment] silences mypy since None isn't technically a valid str, but the schema generator only sees str and outputs the correct OpenAPI spec.

I looked at adopting the _remove_null_from_anyof approach from #4644, but that works for Pydantic model fields (like the embeddings request body), not for query parameters on @webmethod decorated methods. The custom schema generator doesn't process Query(json_schema_extra=...) metadata. Once Inference is converted to a FastAPI router, we could revisit this.

resolves #4622

Test Plan

conformance diff between LLS spec and openAI spec should not have these errors anymore.

eoinfennessy · 2026-01-19T13:20:27Z

src/llama_stack_api/inference.py

-        limit: int | None = 20,
-        model: str | None = None,
-        order: Order | None = Order.desc,
+        after: str = None,  # type: ignore[assignment]


I've proposed a way of allowing for optional but non-nullable types in #4644.

Specifically, I created a helper _remove_null_from_anyof to ensure null does not get added to the OpenAPI schema, and I added runtime validation to ensure null is not explicitly provided.

This allows the type annotations to remain accurate (e.g. str | None with no MyPy errors) while ensuring the spec & runtime behavior conforms to the OpenAI spec and the runtime.

Would be great to get your thoughts on this.

thanks! I took a look, and compared. I am going to adopt your _remove_null_from_anyof here. Both of our approaches work, but yours is more sustainable and reusable

actually -- I am seeing issues here. the inference API is still using web method not fastAPI routers, this causes the openapi spec still to have anyOf if I use your approach. might have to stick with this until the inference API is converted.

- Fix GET /chat/completions query parameters to match OpenAI: - Remove nullable types from after, limit, model, order - Add enum constraint to order parameter (asc, desc) - Set correct default values - Make finish_reason nullable in OpenAIChoice and OpenAIChunkChoice to match OpenAI streaming response spec - Make input_messages optional in OpenAICompletionWithInputMessages since it's not part of standard OpenAI response Fixed errors: ``` error [request-parameter-became-enum] at docs/static/openai-spec-2.3.0.yml in API GET /chat/completions the 'query' request parameter 'order' was restricted to a list of enum values error [request-parameter-default-value-changed] at docs/static/openai-spec-2.3.0.yml in API GET /chat/completions for the 'query' request parameter 'order', default value was changed from 'desc' to 'asc' error [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml in API GET /chat/completions 'query' request parameter 'after' list-of-types was narrowed by removing types 'null' error [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml in API GET /chat/completions 'query' request parameter 'limit' list-of-types was narrowed by removing types 'null' error [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml in API GET /chat/completions 'query' request parameter 'model' list-of-types was narrowed by removing types 'null' error [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml in API GET /chat/completions 'query' request parameter 'order' list-of-types was narrowed by removing types 'null' error [request-parameter-type-changed] at docs/static/openai-spec-2.3.0.yml in API GET /chat/completions for the 'query' request parameter 'limit', the type/format was changed from ''/'' to 'integer'/'' error [response-property-became-nullable] at docs/static/openai-spec-2.3.0.yml in API POST /chat/completions the response property 'choices/items/finish_reason' became nullable for the status '200' error [response-required-property-removed] at docs/static/openai-spec-2.3.0.yml in API GET /chat/completions removed the required property 'data/items/input_messages' from the response with the '200' status error [response-required-property-removed] at docs/static/openai-spec-2.3.0.yml in API GET /chat/completions/{completion_id} removed the required property 'input_messages' from the response with the '200' status ``` Signed-off-by: Charlie Doern <cdoern@redhat.com>

mattf

we need to avoid type ignoring

mattf · 2026-01-20T14:41:32Z

src/llama_stack_api/inference.py

+        after: str = None,  # type: ignore[assignment]
+        limit: int = 20,
+        model: str = None,  # type: ignore[assignment]
+        order: Literal["asc", "desc"] = "asc",


this change requires updates to providers, which are using order.value

another example of why we need to use request classes so we don't have to update all providers impls... soon with #4445

yeah. I can make this more proper. Can hold off until #4445 merges.

leseb · 2026-01-20T15:43:29Z

src/llama_stack_api/inference.py


    delta: OpenAIChoiceDelta
-    finish_reason: str
+    finish_reason: str | None


that doesn't seem to be correct, the spec says:

finish_reason: type: string description: > The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence, `length` if the maximum number of tokens specified in the request was reached, `content_filter` if content was omitted due to a flag from our content filters, `tool_calls` if the model called a tool, or `function_call` (deprecated) if the model called a function. enum: - stop - length - tool_calls - content_filter - function_call

Shouldn't this be?

OpenAIFinishReason = Literal["stop", "length", "tool_calls", "content_filter", "function_call"] @json_schema_type class OpenAIChunkChoice(BaseModel): ... ... finish_reason: OpenAIFinishReason | None = None ... ... @json_schema_type class OpenAIChoice(BaseModel): ... ... finish_reason: OpenAIFinishReason ... ...

cdoern · 2026-01-22T15:16:59Z

putting in draft waiting for a few PRs like the fastAPI migration of inference + the new conformance test to merge. Will revisit then!

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 14, 2026

cdoern force-pushed the fix-chat branch from 7b9597d to b135e70 Compare January 15, 2026 16:22

cdoern changed the title ~~fix: align chat completions API with OpenAI spec~~ fix!: align chat completions API with OpenAI spec Jan 16, 2026

cdoern force-pushed the fix-chat branch from b135e70 to 9b34c54 Compare January 16, 2026 19:54

eoinfennessy reviewed Jan 19, 2026

View reviewed changes

cdoern force-pushed the fix-chat branch from 9b34c54 to a12f274 Compare January 19, 2026 14:29

cdoern marked this pull request as ready for review January 19, 2026 14:38

cdoern requested review from ashwinb, bbrowning, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners January 19, 2026 14:38

cdoern mentioned this pull request Jan 19, 2026

fix!: align chat completions POST API with OpenAI spec #4647

Draft

mattf requested changes Jan 20, 2026

View reviewed changes

leseb requested changes Jan 20, 2026

View reviewed changes

cdoern marked this pull request as draft January 22, 2026 15:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix!: align chat completions API with OpenAI spec #4598

fix!: align chat completions API with OpenAI spec #4598

Uh oh!

cdoern commented Jan 14, 2026 •

edited

Loading

Uh oh!

eoinfennessy Jan 19, 2026

Uh oh!

cdoern Jan 19, 2026

Uh oh!

cdoern Jan 19, 2026

Uh oh!

mattf left a comment

Uh oh!

mattf Jan 20, 2026

Uh oh!

leseb Jan 20, 2026

Uh oh!

cdoern Jan 22, 2026

Uh oh!

leseb Jan 20, 2026

Uh oh!

cdoern commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix!: align chat completions API with OpenAI spec #4598

Are you sure you want to change the base?

fix!: align chat completions API with OpenAI spec #4598

Uh oh!

Conversation

cdoern commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

eoinfennessy Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

cdoern Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

cdoern Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

mattf left a comment

Choose a reason for hiding this comment

Uh oh!

mattf Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

leseb Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

cdoern Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

leseb Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

cdoern commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cdoern commented Jan 14, 2026 •

edited

Loading