-
Notifications
You must be signed in to change notification settings - Fork 1.3k
fix!: align chat completions API with OpenAI spec #4598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| limit: int | None = 20, | ||
| model: str | None = None, | ||
| order: Order | None = Order.desc, | ||
| after: str = None, # type: ignore[assignment] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've proposed a way of allowing for optional but non-nullable types in #4644.
Specifically, I created a helper _remove_null_from_anyof to ensure null does not get added to the OpenAPI schema, and I added runtime validation to ensure null is not explicitly provided.
This allows the type annotations to remain accurate (e.g. str | None with no MyPy errors) while ensuring the spec & runtime behavior conforms to the OpenAI spec and the runtime.
Would be great to get your thoughts on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! I took a look, and compared. I am going to adopt your _remove_null_from_anyof here. Both of our approaches work, but yours is more sustainable and reusable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually -- I am seeing issues here. the inference API is still using web method not fastAPI routers, this causes the openapi spec still to have anyOf if I use your approach. might have to stick with this until the inference API is converted.
- Fix GET /chat/completions query parameters to match OpenAI:
- Remove nullable types from after, limit, model, order
- Add enum constraint to order parameter (asc, desc)
- Set correct default values
- Make finish_reason nullable in OpenAIChoice and OpenAIChunkChoice
to match OpenAI streaming response spec
- Make input_messages optional in OpenAICompletionWithInputMessages
since it's not part of standard OpenAI response
Fixed errors:
```
error [request-parameter-became-enum] at docs/static/openai-spec-2.3.0.yml
in API GET /chat/completions
the 'query' request parameter 'order' was restricted to a list of enum values
error [request-parameter-default-value-changed] at docs/static/openai-spec-2.3.0.yml
in API GET /chat/completions
for the 'query' request parameter 'order', default value was changed from 'desc' to 'asc'
error [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml
in API GET /chat/completions
'query' request parameter 'after' list-of-types was narrowed by removing types 'null'
error [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml
in API GET /chat/completions
'query' request parameter 'limit' list-of-types was narrowed by removing types 'null'
error [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml
in API GET /chat/completions
'query' request parameter 'model' list-of-types was narrowed by removing types 'null'
error [request-parameter-list-of-types-narrowed] at docs/static/openai-spec-2.3.0.yml
in API GET /chat/completions
'query' request parameter 'order' list-of-types was narrowed by removing types 'null'
error [request-parameter-type-changed] at docs/static/openai-spec-2.3.0.yml
in API GET /chat/completions
for the 'query' request parameter 'limit', the type/format was changed from ''/'' to 'integer'/''
error [response-property-became-nullable] at docs/static/openai-spec-2.3.0.yml
in API POST /chat/completions
the response property 'choices/items/finish_reason' became nullable for the status '200'
error [response-required-property-removed] at docs/static/openai-spec-2.3.0.yml
in API GET /chat/completions
removed the required property 'data/items/input_messages' from the response with the '200' status
error [response-required-property-removed] at docs/static/openai-spec-2.3.0.yml
in API GET /chat/completions/{completion_id}
removed the required property 'input_messages' from the response with the '200' status
```
Signed-off-by: Charlie Doern <cdoern@redhat.com>
mattf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to avoid type ignoring
| after: str = None, # type: ignore[assignment] | ||
| limit: int = 20, | ||
| model: str = None, # type: ignore[assignment] | ||
| order: Literal["asc", "desc"] = "asc", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change requires updates to providers, which are using order.value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another example of why we need to use request classes so we don't have to update all providers impls... soon with #4445
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. I can make this more proper. Can hold off until #4445 merges.
|
|
||
| delta: OpenAIChoiceDelta | ||
| finish_reason: str | ||
| finish_reason: str | None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that doesn't seem to be correct, the spec says:
finish_reason:
type: string
description: >
The reason the model stopped generating tokens. This will be `stop` if the model hit a
natural stop point or a provided stop sequence,
`length` if the maximum number of tokens specified in the request was reached,
`content_filter` if content was omitted due to a flag from our content filters,
`tool_calls` if the model called a tool, or `function_call` (deprecated) if the model called
a function.
enum:
- stop
- length
- tool_calls
- content_filter
- function_callShouldn't this be?
OpenAIFinishReason = Literal["stop", "length", "tool_calls", "content_filter", "function_call"]
@json_schema_type
class OpenAIChunkChoice(BaseModel):
...
...
finish_reason: OpenAIFinishReason | None = None
...
...
@json_schema_type
class OpenAIChoice(BaseModel):
...
...
finish_reason: OpenAIFinishReason
...
...|
putting in draft waiting for a few PRs like the fastAPI migration of inference + the new conformance test to merge. Will revisit then! |
What does this PR do?
The changes here align the GET /chat/completions endpoint with OpenAI's API spec.
When you write after: str | None = None, the schema generator produces:
But OpenAI's spec has these parameters as optional but not nullable — you can omit them, but you can't explicitly pass null.
Using
after: str = Noneproduces the correct schema:type: stringThe # type: ignore[assignment] silences mypy since None isn't technically a valid str, but the schema generator only sees str and outputs the correct OpenAPI spec.
I looked at adopting the
_remove_null_from_anyofapproach from #4644, but that works for Pydantic model fields (like the embeddings request body), not for query parameters on @webmethod decorated methods. The custom schema generator doesn't process Query(json_schema_extra=...) metadata. Once Inference is converted to a FastAPI router, we could revisit this.resolves #4622
Test Plan
conformance diff between LLS spec and openAI spec should not have these errors anymore.