Skip to content

feat(llm): support OpenAI Codex endpoints#62

Open
linux-devil wants to merge 1 commit into
alibaba:mainfrom
linux-devil:codex/openai-codex-endpoint
Open

feat(llm): support OpenAI Codex endpoints#62
linux-devil wants to merge 1 commit into
alibaba:mainfrom
linux-devil:codex/openai-codex-endpoint

Conversation

@linux-devil
Copy link
Copy Markdown

@linux-devil linux-devil commented Jun 6, 2026

Summary

  • add an OpenAI Responses API client selected by /v1/responses endpoints
  • send max_completion_tokens for GPT-5/Codex/o-series chat-completions models while preserving max_tokens for legacy models
  • make OCR environment variables take precedence over config and accept CI-compatible OCR_LLM_AUTH_TOKEN / OCR_LLM_USE_ANTHROPIC names
  • document Codex/GPT-5 setup examples

Tests

  • git diff --check
  • go test ./internal/llm
  • make build
  • local ocr llm test against an Azure GPT-5.5 chat-completions deployment

Note: go test ./... currently fails in internal/config/rules because existing tests expect Chinese default-rule snippets while the embedded default rules resolve to English text; that failure is unrelated to this LLM endpoint change.

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Harshit Sharma seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Comment thread internal/llm/client.go
Comment on lines +507 to +513
if req.MaxTokens > 0 && useMaxCompletionTokens(model) {
delete(body, "max_tokens")
body["max_completion_tokens"] = req.MaxTokens
}
for k, v := range c.cfg.ExtraBody {
body[k] = v
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: ExtraBody 可能覆盖或撤销 max_completion_tokens 的转换

在此处先将 max_tokens 删除并替换为 max_completion_tokens,但随后在下方遍历 c.cfg.ExtraBody 时,用户配置的 ExtraBody 可能会重新写入 max_tokens 键,导致请求体中同时存在 max_tokensmax_completion_tokens。OpenAI API 对这两种字段同时存在时的行为未做明确保证,可能导致调用失败或行为异常。

建议:在 ExtraBody 合并之后再做 max_tokens -> max_completion_tokens 的转换,或者在 ExtraBody 合并后再次清理冲突字段。例如:

for k, v := range c.cfg.ExtraBody {
    body[k] = v
}
// 在 ExtraBody 合并之后再处理转换
if req.MaxTokens > 0 && useMaxCompletionTokens(model) {
    delete(body, "max_tokens")
    body["max_completion_tokens"] = req.MaxTokens
}

Comment thread internal/llm/client.go
Comment on lines +323 to +330
func useMaxCompletionTokens(model string) bool {
lower := strings.ToLower(model)
return strings.Contains(lower, "gpt-5") ||
strings.Contains(lower, "codex") ||
strings.Contains(lower, "o1") ||
strings.Contains(lower, "o3") ||
strings.Contains(lower, "o4")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRY 违规:encodingForModeluseMaxCompletionTokens 中的模型匹配逻辑完全重复

这两个函数都维护着相同的模型名称匹配条件 (gpt-5, codex, o1, o3, o4)。未来新增模型时,如果只更新了其中一个函数而遗漏另一个,将导致 token 计数与实际请求参数不一致。

建议:将模型匹配逻辑提取为一个共享的判断函数,例如:

func isAdvancedReasoningModel(modelName string) bool {
    lower := strings.ToLower(modelName)
    return strings.Contains(lower, "gpt-5") ||
        strings.Contains(lower, "codex") ||
        strings.Contains(lower, "o1") ||
        strings.Contains(lower, "o3") ||
        strings.Contains(lower, "o4")
}

然后在 encodingForModeluseMaxCompletionTokens 中复用该函数。

Comment thread internal/llm/client.go
Comment on lines +319 to +321
func isResponsesEndpoint(rawURL string) bool {
return strings.HasSuffix(strings.TrimRight(rawURL, "/"), "/responses")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isResponsesEndpoint 对含查询参数的 URL 匹配可能失败

该函数使用 strings.HasSuffix 检查 URL 是否以 /responses 结尾,但未考虑 URL 可能携带查询参数(如 https://example.com/v1/responses?api-version=2024-01)或片段标识符的情况。这会导致本应路由到 OpenAIResponsesClient 的请求被错误地路由到 OpenAIClient

建议:使用 net/url 包先解析 URL,基于路径部分进行匹配:

func isResponsesEndpoint(rawURL string) bool {
    u, err := url.Parse(rawURL)
    if err != nil {
        return strings.HasSuffix(strings.TrimRight(rawURL, "/"), "/responses")
    }
    return strings.HasSuffix(strings.TrimRight(u.Path, "/"), "/responses")
}

Comment on lines +223 to +233
func responseContentAsString(content any) string {
switch v := content.(type) {
case string:
return v
case []ContentBlock:
msg := Message{Content: v}
return msg.ExtractText()
default:
return fmt.Sprintf("%v", v)
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: 当 contentnil 时(Message.Content 类型为 any,完全可能为 nil),fmt.Sprintf("%v", nil) 会返回字符串 "<nil>",这会被作为工具调用输出或消息内容发送给 API,导致请求内容错误。

注意到 ExtractText() 的 default 分支返回的是空字符串 "",这里应保持一致。

Suggestion:

Suggested change
func responseContentAsString(content any) string {
switch v := content.(type) {
case string:
return v
case []ContentBlock:
msg := Message{Content: v}
return msg.ExtractText()
default:
return fmt.Sprintf("%v", v)
}
}
func responseContentAsString(content any) string {
switch v := content.(type) {
case string:
return v
case []ContentBlock:
msg := Message{Content: v}
return msg.ExtractText()
case nil:
return ""
default:
return fmt.Sprintf("%v", v)
}
}

Comment on lines +312 to +315
finishReason := "stop"
if len(toolCalls) > 0 {
finishReason = "tool_calls"
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: finishReason 仅根据是否存在 tool calls 来判断,完全忽略了 Responses API 实际返回的完成原因。如果响应因 max_output_tokens 限制被截断(API 返回 "length"),此处仍会标记为 "stop",导致调用方无法区分内容是正常结束还是被截断。

Responses API 在 output 的 message 项中包含 stop_reason 字段,应当解析并使用该值。可参考 Anthropic 客户端的做法(finishReason := resp.StopReason,空时回退到 "stop")。

Comment on lines +26 to +29
baseURL := strings.TrimRight(cfg.URL, "/")
if !strings.HasSuffix(baseURL, "/responses") {
cfg.URL = baseURL + "/responses"
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

潜在问题: NewOpenAIResponsesClient 中修改了 cfg.URL 的值(追加 /responses),但 cfgClientConfig 的值拷贝,因此 cfg.URL 的修改只影响本结构体内部的副本。然而,isResponsesEndpoint 的检测逻辑和这里的追加逻辑之间存在微妙的不一致:

  • isResponsesEndpoint 检查 HasSuffix(TrimRight(url, "/"), "/responses")
  • 这里检查 HasSuffix(baseURL, "/responses"),其中 baseURL = TrimRight(cfg.URL, "/")

两者逻辑一致,所以目前不会出问题。但如果用户的 URL 是类似 https://example.com/v1/responses 的完整路径,这里不会再追加,请求会正确发送到该地址。不过如果 URL 是 https://example.com/v1 这种不包含 /responses 的路径,则会被自动追加。建议在注释中明确说明这种 URL 自动补全行为,方便后续维护。

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 6, 2026

🔍 OpenCodeReview found 8 issue(s) in this PR.

  • ✅ 8 posted as inline comment(s)
  • 📝 0 posted as summary (missing line info)

📊 Posting Statistics:

  • ✅ Successfully posted: 6 comment(s)
  • ❌ Failed to post: 2 comment(s)
❌ Failed Comments Details
  • internal/llm/resolver.go: Unprocessable Entity: "Line could not be resolved"
  • internal/llm/resolver.go: Unprocessable Entity: "Line could not be resolved"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants