Skip to content

enhancement: add lightweight tool calling in Draft Patch for lower token usage and better repository navigation #28

@dangzitou

Description

@dangzitou

Summary

当前 OpenMeta 的 agent loop 已经具备 patch draft、implementation、validation、draft PR 的完整流水线,但在 Draft patch 和 implementation context expansion 阶段,仍然主要依赖“预加载 snippets + 模型一次性判断”。

这会带来两个明显问题:

  1. token 消耗偏高
    还没确认真实实现入口前,就把较多候选文件内容直接塞进上下文。

  2. 文件定位容易漂移
    一旦初始 patch draft 偏离真实业务实现文件,后续扩展轮次可能继续把低信号文件带进来,最终出现 changed files: noneInsufficient context for a safe code patch

建议在 不全量重构 agent 架构 的前提下,只在 Draft patch 的内部导航和上下文扩展阶段,引入一个轻量、受限的 tool calling / function calling 机制。

Motivation

OpenMeta 的项目定位更像“可控的贡献流水线”,而不是完全开放式的通用 coding agent。

因此这里不建议直接改成 Claude Code 风格的全开放工具代理,而是保留现有:

  • structured JSON outputs
  • validation / repair 流程
  • artifact 产物链路
  • draft PR 发布流程
  • 现有安全门

只增强最影响效果的部分:

  • 仓库导航
  • 文件定位
  • 按需读取上下文

这样可以用更少的 token,把更多预算花在真正相关的实现文件上。

Proposed Direction

Draft patch 阶段增加一个 tool-assisted navigation loop

Phase 1: 只读工具导航

先不给模型大量文件正文,只提供轻量仓库上下文,例如:

  • repo summary
  • candidate file paths
  • top-level directories
  • validation command summary
  • issue context
  • memory summary

然后允许模型在有限轮次内调用少量只读工具,例如:

  • search_code(query, limit)
  • read_file(path, start?, end?)
  • list_dir(path)
  • find_related(path)

其中 find_related(path) 可以由宿主基于规则返回:

  • 同目录文件
  • 父目录关键文件
  • 同名 service / route / validator / query-config
  • 对应 __tests__ / *.test.* / *.spec.*

Phase 2: 输出现有 patch draft schema

工具调用结束后,模型仍然输出当前已有的 patch_draft JSON schema,而不是引入全新的产物格式。

这样可以保持对现有下游逻辑的兼容。

Phase 3: Implementation 继续沿用现有结构

implementation_draft 仍然输出当前 schema:

  • summary
  • fileChanges[]
  • full final file content

但它的输入文件集合,不再主要来自“预加载 snippets”,而是来自前面工具导航后已经确认的高置信文件集。

Non-Goals

本 issue 不打算:

  • 把整个 OpenMeta 改造成完全开放式 coding agent
  • 让模型直接自由执行 shell / git / network 工具
  • 移除现有 structured JSON output
  • 移除当前 validation / repair / artifact / PR 流程
  • 在第一版就引入自由写文件工具

Expected Benefits

  • 降低 token 使用量
  • 提高真实实现入口命中率
  • 减少低信号文件污染上下文
  • 降低 changed files: none 的发生概率
  • 保持现有 OpenMeta 的可控性和可审计性

Possible Implementation Notes

可以优先只在 Draft patch 阶段引入工具导航,而不是一次性改完整个 loop。

推荐第一版约束:

  • 每轮工具调用数上限固定
  • 总读取文件数上限固定
  • 总读取字符数 / token budget 固定
  • 只允许仓库内只读操作
  • 最终仍由宿主统一执行写文件和验证

Acceptance Criteria

  • Draft patch 阶段支持有限轮次的只读 tool calling / function calling
  • 模型可以按需搜索和读取文件,而不是仅依赖预加载 snippets
  • 最终仍输出兼容现有系统的 patch_draft JSON
  • implementation 阶段可消费工具导航后筛选出的文件集合
  • 默认 trace 中可看到简洁的导航过程,例如:
    • searching ...
    • reading ...
    • expanding context ...
  • 相比当前实现,典型仓库上的上下文 token 使用量下降
  • changed files: none 的发生率在典型 issue 上有所降低

Summary

OpenMeta already has a solid contribution pipeline with patch drafting, implementation drafting, validation, and draft PR generation.

However, the current Draft patch and implementation context expansion flow still relies mostly on “preloaded snippets + one-shot model judgment”.

This creates two recurring problems:

  1. token usage is higher than necessary
    file contents are loaded into context before the model has confidently identified the real implementation entry points

  2. repository navigation can drift
    once the initial patch draft points slightly off target, later expansion rounds may continue pulling in low-signal files, ending in changed files: none or Insufficient context for a safe code patch

This issue proposes a limited, lightweight tool calling / function calling layer inside Draft Patch only, without replacing the overall OpenMeta architecture.

Motivation

OpenMeta is better positioned as a controlled contribution pipeline than as a fully open-ended coding agent.

Because of that, this proposal does not aim to convert the whole agent into a Claude Code style unrestricted tool agent.

Instead, it keeps the current strengths intact:

  • structured JSON outputs
  • validation / repair flow
  • artifact generation
  • draft PR publishing flow
  • existing safety gates

The goal is to improve the weakest part of the current loop:

  • repository navigation
  • file targeting
  • on-demand context retrieval

This should reduce wasted tokens and improve implementation accuracy.

Proposed Direction

Add a tool-assisted navigation loop inside the Draft patch stage.

Phase 1: read-only navigation tools

Instead of providing large file snippets up front, first provide only lightweight repository context such as:

  • repo summary
  • candidate file paths
  • top-level directories
  • validation command summary
  • issue context
  • memory summary

Then allow the model to make a limited number of read-only tool calls, such as:

  • search_code(query, limit)
  • read_file(path, start?, end?)
  • list_dir(path)
  • find_related(path)

find_related(path) can be host-driven and return nearby files like:

  • sibling files in the same directory
  • parent-directory implementation files
  • matching service / route / validator / query-config
  • matching __tests__ / *.test.* / *.spec.*

Phase 2: keep the existing patch draft schema

After the tool loop finishes, the model should still return the current patch_draft JSON schema rather than introducing a new output format.

This keeps downstream compatibility intact.

Phase 3: keep the current implementation draft schema

implementation_draft should continue using the current schema:

  • summary
  • fileChanges[]
  • full final file content

But its input file set should now come from the tool-assisted navigation result instead of mostly from preloaded broad snippets.

Non-Goals

This issue does not aim to:

  • turn OpenMeta into a fully open-ended coding agent
  • let the model directly run shell, git, or network tools
  • remove structured JSON outputs
  • replace the existing validation / repair / artifact / PR flow
  • introduce unrestricted write/edit tools in the first version

Expected Benefits

  • lower token usage
  • better implementation entry point targeting
  • less low-signal context pollution
  • fewer changed files: none outcomes
  • preserved controllability and auditability

Possible Implementation Notes

A good first step is to add tool-assisted navigation only to Draft patch, instead of rewriting the whole agent loop.

Suggested constraints for v1:

  • fixed max tool calls per round
  • fixed max number of files read
  • fixed total read budget in characters or tokens
  • repository-local read-only operations only
  • file writing and validation still executed by the host

Acceptance Criteria

  • Draft patch supports a bounded read-only tool calling / function calling loop
  • the model can search and read files on demand instead of depending only on preloaded snippets
  • the final output remains compatible with the existing patch_draft JSON schema
  • implementation can consume the filtered high-confidence file set selected through the tool loop
  • default trace shows concise navigation steps such as:
    • searching ...
    • reading ...
    • expanding context ...
  • context token usage decreases on representative repositories
  • changed files: none becomes less frequent on representative issues

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions