Tool Eval

You starred it. You'll never use it. Let AI decide.
你收藏了 300 个 GitHub 仓库，一个都没打开过。让 AI 替你决策。

npx skills add wujiajun4/tool-eval -g

Star hoarding is the new "read later." Tool Eval reads any GitHub repo, cross-references your actual tool stack, and gives you a straight verdict in one table. 30 seconds. No ambiguity.

收藏等于"以后再说"，实际上永远不说。 Tool Eval 读取任何 GitHub 仓库，跟你的现有工具栈做横向对比，一张表给你结论。30 秒。不废话。

痛点 / The Problem

你看到一个 20k 星的仓库。README 写得真好。你点了收藏。

乘以 300。

现在你有一个书签坟场，不知道哪个跟哪个重叠，也没时间评估任何一个。收藏是新时代的"稍后阅读"——你永远不读。

市面上其他选择都不好使：

做法	实际发生的事
"我自己评估"	花 30 分钟读文档、对比功能、检查冲突。你没有 30 分钟。
"让团队评估"	没人举手。
"直接装上试试"	多了 147 个依赖和一个 Docker 守护进程，只用了一次。
现有评估 Skill	它们只描述工具。它们不知道你的栈。

You see a repo trending on GitHub. 20k stars. You star it. Multiply by 300. Now you have a graveyard of bookmarks, no idea which ones overlap, and no time to evaluate any of them.

Approach	What actually happens
"I'll evaluate it myself"	30 minutes reading docs. You don't have 30 minutes.
"I'll ask the team"	Nobody volunteers.
"I'll just install it"	147 deps and Docker for a tool you used once.
Existing eval skills	They describe. They don't compare against your stack.

解决方案 / The Solution

一个 AI Agent 技能，读取 GitHub 仓库，跟你的实际工具栈做对比，一张表给你结论。30 秒。不废话。

An AI agent skill that reads any GitHub repo, cross-references your actual tool stack, and gives you a straight verdict. With a comparison table. In 30 seconds.

Demo

🇨🇳 中文

你: https://github.com/Panniantong/Agent-Reach 看一下

AI:

## Agent Reach — 给 AI 装上"眼睛"看全网

> ⭐ 21.9k · 249 commits · Python · MIT
> 零 API 费用搜索 Twitter、Reddit、YouTube、GitHub、B站、小红书

### 快速对比
| 维度 | 你现有的 | Agent Reach | 结论 |
|------|---------|-------------|------|
| 搜索 Reddit/X/YT | last30days | Cookie 认证，同平台 | ❌ 重叠 |
| 中文平台 (B站/小红书) | 无 | xhs-cli, bilibili-cli | ✅ 填补空白 |
| 安装复杂度 | 一行插件安装 | pip install + 6 个 CLI + 6 次登录 | ❌ 维护成本高 |
| API 费用 | last30days 免费 | 全部免费 | 🔗 持平 |

### 结论
**❌ 不装** — 你已有 last30days 覆盖核心平台。
中文平台优势不值得维护 6 个额外 CLI 工具。

🇬🇧 English

You: evaluate https://github.com/khoj-ai/khoj

AI:

## Khoj — AI Second Brain

> ⭐ 34.9k · 5,168 commits · Python · AGPL-3.0
> Self-hosted AI app: chat with any LLM, search your docs, build agents, schedule automations.

### Quick Comparison
| Capability | Your Stack | Khoj | Verdict |
|------|---------|------|------|
| Document semantic search | Obsidian + ctx_search | PDF/Word/Markdown/Notion | ❌ Overlap |
| AI chat | Claude Code | Any LLM, self-hosted | ❌ Different use case |
| Scheduled automation | scheduled-tasks MCP | Built-in cron + email | 🔗 Comparable |
| Obsidian integration | Native Obsidian | Plugin available | 🔗 Comparable |

### Verdict
**❌ Skip** — Your existing stack already covers Khoj's core value prop.

功能 / Features

功能 Feature	做什么 What it does
🔍 自动剖析 / Auto-profiling	读取 README、SKILL.md、文档 — 提取能力、安装成本、定价、安全
🗺️ 栈感知 / Stack-aware	搜索你的 Obsidian 知识库、乐团定义、已装技能 — 知道你有什么
⚖️ 横向对比 / Head-to-head	新工具 vs 你的实际栈，逐项对比
⚡ 30 秒判决 / 30s verdict	三种结果：装 / 锦上添花 / 不要 —— 从不含糊
🌐 双语 / Bilingual	中英文触发 + 中英文输出
🆚 对比模式 / Compare	"A vs B" — 两个工具正面 PK

三种结果，永远不模棱两可 / Three outcomes, never ambiguous:

原理 / How It Works

GitHub 链接 / URL
    │
    ▼
第一步：抓取 / Fetch ──────── 分析仓库、文档、SKILL.md
    │
    ▼
第二步：剖析 / Analyze ────── 提取能力、成本、安全、信号
    │
    ▼
第三步：交叉对比 / Cross-ref ── vs 用户的 Obsidian + 乐团清单
    │
    ▼
第四步：对比 / Compare ────── 跟现有工具逐项对比
    │
    ▼
第五步：判决 / Verdict ────── 装 / 锦上添花 / 不要

秘密武器： 它不只描述工具。它把工具放进你的上下文里对比——你的 Obsidian 知识库、你的乐团定义、你已装的所有技能。其他评估工具在说"这个工具能干什么"。Tool Eval 在说"这个工具对你有什么用"。

The secret sauce: It compares against your stack — your Obsidian vault, your orchestra definitions, your installed skills. Other evaluators talk about the tool. Tool Eval talks about the tool in context of what you already use.

技术栈 / Tech Stack

类别 Category	技术 Technology
平台 / Platform	Claude Code、OpenClaw、Agent Skills（67+ 平台）
语言 / Language	Markdown (SKILL.md)
工具 / Tools	WebFetch、Read、Write、WebSearch
依赖 / Dependencies	零 —— 纯指令 Skill，不占内存

项目结构 / Project Structure

tool-eval/
├── assets/
│   └── icon.svg          # 放大镜 + 对勾 + 对比箭头
├── .gitignore
├── LICENSE               # MIT
├── README.md             # ← 你在这里
└── SKILL.md              # 运行时规范：frontmatter + 5 步流水线 + 输出模板

安装 / Install

# Claude Code（一行命令 / one command）
mkdir -p ~/.claude/skills && git clone https://github.com/wujiajun4/tool-eval.git ~/.claude/skills/tool-eval

# Agent Skills（全平台通用 / any platform）
npx skills add wujiajun4/tool-eval -g

这样触发 / Trigger it with:

你说 / You say	会发生什么 / What happens
`https://github.com/xxx 看一下`	评估仓库
`evaluate this https://...`	英文触发
`A vs B 怎么选`	两个工具正面 PK
`这个要不要装`	快速判决

vs 竞品 / vs Alternatives

	Tool Eval	repo-analyst	evaluate-and-improve
知道你的栈 / Knows stack	✅ 搜知识库	❌ 泛用	❌ 只看代码库
对比表格 / Comparison table	✅ 横向 vs 你的工具	❌ 描述方法	❌ 怀疑论散文
明确裁决 / Straight verdict	✅ 装/锦上添花/不要	❌ 学术分析	⚠️ "这有帮助吗？"
防重复安装 / No overlap	✅ 感知已有工具	❌ 单仓库	❌ 单仓库
双语 / Bilingual	✅ 中+英	❌ 仅英文	❌ 仅英文
安装 / Setup	一行 `git clone`	一行 `git clone`	一行 `git clone`

Tool Eval 是唯一一个回答你真正问题的工具："我已经有 X、Y、Z 了，还该不该装这个？"

原则 / Principles

Stars ≠ Value / 星星不等于价值。 30k 星的仓库可能跟你的已有工具 100% 重叠。
Setup cost is part of the verdict / 安装成本是判决的一部分。 需要 5 个 API key + Docker，不比一行 pip install 更好。
Compare, don't describe / 对比，不要描述。 "它能搜 Reddit"是废话。"它能搜 Reddit——但你已经有了 last30days，覆盖 Reddit + X + YouTube"才是信息。
Don't be a cheerleader / 不做啦啦队。 每个工具都有取舍。如实呈现。
Table format always / 永远用表格。 开发者读表格。他们扫读长文。

FAQ

Q: 只能评估 Skill 还是任意仓库？ / Skills only? 任意仓库。工具、库、框架、Skill、MCP Server —— 只要有 README，Tool Eval 就能剖析。

Q: 它怎么知道我的栈？ / How does it know my stack? 它读你的 Obsidian 知识库、乐团定义（orchestra-system.md）、已装技能。知识库越丰富，对比越精准。

Q: 它是不是偏向说"不要"？ / Biased toward "skip"? 不是——它偏向诚实。如果工具真的填补空缺，它会说"装"。但大部分你收藏的仓库都跟你已有的工具重叠。Tool Eval 只是把事实摆出来。

Q: 跟直接问 Claude 有什么区别？ / vs asking Claude directly? 不加载 Tool Eval 的 Claude 只会描述工具，不会交叉对比你的栈，因为它不知道要这样做。Tool Eval 是一个 5 步流水线，强制每次做交叉对比。

Roadmap / 路线图

支持 NPM 包，不止 GitHub 仓库
历史评估日志（追踪你评估过什么、为什么跳过）
团队模式（对比共享团队栈，而非个人工具）
定期重评（每 3 个月重检跳过的工具）

贡献 / Contributing

提 Issue 和 PR 都欢迎。提交前：

/preflight          # 跑发布前检查清单

本项目与 preflight 配对——一个 6 项质量关卡，扫描占位符、缺失 LICENSE、私有工具引用等。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tool Eval

痛点 / The Problem

解决方案 / The Solution

Demo

🇨🇳 中文

🇬🇧 English

功能 / Features

原理 / How It Works

技术栈 / Tech Stack

项目结构 / Project Structure

安装 / Install

vs 竞品 / vs Alternatives

原则 / Principles

FAQ

Roadmap / 路线图

贡献 / Contributing

许可证 / License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md

Folders and files

Latest commit

History

Repository files navigation

Tool Eval

痛点 / The Problem

解决方案 / The Solution

Demo

🇨🇳 中文

🇬🇧 English

功能 / Features

原理 / How It Works

技术栈 / Tech Stack

项目结构 / Project Structure

安装 / Install

vs 竞品 / vs Alternatives

原则 / Principles

FAQ

Roadmap / 路线图

贡献 / Contributing

许可证 / License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages