Add SafeSkill security badge (44/100 — Use with Caution)#1
Open
OyaAIProd wants to merge 1 commit into
Open
Conversation
Signed-off-by: SafeSkill Scanner <mk@oya.ai>
guodaxia103
added a commit
that referenced
this pull request
Apr 23, 2026
Source-of-truth sync for V24-05 (real-browser benchmark v2 framework + v2.4.0+ release gate; transformer / gate / CLI / spawn-tests landed in commits a4c8d34 + 6b2a815). * `docs/PRODUCT_BACKLOG.md` — new 2026-04-23 changelog entry for V24-05 placed immediately above the V24-03 entry. Records the pair-aware schema, K1..K8 metrics (K1..K4 carryover, K5..K8 new with v23 unchanged), the dual-gate release-check routing (`v2.4.0+` → v24 gate; `v2.3.x` → v23 gate; mutually exclusive), the gate-then-write CLI semantics, the `--allow-missing-notes` invariant, and the K8 < 0.40 trigger that reactivates the V24-04 candidate for v2.5. * `docs/TASK_ROADMAP.md` — §20 follow-up #5 added (v2.4.0 release gate landing entry, mirroring follow-ups #1..#4) + §21 Changelog `v1.2.0` row covering V24-01 / V24-02 / V24-03 / V24-05 in one v2.4.0 development-package summary. * `docs/TASK_ROADMAP_zh.md` — Chinese mirror. * `docs/RELEASE_NOTES_v2.4.0_DRAFT.md` — release-notes draft. Intentionally suffixed `_DRAFT.md` so `check-release-readiness.mjs` still treats v2.4.0 as un-noted until the maintainer (a) pastes the auto-generated baseline-comparison table from `pnpm run benchmark:v24 -- --baseline …`, (b) renames the file to `docs/RELEASE_NOTES_v2.4.0.md`, and (c) bumps the five `package.json` files to `2.4.0` in lockstep. Documents the maintainer command list, the public KPI scenario list with `pairCount ≥ 3` requirement per scenario, the K5..K8 evidence-only status, and the carry-forward known-limitations. Verified: `pnpm run docs:check` green; `pnpm -r typecheck` green; full `pnpm -C app/native-server test:ci` green (49 suites, 582 passed, 24 skipped); `pnpm run release:check` green on the v2.3.0 path (the repo `package.json` is still 2.3.0; the v2.4.0+ branch is exercised end-to-end via the V24-05 spawn tests). Real-browser MCP run NOT executed by Claude — owner-lane / Codex / maintainer must run the `tabrix-private-tests` `acceptance:v2.4.0` runner, project the NDJSON via `pnpm run benchmark:v24 -- --gate --baseline …`, paste the generated baseline table into the renamed release-notes file, and bump the five `package.json` versions before `release:check` passes on the v2.4.0+ branch. Refs: V24-05. Made-with: Cursor
guodaxia103
added a commit
that referenced
this pull request
Apr 23, 2026
…loseout) v2.4.0 release-blocking review finding #1: the aggregator only excluded `experience_suggest_plan` from self-projection, leaving three other internal/native MCP tools to seed bogus `(unknown, "run mcp tool <internal>")` action-path buckets every time they were invoked: `experience_score_step` (V24-02 write-back), `tabrix_choose_context` (V24-03 chooser), and `tabrix_choose_context_record_outcome` (V24-03 outcome write-back). Extend `EXPERIENCE_AGGREGATION_EXCLUDED_TOOLS` to cover all four, keeping the existing per-session semantics: a session whose ENTIRE step list is internal gets `aggregated_at` marked but is NOT upserted, so both successful and failed invocations skip Experience cleanly and the pending-aggregation scan does not re-encounter them on replay. Mixed sessions (any step is a real Memory tool) still aggregate normally. Tests pin all three new tools twice (success + failure both no-op), add a multi-internal-tools session case, and a mixed-session control that proves the per-session (not per-step) rule still aggregates prefixed real flows. Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🟠 SafeSkill Security Scan Results
Top Findings
app/chrome-extension/wxt.config.ts:131)app/chrome-extension/wxt.config.ts:136)app/chrome-extension/inject-scripts/web-fetcher-helper.js:2234)app/chrome-extension/utils/image-utils.ts:19)app/native-server/src/cli.ts:6)View full report on SafeSkill
About SafeSkill
SafeSkill is a free, open-source security scanner for AI tools, MCP servers, and Claude Code skills. We scan for code exploits, prompt injection, and data exfiltration risks.
False positive? We take accuracy seriously. If any finding above is incorrect, please open an issue and we will fix it immediately.