fix(memory_tree/jobs): scrub credentials from worker error logs#1363
Conversation
Anyhow chains and handler-supplied `reason` strings can carry upstream auth tokens; mask them at log emission while keeping the full text in `last_error` for diagnostics. Fixes tinyhumansai#1346
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Fixes #1346.
Free-form
reasonandlast_errorstrings inmem_tree_jobsare logged verbatim today. Anyhow chains and handler-supplied reasons can carry upstream provider responses, auth headers, or API keys, so the log sink (often a remote aggregator or a shared bug-report attachment) gets material that the persisted DB column doesn't need to redact. Per @sanil-23's review on #1345 the policy needs to be applied consistently acrossmark_failedandmark_deferred, not piecemeal.Introduces
src/openhuman/memory/tree/jobs/redact.rswithscrub_for_log(&str) -> String. Applied at the four log call-sites that emit free-form text:worker::run_once—JobOutcome::Deferinfo log (handler reason)worker::run_once—Err(_)warn log (anyhow{:#}chain)store::mark_failed— terminal-failure warn logstore::mark_failed— retry info logstore::mark_deferredonly logs the no-op stale-lease warning (no reason), so no log site there to scrub. Persistedlast_errorcolumns keep the original string for diagnostics — two new tests (mark_failed_persists_full_error_unredacted,mark_deferred_persists_full_reason_unredacted) lock that contract.What gets scrubbed
Bearer …(with or withoutAuthorization:prefix)sk-…ghp_…/ghs_…/gho_…xox?-…api_key=…/password=…/token=…/secret=…/auth=…assignments (bare, double-quoted, or single-quoted values)https://user:pass@host→https://***@host…(truncated, N more bytes)suffixTest plan
cargo test ... openhuman::memory::tree::jobs::redact::— 11 unit tests for the scrubber (Bearer, OpenAI, GitHub variants, Slack, generic key=value, URL userinfo, email, oversized truncation, multibyte boundary, idempotence, safe-string passthrough).cargo test ... openhuman::memory::tree::jobs::— 43/43 pass (11 redact + 30 existing + 2 new persistence contract tests).cargo check --manifest-path Cargo.toml --lib— clean.cargo fmt.PR Submission Checklist
redact.rsmodule is fully covered by its 11 unit tests; new code inworker.rs/store.rsis exercised by the two new persistence tests plus the existingmark_failed_retries_then_terminates/mark_deferred_*suite.scrub_for_logdocuments the policy; in-code comment at the call sites explains why we scrub the log copy but not the persisted column.Related
src/openhuman/memory/tree/jobs/{mod.rs,redact.rs,worker.rs,store.rs}