Skip to content

Redaction and Safety Rules for Audit Text #954

@SorraTheOrc

Description

@SorraTheOrc

Redaction and Safety Rules for Audit Text (WL-0MMNCOIYS15A1YSI)

Problem statement
Apply deterministic, irreversible redaction of email addresses in free-form audit.text before persistence so audit entries remain useful while preventing accidental storage of raw email PII.

Users

  • Operators and maintainers who create short audit notes from the CLI or API and need to avoid leaking email addresses.
    • Example: "As a maintainer, when I run wl update <id> --audit-text "Applied DB migration", I want any emails in that text masked before they are stored."
  • Automation authors and bots that write audit text programmatically and must not persist raw email addresses.

Success criteria

  • Email-like strings in audit.text are masked before being persisted (create and overwrite flows).
  • Masking is deterministic and irreversible: storing only the masked value; no original values retained in the work item record.
  • Masking preserves domain for context and uses the agreed format (keep first character of local part, replace remainder with three asterisks): alice@example.com -> a***@example.com.
  • Unit and integration tests cover positive and negative email cases and verify that stored audit text never contains raw email fixtures used in tests.

Constraints

  • Scope: only email addresses are redacted in this work item (other PII types are out of scope).
  • Redaction must be irreversible (no plaintext originals stored) to minimise risk and avoid new secrets/key management.
  • Detection should use a practical, common pattern (not full RFC-level parsing) to balance coverage and complexity.
  • Backwards compatibility: legacy comment-based audits remain unchanged; no automatic migration of historical comments.

Existing state

  • A companion work item Audit Write Path via CLI Update (WL-0MMNCOIYF18YPLFB) implements wl update <id> --audit-text and explicitly defers redaction to this item.
  • The project already expects tests and integration coverage for write+show+redaction flows and contains intake drafts referencing these items for traceability.

Desired change

  • Implement a small, well-tested redaction utility used by the audit write path to mask emails before persistence.
  • Ensure the audit write handler (create/overwrite) calls the utility prior to saving the audit.text field.
  • Add unit tests for redaction helper and integration tests exercising the CLI wl update --audit-text path to assert masked text is persisted and raw emails are not present.
  • Update a short docs/help note describing masking behavior and the rationale.

Related work

  • Audit Write Path via CLI Update (WL-0MMNCOIYF18YPLFB) — implements --audit-text; blocked-by this redaction item for PII handling.
  • End-to-End Validation, Docs and Reuse Alignment (WL-0MMNCOQY30S8312J) — integration tests and final docs referencing the redaction behavior.
  • Replace comment-based audits with structured audit field (WL-0MLDJ34RQ1ODWRY0) — parent item introducing the audit field; this item is a child concerned with safety rules.
  • Relevant files to check when implementing: src/commands/update.ts, src/persistent-store.ts, tests for CLI update and model behaviour.

Implementation notes (developer-focused)

  • Detection: use a practical regex that matches common email forms (local part with optional +tag, @, domain with at least one dot or common TLD). Avoid over-engineering with full RFC parsing.
  • Masking rule: keep the first character of the local part and replace the remainder with exactly three asterisks, then append @ and the original domain. Examples:
    • alice@example.com -> a***@example.com
    • a@x.io -> a***@x.io (local part a still yields a*** to keep deterministic output)
  • Irreversibility: do not store or log originals; only masked values persist in the audit.text field. Ensure tests do not leave raw emails in fixtures that are persisted.
  • Determinism: the same input must always produce the same masked output; avoid nondeterministic salts or runtime randomness.
  • Call sites: apply redaction on both create and overwrite paths of the audit write handler before any persistence or temporary logging that might end up in storage.
  • Tests: include canonical vectors below and negative cases to avoid false positives.

Canonical test vectors (to include as unit tests)

  • Positive: alice@example.com -> a***@example.com
  • Positive: first.last+tag@sub.domain.co.uk -> f***@sub.domain.co.uk
  • Positive: a@x.io -> a***@x.io
  • Negative: not-an-email@ -> no-match (unchanged)
  • Negative: user@localhost -> no-match (unchanged) unless project policy decides otherwise

Open questions (left in the intake for traceability)

  • Should user@localhost be considered an email to redact (default: no)?

File: .opencode/tmp/intake-draft-Redaction-and-Safety-Rules-for-Audit-Text-WL-0MMNCOIYS15A1YSI.md

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions