Skip to content

[FR] Introduce Auditing and Visual Feedback Capabilities for Computer Use #39

@soimy

Description

@soimy

Feature Description

As computer-use capabilities grant agents direct control over the desktop environments, security, transparency, and traceability become paramount. Currently, it can be challenging for users to track exactly what the agent is doing in real-time or review its actions after the fact.

I would like to propose a comprehensive Auditing and Visual Feedback System to make agent operations explicit, safe, and fully auditable.

Detailed Requirements

1. Real-time Visual Feedback (Active Border)

  • Desktop-level: When the agent is performing full-screen or system-wide operations, a distinct, colored flashing or solid border (e.g., green/purple, customizable) should frame the entire desktop.
  • App-level: If the agent's action target is confined to a specific application window, that active window should be highlighted with the colored border to clearly show the focus of the interaction.

2. Persistent Floating Overlay (HUD / Status Widget)

An "Always-on-Top" floating bubble or mini-dashboard should be displayed during execution, providing the following live metadata:

  • Source/Trigger: Where the request came from (e.g., API, specific workflow, or user session).
  • Current Action & Prompt: The active sub-task name (e.g., click, type) and the underlying prompt or reasoning.
  • Execution Metrics: Elapsed time, task duration, and step counter.
  • Note: The widget should be draggable so it doesn't block critical UI elements.

3. Structured Audit Logging

All interactions must be recorded into a dedicated local log file (ideally in a structured format like JSON or JSONL) for post-incident review and compliance. The logs should capture:

  • Source prompts and intent.
  • Exact tool calls executed (coordinates, keystrokes, etc.).
  • Screenshots/Artifacts taken during the process.
  • Execution outcomes and system responses.

Why is this needed?

  • Security & Trust: Users need to know exactly what an autonomous agent is doing on their machine to prevent unintended or malicious actions.
  • Debugging & Optimization: Developers can easily pinpoint where a workflow fails or where an agent misunderstands a prompt.
  • Compliance: Enterprises require strict audit trails for any automated system interacting with production environments or sensitive data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions