Skip to content

OpticLM/agent-scaffold

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agent-scaffold — cross-platform accessibility automation for LLM agents

A Rust workspace that gives an LLM-driven agent (or any program) the ability to operate the user's computer through the OS accessibility APIs — Windows UI Automation, Linux AT-SPI2, macOS Accessibility — behind a single, narrow trait surface. Bindings ship for Node.js (via napi-rs) and for Swift / Kotlin / Java (via boltffi).

Status

Backend Status Notes
Windows (UI Automation) working Wraps uiautomation 0.24. All trait methods implemented.
Linux (AT-SPI2) working Wraps atspi.
macOS (Accessibility) stub Compiles, every method returns Error::Unsupported.
Binding Status
Pure Rust (scfd facade) working
Node.js (scfd-napi, napi-rs 3) working — async/Promise surface
Swift / Kotlin / Java (scfd-boltffi) working — async on each target via boltffi codegen

At a glance

use scfd::{Backend, Element, Platform, Query, Role, TextMatch, SnapshotOptions};

let backend = Platform::new()?;

// Snapshot the foreground window for the LLM to reason about.
let focused = backend.focused()?;
let tree    = focused.snapshot(&SnapshotOptions::default())?;
println!("{}", serde_json::to_string_pretty(&tree)?);

// Act on a button by name.
if let Some(save) = focused.find(
    &Query::new()
        .role(Role::Button)
        .name(TextMatch::Equals("Save".into()))
        .timeout(std::time::Duration::from_secs(2)),
)? {
    save.invoke()?;
}
# Ok::<(), scfd::Error>(())

The same operations on the Node side:

import { Automation } from "scfd";

const a   = new Automation();
const el  = await a.focused();
const png = await el.snapshot({ maxDepth: 6 });
const btn = await el.find({ role: "button", name: "Save" });
if (btn) await btn.invoke();

Workspace layout

crates/
├── scfd-core/      Trait surface + value types. Pure Rust, zero platform deps.
├── scfd-windows/   Windows backend  (uiautomation crate).
├── scfd-linux/     Linux backend stub.
├── scfd-macos/     macOS backend stub.
├── scfd/           Facade — `use scfd::Platform;` picks the right backend.
├── scfd-napi/      Node.js bindings (cdylib).
└── scfd-boltffi/   Swift / Kotlin / Java bindings.

API at a glance

pub trait Backend {
    type Element: Element;
    fn root(&self)              -> Result<Self::Element>;
    fn focused(&self)           -> Result<Self::Element>;
    fn at_point(&self, p: Point) -> Result<Self::Element>;
    fn windows_for_pid(&self, pid: u32) -> Result<Vec<Self::Element>>;
}

pub trait Element: Sized {
    fn info(&self)                  -> Result<ElementInfo>;
    fn supports(&self, p: Pattern)  -> bool;

    fn parent(&self)                -> Result<Option<Self>>;
    fn children(&self)              -> Result<Vec<Self>>;
    fn find(&self, q: &Query)       -> Result<Option<Self>>;
    fn find_all(&self, q: &Query)   -> Result<Vec<Self>>;

    fn snapshot(&self, opts: &SnapshotOptions) -> Result<Tree>;

    fn invoke(&self)            -> Result<()>;
    fn focus(&self)             -> Result<()>;
    fn set_value(&self, v: &str)-> Result<()>;
    fn get_text(&self)          -> Result<String>;
    fn toggle(&self)            -> Result<()>;
    fn select(&self)            -> Result<()>;
    fn expand(&self)            -> Result<()>;
    fn collapse(&self)          -> Result<()>;
    fn scroll_into_view(&self)  -> Result<()>;
    fn window_close(&self)      -> Result<()>;
    fn window_set_focus(&self)  -> Result<()>;
}

That's the whole contract. ElementInfo, Tree, Query, Role, Pattern, Error are all plain serde-serialisable values, ready to be handed to a model.

Threading model

Windows UI Automation lives in its COM apartment, so the trait surface is sync and the backend is bound to the thread that constructed it. The Node and boltffi binding crates each spawn a single dedicated worker thread that owns the Platform and serialise calls over an mpsc channel. From the caller's side this looks like ordinary async/await — no blocking, no apartment-marshalling surprises.

If you embed the pure-Rust scfd crate directly, construct your Platform once on a long-lived thread and call into it from there.

Build & test

cargo check  --workspace
cargo test   --workspace --exclude scfd-napi
cargo clippy --workspace --all-targets -- -D warnings

# Windows smoke test — needs an interactive desktop session.
cargo test -p scfd-windows -- --ignored notepad_smoke

# Node bindings (in crates/scfd-napi):
npm install && npm run build && npm test

# Boltffi codegen for Swift / Kotlin / TypeScript:
cargo build -p scfd-boltffi --release
boltffi generate --crate scfd-boltffi --target swift,kotlin,ts

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages