Skip to content

marx161-cmd/voomd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

voomd

voomd is an experimental GPU pressure daemon for Linux systems running AMD GPUs.

The goal is simple: treat VRAM pressure a bit more like RAM pressure, with policy, hysteresis, workload classification, and reclaim decisions that are better than "hope the driver recovers".

Status

Important

This is a dry-run prototype published as a useful snapshot, not a production-ready daemon. It is currently good for observation, policy iteration, and homelab experimentation. It is not yet something I would tell strangers to trust with real enforcement on important workloads.

What exists today:

  • a runnable daemon script: voomd
  • dry-run pressure tracking
  • workload classification by cgroup/unit/cmdline
  • per-device pressure states with hysteresis
  • structured decision logging
  • multi-source telemetry strategy:
    • amd-smi as primary
    • DRM sysfs as fallback
    • stale workload reuse when richer telemetry stalls

What does not exist yet:

  • production-grade enforcement
  • strong guarantees across different AMD stacks
  • polished packaging/service/install UX
  • broad hardware validation

Why Publish It Now

Because the idea and the current implementation are already useful.

I would rather publish an honest experimental repo than wait for a level of completeness that may take months while my attention moves to other projects. Maintenance may be sporadic. README statements about future work are intent, not a promise.

Current Design

voomd currently models:

  • NORMAL
  • GUARDED
  • PRESSURE
  • CRITICAL

and classifies workloads roughly into:

  • critical-no-kill
  • graceful-reclaim
  • killable

It watches GPU pressure, keeps state in ~/.local/state/voomd, and logs what it would reclaim in dry-run mode.

Files

Known Limitations

  • telemetry/provider mapping is still being refined
  • per-process attribution is only as good as the active provider
  • current behavior is tuned on one homelab, not generalized across many AMD systems
  • the daemon should be considered "policy research with working code", not finished infrastructure

Suggested Use

Use it if you want to:

  • inspect GPU pressure behavior
  • prototype reclaim policy on AMD/Linux
  • compare telemetry sources under load
  • build your own GPU oomd-style control loop

Do not use it yet if you need:

  • low-risk autonomous kill decisions
  • polished deployment
  • wide hardware support guarantees

License

Licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

About

Experimental AMD/Linux GPU pressure daemon prototype

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages