Skip to content

bitboyro/scanos

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scanos

Recursive disk scanner with filtering, sorting, multi-format export, and an MCP server — single Python file, zero required dependencies, works on macOS and Linux.

scanos ~/Downloads --min 100mb --format tree
scanos . --largest
scanos --from scan.json --group-by ext --format excel

Install

Homebrew (macOS / Linux)

brew tap bitboyro/scanos
brew install scanos

Upgrade later with brew upgrade scanos.

Direct download (no brew, no pip)

curl -fsSL https://github.com/bitboyro/scanos/releases/latest/download/scanos \
  -o /usr/local/bin/scanos && chmod +x /usr/local/bin/scanos

pip (optional)

pip install scanos

Excel support

Excel export requires one optional dependency:

pip install openpyxl

All other formats (JSON, CSV, tree, XML, HTML) work with the Python standard library only.


Quick start

# Scan current directory, print JSON
scanos

# ASCII tree of Downloads
scanos ~/Downloads --format tree

# Save everything over 100 MB in Downloads to an auto-named CSV in your cwd
scanos ~/Downloads --min 100mb --format csv --save

# Top 10 largest files anywhere under home
scanos ~ --largest

# Top 10 heaviest directories
scanos ~ --heavy-dirs

# File-type breakdown saved as Excel
scanos ~/Documents --group-by ext --format excel --save

# Re-process a saved scan with a tighter filter — no re-scanning
scanos --from _save_Downloads_20260422-090000.json --min 500mb --format html --save

All options

scanos [PATH] [OPTIONS]

  PATH                    Directory to scan (default: current directory)

source:
  --from FILE             Re-process an existing .json scan instead of scanning

filter:
  --min SIZE              Minimum size  e.g. 10mb, 500kb, 1.5gb
  --max SIZE              Maximum size
  --ext EXT[,EXT]         Extension allowlist  e.g. .mp4,.mov
  --name PATTERN          Glob on filename  e.g. '*.log'
  --since DATE|DUR        Modified after  e.g. 2024-01-01  or  30d
  --until DATE|DUR        Modified before
  --depth N               Max recursion depth
  --follow-symlinks       Follow symbolic links (default: skip)
  --empty                 Show only empty files/folders

output:
  --format FMT            json | csv | excel | tree | xml | html  (default: json)
  --output FILE           Write to a specific path (format inferred from extension)
  --save                  Auto-save to _save_<name>_<timestamp>.<ext> in cwd
  --sort FIELD[:DIR]      Sort by size|name|date with optional :asc or :desc
  --top N                 Show top N results
  --bottom N              Show bottom N results
  --offset N              Skip first N results (pagination with --top)
  --flat                  Flatten tree to a plain list
  --files-only            Include only files
  --dirs-only             Include only directories
  --group-by ext|dir|date Aggregate results into groups with totals
  --rollup                Show only immediate children of the scan root
  --summary               Print a one-line stats summary

shortcuts:
  --largest               Top 10 largest files (flat, sorted by size desc)
  --newest                Top 10 most recently modified files
  --oldest                Top 10 oldest files
  --heavy-dirs            Top 10 largest directories (flat)

mcp:
  --mcp                   Run as MCP stdio server (JSON-RPC 2.0)

Output formats

Format Best for Deps
json Piping, scripting, re-processing stdlib
csv Spreadsheets, shell tools stdlib
tree Quick visual overview in terminal stdlib
xml Structured data pipelines stdlib
html Shareable, sortable report stdlib
excel Rich spreadsheet with formatting openpyxl

Saved file naming

When --save is used (or for Excel, always), files are written to your current directory as:

_save_<dirname>_<YYYYMMDD-HHMMSS>.<ext>

Example: _save_Downloads_20260422-143022.csv

When re-processing with --from, the stem of the input filename is used:

_save_scan_20260422-143022.xlsx

MCP server

scanos can run as an MCP stdio server, letting AI assistants like Claude scan and analyse your disk directly.

Claude Desktop setup

Add to ~/.claude/claude_desktop_config.json:

{
  "mcpServers": {
    "scanos": {
      "command": "scanos",
      "args": ["--mcp"]
    }
  }
}

Tools exposed

Tool Description
scan Scan a directory, returns full metadata tree
summary Human-readable disk usage summary
largest Top N largest files
filter Filter an existing scan tree by size/ext/date

Claude Code skill

You can also use scanos as a Claude Code agent skill. Add to your project's .claude/settings.json:

{
  "mcpServers": {
    "scanos": {
      "command": "scanos",
      "args": ["--mcp"]
    }
  }
}

Then ask Claude things like:

"What are the 10 biggest files in my Downloads folder?" "Show me all video files larger than 500 MB in my home directory." "Give me a breakdown of disk usage by file type in ~/Documents."


Re-processing saved scans

Scan once, slice many ways without re-traversing:

# Save a full scan
scanos ~ --save

# Later: get only videos > 200 MB from that scan
scanos --from _save_root_20260422-090000.json --ext .mp4,.mov --min 200mb --format csv --save

# And the Excel version
scanos --from _save_root_20260422-090000.json --ext .mp4,.mov --min 200mb --format excel

Requirements

  • Python 3.8 or later
  • macOS or Linux
  • openpyxl for Excel export only (pip install openpyxl)

About

Fast recursive disk scanner with filtering and multi-format export

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors