██╗ ██╗███████╗██╗ ██╗██╗ ██╗
██║ ██║██╔════╝██║ ██║╚██╗██╔╝
███████║█████╗ ██║ ██║ ╚███╔╝
██╔══██║██╔══╝ ██║ ██║ ██╔██╗
██║ ██║███████╗███████╗██║██╔╝ ██╗
╚═╝ ╚═╝╚══════╝╚══════╝╚═╝╚═╝ ╚═╝
Helix is a next-generation open-source OSINT framework that goes far beyond username checking.
It maps the actual connections between a target's online identities — then renders them as a
live, interactive D3.js relational graph you can explore, filter, and export.
Quick Start · Features · Modules · Graph · Install
Most OSINT tools answer one question: "Does this username exist on Platform X?"
Helix answers a harder one: "How do all these accounts connect to the same person?"
It extracts cross-platform links from bios, matches profile pictures by perceptual hash, infers timezone from commit patterns, discovers domains via certificate transparency, and plots every relationship as a glowing edge in a browser-based network graph — all in a single command.
python helix.py -u johndoe --wayback --crt --paste --pivot --phash| Capability | Sherlock | SpiderFoot | Maltego | Helix |
|---|---|---|---|---|
| Username enumeration | ✓ | ✓ | ✓ | ✓ |
| Relational bio-link graph | ✗ | ✗ | Partial | ✓ |
| Recursive alias pivot | ✗ | ✗ | Manual | ✓ auto |
| Perceptual avatar matching | ✗ | ✗ | ✗ | ✓ |
| Timezone inference | ✗ | ✗ | ✗ | ✓ |
| Wayback identity timeline | ✗ | Partial | ✗ | ✓ |
| Certificate transparency | ✗ | ✓ | ✓ | ✓ |
| GitHub commit email extraction | ✗ | ✗ | ✗ | ✓ |
| Local heuristic verifier | ✗ | ✗ | ✗ | ✓ always-on |
| Multi-AI false-positive filter | ✗ | ✗ | ✗ | ✓ 3 providers |
| Async speed | ✗ | ✗ | ✗ | ✓ |
| 100% free & open source | ✓ | ✓ | ✗ | ✓ |
- Local Heuristic Verifier — Zero-dependency false-positive engine. Scores every result across 8 signals (WAF pages, generic titles, login redirects, homepage redirects). Runs before anything else, every single scan.
| Flag | What it does |
|---|---|
--wmn |
Loads WhatsMyName database at runtime — 700+ platforms, community-maintained |
--maigret |
Loads Maigret database at runtime — sophisticated detection with presenceStrs/absenceStrs, 24h cached |
--sherlock |
Loads Sherlock's database at runtime — 400+ platforms, cached 24h locally |
--pivot |
Recursive bio pivot — finds aliases in bios and auto-scans them, up to 4 hops deep |
--phash |
Perceptual avatar hash — downloads profile pics, hashes them, cross-matches across platforms. Finds the same person even if they changed their username |
--wayback |
Wayback Machine — fetches snapshot history + parses archived HTML for old usernames, historic emails, and past bios |
--crt |
Certificate Transparency — queries crt.sh for SSL certs containing the target's name or email. Finds personal domains that never appeared in any bio |
--paste |
Paste Intelligence — searches GitHub Gists and public Pastebin index for mentions |
--breach |
Breach check — queries XposedOrNot for breach metadata (names, dates, data types exposed). No credentials returned |
--holehe |
Deep email scan — hands off to holehe for 120+ platform email-registration checks |
--ai |
AI false-positive filter — second verification pass via Claude, OpenRouter (free), or NVIDIA NIM (free) |
- GitHub Deep Recon — runs automatically when a GitHub profile is found. Extracts real emails from public commits (filters noreply), org memberships, language stats, npm packages, and infers timezone from commit timestamp distribution (requires ≥15 commits for confidence)
The HTML output is a standalone zero-dependency interactive network — no server needed, just open in a browser.
White pulsing node → Username root
Amber pulsing node → Email root
Amber/orange nodes → Pivot-discovered aliases
Green solid edges → Bio-extracted cross-links (proven connections)
Pink dashed edges → Avatar hash matches (same person across accounts)
Amber dashed edges → Email-matched platforms
Green ring on node → High confidence (OG meta validated)
Blue ring on node → Medium confidence
Controls: drag nodes · scroll to zoom · click node to open profile · hover for tooltip (confidence, og:title, cross-link partners, bio-extracted alias details) · ⌕ search · ◌ not-found overlay · ☰ labels · ↓ SVG export · filter by confidence
git clone https://github.com/thalha-a9/helix.git
cd helix
pip install -r requirements.txt
python helix.py -u johndoeRequired
pip install aiohttpOptional — unlock more power
pip install curl-cffi # WAF bypass for Twitter, Instagram, TikTok, Patreon
pip install imagehash Pillow # Perceptual avatar hash matching (--phash)
pip install holehe # Deep email scanning 120+ platforms (--holehe)
pip install anthropic # Claude AI verification (--ai claude)
pip install openai # OpenRouter / NVIDIA AI verification (--ai openrouter)Or install everything at once
pip install "helix-osint[full]"Set GITHUB_TOKEN for 5000 req/hr on GitHub API (optional, default is 60/hr):
export GITHUB_TOKEN=ghp_yourtoken# Basic scan — opens interactive graph automatically
python helix.py -u johndoe
# Full power — all intelligence modules
python helix.py -u johndoe --wayback --crt --paste --pivot --phash
# Username + email — two root nodes, cross-matched in graph
python helix.py -u johndoe -e johndoe@gmail.com --breach --holehe
# Massive scan — 1100+ platforms
python helix.py -u johndoe --wmn --sherlock
# AI-verified scan (free — no API key cost)
python helix.py -u johndoe --ai openrouter
# Recursive pivot — auto-scan aliases up to 4 hops deep
python helix.py -u johndoe --pivot --pivot-depth 4
# Permutations — scan johndoe1, john.doe, realjohndoe, etc.
python helix.py -u johndoe --permutations
# Everything, saved to custom dir, no browser
python helix.py -u johndoe -e johndoe@gmail.com \
--wmn --sherlock --wayback --crt --paste \
--pivot --phash --breach --holehe \
--ai openrouter --format all --no-browser --output ~/Desktop/report
# Check which AI providers are configured
python helix.py --providersHelix has a two-layer false-positive filter:
Layer 1 — Local heuristic verifier (always on, zero cost)
Scores every result across 8 signals. A single generic title (e.g. "Pinterest" instead of a username) instantly purges the result. WAF/Cloudflare pages scored separately at 80 points. Threshold: 60 for normal results, 85 for OG-validated high-confidence results.
Layer 2 — AI verifier (--ai, optional)
Sends uncertain results to an LLM with a strict system prompt. Three providers:
| Provider | Flag | Cost | Setup |
|---|---|---|---|
| Anthropic Claude | --ai claude |
Paid | export ANTHROPIC_API_KEY=... |
| OpenRouter Llama 3.1 | --ai openrouter |
Free tier | export OPENROUTER_API_KEY=... → openrouter.ai |
| NVIDIA NIM Llama 3.1 | --ai nvidia |
Free tier | export NVIDIA_API_KEY=... → build.nvidia.com |
helix/
├── helix.py ← CLI entry point + orchestrator
├── pyproject.toml ← pip installable (helix-osint)
├── osint/
│ ├── checker.py ← Async engine (aiohttp + optional curl_cffi)
│ ├── platforms.py ← 70+ platform definitions with OG/API detection
│ ├── verifier.py ← Local heuristic false-positive engine
│ ├── graph.py ← D3.js relational graph generator
│ ├── report.py ← JSON / CSV / TXT exporters
│ ├── permutations.py ← Username variation generator
│ ├── pivot.py ← Concurrent BFS alias pivot engine
│ ├── phash.py ← Perceptual avatar hash matcher
│ └── modules/
│ ├── wayback.py ← Archive.org CDX API + archived HTML parser
│ ├── github_deep.py ← GitHub API deep recon + timezone inference
│ ├── crt.py ← Certificate transparency (crt.sh)
│ └── paste.py ← Gist + Pastebin intelligence
│ └── adapters/
│ ├── sherlock_adapter.py ← Sherlock data.json loader (24h cached)
│ ├── wmn_adapter.py ← WhatsMyName loader
│ ├── holehe_adapter.py ← holehe email scanner wrapper
│ ├── breach_adapter.py ← XposedOrNot breach metadata
│ └── ai_verifier.py ← Multi-provider async AI verification
└── results/ ← Output (git-ignored)
└── username/
├── username_graph.html ← Interactive D3.js network graph
├── username_TIMESTAMP.json ← Full structured report
├── username_TIMESTAMP.csv
└── username_TIMESTAMP.txt
Helix uses the right detection method per platform instead of naive HTTP 200 checks:
| Platform | Method | Why |
|---|---|---|
reddit.com/user/{u}/about.json → "is_employee" field |
JSON API; field only exists for valid users | |
| Bluesky | AT Protocol API | SPA — static HTML is useless |
| Chess.com | api.chess.com/pub/player/{u} |
Official public API |
| Lichess | lichess.org/api/user/{u} |
Official public API |
| GitHub | og:title parsed + validated against known error strings |
Server-side rendered, reliable |
| Medium | og:title rejects homepage redirect string |
Catches "Where good ideas find you" |
| Twitter/X | curl_cffi TLS impersonation |
Skipped gracefully without it |
| Format | Contents |
|---|---|
.html |
Standalone interactive D3.js graph — no server needed |
.json |
Full structured report including intel bundle (wayback, GitHub deep, CRT, paste) |
.csv |
Spreadsheet-friendly, all platforms |
.txt |
Clean terminal-style summary |
Pull requests are welcome. For major changes open an issue first.
When adding a platform to platforms.py:
- Prefer
og_metaor API endpoints overtext_not_present - Always test against a non-existent username first — if it returns
found=True, your detection is wrong - Add
bio_extract: True+bio_patternsif the platform renders bio text server-side
Helix is built for security research, bug bounty reconnaissance, and OSINT education. All data sources used are publicly accessible. Always ensure you have proper authorization before running reconnaissance on any target. The author is not responsible for misuse.
- esp32-iot-audit — ESP32 IoT security scanner
- esp-pentest-toolkit — Wireless ESP32/8266 pentest toolkit
Built by Thalha Ahmed · @thalha-a9
If Helix helped you — drop a ⭐ and share it with your security community.