Omni-Search-Skill

Omni-Search-Skill is a full-stack search and retrieval skill for agentic workflows.

Its vision is simple:

no-blind-spot, high-speed web search and fetching across the public web

It combines search, fetch, search-then-fetch, and crawl into one skill with a unified output shape and provider routing layer.

What it does

Searches the live web through multiple providers
Fetches a specific page as clean Markdown
Resolves a query into top search hits and fetches the best page(s)
Crawls a site for relevant pages when a docs map or content graph is needed
Routes automatically between local, free, and paid providers
Detects junk content (captcha, JS-required pages) and falls back automatically
Skips known-blocked domains to avoid wasted attempts

Built-in providers

Search (12 providers)

Provider	Type	Free Tier
Jina Search	API (key optional)	Generous free tier
DuckDuckGo (ddgs)	Library	Unlimited
CN free (DDG + Bing CN)	HTML scraping	Unlimited
Brave Search	API	2,000/month
Serper.dev	API	2,500/month
Google CSE	API	100/day
Bing Web Search	API (Azure)	1,000/month
Tavily Search	API	1,000/month
Baidu AI Search	API	With key
Exa	via `mcporter`	With key

Fetch (4 providers)

Provider	Type	Notes
Local Scrapling	Local browser	Fast + stealth (Camoufox) auto-fallback
Jina Reader	API	Good for JS-heavy sites
Tavily Extract	API	Paid fallback
Firecrawl Scrape	API	Paid fallback

Crawl

Tavily Crawl

Routing model

Smart provider selection

The router uses a tiered, cost-optimized strategy:

Search routing:

Tier 1 — Free: Jina (with key) → DuckDuckGo (ddgs library) → CN free HTML
Tier 2 — Freemium: Tavily → Brave → Serper → Google CSE → Bing
Tier 3 — Specialized: Baidu (Chinese), Exa

Fetch routing:

Local Scrapling (fast mode, then stealth auto-fallback)
Jina Reader
Tavily Extract / Firecrawl (when paid allowed)

Domain-aware optimization: Sites known to block local fetching (x.com, zhihu, weibo, bloomberg, wsj, etc.) skip straight to API providers — saves time and avoids flaky failures.

Resilience features

Request-level retry with backoff for transient HTTP errors (429, 5xx)
Stealth fetch retry for Camoufox browser crashes
Junk content detection: captcha pages, JS-required shells → auto fallback
Graceful degradation: returns best available result instead of failing
Content quality threshold: minimum 500 chars of usable content before accepting

Repository layout

omni-search-skill/
  SKILL.md
  README.md
  README.zh-CN.md
  requirements.txt
  .env.example
  scripts/
    omni_search.py
    eval_benchmark.py
  omni_search_skill/
    cli.py
    models.py
    providers.py
    router.py
    utils.py

Installation

git clone https://github.com/d-wwei/omni-search-skill.git
cd omni-search-skill
python3 -m pip install -r requirements.txt

For stealth fetching (JS-heavy sites), also install the Camoufox browser:

python3 -m camoufox fetch

API keys (all optional)

The system works with zero API keys (using ddgs + local fetch), but adding keys unlocks more providers and better coverage:

Key	Provider	How to get
`JINA_API_KEY`	Jina Search + Reader	jina.ai
`BRAVE_API_KEY`	Brave Search	brave.com/search/api
`SERPER_API_KEY`	Serper.dev (Google SERP)	serper.dev
`TAVILY_API_KEY`	Tavily Search/Extract/Crawl	tavily.com
`GOOGLE_CSE_API_KEY` + `GOOGLE_CSE_CX`	Google Custom Search	developers.google.com
`BING_API_KEY`	Bing Web Search (Azure)	azure.microsoft.com
`BAIDU_API_KEY`	Baidu AI Search	cloud.baidu.com
`FIRECRAWL_API_KEY`	Firecrawl Scrape	firecrawl.dev

Place them in .env based on .env.example.

Quick start

# Check what is available in the current environment
python3 scripts/omni_search.py providers

# Search the web
python3 scripts/omni_search.py search "latest AI news"

# Fetch a page
python3 scripts/omni_search.py fetch "https://openai.com/news"

# Search first, then fetch top result(s)
python3 scripts/omni_search.py resolve "Tavily extract docs" --fetch-top 2

# Crawl a docs site
python3 scripts/omni_search.py crawl "https://docs.tavily.com"

Benchmark

The project includes a comprehensive benchmark (scripts/eval_benchmark.py) that tests against 35 fetch targets and 22 search queries across:

Social media (X, Reddit, Instagram, TikTok, Xiaohongshu)
Finance (Seeking Alpha, Yahoo Finance, Bloomberg, WSJ, FT)
Chinese web (Douban, Zhihu, 36kr, Weibo, Bilibili)
Tech (HN, GitHub, arXiv, StackOverflow, OpenAI docs)
News (Wikipedia, BBC, NYT, whitehouse.gov, WHO)
Hard targets (LinkedIn, Medium, Pinterest, Amazon, Google Scholar)
Multilingual search (English, Chinese, Japanese, French, Korean)

python3 scripts/eval_benchmark.py

Safety model

Blocks localhost and private-network fetch targets by default
Prefers local and lower-cost routes first
Uses paid providers only when they unlock better quality or coverage
Falls through to the next provider on failure instead of retrying the same route
Retries only on transient HTTP errors (429, 5xx) with backoff

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Omni-Search-Skill

What it does

Built-in providers

Search (12 providers)

Fetch (4 providers)

Crawl

Routing model

Smart provider selection

Resilience features

Repository layout

Installation

API keys (all optional)

Quick start

Benchmark

Safety model

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
omni_search_skill		omni_search_skill
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
SKILL.md		SKILL.md
labloop.md		labloop.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Omni-Search-Skill

What it does

Built-in providers

Search (12 providers)

Fetch (4 providers)

Crawl

Routing model

Smart provider selection

Resilience features

Repository layout

Installation

API keys (all optional)

Quick start

Benchmark

Safety model

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages