Skip to content

Commit 5932c47

Browse files
committed
feat: invert auth model — all routes protected, whitelist public paths
- Everything requires token by default when auth is enabled - New WEB2API_PUBLIC_PATHS env var for whitelisting (glob patterns) - Only / and /health are public by default - Updated README with new auth model and examples - Updated tests for new auth behavior
1 parent 65efe68 commit 5932c47

6 files changed

Lines changed: 249 additions & 39 deletions

File tree

README.md

Lines changed: 38 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -108,35 +108,50 @@ docker compose exec web2api web2api recipes catalog add hackernews --yes
108108

109109
## Access Token
110110

111-
Web2API can protect the sensitive management and MCP surfaces with a shared access token.
111+
Web2API can protect all HTTP routes except selected public paths with a shared access token.
112112

113113
Set one of:
114114
- `WEB2API_ACCESS_TOKEN`
115115
- `WEB2API_ACCESS_TOKEN_FILE` (path to a file containing the token)
116116

117-
When configured, Web2API requires the token for:
118-
- `/api/recipes/manage*`
119-
- `/mcp*`
117+
By default, when configured, Web2API requires the token for everything except:
118+
- `/`
119+
- `/health`
120+
121+
You can keep extra routes public while token auth stays enabled by setting
122+
`WEB2API_PUBLIC_PATHS` to a comma- or newline-separated list of exact paths or shell-style
123+
glob patterns matched against the request path.
124+
125+
Common examples:
126+
- `/api/sites`
127+
- `/docs`
128+
- `/openapi.json`
129+
- `/allenai/*`
130+
- `/*/chat`
131+
132+
Any path that matches one of those patterns skips token auth, so use this allowlist sparingly.
120133

121134
Send the token as either:
122135
- `Authorization: Bearer <token>`
123136
- `X-Web2API-Key: <token>`
124137

125-
Public scrape/discovery routes remain open:
126-
- `/`
127-
- `/health`
128-
- `/api/sites`
129-
- `/{slug}/{endpoint}`
138+
Example mixed setup:
139+
140+
```bash
141+
export WEB2API_ACCESS_TOKEN="secret-token"
142+
export WEB2API_PUBLIC_PATHS="/api/sites,/allenai/*,/docs,/openapi.json"
143+
```
130144

131-
Examples:
145+
Authenticated request examples:
132146

133147
```bash
148+
curl -H "Authorization: Bearer $WEB2API_ACCESS_TOKEN" http://localhost:8010/allenai/chat?q=example&page=1
134149
curl -H "Authorization: Bearer $WEB2API_ACCESS_TOKEN" http://localhost:8010/api/recipes/manage
135150
curl -H "Authorization: Bearer $WEB2API_ACCESS_TOKEN" http://localhost:8010/mcp/tools
136151
```
137152

138153
When token auth is enabled, the built-in web UI shows an access-token input and stores the token in
139-
browser local storage for repository/MCP actions.
154+
browser local storage for protected browser actions.
140155

141156
## CLI
142157

@@ -387,15 +402,15 @@ A simpler HTTP-based tool bridge is also available for non-MCP clients:
387402

388403
| Endpoint | Description |
389404
|---|---|
390-
| `GET /` | HTML index listing all recipes and endpoints |
391-
| `GET /health` | Service, browser pool, and cache health |
392-
| `GET /api/sites` | JSON list of all recipes with endpoint metadata |
393-
| `GET /api/recipes/manage` | JSON catalog + installed recipe state for UI/automation (protected when token auth is enabled) |
394-
| `POST /api/recipes/manage/install/{name}` | Install recipe by catalog entry name (protected when token auth is enabled) |
395-
| `POST /api/recipes/manage/update/{slug}` | Update installed managed recipe (protected when token auth is enabled) |
396-
| `POST /api/recipes/manage/uninstall/{slug}` | Uninstall recipe (add `?force=true` for unmanaged local recipes, protected when token auth is enabled) |
397-
| `POST /api/recipes/manage/enable/{slug}` | Enable installed recipe (protected when token auth is enabled) |
398-
| `POST /api/recipes/manage/disable/{slug}` | Disable installed recipe (protected when token auth is enabled) |
405+
| `GET /` | HTML index listing all recipes and endpoints (always public) |
406+
| `GET /health` | Service, browser pool, and cache health (always public) |
407+
| `GET /api/sites` | JSON list of all recipes with endpoint metadata (protected by default when token auth is enabled) |
408+
| `GET /api/recipes/manage` | JSON catalog + installed recipe state for UI/automation (protected by default when token auth is enabled) |
409+
| `POST /api/recipes/manage/install/{name}` | Install recipe by catalog entry name (protected by default when token auth is enabled) |
410+
| `POST /api/recipes/manage/update/{slug}` | Update installed managed recipe (protected by default when token auth is enabled) |
411+
| `POST /api/recipes/manage/uninstall/{slug}` | Uninstall recipe (add `?force=true` for unmanaged local recipes, protected by default when token auth is enabled) |
412+
| `POST /api/recipes/manage/enable/{slug}` | Enable installed recipe (protected by default when token auth is enabled) |
413+
| `POST /api/recipes/manage/disable/{slug}` | Disable installed recipe (protected by default when token auth is enabled) |
399414

400415
`GET /api/recipes/manage` includes:
401416
- `catalog`: entries from the current catalog source
@@ -405,6 +420,7 @@ A simpler HTTP-based tool bridge is also available for non-MCP clients:
405420
### Recipe Endpoints
406421

407422
All recipe endpoints follow the pattern: `GET /{slug}/{endpoint}?page=1&q=...`
423+
and require the access token by default when token auth is enabled.
408424

409425
- `page` — pagination (default: 1)
410426
- `q` — query text (required when `requires_query: true`)
@@ -644,8 +660,9 @@ Environment variables (with defaults):
644660
| `WEB2API_RECIPE_CATALOG_REF` | empty | Optional git ref for catalog source |
645661
| `WEB2API_RECIPE_CATALOG_PATH` | `catalog.yaml` | Catalog file path inside catalog source |
646662
| `PLUGIN_ENFORCE_COMPATIBILITY` | false | Skip plugin recipes outside declared `web2api` version bounds |
647-
| `WEB2API_ACCESS_TOKEN` | empty | Shared access token for admin API and MCP endpoints |
663+
| `WEB2API_ACCESS_TOKEN` | empty | Shared access token for all routes except public paths |
648664
| `WEB2API_ACCESS_TOKEN_FILE` | empty | Path to file containing the access token (alternative to `WEB2API_ACCESS_TOKEN`) |
665+
| `WEB2API_PUBLIC_PATHS` | empty | Extra public path patterns to allow without auth while token auth is enabled |
649666
| `BIRD_AUTH_TOKEN` | empty | X/Twitter auth token for `x` recipe |
650667
| `BIRD_CT0` | empty | X/Twitter ct0 token for `x` recipe |
651668

docker-compose.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ services:
2424
WEB2API_RECIPE_CATALOG_REF: "${WEB2API_RECIPE_CATALOG_REF:-}"
2525
WEB2API_RECIPE_CATALOG_PATH: "${WEB2API_RECIPE_CATALOG_PATH:-}"
2626
WEB2API_ACCESS_TOKEN: "${WEB2API_ACCESS_TOKEN:-}"
27+
WEB2API_PUBLIC_PATHS: "${WEB2API_PUBLIC_PATHS:-}"
2728
LOG_LEVEL: "${LOG_LEVEL:-info}"
2829
BIRD_AUTH_TOKEN: "${BIRD_AUTH_TOKEN:-}"
2930
BIRD_CT0: "${BIRD_CT0:-}"

tests/integration/test_api.py

Lines changed: 74 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -397,7 +397,7 @@ async def fake_scrape(
397397

398398

399399
@pytest.mark.asyncio
400-
async def test_access_token_protects_admin_and_mcp_surfaces(
400+
async def test_access_token_protects_all_routes_except_public_surfaces(
401401
tmp_path: Path,
402402
monkeypatch: pytest.MonkeyPatch,
403403
) -> None:
@@ -427,17 +427,18 @@ async def fake_scrape(
427427
transport = ASGITransport(app=app)
428428
async with AsyncClient(transport=transport, base_url="http://testserver") as client:
429429
public_resp = await client.get("/alpha/read")
430-
assert public_resp.status_code == 200
430+
assert public_resp.status_code == 401
431431

432432
sites_resp = await client.get("/api/sites")
433-
assert sites_resp.status_code == 200
433+
assert sites_resp.status_code == 401
434434

435435
health_resp = await client.get("/health")
436436
assert health_resp.status_code == 200
437437

438438
index_resp = await client.get("/")
439439
assert index_resp.status_code == 200
440440
assert "Paste access token" in index_resp.text
441+
assert "public paths shown below" in index_resp.text
441442

442443
manage_resp = await client.get("/api/recipes/manage")
443444
assert manage_resp.status_code == 401
@@ -470,6 +471,76 @@ async def fake_scrape(
470471
)
471472
assert authorized_mcp.status_code == 200
472473

474+
authorized_sites = await client.get(
475+
"/api/sites",
476+
headers={"Authorization": "Bearer secret-token"},
477+
)
478+
assert authorized_sites.status_code == 200
479+
480+
authorized_recipe = await client.get(
481+
"/alpha/read",
482+
headers={"Authorization": "Bearer secret-token"},
483+
)
484+
assert authorized_recipe.status_code == 200
485+
486+
487+
@pytest.mark.asyncio
488+
async def test_access_token_allows_configured_public_path_patterns(
489+
tmp_path: Path,
490+
monkeypatch: pytest.MonkeyPatch,
491+
) -> None:
492+
recipes_dir = tmp_path / "recipes"
493+
_write_recipe(recipes_dir, "alpha")
494+
_write_recipe(recipes_dir, "beta")
495+
496+
async def fake_scrape(
497+
*,
498+
pool: FakePool,
499+
recipe,
500+
endpoint: str,
501+
page: int = 1,
502+
query: str | None = None,
503+
extra_params: dict[str, str] | None = None,
504+
scrape_timeout: float = 30.0,
505+
) -> ApiResponse:
506+
_ = pool, endpoint, query, extra_params, scrape_timeout
507+
return _success_response(slug=recipe.config.slug, endpoint="read", page=page)
508+
509+
monkeypatch.setattr("web2api.main.scrape", fake_scrape)
510+
monkeypatch.setenv("WEB2API_ACCESS_TOKEN", "secret-token")
511+
monkeypatch.setenv("WEB2API_PUBLIC_PATHS", "/api/sites,/alpha/*")
512+
513+
fake_pool = FakePool()
514+
app = create_app(recipes_dir=recipes_dir, pool=fake_pool)
515+
516+
async with app.router.lifespan_context(app):
517+
transport = ASGITransport(app=app)
518+
async with AsyncClient(transport=transport, base_url="http://testserver") as client:
519+
index_resp = await client.get("/")
520+
assert index_resp.status_code == 200
521+
assert "/alpha/*" in index_resp.text
522+
523+
health_resp = await client.get("/health")
524+
assert health_resp.status_code == 200
525+
526+
sites_resp = await client.get("/api/sites")
527+
assert sites_resp.status_code == 200
528+
529+
alpha_resp = await client.get("/alpha/read")
530+
assert alpha_resp.status_code == 200
531+
532+
beta_resp = await client.get("/beta/read")
533+
assert beta_resp.status_code == 401
534+
535+
manage_resp = await client.get("/api/recipes/manage")
536+
assert manage_resp.status_code == 401
537+
538+
authorized_beta = await client.get(
539+
"/beta/read",
540+
headers={"Authorization": "Bearer secret-token"},
541+
)
542+
assert authorized_beta.status_code == 200
543+
473544

474545
@pytest.mark.asyncio
475546
async def test_response_cache_serves_fresh_and_stale_results(

tests/unit/test_auth.py

Lines changed: 71 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
import pytest
88
from starlette.datastructures import Headers
99

10-
from web2api.auth import AuthConfig, load_auth_config, request_is_authorized
10+
from web2api.auth import AuthConfig, load_auth_config, public_auth_payload, request_is_authorized
1111

1212

1313
def test_auth_config_loads_token_from_file(
@@ -23,6 +23,29 @@ def test_auth_config_loads_token_from_file(
2323

2424
assert config.enabled is True
2525
assert config.access_token == "secret-token"
26+
assert config.public_path_patterns == ("/", "/health")
27+
28+
29+
def test_auth_config_loads_extra_public_paths_from_env(
30+
monkeypatch: pytest.MonkeyPatch,
31+
) -> None:
32+
monkeypatch.delenv("WEB2API_ACCESS_TOKEN_FILE", raising=False)
33+
monkeypatch.setenv("WEB2API_ACCESS_TOKEN", "secret-token")
34+
monkeypatch.setenv(
35+
"WEB2API_PUBLIC_PATHS",
36+
"/api/sites, /allenai/*\n/docs,/openapi.json",
37+
)
38+
39+
config = load_auth_config()
40+
41+
assert config.public_path_patterns == (
42+
"/",
43+
"/health",
44+
"/api/sites",
45+
"/allenai/*",
46+
"/docs",
47+
"/openapi.json",
48+
)
2649

2750

2851
def test_request_is_authorized_supports_bearer_and_alt_header() -> None:
@@ -40,3 +63,50 @@ def test_request_is_authorized_supports_bearer_and_alt_header() -> None:
4063
Headers({"authorization": "Bearer wrong"}),
4164
config,
4265
) is False
66+
67+
68+
def test_auth_config_requires_auth_for_all_non_public_routes() -> None:
69+
config = AuthConfig(access_token="secret-token")
70+
71+
assert config.requires_auth("/") is False
72+
assert config.requires_auth("/health") is False
73+
assert config.requires_auth("/health/") is False
74+
assert config.requires_auth("/api/sites") is True
75+
assert config.requires_auth("/alpha/read") is True
76+
assert config.requires_auth("/mcp/tools") is True
77+
78+
79+
def test_auth_config_matches_additional_public_path_patterns() -> None:
80+
config = AuthConfig(
81+
access_token="secret-token",
82+
public_path_patterns=("/", "/health", "/api/sites", "/allenai/*", "/*/chat"),
83+
)
84+
85+
assert config.requires_auth("/api/sites") is False
86+
assert config.requires_auth("/allenai/chat") is False
87+
assert config.requires_auth("/foo/chat") is False
88+
assert config.requires_auth("/foo/read") is True
89+
90+
91+
def test_auth_config_rejects_invalid_public_path_patterns(
92+
monkeypatch: pytest.MonkeyPatch,
93+
) -> None:
94+
monkeypatch.setenv("WEB2API_ACCESS_TOKEN", "secret-token")
95+
monkeypatch.setenv("WEB2API_PUBLIC_PATHS", "api/sites")
96+
97+
with pytest.raises(ValueError, match="WEB2API_PUBLIC_PATHS entries must start with '/'"):
98+
load_auth_config()
99+
100+
101+
def test_public_auth_payload_describes_public_and_protected_surfaces() -> None:
102+
payload = public_auth_payload(
103+
AuthConfig(
104+
access_token="secret-token",
105+
public_path_patterns=("/", "/health", "/api/sites"),
106+
)
107+
)
108+
109+
assert payload["enabled"] is True
110+
assert payload["protected_surfaces"] == ["all routes except configured public path patterns"]
111+
assert payload["public_surfaces"] == ["/", "/health", "/api/sites"]
112+
assert payload["public_paths_env"] == "WEB2API_PUBLIC_PATHS"

0 commit comments

Comments
 (0)