Error in user YAML: (<unknown>): did not find expected alphabetic or numeric character while scanning an alias at line 1 column 1
---
**Memory Bank Protocol**
**Reading Priority:** CRITICAL
**Read When:** When choosing libraries, verifying dependencies, resolving conflicts, cost planning
**Dependencies:** Read project-brief.md for "Glue Coding" philosophy
**Purpose:** Actual technology stack, dependency versions, MVP cost structure
**Last Updated:** 2026-03-08
---
Constraint: Minimize cost for PMF validation. Only two paid services are required — everything else runs on free tiers.
| Service | Cost | Tier | Purpose |
|---|---|---|---|
| Supabase Pro | $25/mo | Pro (required) | PostgreSQL + Auth, 8GB, 200 connections |
| Gemini API | ~$1-5/mo | Pay-as-you-go | Primary LLM — $0.10/M input, $0.40/M output |
| Railway | $0 | Free (500h/mo) | Backend hosting + native Redis |
| Vercel | $0 | Hobby | Frontend hosting, unlimited bandwidth |
| Sentry | $0 | Free (5K events/mo) | Error tracking + performance |
| Resend | $0 | Free (3K emails/mo) | Transactional email |
| PRAW | $0 | Free API | Reddit scraping |
| pytrends | $0 | Unofficial API | Google Trends (no key required) |
| Firecrawl | $0 | Free (500 pages/mo) | Web → Markdown; free tier covers MVP |
| Crawl4AI | $0 | Self-hosted | Free Firecrawl alternative (runs in container) |
| Twitter/X | $0 | Free basic tier | 500K tweets/mo read access |
| Stripe | $0/mo + 2.9% | No monthly fee | Payments (fees only on revenue) |
| Total | ~$30/mo | — | Full production platform |
Scalability note: At 10K paying users ($59K MRR target), costs scale to ~$50-70/mo (still 99%+ margin). Firecrawl Pro ($149/mo) is the only meaningful cost addition at that scale.
- Next.js 16.1.3 — App Router, React Server Components, Turbopack
- React 19.2.3 — Latest stable with concurrent features
- TypeScript — Strict typing throughout
- Node.js 22.x LTS
- Tailwind CSS 4.0 — New CSS architecture with
@layersyntax - shadcn/ui — 25 components (copy-paste, based on Radix UI)
- Design system: Deep teal primary + warm amber accent — "Data Intelligence" aesthetic
- Typography: Instrument Serif (headings) · Satoshi (body) · JetBrains Mono (data/scores)
- Satoshi self-hosted: 3 woff2 files in
frontend/public/fonts/, loaded vianext/font/localwithdisplay:'optional';globals.cssusesvar(--font-satoshi)(NOT the string"Satoshi")
- Satoshi self-hosted: 3 woff2 files in
- Animation: Framer Motion 12.x — stagger reveals, counter animations, skeleton morphing
- TanStack Query 5.x (
@tanstack/react-query) — Server state, 60s stale time, 2 retries - React Hook Form + @hookform/resolvers — Forms with Zod schema validation
- Zod 4.x — Runtime type validation of API responses and form data
- Recharts 3.6.0 — React 19 compatible; radar charts, scatter charts, line/area/bar
- All visualizations implemented via Recharts. Tremor is not installed (not needed).
- PostHog — Product analytics; loaded via deferred dynamic import inside
useEffect(removes ~40KB from main bundle).NEXT_PUBLIC_POSTHOG_KEY+NEXT_PUBLIC_POSTHOG_HOSTenv vars required.
{
"next": "16.1.3",
"react": "19.2.3",
"react-dom": "19.2.3",
"@tanstack/react-query": "^5.90.19",
"recharts": "^3.6.0",
"framer-motion": "^12.29.2",
"tailwindcss": "^4.0",
"@sentry/nextjs": "^10.38.0",
"@supabase/ssr": "^0.8.0",
"@supabase/supabase-js": "^2.91.1",
"axios": "^1.13.2",
"date-fns": "^4.1.0",
"zod": "^4.3.6",
"react-hook-form": "^7.71.1",
"react-markdown": "^10.1.0",
"remark-gfm": "^4.0.1",
"lucide-react": "^0.562.0",
"next-themes": "^0.4.6",
"sonner": "^2.0.7",
"@vercel/analytics": "^1.6.1",
"@vercel/speed-insights": "^1.3.1"
}- FastAPI ≥0.109 — Async-first, automatic OpenAPI/Swagger
- Python 3.11+
- Uvicorn[standard] — ASGI server
- Pydantic V2 — Strict validation throughout
- Pydantic Settings — Environment variable management
- PostgreSQL via Supabase Pro (ap-southeast-2 Sydney)
- Connection: session-mode pooler, SSL required
- Pool: 20 size, 30 max overflow, recycle 3600s
- 70 tables, all with Row-Level Security enabled
- 16 Alembic migrations (c016 at head)
- SQLAlchemy 2.0 async (
sqlalchemy[asyncio]>=2.0.25) - asyncpg ≥0.29 — Async PostgreSQL driver
- Alembic ≥1.13 — Migration version control
- Arq ≥0.25 — Async task queue (built on Redis), lighter than Celery
- APScheduler ≥3.11 — Cron/interval job scheduling within FastAPI lifespan
- Redis — Railway native Redis (
redis.railway.internal:6379, free, no TLS needed) conn_retries=0, conn_timeout=3— Fast startup even if Redis unavailable
- PydanticAI ≥1.x — Agent orchestration, native async, structured JSON output
- 8 active agents:
analyzer,enhanced_analyzer,research,competitive_intel,market_intel,content_generator,chat_agent,market_insight_publisher - All agents use
output_type=andresult.output(v1.x API —result_type/result.dataare deprecated) - Model string:
"google-gla:gemini-2.0-flash"(hardcoded in all agents)
- 8 active agents:
- Primary LLM: Gemini 2.0 Flash via
GOOGLE_API_KEY- $0.10/M input · $0.40/M output — 97% cheaper than Claude
- 1M token context window
- Fallback LLMs (installed but not used in production to minimize cost):
anthropic>=0.25.0— Claude 3.5 Sonnet (activate only if Gemini quota exceeded)openai>=1.12.0— GPT-4o (activate only for vision tasks if needed)
- PRAW ≥7.7 — Reddit API (structured posts, scores, metadata). Free.
- pytrends ≥4.9 — Google Trends (unofficial API, no key needed). Free.
- Tweepy ≥4.14 — Twitter/X API (free basic tier, 500K reads/mo)
- Firecrawl (
firecrawl-py>=0.0.16) — Web → Markdown. Free tier (500 pages/mo) for MVP. - Crawl4AI ≥0.2 (
USE_CRAWL4AI=trueon Railway) — Self-hosted Firecrawl alternative, $0 cost - BeautifulSoup4 — HTML parsing (Product Hunt scraper + Hacker News scraper)
- python-jose[cryptography] ≥3.3 — JWT ES256 validation via Supabase JWKS
- JWKS URL:
{supabase_url}/auth/v1/.well-known/jwks.json - Issuer:
{supabase_url}/auth/v1
- JWKS URL:
- bleach ≥6.1 — XSS prevention on all user inputs
- itsdangerous ≥2.1 — Token signing for email unsubscribe
- Stripe ≥7.0 — Subscriptions + webhooks. No monthly fee, 2.9% + 30¢ per transaction.
- Resend ≥0.7 — Transactional email. Free tier (3K/mo) covers MVP.
- SlowAPI ≥0.1.9 — Rate limiting (tier-based, in-memory for non-production)
- sse-starlette ≥1.6.5 — Server-Sent Events for real-time feed + admin updates
- sentry-sdk[fastapi] ≥2.0.0
enable_logs=True+LoggingIntegration(WARNING+ → Sentry Logs)FastApiIntegration+SqlalchemyIntegration- User context via
sentry_sdk.set_user()indeps.py - Manual
gen_ai.requestspans in enhanced_analyzer, research_agent, market_intel_agent - Sample rate: 10% production, 100% staging
dependencies = [
"fastapi>=0.109.0",
"uvicorn[standard]>=0.27.0",
"pydantic>=2.5.0",
"pydantic-settings>=2.1.0",
"sqlalchemy[asyncio]>=2.0.25",
"alembic>=1.13.0",
"asyncpg>=0.29.0",
"redis>=5.0.1",
"pydantic-ai>=0.0.13", # v1.x in practice
"anthropic>=0.25.0", # fallback only
"openai>=1.12.0", # fallback only
"firecrawl-py>=0.0.16",
"crawl4ai>=0.2.0", # free alternative (USE_CRAWL4AI=true)
"playwright>=1.40.0", # required by crawl4ai
"praw>=7.7.1",
"pytrends>=4.9.2",
"tweepy>=4.14.0",
"arq>=0.25.0",
"apscheduler>=3.11.2",
"python-dotenv>=1.0.0",
"httpx>=0.26.0",
"python-jose[cryptography]>=3.3.0",
"sse-starlette>=1.6.5",
"stripe>=7.0.0",
"resend>=0.7.0",
"slowapi>=0.1.9",
"sentry-sdk[fastapi]>=2.0.0",
"beautifulsoup4>=4.14.3",
"bleach>=6.1.0",
"itsdangerous>=2.1.0",
"rich>=14.2.0",
"cachetools>=5.3.0", # Phase 6.3A: L1 in-memory TTL cache
]| Service | Purpose | Tier | Notes |
|---|---|---|---|
| Railway | Backend (FastAPI + Arq worker) | Free (500h/mo) | Docker, port 8080, railway.toml |
| Railway Redis | Task queue + rate limiting + L2 cache | Free (native) | redis.railway.internal:6379, plain TCP; 3-tier caching: L1 cachetools TTLCache (30s) → L2 Redis (60-300s) → stale fallback |
| Vercel | Frontend (Next.js) | Hobby (free) | App Router, unlimited bandwidth |
- GitHub Actions —
.github/workflows/ci-cd.ymlmainpush → production (Security Scan → Tests → Migrate → Build → Deploy)developpush → staging (same pipeline, staging env)RAILWAY_API_TOKEN(account-level) +--projectflag required
- Alembic — Python-managed, 16 migrations (c001–c016)
- Never re-run migration c006 (not idempotent)
- Migration c008 (
purge_seed_data) is irreversible — run staging first - Current head:
c016
- uv (Python) — Blazing-fast, replaces pip/poetry:
uv sync,uv run - npm (Node.js) — Standard package management
- Ruff — Python linter + formatter (replaces flake8 + black)
- ESLint + Prettier — TypeScript/React
- pytest ≥7.4 with
pytest-asyncio+pytest-cov— 398 tests, 47% coverage - Playwright — 47 E2E tests, 5 browsers (Chrome, Firefox, Safari, Mobile Chrome, Mobile Safari)
[Reddit / PH / Google Trends / HN / Twitter/X]
↓ PRAW, pytrends, Tweepy, Firecrawl/Crawl4AI
[Arq Worker + APScheduler] ──► [Railway Redis Queue]
↓ every 6h
[Supabase PostgreSQL: raw_signals table]
↓ unprocessed signals
[8 PydanticAI Agents] ──► [Gemini 2.0 Flash $0.10/M]
↓ structured insights
[Supabase PostgreSQL: insights table (70 tables total)]
↓ REST / SSE
[FastAPI: 235+ endpoints]
↓ JSON
[Next.js 16 App Router: 35+ routes]
Loop 1 — Collection (every 6h via APScheduler):
- 6 scrapers produce 150+ signals/day
- Raw signals stored with provenance (source, URL, timestamp)
Loop 2 — Analysis (after each collection):
analyze_signals_taskprocesses unprocessed signals in batches of 10- Gemini 2.0 Flash produces 8-dimension scored insights
- Post-LLM validation: 300+ word minimum, SHA-256 deduplication
Loop 3 — Presentation (on-demand):
- FastAPI serves ranked insights, SSE streams live updates
- Next.js renders editorial UI with radar charts, evidence badges, trend sparklines
| Agent | File | Task Function | Schedule |
|---|---|---|---|
enhanced_analyzer |
enhanced_analyzer.py |
analyze_signals_task |
Every 6h |
research |
research_agent.py |
fetch_daily_insight_task |
Daily 08:00 UTC |
competitive_intel |
competitive_intel_agent.py |
— | Manual |
market_intel |
market_intel_agent.py |
market_insight_publisher_task |
Wed every 3 days |
content_generator |
content_generator.py |
run_content_pipeline_task |
Dynamic |
chat_agent |
chat_agent.py |
— | On-demand (SSE) |
quality_reviewer |
quality_reviewer.py |
insight_quality_audit_task |
Dynamic |
market_insight_publisher |
market_insight_publisher.py |
market_insight_publisher_task |
Wed every 3 days |
All agents use output_type= (PydanticAI v1.x) and result.output (not result.data).
Model string: "google-gla:gemini-2.0-flash" (hardcoded, no settings.llm_model).
ENVIRONMENT=production
DATABASE_URL=postgresql+asyncpg://postgres.[REF]:[PASS]@aws-1-ap-southeast-2.pooler.supabase.com:5432/postgres
SUPABASE_URL=https://[REF].supabase.co
SUPABASE_ANON_KEY=eyJ...
SUPABASE_SERVICE_ROLE_KEY=eyJ...
GOOGLE_API_KEY=AIza... # Gemini 2.0 Flash
REDIS_URL=redis://default:[PASS]@redis.railway.internal:6379 # Railway native
STRIPE_SECRET_KEY=sk_live_...
STRIPE_WEBHOOK_SECRET=whsec_...
RESEND_API_KEY=re_...
SENTRY_DSN=https://...@ingest.us.sentry.io/...
CORS_ORIGINS=https://start-insight-ascentias-projects.vercel.app,...
PORT=8080
DB_SSL=true
USE_CRAWL4AI=true # Free scraping alternativeNEXT_PUBLIC_API_URL=https://backend-production-e845.up.railway.app
NEXT_PUBLIC_SUPABASE_URL=https://[REF].supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=eyJ...
NEXT_PUBLIC_ENVIRONMENT=production
NEXT_PUBLIC_SENTRY_DSN=https://...
SENTRY_DSN=https://...
SENTRY_ORG=ascentia-km
SENTRY_PROJECT=frontend
SENTRY_AUTH_TOKEN=sntrys_...Key rule: NEXT_PUBLIC_* vars are baked at build time — changing them in Vercel requires a redeploy.
sentry_sdk.init(
dsn=settings.sentry_dsn,
environment=settings.environment, # "production" or "staging"
release=os.environ.get("RAILWAY_GIT_COMMIT_SHA", "local"),
traces_sample_rate=settings.sentry_traces_sample_rate, # 0.1 prod, 1.0 staging
enable_logs=True,
integrations=[
FastApiIntegration(transaction_style="url"),
SqlalchemyIntegration(),
LoggingIntegration(
level=logging.INFO,
event_level=logging.ERROR,
sentry_logs_level=logging.WARNING,
),
],
before_send=lambda event, hint:
None if event.get("request", {}).get("url", "").endswith("/health") else event,
)Sentry.init({
dsn: process.env.NEXT_PUBLIC_SENTRY_DSN,
environment: process.env.NEXT_PUBLIC_ENVIRONMENT,
enableLogs: true,
integrations: [
Sentry.browserTracingIntegration(),
Sentry.replayIntegration({ maskAllText: true, blockAllMedia: true }),
Sentry.consoleLoggingIntegration({ levels: ["warn", "error"] }),
],
});| Layer | Implementation | Library |
|---|---|---|
| HTTP Security Headers | HSTS + CSP + X-Frame-Options | Custom SecurityMiddleware |
| JWT Authentication | ES256 via Supabase JWKS endpoint | python-jose[cryptography] |
| Input Sanitization | XSS prevention on all user inputs | bleach |
| Rate Limiting | Tier-based sliding window (in-memory for non-prod) | slowapi |
| Password Recovery | /auth/update-password with Supabase recovery token |
Supabase Auth |
| RBAC | 4 roles: superadmin, admin, member, viewer | deps.py |
| Decision | Choice | Reason |
|---|---|---|
| LLM primary | Gemini 2.0 Flash | 97% cheaper than Claude, 1M context |
| LLM framework | PydanticAI v1.x | Type-safe, native async, no LangChain complexity |
| Task queue | Arq + Redis | Lighter than Celery, async-native |
| Redis provider | Railway native | Free, same private network, no TLS needed |
| Database | Supabase Pro | $25/mo, built-in auth, RLS, APAC region |
| Scraping primary | PRAW + pytrends + Tweepy | Free APIs, no per-call costs |
| Scraping supplement | Crawl4AI (self-hosted) | Free Firecrawl alternative (USE_CRAWL4AI=true) |
| Auth | Supabase JWT (ES256) | No shared secret needed, JWKS endpoint |
| Frontend hosting | Vercel Hobby | Free, global CDN, Next.js-native |
| Backend hosting | Railway Free | Docker, 500h/mo, Redis built-in |
| Resend Free | 3K/mo covers MVP, no monthly fee | |
| Payments | Stripe | No monthly fee, industry standard |