Parth576 · Parth576 · Mar 5, 2026 · Mar 1, 2026 · Mar 1, 2026 · Mar 1, 2026
diff --git a/.agents/scratchpad/2026-02-15-smolterms/project-landing-page/context.md b/.agents/scratchpad/2026-02-15-smolterms/project-landing-page/context.md
@@ -0,0 +1,35 @@
+# Context: SmolTerms Project Landing Page
+
+## Requirements
+- Static landing page at `website/` directory
+- Vanilla HTML + Tailwind CSS (CDN) + minimal JS
+- No build system required
+- Light pastel/offwhite color palette with darker contrast buttons
+- Responsive (mobile 375px, desktop 1280px)
+
+## Sections Required
+1. Hero/Header - project name, tagline
+2. What it does - privacy policy analyzer, 5 dimensions
+3. Motivation - why privacy policies need simplification
+4. How it works - extension flow explanation
+5. Install - Firefox/Chrome buttons (placeholder URLs)
+6. Scoring explanation - 5 dimensions + 4 risk levels with colors
+7. Try It (placeholder) - `<section id="try-it">` for URL analysis (task-03)
+8. Footer - GitHub repo, license
+
+## Design Decisions
+- **Tailwind CSS via CDN** - user preference, no build step needed
+- **Color palette**: Offwhite backgrounds (#f8f7f4 / #faf9f6), darker offwhite buttons (#e8e5df / #d4d0c8)
+- **Risk level colors from design doc**: Green (8-10), Yellow (5-0-7.9), Orange (3-4.9), Red (1-2.9)
+- **Font**: Inter via Google Fonts for clean modern look
+
+## Scoring System (from design doc)
+| Dimension | What it measures |
+|-----------|-----------------|
+| Data Collection | How much personal data is collected |
+| Data Sharing | Whether data is shared/sold to third parties |
+| User Rights | Ability to access, delete, export data |
+| Retention | How long data is kept |
+| Security | Security measures, encryption, breach notification |
+
+Risk levels: Low (8-10, green), Moderate (5-7.9, yellow), High (3-4.9, orange), Critical (1-2.9, red)
diff --git a/.agents/scratchpad/2026-02-15-smolterms/project-landing-page/plan.md b/.agents/scratchpad/2026-02-15-smolterms/project-landing-page/plan.md
@@ -0,0 +1,33 @@
+# Plan: SmolTerms Project Landing Page
+
+## Test Strategy
+Since this is a static HTML/CSS/JS page with no build system, traditional unit tests don't apply. Validation:
+- Manual browser check: all sections render correctly
+- HTML structure verification: `<section id="try-it">` placeholder exists
+- Responsive: verify layout at 375px and 1280px viewports
+- All required sections present in HTML
+
+## Implementation Plan
+
+### Files to Create
+1. `website/index.html` - Main page with all sections, Tailwind CDN
+2. `website/script.js` - Smooth scrolling, mobile nav toggle
+3. `website/README.md` - Local dev and GitHub Pages deployment instructions
+
+### Design Approach
+- Tailwind CSS via CDN (no build step)
+- Google Fonts: Inter
+- Color palette: offwhite backgrounds, darker offwhite/warm gray buttons
+- Risk level colors used as accents in scoring section
+- Clean, minimal layout with generous whitespace
+
+### Section Breakdown
+1. **Nav** - Fixed top, logo + nav links
+2. **Hero** - Large heading, tagline, CTA buttons
+3. **What it does** - Brief feature overview
+4. **Motivation** - Why SmolTerms exists
+5. **How it works** - 3-step visual flow
+6. **Install** - Browser extension download buttons
+7. **Scoring** - 5 dimensions grid + risk level legend
+8. **Try It** - Placeholder section for URL analysis
+9. **Footer** - Links, license
diff --git a/.agents/scratchpad/2026-02-15-smolterms/project-landing-page/progress.md b/.agents/scratchpad/2026-02-15-smolterms/project-landing-page/progress.md
@@ -0,0 +1,21 @@
+# Progress: SmolTerms Project Landing Page
+
+## Setup
+- [x] Create documentation directory
+- [x] Read task file and design document
+- [x] Create context.md
+
+## Implementation
+- [x] Create `website/` directory structure
+- [x] Build `index.html` with all sections (hero, features, motivation, how-it-works, install, scoring, try-it, footer)
+- [x] Style with Tailwind CSS (CDN) - pastel/offwhite palette with custom cream color scale
+- [x] Add `script.js` for smooth scrolling and mobile nav toggle
+- [x] Create `website/README.md` with local dev and GitHub Pages deploy instructions
+- [x] Validate: all acceptance criteria met (sections, scoring, responsive classes, try-it placeholder)
+- [ ] Commit
+
+## Decisions
+- Used Tailwind CDN with custom config (cream color palette, risk level colors)
+- Inter font via Google Fonts for clean modern look
+- SVG icons inline (no external icon library dependency)
+- Firefox/Chrome install buttons use placeholder `#` URLs
diff --git a/.agents/scratchpad/2026-02-15-smolterms/website-url-analysis/context.md b/.agents/scratchpad/2026-02-15-smolterms/website-url-analysis/context.md
@@ -0,0 +1,75 @@
+# Context: Website URL Analysis Feature
+
+## Project Structure
+- `website/index.html` - Landing page using Tailwind CSS CDN, Inter font, vanilla JS
+- `website/script.js` - Mobile menu toggle + smooth scrolling (28 lines)
+- No `styles.css` - all styling via Tailwind utility classes inline
+- No build system - plain static files served directly
+
+## Requirements Summary
+
+### Functional
+1. Replace the `<section id="try-it">` placeholder (line 279-290) with:
+   - URL input field with placeholder text
+   - "Analyze" submit button
+   - Results container (hidden by default)
+2. Form submission flow:
+   - Client-side URL validation (basic format check)
+   - Loading state (spinner, disabled button, progress text)
+   - POST to `{API_BASE_URL}/api/v1/analyze-url` with `{ "url": "..." }`
+   - Render results or errors
+3. Results display: overall score + risk badge, 5 dimension scores, key concerns list, summary, cached indicator
+4. Error handling: network errors, fetch failures (suggest extension), rate limiting (429), invalid URL, non-policy content
+5. Configurable API base URL (default `http://localhost:8080`)
+6. Smooth scroll to results after analysis
+
+### Backend API Contract
+- **Request:** `POST /api/v1/analyze-url` with `{ "url": "..." }`
+- **Success Response (200):** `AnalysisResult` struct:
+  ```json
+  {
+    "url": "...",
+    "overall_score": 7.2,
+    "risk_level": "moderate",  // "low"|"moderate"|"high"|"critical"|"not_policy"
+    "dimensions": {
+      "data_collection": { "score": 7.0, "summary": "..." },
+      "data_sharing": { "score": 6.5, "summary": "..." },
+      "user_rights": { "score": 8.0, "summary": "..." },
+      "retention": { "score": 7.5, "summary": "..." },
+      "security": { "score": 7.0, "summary": "..." }
+    },
+    "key_concerns": ["concern1", "concern2"],
+    "summary": "...",
+    "cached": false,
+    "analyzed_at": "2026-02-15T..."
+  }
+  ```
+- **Error Response:** `{ "error": "..." }` with appropriate HTTP status codes
+  - 400: Invalid request / empty URL
+  - 429: Rate limited (5 req/min/IP)
+  - 502: Fetch failure / analysis failure
+  - 504: Analysis timeout
+
+### Risk Level Colors (from Tailwind config)
+- Low (8-10): `risk-low` (#4ade80 green)
+- Moderate (5-7.9): `risk-moderate` (#facc15 yellow)
+- High (3-4.9): `risk-high` (#fb923c orange)
+- Critical (1-2.9): `risk-critical` (#f87171 red)
+
+## Existing Patterns
+- Website uses Tailwind CSS CDN - no custom CSS file, all utility classes
+- Existing sections use `max-w-3xl` or `max-w-5xl` with `mx-auto`, `px-6`, `py-20`
+- Cards use `bg-cream-100 rounded-xl p-5/p-6 border border-cream-300` pattern
+- Colors from custom cream palette (cream-50 through cream-900)
+- Risk colors already defined in Tailwind config
+- JS is vanilla, no framework, no modules
+- Try-it section currently has `max-w-3xl mx-auto text-center` layout
+
+## Implementation Path
+1. Edit `website/index.html` - Replace try-it placeholder with form HTML + results container
+2. Edit `website/script.js` - Add all JS logic (validation, API calls, rendering, error handling)
+3. Add inline `<style>` tag for spinner animation (only thing Tailwind can't do with utilities)
+
+## Dependencies
+- Backend `POST /api/v1/analyze-url` endpoint (already implemented)
+- Backend CORS must allow website origin
diff --git a/.agents/scratchpad/2026-02-15-smolterms/website-url-analysis/plan.md b/.agents/scratchpad/2026-02-15-smolterms/website-url-analysis/plan.md
@@ -0,0 +1,73 @@
+# Plan: Website URL Analysis Feature
+
+## Test Strategy
+
+This is a vanilla JS frontend with no build system or test framework. Testing is **manual verification** against the acceptance criteria. Each criterion maps to a specific UI interaction that can be verified by running the local dev server and interacting with the form.
+
+### Manual Test Scenarios
+
+1. **URL Form Displays** - Load page, scroll to try-it section, verify input + button visible
+2. **Empty URL validation** - Click Analyze with empty input -> validation message, no API call
+3. **Invalid URL validation** - Type "not a url", click Analyze -> validation message, no API call
+4. **Loading state** - Submit valid URL -> spinner visible, button disabled, progress text shows
+5. **Successful analysis** - Submit real policy URL with running backend -> scores, risk badge, dimensions, concerns, summary all render
+6. **Risk level colors** - Verify green/yellow/orange/red match score ranges
+7. **Fetch failure** - Submit URL backend can't fetch -> error + extension suggestion
+8. **Rate limit (429)** - Trigger rate limit -> "too many requests" message
+9. **Non-policy content** - Submit non-policy URL -> friendly "not a privacy policy" message
+10. **Smooth scroll** - After results render, page scrolls to results section
+11. **Responsive** - Check results on mobile viewport (no horizontal scroll)
+12. **Cached indicator** - If response has `cached: true`, show indicator
+
+## Implementation Plan
+
+### Step 1: HTML - Replace try-it placeholder
+- Replace the placeholder div (lines 279-290 of index.html) with:
+  - Form with URL input + Analyze button
+  - Loading state container (hidden)
+  - Results container (hidden)
+  - Error container (hidden)
+- Add a `<style>` block in `<head>` for spinner keyframe animation
+
+### Step 2: CSS via Tailwind classes
+- All styling via Tailwind utility classes (matching existing pattern)
+- Only custom CSS needed: `@keyframes spin` for the loading spinner
+- Score cards: grid layout, cream card styling like existing dimension cards
+- Risk badge: colored pill using `bg-risk-*` classes from Tailwind config
+- Dimension scores: flex layout with score number + summary text
+- Error states: red-tinted cards with appropriate messaging
+
+### Step 3: JavaScript implementation
+Add to `script.js`:
+- `API_BASE` constant (configurable, defaults to `http://localhost:8080`)
+- URL validation function (basic URL format check)
+- Form submit handler
+- `analyzeURL(url)` - fetch call to backend
+- `renderResults(data)` - builds results HTML from AnalysisResult
+- `renderError(message, suggestion)` - shows error with optional suggestion
+- `getRiskColor(riskLevel)` - maps risk level string to Tailwind color class
+- `getDimensionLabel(key)` - maps snake_case keys to display labels
+- `showLoading()` / `hideLoading()` - toggle loading state
+- `resetState()` - clears previous results/errors
+
+### Step 4: Error handling matrix
+| HTTP Status | Backend Error Message | Frontend Display |
+|-------------|----------------------|------------------|
+| Network error | N/A | "Could not reach the analysis server" |
+| 400 | "Invalid JSON..." / "url required" | Show backend message |
+| 429 | N/A (rate limiter) | "Too many requests, please wait and try again" |
+| 502 | "Failed to fetch..." | Backend message + "Try using the browser extension instead" |
+| 504 | "Analysis timed out..." | "Analysis timed out. Please try again." |
+| 200 + risk_level="not_policy" | N/A | "This page doesn't appear to be a privacy policy" |
+
+## Implementation Checklist
+- [ ] Add spinner keyframe CSS in `<head>`
+- [ ] Replace try-it placeholder HTML with form + results + error containers
+- [ ] Add `API_BASE` constant to script.js
+- [ ] Implement URL validation
+- [ ] Implement form submit handler with loading state
+- [ ] Implement API fetch call
+- [ ] Implement results rendering (overall score, risk badge, dimensions, concerns, summary, cached)
+- [ ] Implement error rendering (all error types)
+- [ ] Implement smooth scroll to results
+- [ ] Manual verification of all acceptance criteria
diff --git a/.agents/scratchpad/2026-02-15-smolterms/website-url-analysis/progress.md b/.agents/scratchpad/2026-02-15-smolterms/website-url-analysis/progress.md
@@ -0,0 +1,38 @@
+# Progress: Website URL Analysis Feature
+
+## Setup
+- [x] Created documentation directory structure
+- [x] Discovered project files (no CODEASSIST.md found)
+
+## Explore Phase
+- [x] Analyzed requirements and context
+- [x] Researched existing patterns (Tailwind-only styling, cream palette, card patterns)
+- [x] Created context document
+
+## Plan Phase
+- [x] Designed test strategy (manual verification - no test framework for static site)
+- [x] Created implementation plan
+
+## Code Phase
+- [x] Implement HTML form in try-it section (replaced placeholder with form + results + error containers)
+- [x] Add spinner CSS keyframe animation in `<head>`
+- [x] All styling via Tailwind utility classes (matching existing patterns)
+- [x] Implement JS: URL validation (`new URL()` + protocol check), API fetch, result rendering
+- [x] Implement error handling (429, 502, 504, network errors, not_policy)
+- [x] XSS protection via `escapeHTML()` for API response content
+- [x] Refactor and review - conventions match existing codebase
+
+## Validate
+- [x] JS syntax validation: passed
+- [x] HTML structure validation: passed
+- [x] All 10 acceptance criteria addressed in implementation
+
+## Decisions
+- Used `<style>` tag for spinner animation (only custom CSS needed; Tailwind handles everything else)
+- Used `new URL()` constructor for URL validation (browser-native, handles edge cases)
+- Used `escapeHTML()` via DOM text content assignment for XSS prevention
+- Dimension scores rendered in fixed order matching the scoring section of the landing page
+- "Analyze another" button scrolls back to form and resets state
+
+## Commit
+- [ ] Conventional commit created
diff --git a/backend/internal/api/fetcher.go b/backend/internal/api/fetcher.go
@@ -0,0 +1,89 @@
+package api
+
+import (
+	"context"
+	"fmt"
+	"io"
+	"net/http"
+	"strings"
+)
+
+const (
+	defaultMaxBodySize  = 5 * 1024 * 1024 // 5MB
+	defaultUserAgent    = "SmolTerms/1.0 (+https://github.com/Parth576/smolterms)"
+	defaultMaxRedirects = 5
+)
+
+// Fetcher handles server-side HTTP fetching with SSRF protection and safety limits.
+type Fetcher struct {
+	MaxBodySize  int64
+	UserAgent    string
+	MaxRedirects int
+	Client       *http.Client
+	// ValidateFunc validates a URL before fetching. Defaults to ValidateURL.
+	ValidateFunc func(string) error
+}
+
+// NewFetcher creates a Fetcher with default settings.
+func NewFetcher() *Fetcher {
+	f := &Fetcher{
+		MaxBodySize:  defaultMaxBodySize,
+		UserAgent:    defaultUserAgent,
+		MaxRedirects: defaultMaxRedirects,
+		ValidateFunc: ValidateURL,
+	}
+	f.Client = &http.Client{
+		CheckRedirect: func(req *http.Request, via []*http.Request) error {
+			if len(via) >= f.MaxRedirects {
+				return fmt.Errorf("too many redirects (max %d)", f.MaxRedirects)
+			}
+			if err := f.ValidateFunc(req.URL.String()); err != nil {
+				return fmt.Errorf("redirect blocked: %w", err)
+			}
+			return nil
+		},
+	}
+	return f
+}
+
+// Fetch downloads the page at rawURL and returns the HTML body.
+// It validates the URL for SSRF, checks content type, and limits body size.
+func (f *Fetcher) Fetch(ctx context.Context, rawURL string) (string, error) {
+	if err := f.ValidateFunc(rawURL); err != nil {
+		return "", fmt.Errorf("url validation failed: %w", err)
+	}
+
+	req, err := http.NewRequestWithContext(ctx, http.MethodGet, rawURL, nil)
+	if err != nil {
+		return "", fmt.Errorf("failed to create request: %w", err)
+	}
+	req.Header.Set("User-Agent", f.UserAgent)
+	req.Header.Set("Accept", "text/html")
+
+	resp, err := f.Client.Do(req)
+	if err != nil {
+		return "", fmt.Errorf("failed to fetch url: %w. Try using the browser extension instead", err)
+	}
+	defer resp.Body.Close()
+
+	if resp.StatusCode < 200 || resp.StatusCode >= 300 {
+		return "", fmt.Errorf("server returned status %d. The page may be restricted — try using the browser extension instead", resp.StatusCode)
+	}
+
+	ct := resp.Header.Get("Content-Type")
+	if !strings.Contains(strings.ToLower(ct), "text/html") {
+		return "", fmt.Errorf("unexpected content type %q: expected text/html", ct)
+	}
+
+	// Read up to MaxBodySize + 1 byte to detect oversized responses.
+	limited := io.LimitReader(resp.Body, f.MaxBodySize+1)
+	body, err := io.ReadAll(limited)
+	if err != nil {
+		return "", fmt.Errorf("failed to read response body: %w", err)
+	}
+	if int64(len(body)) > f.MaxBodySize {
+		return "", fmt.Errorf("response body too large (exceeds %d bytes)", f.MaxBodySize)
+	}
+
+	return string(body), nil
+}