Skip to content

jbeshir/alignment-research-feed

Repository files navigation

Alignment Research Feed API

A Go REST API that serves AI alignment research articles with personalized recommendations, semantic search, and user interaction tracking. It sits between the alignment-research-dataset ingestion pipeline and the alignment-research-feed-fe frontend, providing article discovery, vector-based recommendations, and an RSS feed.

Key Concepts

  • Article -- A research paper or blog post stored with metadata (title, authors, source, publication date, summary, key points, category) and user-specific state (read, thumbs up/down).
  • User Interest Cluster -- A k-means centroid computed from a user's positively-rated article vectors, representing a distinct area of interest. Multiple clusters capture diverse reading interests.
  • Temporal Weighting -- Exponential decay applied to rating vectors so recent preferences influence recommendations more than older ones. Configured via a half-life parameter.
  • Precomputed Recommendation -- A cached recommendation (article, score, source) generated by a batch job or on-demand, stored in MySQL to avoid recomputing on every request.
  • API Token -- A user-created bearer token for programmatic access. Stored as a SHA-256 hash. Cannot be used for token management endpoints (only Auth0 sessions can manage tokens).
  • Null Driver -- A no-op implementation of Pinecone, VoyageAI, or Auth0 that allows the API to run without those services for local development.

Architecture Overview

The codebase follows a clean layered architecture. The transport layer handles HTTP concerns, the command layer implements business logic, the domain layer defines core types and algorithms, and datasource implementations provide storage and external service integration.

graph TD
    HTTP[HTTP Transport<br/>Router, Controllers, Middleware]
    CMD[Command Layer<br/>Business Logic]
    DOM[Domain Layer<br/>Entities, Clustering, Temporal Decay]
    MYSQL[MySQL<br/>Articles, Ratings, Tokens, Cached Recommendations]
    PINE[Pinecone<br/>Vector Similarity Search]
    VOYAGE[VoyageAI<br/>Text Embedding]

    HTTP --> CMD
    CMD --> DOM
    CMD --> MYSQL
    CMD --> PINE
    CMD --> VOYAGE
    HTTP --> MYSQL
Loading

Three separate entrypoints share the same internal packages:

Entrypoint Purpose
cmd/app/ Main HTTP API server
cmd/generate-recommendations/ Batch job that precomputes recommendations for users who need regeneration
cmd/mcp/ MCP (Model Context Protocol) server for AI agent integration

External Dependencies

The API depends on several external services, all of which can be swapped for null drivers except MySQL.

graph LR
    API[Alignment Feed API]
    FE[Frontend]
    MCP[MCP Clients]

    API --> MySQL[(MySQL)]
    API --> Pinecone[(Pinecone)]
    API --> VoyageAI[VoyageAI]
    API --> Auth0[Auth0]
    FE --> API
    MCP --> API
Loading
Service Purpose Required
MySQL Article storage, user interactions, recommendations, API tokens Yes
Pinecone Vector similarity search for recommendations and similar articles No (SIMILARITY_DRIVER=null)
VoyageAI Text-to-vector embeddings for semantic search No (EMBEDDING_DRIVER=null)
Auth0 JWT authentication for browser sessions No (AUTH_DRIVERS=)

Data Flow

Article Listing

sequenceDiagram
    participant C as Client
    participant R as Router
    participant Ctrl as ArticlesList Controller
    participant DB as MySQL

    C->>R: GET /v1/articles?page=1&pageSize=20
    R->>Ctrl: Route with filters and options
    Ctrl->>DB: ListLatestArticleIDs(filters, options)
    DB-->>Ctrl: Article hash IDs
    Ctrl->>DB: FetchArticlesByID(hashIDs)
    DB-->>Ctrl: Articles with user state
    Ctrl-->>C: JSON response with articles
Loading

Recommendation Generation

Recommendations combine interest clustering with temporal weighting and negative signal filtering. They are precomputed by a batch job and served from cache, falling back to on-demand generation when stale.

sequenceDiagram
    participant C as Client
    participant Cmd as RecommendArticles
    participant Gen as GenerateRecommendations
    participant DB as MySQL
    participant PC as Pinecone

    C->>Cmd: GET /v1/articles/recommended
    Cmd->>DB: GetPrecomputedRecommendations
    alt Fresh recommendations exist
        DB-->>Cmd: Cached recommendations
    else Stale or missing
        Cmd->>Gen: Generate on-demand
        Gen->>DB: Get thumbs-up and thumbs-down vectors
        Gen->>DB: Get/compute interest clusters via k-means
        Gen->>PC: Query similar articles per cluster
        Gen-->>Cmd: Ranked, deduplicated results
        Cmd->>DB: Cache recommendations
    end
    Cmd->>DB: FetchArticlesByID
    Cmd-->>C: Recommended articles
Loading

Rating and Regeneration

When a user rates an article, the system stores the rating vector and flags the user for recommendation regeneration.

sequenceDiagram
    participant C as Client
    participant Cmd as SetArticleRating
    participant DB as MySQL
    participant PC as Pinecone

    C->>Cmd: POST /v1/articles/{id}/thumbs_up/true
    Cmd->>PC: FetchArticleVector(id)
    PC-->>Cmd: Article vector
    Cmd->>DB: SetArticleRating (atomic, mutual exclusivity)
    Cmd->>DB: MarkUserNeedsRegeneration
    Cmd-->>C: 204 No Content
Loading

Getting Started

Prerequisites

  • Go 1.25+
  • Docker and Docker Compose (for local MySQL)
  • Make

Setup

  1. Install development tools:
make setup-tools
  1. Start the local MySQL database:
make docker-up
  1. Run database migrations:
make docker-migrate
  1. Configure environment variables. make setup-tools copies .env.dist to .env if it does not exist. The defaults run the API with null drivers for Pinecone, VoyageAI, and Auth0:
# .env (defaults from .env.dist)
LOG_LEVEL=DEBUG
HTTP_TLS_DISABLED=true
PORT=3000
MYSQL_URI=alignment_research_feed:pass@tcp(localhost:3306)/alignment_research_dataset
SIMILARITY_DRIVER=null
EMBEDDING_DRIVER=null
AUTH_DRIVERS=
  1. Run the API server:
go run ./cmd/app

Common Commands

Command Description
make test-short Run unit tests
make lint Run golangci-lint
make lint-openapi Validate OpenAPI spec
make fmt Format code (gofmt + goimports)
make generate Run code generation (sqlc, mockery)
make docker-test Run tests in Docker with a real MySQL instance
make docker-mysql Open a MySQL CLI connected to the dev database
make build-mcp Build the MCP server binary

API Reference

The full API is documented in the OpenAPI spec. A rendered version can be built with make build-openapi-docs.

Articles

Method Path Auth Description
GET /v1/articles Optional Paginated article list with filters (source, date, title, author, category)
GET /v1/articles/{article_id} Optional Single article by hash ID
GET /v1/articles/{article_id}/similar Optional Up to 10 similar articles via vector similarity
POST /v1/articles/semantic-search Optional Semantic search by text query
GET /v1/articles/recommended Required Personalized recommendations (1-100 results)
GET /v1/articles/unreviewed Required Articles not yet read or rated
GET /v1/articles/liked Required Articles with thumbs up
GET /v1/articles/disliked Required Articles with thumbs down

User Interactions

Method Path Auth Description
POST /v1/articles/{article_id}/read/{read} Required Mark article as read/unread
POST /v1/articles/{article_id}/thumbs_up/{thumbs_up} Required Set thumbs up (clears thumbs down)
POST /v1/articles/{article_id}/thumbs_down/{thumbs_down} Required Set thumbs down (clears thumbs up)

API Tokens

Method Path Auth Description
GET /v1/tokens Auth0 only List user's API tokens
POST /v1/tokens Auth0 only Create a new API token (max 10 active)
DELETE /v1/tokens/{token_id} Auth0 only Revoke a token

RSS

Method Path Auth Description
GET /rss No RSS 2.0 feed (supports same filters as article listing)

Authentication

Two authentication methods are supported, identified by the bearer token prefix:

  • Auth0 JWT: Authorization: Bearer auth0|<jwt_token> -- for browser sessions. Can access all endpoints including token management.
  • API Token: Authorization: Bearer user_api|<token> -- for programmatic access. Cannot manage tokens.

Unauthenticated requests can access public endpoints (article listing, single article, similar articles, semantic search, RSS).

About

Backend service providing a feed and queries over the Alignment Research Dataset

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages