Skip to content

esutarosa/osu-mapper-indexer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Osu Mapper Indexer

osu-mapper-indexer is a worker that scans osu! beatmapsets in discovery mode, extracts mapper users, and stores results in PostgreSQL.

No local app runtime is required. Run everything through Docker.

API Rate Limit Policy

  • Default worker throttle is 50 requests per minute.
  • This project is designed to stay below the common safe ceiling of 60 requests per minute for ppy API usage.
  • If you plan to go above 60 RPM, contact ppy first.
  • Recommended policy: do not change the default 50 RPM unless you have a clear and justified reason.

Please respect API limits.

Quick Start With just

  1. Create env file:
cp .env.example .env
  1. Fill required variables in .env:
  • OSU_CLIENT_ID
  • OSU_CLIENT_SECRET
  • POSTGRES_PASSWORD
  1. Start stack (DB + migrations + worker):
just up
  1. View logs:
just logs
  1. Stop stack:
just down

If just Is Not Installed

Use plain Docker Compose commands:

  1. Start PostgreSQL:
docker compose up -d --build postgres
  1. Run migrations:
docker compose run --rm migrator
  1. Start worker:
docker compose up -d --build indexer
  1. View logs:
docker compose logs -f --tail=727
  1. Stop all:
docker compose down

Rebuild without cache:

docker compose down
docker compose build --no-cache
docker compose up -d

Available Commands

Via just:

  • just up - start postgres, run migrations, start indexer
  • just migrate - run migrations only
  • just down - stop stack
  • just wipe - full cleanup of project Docker state (asks confirmation; only yes proceeds)
  • just rebuild - down + no-cache build + up
  • just logs - follow service logs
  • just dump [dir] - create backup bundle (default backups)
  • just health - check Rust and Docker tooling availability
  • just check - run cargo check and cargo clippy

Direct Docker equivalents:

  • docker compose up -d --build postgres
  • docker compose run --rm migrator
  • docker compose up -d --build indexer
  • docker compose down
  • docker compose logs -f --tail=727
  • ./scripts/dump.sh backups

Full cleanup equivalent:

docker compose down -v --remove-orphans --rmi local

Environment Variables

The app builds DB URL internally from POSTGRES_* values.

Variable Required Default Description
OSU_CLIENT_ID Yes - osu! OAuth client ID
OSU_CLIENT_SECRET Yes - osu! OAuth client secret
POSTGRES_PASSWORD Yes - PostgreSQL password
POSTGRES_HOST No localhost (postgres in container runtime) PostgreSQL host
POSTGRES_PORT No 5432 PostgreSQL port
POSTGRES_USER No postgres PostgreSQL user
POSTGRES_DB No osu_mapper_indexer PostgreSQL database name
SCAN_COUNTRY_CODES No UA One or many ISO-3166 alpha-2 country codes (comma-separated for multiple)
SCAN_MODES No all Comma-separated scan modes: all, osu, taiko, catch, mania
SCAN_OLDEST_FIRST No false If true, scans from oldest pages first
SCAN_RANKED_ONLY No false If true, scan only ranked beatmapsets
SCAN_PAGE_DELAY_MS No 500 Delay between page requests (ms)
SCAN_MAX_PAGES No empty Limit pages per run (empty = no limit)
SCAN_BATCH_SIZE No 50 Users batch size for profile lookup
SCAN_FORCE_RESCAN No false Ignore cutoff and rescan fully
SCAN_RESUME_FROM_CHECKPOINT No true Resume from stored checkpoint
WORKER_PROGRESS_EVERY No 25 Progress log interval in pages
RUST_LOG No info Log level

SCAN_MODES examples:

  • SCAN_MODES=all
  • SCAN_MODES=osu,taiko
  • SCAN_MODES=std,tko,ctb

SCAN_COUNTRY_CODES examples:

  • SCAN_COUNTRY_CODES=UA (single country)
  • SCAN_COUNTRY_CODES=UA,PL,DE (multiple countries)

Worker Log Fields

The worker emits three main log groups:

  1. Start log:
  • start discovery=on countries=... modes=... ranked_only=...
  • countries: country filter from SCAN_COUNTRY_CODES
  • modes: mode filter from SCAN_MODES (all or explicit list)
  • ranked_only: ranked filter from SCAN_RANKED_ONLY
  1. Progress log:
  • discovery page=<n> new=<n> existing=<n> skip=<n> <seconds>s
  • page: number of scanned pages in current run
  • new: users inserted/updated as mapper rows in this run
  • existing: creators already present in DB, skipped from user profile fetch
  • skip: users skipped because their country is not in SCAN_COUNTRY_CODES
  • <seconds>s: elapsed time from run start
  1. Finish logs:
  • discovery done page=<n> new=<n> existing=<n> skip=<n> <seconds>s
  • done duration=... requests=... throttle_sleep_ms=... retries=...
  • requests: number of throttled osu API acquires
  • throttle_sleep_ms: total sleep time added by rate limiter
  • retries: total osu API retry attempts

Database Tables

Main tables:

  • mapper_discovery_mappers
  • mapper_discovery_scan_state

Service table:

  • seaql_migrations

Backups and Exports

Create backup bundle:

just dump

# also works as alias
just backup

or

./scripts/dump.sh backups

Each run creates a folder like backups/omi-YYYYMMDD-HHMMSS/ with:

  • plain SQL dump (.sql)
  • compressed SQL dump (.sql.gz)
  • custom PostgreSQL dump (.dump)
  • CSV exports for mapper_discovery_* tables (csv/*.csv)

License

This project is licensed under MIT. See LICENSE.