Discogs Sage - Album Recognition App

AI-powered album cover recognition system using PyTorch, FastAPI, and Next.js.

Features

🎵 Album cover recognition using deep learning
🖼️ Drag & drop image upload interface
📊 Confidence scores for top 5 predictions
🎨 Modern, responsive UI with dark mode support
🚀 FastAPI backend with PyTorch model serving
📦 Transfer learning with ResNet50

Project Structure

discogs-sage-app/
├── backend/          # FastAPI + PyTorch (works locally and in SageMaker)
│   ├── data/               # Data pipeline: XML parser, image downloader
│   ├── ml/                 # Training & inference (shared)
│   ├── scripts/            # CLI: build_data, train
│   ├── main.py             # FastAPI service
│   ├── train.py            # SageMaker training entry point
│   └── inference.py        # SageMaker inference entry point
├── frontend/         # Next.js frontend
├── infrastructure/   # CDK: API Gateway, Lambda, SageMaker
├── docs/             # Documentation
└── data/             # Manifest, images (generated by build_data)

Setup

Prerequisites: Python via pyenv

This project requires Python 3.10+ (for PyTorch). Using pyenv lets you install and switch Python versions easily.

Install pyenv

macOS (Homebrew):

brew install pyenv

Add to ~/.zshrc (or ~/.bash_profile):

export PYENV_ROOT="$HOME/.pyenv"
[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"

Then run exec "$SHELL" or open a new terminal.

Linux (Ubuntu/Debian):
```
curl https://pyenv.run | bash
```
Add the same three lines above to ~/.bashrc, then exec "$SHELL".

Install and use Python 3.11

pyenv install 3.11.9
pyenv local 3.11.9   # Use 3.11 for this project

Verify: python --version should show 3.11.9.

Backend

Navigate to backend directory:

cd backend

Create virtual environment (uses Python from pyenv if set):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

If you get "No matching distribution found for torch", install PyTorch first:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt

Or run ./setup.sh which installs PyTorch before other deps.

Build data and train (see docs/backend.md):

python backend/scripts/build_data.py --count 500
cd backend && python -m scripts.train --data-dir ../data --model-dir models

Start the backend:

python main.py

The API will be available at http://localhost:8000

Frontend

Navigate to frontend directory:

cd frontend

Install dependencies:

npm install

Start development server:

npm run dev

The app will be available at http://localhost:3000

Usage

1. Download Album Images

First, download album cover images from Discogs:

curl -X POST http://localhost:8000/api/download-images

This will download images for the first 50 releases from your manifest file.

2. Train the Model

Train the classification model:

curl -X POST http://localhost:8000/api/train

This trains a ResNet50 model using transfer learning on the downloaded images. Training takes a few minutes.

3. Use the Web Interface

Open http://localhost:3000 in your browser
Upload or drag & drop an album cover image
Click "Identify Album"
View the top 5 predictions with confidence scores

API Endpoints

GET / - API status
GET /api/releases - List all releases in dataset
GET /api/health - Health check
POST /api/download-images - Download images from Discogs
POST /api/train - Train classification model
POST /api/predict - Predict album from uploaded image

Architecture

Backend

FastAPI: Modern Python web framework
PyTorch: Deep learning framework
ResNet50: Pre-trained CNN for image classification
Transfer Learning: Fine-tuned on album covers

Frontend

Next.js 15: React framework with App Router
TypeScript: Type-safe development
Tailwind CSS: Utility-first styling
Modern UI: Drag & drop, dark mode, responsive

Model Details

Base Model: ResNet50 pre-trained on ImageNet
Architecture: Transfer learning with frozen early layers
Input: 224x224 RGB images
Output: Softmax probabilities over 50 album classes
Training: Data augmentation with horizontal flips and color jitter
Optimizer: Adam with learning rate scheduling

Environment Variables

Backend (.env.local)

DISCOGS_USER_TOKEN: (Preferred) Personal access token from Discogs Developers → Generate new token. Uses python3-discogs-client.
DISCOGS_CONSUMER_KEY + DISCOGS_CONSUMER_SECRET: Alternative auth from Create an application
MODEL_PATH: Path to trained model (default: ./models/album_classifier.pth)
IMAGES_PATH: Path to album images (default: ./data/images)
BUCKET_NAME: S3 bucket for SageMaker
SAGEMAKER_ROLE: IAM role ARN for SageMaker

Frontend (.env.local)

NEXT_PUBLIC_API_URL: Backend API URL (default: http://localhost:8000)

Troubleshooting

NumPy "compiled with NumPy 1.x" error – Downgrade: pip install "numpy<2"

SSL certificate verify failed (when downloading ResNet50 weights on macOS):

# Option 1: Run Python's certificate installer (if using python.org installer)
/Applications/Python\ 3.11/Install\ Certificates.command

# Option 2: Use certifi (if using pyenv)
export SSL_CERT_FILE=$(python -c "import certifi; print(certifi.where())")

# Option 3: Install certs for pyenv Python
pip install certifi
python -m certifi
# Then: export SSL_CERT_FILE=/path/from/certifi/output

No matching distribution for torch – See step 3 in Backend setup above.

DNS/network error when downloading ResNet50 (nodename nor servname provided, or not known):

On a machine with internet, run: python backend/scripts/download_resnet50_weights.py
Copy the file to your offline machine at ~/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth
Or download from https://download.pytorch.org/models/resnet50-0676ba61.pth and set: export RESNET50_WEIGHTS_PATH=/path/to/resnet50-0676ba61.pth

Development

Backend Development

cd backend
python main.py

Frontend Development

cd frontend
npm run dev

Testing the API

# Test prediction with curl
curl -X POST -F "file=@album_cover.jpg" http://localhost:8000/api/predict

AWS SageMaker Deployment

This project includes full AWS SageMaker support for production deployment.

Quick Start

# 1. Upload data and code to S3
./prepare_for_studio.sh your-bucket-name

# 2. Train on SageMaker (from Studio notebook or CLI)
# See docs/SAGEMAKER_README.md

# 3. Deploy API Gateway + Lambda
cd infrastructure && npm run deploy -- --context endpointName=album-classifier

Documentation

How SageMaker Works - Architecture, flow, key files
SageMaker Quick Start - 5-step reference
Complete Setup - Full walkthrough
Deploy Inference - API Gateway + Lambda

Future Enhancements

Expand to full Discogs catalog
Add batch prediction support
Implement image similarity search
Add user feedback for model improvement
Deploy to AWS SageMaker for production
Add caching for faster predictions
Support for multi-image input

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
backend		backend
docs		docs
frontend		frontend
huggingface_space		huggingface_space
infrastructure		infrastructure
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
prepare_for_studio.sh		prepare_for_studio.sh
s3-policy.json		s3-policy.json
test-full-workflow.sh		test-full-workflow.sh
trust-policy.json		trust-policy.json
verify-setup.sh		verify-setup.sh

Folders and files

Latest commit

History

Repository files navigation

Discogs Sage - Album Recognition App

Features

Project Structure

Setup

Prerequisites: Python via pyenv

Backend

Frontend

Usage

1. Download Album Images

2. Train the Model

3. Use the Web Interface

API Endpoints

Architecture

Backend

Frontend

Model Details

Environment Variables

Backend (.env.local)

Frontend (.env.local)

Troubleshooting

Development

Backend Development

Frontend Development

Testing the API

AWS SageMaker Deployment

Quick Start

Documentation

Future Enhancements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages