Skip to content

normtronics/discogs-sagemaker

Repository files navigation

Discogs Sage - Album Recognition App

AI-powered album cover recognition system using PyTorch, FastAPI, and Next.js.

Features

  • 🎵 Album cover recognition using deep learning
  • 🖼️ Drag & drop image upload interface
  • 📊 Confidence scores for top 5 predictions
  • 🎨 Modern, responsive UI with dark mode support
  • 🚀 FastAPI backend with PyTorch model serving
  • 📦 Transfer learning with ResNet50

Project Structure

discogs-sage-app/
├── backend/          # FastAPI + PyTorch (works locally and in SageMaker)
│   ├── data/               # Data pipeline: XML parser, image downloader
│   ├── ml/                 # Training & inference (shared)
│   ├── scripts/            # CLI: build_data, train
│   ├── main.py             # FastAPI service
│   ├── train.py            # SageMaker training entry point
│   └── inference.py        # SageMaker inference entry point
├── frontend/         # Next.js frontend
├── infrastructure/   # CDK: API Gateway, Lambda, SageMaker
├── docs/             # Documentation
└── data/             # Manifest, images (generated by build_data)

Setup

Prerequisites: Python via pyenv

This project requires Python 3.10+ (for PyTorch). Using pyenv lets you install and switch Python versions easily.

Install pyenv

  • macOS (Homebrew):

    brew install pyenv

    Add to ~/.zshrc (or ~/.bash_profile):

    export PYENV_ROOT="$HOME/.pyenv"
    [[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"
    eval "$(pyenv init -)"

    Then run exec "$SHELL" or open a new terminal.

  • Linux (Ubuntu/Debian):

    curl https://pyenv.run | bash

    Add the same three lines above to ~/.bashrc, then exec "$SHELL".

Install and use Python 3.11

pyenv install 3.11.9
pyenv local 3.11.9   # Use 3.11 for this project

Verify: python --version should show 3.11.9.

Backend

  1. Navigate to backend directory:
cd backend
  1. Create virtual environment (uses Python from pyenv if set):
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

If you get "No matching distribution found for torch", install PyTorch first:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt

Or run ./setup.sh which installs PyTorch before other deps.

  1. Build data and train (see docs/backend.md):
python backend/scripts/build_data.py --count 500
cd backend && python -m scripts.train --data-dir ../data --model-dir models
  1. Start the backend:
python main.py

The API will be available at http://localhost:8000

Frontend

  1. Navigate to frontend directory:
cd frontend
  1. Install dependencies:
npm install
  1. Start development server:
npm run dev

The app will be available at http://localhost:3000

Usage

1. Download Album Images

First, download album cover images from Discogs:

curl -X POST http://localhost:8000/api/download-images

This will download images for the first 50 releases from your manifest file.

2. Train the Model

Train the classification model:

curl -X POST http://localhost:8000/api/train

This trains a ResNet50 model using transfer learning on the downloaded images. Training takes a few minutes.

3. Use the Web Interface

  1. Open http://localhost:3000 in your browser
  2. Upload or drag & drop an album cover image
  3. Click "Identify Album"
  4. View the top 5 predictions with confidence scores

API Endpoints

  • GET / - API status
  • GET /api/releases - List all releases in dataset
  • GET /api/health - Health check
  • POST /api/download-images - Download images from Discogs
  • POST /api/train - Train classification model
  • POST /api/predict - Predict album from uploaded image

Architecture

Backend

  • FastAPI: Modern Python web framework
  • PyTorch: Deep learning framework
  • ResNet50: Pre-trained CNN for image classification
  • Transfer Learning: Fine-tuned on album covers

Frontend

  • Next.js 15: React framework with App Router
  • TypeScript: Type-safe development
  • Tailwind CSS: Utility-first styling
  • Modern UI: Drag & drop, dark mode, responsive

Model Details

  • Base Model: ResNet50 pre-trained on ImageNet
  • Architecture: Transfer learning with frozen early layers
  • Input: 224x224 RGB images
  • Output: Softmax probabilities over 50 album classes
  • Training: Data augmentation with horizontal flips and color jitter
  • Optimizer: Adam with learning rate scheduling

Environment Variables

Backend (.env.local)

  • DISCOGS_USER_TOKEN: (Preferred) Personal access token from Discogs Developers → Generate new token. Uses python3-discogs-client.
  • DISCOGS_CONSUMER_KEY + DISCOGS_CONSUMER_SECRET: Alternative auth from Create an application
  • MODEL_PATH: Path to trained model (default: ./models/album_classifier.pth)
  • IMAGES_PATH: Path to album images (default: ./data/images)
  • BUCKET_NAME: S3 bucket for SageMaker
  • SAGEMAKER_ROLE: IAM role ARN for SageMaker

Frontend (.env.local)

  • NEXT_PUBLIC_API_URL: Backend API URL (default: http://localhost:8000)

Troubleshooting

NumPy "compiled with NumPy 1.x" error – Downgrade: pip install "numpy<2"

SSL certificate verify failed (when downloading ResNet50 weights on macOS):

# Option 1: Run Python's certificate installer (if using python.org installer)
/Applications/Python\ 3.11/Install\ Certificates.command

# Option 2: Use certifi (if using pyenv)
export SSL_CERT_FILE=$(python -c "import certifi; print(certifi.where())")

# Option 3: Install certs for pyenv Python
pip install certifi
python -m certifi
# Then: export SSL_CERT_FILE=/path/from/certifi/output

No matching distribution for torch – See step 3 in Backend setup above.

DNS/network error when downloading ResNet50 (nodename nor servname provided, or not known):

  1. On a machine with internet, run: python backend/scripts/download_resnet50_weights.py
  2. Copy the file to your offline machine at ~/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth
  3. Or download from https://download.pytorch.org/models/resnet50-0676ba61.pth and set: export RESNET50_WEIGHTS_PATH=/path/to/resnet50-0676ba61.pth

Development

Backend Development

cd backend
python main.py

Frontend Development

cd frontend
npm run dev

Testing the API

# Test prediction with curl
curl -X POST -F "file=@album_cover.jpg" http://localhost:8000/api/predict

AWS SageMaker Deployment

This project includes full AWS SageMaker support for production deployment.

Quick Start

# 1. Upload data and code to S3
./prepare_for_studio.sh your-bucket-name

# 2. Train on SageMaker (from Studio notebook or CLI)
# See docs/SAGEMAKER_README.md

# 3. Deploy API Gateway + Lambda
cd infrastructure && npm run deploy -- --context endpointName=album-classifier

Documentation

Future Enhancements

  • Expand to full Discogs catalog
  • Add batch prediction support
  • Implement image similarity search
  • Add user feedback for model improvement
  • Deploy to AWS SageMaker for production
  • Add caching for faster predictions
  • Support for multi-image input

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors