Skip to content

jo-s-eph/RL-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quoridor RL

Reinforcement learning agents for the board game Quoridor on a 3x3 board with 1 wall per player.

Project Structure

Game engine:

  • game.py — game logic, Alpha-Beta, MCTS
  • quoridor_env.py — Gymnasium environment (flat/grid obs, sparse/dense reward)
  • wrappers.py — observation, reward, and action mask wrappers
  • opponents.py — baseline agents (Random, GreedyPath, Blocking, Minimax)

Value-based methods:

  • deep_q_network.py, double_deep_q_network.py, dueling_deep_q_network.py, categorical_deep_q_network.py, rainbow_deep_q_network.py — DQN variant implementations
  • train_all.py, run_train_all.py — training scripts
  • best_params.json — tuned hyperparameters per model
  • dqn_agents.py — unified loader for all DQN models

Policy gradient methods:

  • train_pg.py — REINFORCE, A2C, PPO, TRPO training
  • policy_agents.py — unified loader for PG models

Evaluation:

  • arena.py — round-robin tournament framework
  • eval_ppo.py — tournament runner
  • visualize.py — game replay and visualization
  • quoridor_research.ipynb — main analysis notebook with all results

Trained models:

  • models/ — DQN models (20 variants, retrained with alternating starts)
  • pg_results/ — PG models and training curves

Setup

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install stable-baselines3 sb3-contrib torchrl

Reproducing

Train policy gradient models:

python train_pg.py

Train DQN models:

python run_train_all.py

Run the full tournament:

python arena.py

Analysis notebook:

jupyter notebook quoridor_research.ipynb

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors