diff --git a/README.md b/README.md index 6417b97a6..eda138002 100644 --- a/README.md +++ b/README.md @@ -16,50 +16,31 @@ The entire SimPaths documentation is available on its [WikiPage](https://github. \ No newline at end of file +--> diff --git a/SimPaths/documentation/README.md b/SimPaths/documentation/README.md new file mode 100644 index 000000000..b36ace796 --- /dev/null +++ b/SimPaths/documentation/README.md @@ -0,0 +1,34 @@ +# SimPaths Documentation + +This documentation is structured to support both first-time users and contributors. + +## Recommended reading order + +1. [Getting Started](getting-started.md) +2. [CLI Reference](cli-reference.md) +3. [Configuration](configuration.md) +4. [Scenario Cookbook](scenario-cookbook.md) +5. [Data and Outputs](data-and-outputs.md) +6. [Troubleshooting](troubleshooting.md) + +For contributors and advanced users: + +- [Architecture](architecture.md) +- [Development and Testing](development.md) +- [GUI Guide](gui-guide.md) + +## Scope + +These guides cover: + +- Building SimPaths with Maven +- Running single-run and multi-run workflows +- Configuring model, collector, and runtime behavior via YAML +- Understanding expected input/output files and generated artifacts +- Running unit and integration tests locally and in CI + +## Conventions + +- Commands are shown from the repository root. +- Paths are relative to the repository root. +- `default.yml` refers to `config/default.yml`. diff --git a/SimPaths/documentation/architecture.md b/SimPaths/documentation/architecture.md new file mode 100644 index 000000000..a0e168edf --- /dev/null +++ b/SimPaths/documentation/architecture.md @@ -0,0 +1,44 @@ +# Architecture + +## High-level module map + +Core package layout under `src/main/java/simpaths/`: + +- `experiment/`: simulation entry points and orchestration +- `model/`: core simulation entities and yearly process logic +- `data/`: parameters, setup routines, filters, statistics helpers + +## Primary entry points + +- `simpaths.experiment.SimPathsStart` + - Builds/refreshes setup artifacts + - Launches single simulation run (GUI or headless) +- `simpaths.experiment.SimPathsMultiRun` + - Loads YAML config + - Iterates runs with optional seed/innovation logic + - Supports persistence mode switching + +## Runtime managers + +The simulation engine registers: + +- `SimPathsModel`: state evolution and process scheduling +- `SimPathsCollector`: statistics computation and export +- `SimPathsObserver`: GUI observation layer (when GUI is enabled) + +## Data flow + +1. Setup stage prepares policy schedule and input database. +2. Runtime model loads parameters and input maps. +3. Collector computes and exports statistics at scheduled intervals. +4. Output files are written to run folders under `output/`. + +## Configuration flow + +`SimPathsMultiRun` combines: + +- defaults in class fields +- overrides from `config/.yml` +- final CLI overrides at invocation time + +This layered strategy supports reproducible batch runs with targeted command-line changes. diff --git a/SimPaths/documentation/cli-reference.md b/SimPaths/documentation/cli-reference.md new file mode 100644 index 000000000..7535ba00b --- /dev/null +++ b/SimPaths/documentation/cli-reference.md @@ -0,0 +1,90 @@ +# CLI Reference + +## `singlerun.jar` (`SimPathsStart`) + +Usage: + +```bash +java -jar singlerun.jar [options] +``` + +### Options + +| Option | Meaning | +|---|---| +| `-c`, `--country ` | Country code (`UK` or `IT`) | +| `-s`, `--startYear ` | Simulation start year | +| `-Setup` | Setup only (do not run simulation) | +| `-Run` | Run only (skip setup) | +| `-r`, `--rewrite-policy-schedule` | Rebuild policy schedule from policy files | +| `-g`, `--showGui ` | Enable or disable GUI | +| `-h`, `--help` | Print help | + +Notes: + +- `-Setup` and `-Run` are mutually exclusive. +- For non-GUI environments, use `-g false`. + +### Examples + +Setup only: + +```bash +java -jar singlerun.jar -c UK -s 2019 -g false -Setup --rewrite-policy-schedule +``` + +Run only (after setup exists): + +```bash +java -jar singlerun.jar -g false -Run +``` + +## `multirun.jar` (`SimPathsMultiRun`) + +Usage: + +```bash +java -jar multirun.jar [options] +``` + +### Options + +| Option | Meaning | +|---|---| +| `-p`, `--popSize ` | Simulated population size | +| `-s`, `--startYear ` | Start year | +| `-e`, `--endYear ` | End year | +| `-DBSetup` | Database setup mode | +| `-n`, `--maxNumberOfRuns ` | Number of sequential runs | +| `-r`, `--randomSeed ` | Seed for first run | +| `-g`, `--executeWithGui ` | Enable or disable GUI | +| `-config ` | Config file in `config/` (default: `default.yml`) | +| `-f` | Write stdout and logs to `output/logs/` | +| `-P`, `--persist ` | Persistence strategy for processed dataset | +| `-h`, `--help` | Print help | + +Persistence modes: + +- `root` (default): persist to root input area for reuse +- `run`: persist per run output folder +- `none`: no processed-data persistence + +### Examples + +Create setup database using config: + +```bash +java -jar multirun.jar -DBSetup -config test_create_database.yml +``` + +Run two simulations with root persistence: + +```bash +java -jar multirun.jar -config test_run.yml -P root +``` + +Run without persistence and with file logging: + +```bash +java -jar multirun.jar -config default.yml -P none -f +``` diff --git a/SimPaths/documentation/configuration.md b/SimPaths/documentation/configuration.md new file mode 100644 index 000000000..4e8a1426a --- /dev/null +++ b/SimPaths/documentation/configuration.md @@ -0,0 +1,101 @@ +# Configuration + +SimPaths multi-run behavior is controlled by YAML files in `config/`. + +Examples in this repository include: + +- `default.yml` +- `test_create_database.yml` +- `test_run.yml` +- `create database.yml` +- `sc analysis*.yml` +- `intertemporal elasticity.yml` +- `labour supply elasticity.yml` + +For command-by-command guidance for each provided config, see [Scenario Cookbook](scenario-cookbook.md). + +## How config is applied + +`SimPathsMultiRun` loads `config/` and applies values in two stages: + +1. YAML values initialize runtime fields and argument maps. +2. CLI flags override those values if provided. + +## Top-level keys + +### Core run arguments + +Common fields: + +- `countryString` +- `maxNumberOfRuns` +- `executeWithGui` +- `randomSeed` +- `startYear` +- `endYear` +- `popSize` +- `integrationTest` + +### `model_args` + +Passed into `SimPathsModel` via reflection. + +Typical toggles include: + +- alignment flags (`alignPopulation`, `alignFertility`, `alignEmployment`, ...) +- behavioral switches (`enableIntertemporalOptimisations`, `responsesToHealth`, ...) +- persistence of behavioral grids (`saveBehaviour`, `useSavedBehaviour`, `readGrid`) + +### `collector_args` + +Controls output collection and export behavior (via `SimPathsCollector`), including: + +- `persistStatistics`, `persistStatistics2`, `persistStatistics3` +- `persistPersons`, `persistBenefitUnits`, `persistHouseholds` +- `exportToCSV`, `exportToDatabase` + +### `innovation_args` + +Controls iteration logic across runs, such as: + +- `randomSeedInnov` +- `intertemporalElasticityInnov` +- `labourSupplyElasticityInnov` +- `flagDatabaseSetup` + +### `parameter_args` + +Overrides values from `Parameters` (paths and model-global flags). + +Common examples: + +- `trainingFlag` +- `working_directory` +- `input_directory` +- `input_directory_initial_populations` +- `euromod_output_directory` + +## Minimal example + +```yaml +maxNumberOfRuns: 2 +executeWithGui: false +randomSeed: 100 +startYear: 2019 +endYear: 2022 +popSize: 20000 + +collector_args: + persistStatistics: true + persistStatistics2: true + persistStatistics3: true + persistPersons: false + persistBenefitUnits: false + persistHouseholds: false +``` + +Run it: + +```bash +java -jar multirun.jar -config test_run.yml +``` diff --git a/SimPaths/documentation/data-and-outputs.md b/SimPaths/documentation/data-and-outputs.md new file mode 100644 index 000000000..0e7ef0d13 --- /dev/null +++ b/SimPaths/documentation/data-and-outputs.md @@ -0,0 +1,56 @@ +# Data and Outputs + +## Data availability model + +- Source code and documentation are open. +- Full research input datasets are not freely redistributable. +- Training data is included to support development, local testing, and CI. + +## Input directory layout + +Key paths: + +- `input/`: + - regression and scenario Excel files (`reg_*.xlsx`, `scenario_*.xlsx`, `align_*.xlsx`) + - generated setup files (`input.mv.db`, `EUROMODpolicySchedule.xlsx`, `DatabaseCountryYear.xlsx`) +- `input/InitialPopulations/`: + - `training/population_initial_UK_2019.csv` + - `compile/` scripts for preparing initial-population inputs +- `input/EUROMODoutput/`: + - `training/*.txt` policy outputs and schedule artifacts + +## Setup-generated artifacts + +Running setup mode (`singlerun` setup or `multirun -DBSetup`) creates or refreshes: + +- `input/input.mv.db` +- `input/EUROMODpolicySchedule.xlsx` +- `input/DatabaseCountryYear.xlsx` + +## Output directory layout + +Simulation runs produce timestamped folders under `output/`, typically with: + +- `csv/` generated statistics and exported entities +- `database/` run-specific persistence output +- `input/` copied or persisted run input artifacts + +Common CSV files include: + +- `Statistics1.csv` +- `Statistics21.csv` +- `Statistics31.csv` +- `EmploymentStatistics1.csv` +- `HealthStatistics1.csv` + +## Logging output + +If `-f` is enabled with `multirun.jar`, logs are written to: + +- `output/logs/run_.txt` (stdout capture) +- `output/logs/run_.log` (log4j output) + +## Validation and analysis assets + +- `validation/` contains validation artifacts and graph assets. +- `analysis/` contains `.do` scripts and spreadsheets used for downstream analysis. diff --git a/SimPaths/documentation/development.md b/SimPaths/documentation/development.md new file mode 100644 index 000000000..c5f5c4da9 --- /dev/null +++ b/SimPaths/documentation/development.md @@ -0,0 +1,61 @@ +# Development and Testing + +## Build + +Compile and package: + +```bash +mvn clean package +``` + +## Tests + +### Unit tests + +Run unit tests (Surefire): + +```bash +mvn test +``` + +### Integration tests + +Run integration tests (Failsafe): + +```bash +mvn verify +``` + +Integration tests exercise setup and run flows and compare generated CSV outputs to expected files in: + +- `src/test/java/simpaths/integrationtest/expected/` + +## CI workflows + +GitHub workflows in `.github/workflows/` run: + +- build and package on pull requests to `main` and `develop` +- integration tests (`mvn verify`) +- smoke runs for `singlerun.jar` and `multirun.jar` with persistence variants +- Javadoc generation and publish (on `develop` pushes) + +## Javadoc + +Generate locally: + +```bash +mvn javadoc:javadoc +``` + +## Typical contributor flow + +1. Create a feature branch in your fork. +2. Implement and test changes. +3. Run `mvn verify` before opening a PR. +4. Open a PR against `develop` (or `main` for stable fixes, when appropriate). + +## Debugging tips + +- Use `-g false` on headless systems. +- Use `-f` with `multirun.jar` to capture logs in `output/logs/`. +- Start from `config/test_create_database.yml` and `config/test_run.yml` when reproducing CI behavior. diff --git a/SimPaths/documentation/figures/Chart Properties.png b/SimPaths/documentation/figures/Chart Properties.png new file mode 100644 index 000000000..757653b1e Binary files /dev/null and b/SimPaths/documentation/figures/Chart Properties.png differ diff --git a/SimPaths/documentation/figures/Charts.png b/SimPaths/documentation/figures/Charts.png new file mode 100644 index 000000000..1791402d8 Binary files /dev/null and b/SimPaths/documentation/figures/Charts.png differ diff --git a/SimPaths/documentation/figures/Output stream.png b/SimPaths/documentation/figures/Output stream.png new file mode 100644 index 000000000..01bdfa14f Binary files /dev/null and b/SimPaths/documentation/figures/Output stream.png differ diff --git a/SimPaths/documentation/figures/SimPaths GUI.png b/SimPaths/documentation/figures/SimPaths GUI.png new file mode 100644 index 000000000..369771f9b Binary files /dev/null and b/SimPaths/documentation/figures/SimPaths GUI.png differ diff --git a/SimPaths/documentation/figures/SimPaths parameters.png b/SimPaths/documentation/figures/SimPaths parameters.png new file mode 100644 index 000000000..96a4eab5a Binary files /dev/null and b/SimPaths/documentation/figures/SimPaths parameters.png differ diff --git a/SimPaths/documentation/figures/SimPaths-Buttons.png b/SimPaths/documentation/figures/SimPaths-Buttons.png new file mode 100644 index 000000000..65dd272c5 Binary files /dev/null and b/SimPaths/documentation/figures/SimPaths-Buttons.png differ diff --git a/SimPaths/documentation/figures/SimPaths-Chart-Zoom.png b/SimPaths/documentation/figures/SimPaths-Chart-Zoom.png new file mode 100644 index 000000000..05bec29c0 Binary files /dev/null and b/SimPaths/documentation/figures/SimPaths-Chart-Zoom.png differ diff --git a/SimPaths/documentation/getting-started.md b/SimPaths/documentation/getting-started.md new file mode 100644 index 000000000..6a93e977d --- /dev/null +++ b/SimPaths/documentation/getting-started.md @@ -0,0 +1,65 @@ +# Getting Started + +## Prerequisites + +- Java 19 +- Maven 3.8+ +- Optional IDE: IntelliJ IDEA (import as a Maven project) + +## Build + +From repository root: + +```bash +mvn clean package +``` + +Artifacts produced at the root: + +- `singlerun.jar` +- `multirun.jar` + +## Understand run modes + +SimPaths supports two entry points: + +- `singlerun.jar` (`SimPathsStart`): setup and single simulation execution +- `multirun.jar` (`SimPathsMultiRun`): repeated runs across seeds/scenarios + +## First run (headless) + +### Step 1: setup input artifacts + +```bash +java -jar singlerun.jar -c UK -s 2019 -g false -Setup --rewrite-policy-schedule +``` + +This prepares required setup files such as: + +- `input/input.mv.db` +- `input/EUROMODpolicySchedule.xlsx` +- `input/DatabaseCountryYear.xlsx` + +### Step 2: execute a multi-run configuration + +```bash +java -jar multirun.jar -config default.yml -g false +``` + +Results are written under `output//`. + +## Training vs full data mode + +- The repository includes training data under: + - `input/InitialPopulations/training/` + - `input/EUROMODoutput/training/` +- If no initial-population CSV files are found in the main input location, SimPaths automatically switches to training mode. +- Training mode supports development and CI, but is not intended for research interpretation. + +## GUI usage + +Use `-g true` (default behavior in several flows) to run with GUI components. + +In headless/remote environments, set `-g false`. + +See [GUI Guide](gui-guide.md) for screenshots. diff --git a/SimPaths/documentation/gui-guide.md b/SimPaths/documentation/gui-guide.md new file mode 100644 index 000000000..40ad53d96 --- /dev/null +++ b/SimPaths/documentation/gui-guide.md @@ -0,0 +1,51 @@ +# GUI Guide + +The GUI is available in single-run and multi-run workflows when enabled. + +## Enable GUI + +Single run: + +```bash +java -jar singlerun.jar -g true +``` + +Multi run: + +```bash +java -jar multirun.jar -config default.yml -g true +``` + +## Screenshots + +Main GUI: + +![SimPaths GUI](figures/SimPaths%20GUI.png) + +Control buttons: + +![SimPaths Buttons](figures/SimPaths-Buttons.png) + +Parameter selection: + +![SimPaths Parameters](figures/SimPaths%20parameters.png) + +Charts overview: + +![Charts](figures/Charts.png) + +Chart properties: + +![Chart Properties](figures/Chart%20Properties.png) + +Chart zoom example: + +![Chart Zoom](figures/SimPaths-Chart-Zoom.png) + +Output stream panel: + +![Output Stream](figures/Output%20stream.png) + +## Headless note + +In remote servers or CI, run with `-g false`. diff --git a/SimPaths/documentation/scenario-cookbook.md b/SimPaths/documentation/scenario-cookbook.md new file mode 100644 index 000000000..1d8576068 --- /dev/null +++ b/SimPaths/documentation/scenario-cookbook.md @@ -0,0 +1,171 @@ +# Scenario Cookbook + +This guide maps every provided YAML scenario in `config/` to its intended use. + +All commands below assume you are running from repository root after building jars. + +## Baseline and testing scenarios + +### `default.yml` + +Use when you want the standard baseline run with conservative defaults. + +Command: + +```bash +java -jar multirun.jar -config default.yml -g false +``` + +### `test_create_database.yml` + +Use for test-oriented database setup with training data (`trainingFlag: true`). + +Command: + +```bash +java -jar multirun.jar -DBSetup -config test_create_database.yml +``` + +### `test_run.yml` + +Use for integration-style short runs (2 runs, test settings). + +Command: + +```bash +java -jar multirun.jar -config test_run.yml -P root +``` + +### `programming test.yml` + +Use for quick developer smoke runs with smaller population and simplified behavior flags. + +Command: + +```bash +java -jar multirun.jar -config "programming test.yml" -g false +``` + +## Setup-focused scenario + +### `create database.yml` + +Use to build a full database object set for UK long-horizon work. This file sets `flagDatabaseSetup: true` in `innovation_args`, so it runs setup mode. + +Command: + +```bash +java -jar multirun.jar -config "create database.yml" +``` + +## Sensitivity and robustness scenarios + +### `random seed.yml` + +Use to run multiple replications with random-seed iteration enabled. + +Command: + +```bash +java -jar multirun.jar -config "random seed.yml" -g false +``` + +### `intertemporal elasticity.yml` + +Use for intertemporal elasticity sensitivity (3 runs with interest-rate innovation pattern). + +Command: + +```bash +java -jar multirun.jar -config "intertemporal elasticity.yml" -g false +``` + +### `labour supply elasticity.yml` + +Use for labour-supply elasticity sensitivity (3 runs with labour-income innovation pattern). + +Command: + +```bash +java -jar multirun.jar -config "labour supply elasticity.yml" -g false +``` + +## Targeted output scenarios + +### `employmentTransStats.yml` + +Use when you mainly want employment transition statistics and minimal other persisted outputs. + +Command: + +```bash +java -jar multirun.jar -config employmentTransStats.yml -g false +``` + +## Social care scenario family + +### `sc calibration.yml` + +Use to calibrate preference parameters for social care analysis. + +Command: + +```bash +java -jar multirun.jar -config "sc calibration.yml" -g false +``` + +### `sc analysis0.yml` + +Base social care analysis run with social care enabled and alignment on. + +Command: + +```bash +java -jar multirun.jar -config "sc analysis0.yml" -g false +``` + +### `sc analysis1.yml` + +Main social care analysis run with named behavioral grid output (`saveBehaviour: true`, `readGrid: "sc analysis1"`). + +Command: + +```bash +java -jar multirun.jar -config "sc analysis1.yml" -g false +``` + +### `sc analysis1b.yml` + +Variant of analysis1 with `alignPopulation: false` and `useSavedBehaviour: true` for comparison. + +Command: + +```bash +java -jar multirun.jar -config "sc analysis1b.yml" -g false +``` + +### `sc analysis2.yml` + +Zero-costs social care scenario (`flagSuppressChildcareCosts: true`, `flagSuppressSocialCareCosts: true`). + +Command: + +```bash +java -jar multirun.jar -config "sc analysis2.yml" -g false +``` + +### `sc analysis3.yml` + +Ignore-costs response scenario that reuses behavior from analysis2 (`useSavedBehaviour: true`, `readGrid: "sc analysis2"`). + +Command: + +```bash +java -jar multirun.jar -config "sc analysis3.yml" -g false +``` + +## Practical notes + +- Use quotes around config filenames that contain spaces. +- Add `-f` to write run logs to `output/logs/`. +- Override config values via CLI flags when needed (for example `-n`, `-r`, `-P`, `-g`). diff --git a/SimPaths/documentation/troubleshooting.md b/SimPaths/documentation/troubleshooting.md new file mode 100644 index 000000000..d9e69082c --- /dev/null +++ b/SimPaths/documentation/troubleshooting.md @@ -0,0 +1,83 @@ +# Troubleshooting + +## `Config file not found` + +Cause: + +- `-config` points to a file not present in `config/`. + +Fix: + +- Verify filename and extension. +- Example: + +```bash +java -jar multirun.jar -config default.yml +``` + +## Missing `EUROMODpolicySchedule.xlsx` + +Cause: + +- Setup has not generated schedule files yet. + +Fix: + +- Re-run setup with rewrite enabled: + +```bash +java -jar singlerun.jar -c UK -s 2019 -g false --rewrite-policy-schedule -Setup +``` + +## GUI errors on server or CI + +Cause: + +- Running GUI mode in headless environment. + +Fix: + +- Disable GUI: + +```bash +-g false +``` + +## Start year rejected or inconsistent + +Cause: + +- Chosen year is outside available input/training data bounds. + +Fix: + +- Use a year covered by available input files. +- For training-only mode, use the provided training start year (2019 in this repository setup). + +## Expected CSV files not found after run + +Cause: + +- Collector settings disabled certain exports. +- Run failed before collector dump phase. + +Fix: + +- Check `collector_args` in YAML. +- Re-run with `-f` and inspect `output/logs/run_.txt` and `.log`. + +## Integration test output mismatch + +Cause: + +- Simulation behavior changed or output schema changed. + +Fix: + +1. Confirm differences are intended. +2. Replace expected files in `src/test/java/simpaths/integrationtest/expected/` with verified new outputs. +3. Re-run: + +```bash +mvn verify +```