Skip to content

Commit ecec105

Browse files
authored
Merge pull request #384 from centreformicrosimulation/improve-documentation
upgrade
2 parents 7915ab2 + f6ab75e commit ecec105

18 files changed

Lines changed: 756 additions & 0 deletions

.DS_Store

4 KB
Binary file not shown.

documentation/README.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# SimPaths Documentation
2+
3+
This documentation is structured to support both first-time users and contributors.
4+
5+
## Recommended reading order
6+
7+
1. [Getting Started](getting-started.md)
8+
2. [CLI Reference](cli-reference.md)
9+
3. [Configuration](configuration.md)
10+
4. [Scenario Cookbook](scenario-cookbook.md)
11+
5. [Data and Outputs](data-and-outputs.md)
12+
6. [Troubleshooting](troubleshooting.md)
13+
14+
For contributors and advanced users:
15+
16+
- [Architecture](architecture.md)
17+
- [Development and Testing](development.md)
18+
- [GUI Guide](gui-guide.md)
19+
20+
## Scope
21+
22+
These guides cover:
23+
24+
- Building SimPaths with Maven
25+
- Running single-run and multi-run workflows
26+
- Configuring model, collector, and runtime behavior via YAML
27+
- Understanding expected input/output files and generated artifacts
28+
- Running unit and integration tests locally and in CI
29+
30+
## Conventions
31+
32+
- Commands are shown from the repository root.
33+
- Paths are relative to the repository root.
34+
- `default.yml` refers to `config/default.yml`.

documentation/architecture.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Architecture
2+
3+
## High-level module map
4+
5+
Core package layout under `src/main/java/simpaths/`:
6+
7+
- `experiment/`: simulation entry points and orchestration
8+
- `model/`: core simulation entities and yearly process logic
9+
- `data/`: parameters, setup routines, filters, statistics helpers
10+
11+
## Primary entry points
12+
13+
- `simpaths.experiment.SimPathsStart`
14+
- Builds/refreshes setup artifacts
15+
- Launches single simulation run (GUI or headless)
16+
- `simpaths.experiment.SimPathsMultiRun`
17+
- Loads YAML config
18+
- Iterates runs with optional seed/innovation logic
19+
- Supports persistence mode switching
20+
21+
## Runtime managers
22+
23+
The simulation engine registers:
24+
25+
- `SimPathsModel`: state evolution and process scheduling
26+
- `SimPathsCollector`: statistics computation and export
27+
- `SimPathsObserver`: GUI observation layer (when GUI is enabled)
28+
29+
## Data flow
30+
31+
1. Setup stage prepares policy schedule and input database.
32+
2. Runtime model loads parameters and input maps.
33+
3. Collector computes and exports statistics at scheduled intervals.
34+
4. Output files are written to run folders under `output/`.
35+
36+
## Configuration flow
37+
38+
`SimPathsMultiRun` combines:
39+
40+
- defaults in class fields
41+
- overrides from `config/<file>.yml`
42+
- final CLI overrides at invocation time
43+
44+
This layered strategy supports reproducible batch runs with targeted command-line changes.

documentation/cli-reference.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# CLI Reference
2+
3+
## `singlerun.jar` (`SimPathsStart`)
4+
5+
Usage:
6+
7+
```bash
8+
java -jar singlerun.jar [options]
9+
```
10+
11+
### Options
12+
13+
| Option | Meaning |
14+
|---|---|
15+
| `-c`, `--country <CC>` | Country code (`UK` or `IT`) |
16+
| `-s`, `--startYear <year>` | Simulation start year |
17+
| `-Setup` | Setup only (do not run simulation) |
18+
| `-Run` | Run only (skip setup) |
19+
| `-r`, `--rewrite-policy-schedule` | Rebuild policy schedule from policy files |
20+
| `-g`, `--showGui <true/false>` | Enable or disable GUI |
21+
| `-h`, `--help` | Print help |
22+
23+
Notes:
24+
25+
- `-Setup` and `-Run` are mutually exclusive.
26+
- For non-GUI environments, use `-g false`.
27+
28+
### Examples
29+
30+
Setup only:
31+
32+
```bash
33+
java -jar singlerun.jar -c UK -s 2019 -g false -Setup --rewrite-policy-schedule
34+
```
35+
36+
Run only (after setup exists):
37+
38+
```bash
39+
java -jar singlerun.jar -g false -Run
40+
```
41+
42+
## `multirun.jar` (`SimPathsMultiRun`)
43+
44+
Usage:
45+
46+
```bash
47+
java -jar multirun.jar [options]
48+
```
49+
50+
### Options
51+
52+
| Option | Meaning |
53+
|---|---|
54+
| `-p`, `--popSize <int>` | Simulated population size |
55+
| `-s`, `--startYear <year>` | Start year |
56+
| `-e`, `--endYear <year>` | End year |
57+
| `-DBSetup` | Database setup mode |
58+
| `-n`, `--maxNumberOfRuns <int>` | Number of sequential runs |
59+
| `-r`, `--randomSeed <int>` | Seed for first run |
60+
| `-g`, `--executeWithGui <true/false>` | Enable or disable GUI |
61+
| `-config <file>` | Config file in `config/` (default: `default.yml`) |
62+
| `-f` | Write stdout and logs to `output/logs/` |
63+
| `-P`, `--persist <root|run|none>` | Persistence strategy for processed dataset |
64+
| `-h`, `--help` | Print help |
65+
66+
Persistence modes:
67+
68+
- `root` (default): persist to root input area for reuse
69+
- `run`: persist per run output folder
70+
- `none`: no processed-data persistence
71+
72+
### Examples
73+
74+
Create setup database using config:
75+
76+
```bash
77+
java -jar multirun.jar -DBSetup -config test_create_database.yml
78+
```
79+
80+
Run two simulations with root persistence:
81+
82+
```bash
83+
java -jar multirun.jar -config test_run.yml -P root
84+
```
85+
86+
Run without persistence and with file logging:
87+
88+
```bash
89+
java -jar multirun.jar -config default.yml -P none -f
90+
```

documentation/configuration.md

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# Configuration
2+
3+
SimPaths multi-run behavior is controlled by YAML files in `config/`.
4+
5+
Examples in this repository include:
6+
7+
- `default.yml`
8+
- `test_create_database.yml`
9+
- `test_run.yml`
10+
- `create database.yml`
11+
- `sc analysis*.yml`
12+
- `intertemporal elasticity.yml`
13+
- `labour supply elasticity.yml`
14+
15+
For command-by-command guidance for each provided config, see [Scenario Cookbook](scenario-cookbook.md).
16+
17+
## How config is applied
18+
19+
`SimPathsMultiRun` loads `config/<file>` and applies values in two stages:
20+
21+
1. YAML values initialize runtime fields and argument maps.
22+
2. CLI flags override those values if provided.
23+
24+
## Top-level keys
25+
26+
### Core run arguments
27+
28+
Common fields:
29+
30+
- `countryString`
31+
- `maxNumberOfRuns`
32+
- `executeWithGui`
33+
- `randomSeed`
34+
- `startYear`
35+
- `endYear`
36+
- `popSize`
37+
- `integrationTest`
38+
39+
### `model_args`
40+
41+
Passed into `SimPathsModel` via reflection.
42+
43+
Typical toggles include:
44+
45+
- alignment flags (`alignPopulation`, `alignFertility`, `alignEmployment`, ...)
46+
- behavioral switches (`enableIntertemporalOptimisations`, `responsesToHealth`, ...)
47+
- persistence of behavioral grids (`saveBehaviour`, `useSavedBehaviour`, `readGrid`)
48+
49+
### `collector_args`
50+
51+
Controls output collection and export behavior (via `SimPathsCollector`), including:
52+
53+
- `persistStatistics`, `persistStatistics2`, `persistStatistics3`
54+
- `persistPersons`, `persistBenefitUnits`, `persistHouseholds`
55+
- `exportToCSV`, `exportToDatabase`
56+
57+
### `innovation_args`
58+
59+
Controls iteration logic across runs, such as:
60+
61+
- `randomSeedInnov`
62+
- `intertemporalElasticityInnov`
63+
- `labourSupplyElasticityInnov`
64+
- `flagDatabaseSetup`
65+
66+
### `parameter_args`
67+
68+
Overrides values from `Parameters` (paths and model-global flags).
69+
70+
Common examples:
71+
72+
- `trainingFlag`
73+
- `working_directory`
74+
- `input_directory`
75+
- `input_directory_initial_populations`
76+
- `euromod_output_directory`
77+
78+
## Minimal example
79+
80+
```yaml
81+
maxNumberOfRuns: 2
82+
executeWithGui: false
83+
randomSeed: 100
84+
startYear: 2019
85+
endYear: 2022
86+
popSize: 20000
87+
88+
collector_args:
89+
persistStatistics: true
90+
persistStatistics2: true
91+
persistStatistics3: true
92+
persistPersons: false
93+
persistBenefitUnits: false
94+
persistHouseholds: false
95+
```
96+
97+
Run it:
98+
99+
```bash
100+
java -jar multirun.jar -config test_run.yml
101+
```

documentation/data-and-outputs.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Data and Outputs
2+
3+
## Data availability model
4+
5+
- Source code and documentation are open.
6+
- Full research input datasets are not freely redistributable.
7+
- Training data is included to support development, local testing, and CI.
8+
9+
## Input directory layout
10+
11+
Key paths:
12+
13+
- `input/`:
14+
- regression and scenario Excel files (`reg_*.xlsx`, `scenario_*.xlsx`, `align_*.xlsx`)
15+
- generated setup files (`input.mv.db`, `EUROMODpolicySchedule.xlsx`, `DatabaseCountryYear.xlsx`)
16+
- `input/InitialPopulations/`:
17+
- `training/population_initial_UK_2019.csv`
18+
- `compile/` scripts for preparing initial-population inputs
19+
- `input/EUROMODoutput/`:
20+
- `training/*.txt` policy outputs and schedule artifacts
21+
22+
## Setup-generated artifacts
23+
24+
Running setup mode (`singlerun` setup or `multirun -DBSetup`) creates or refreshes:
25+
26+
- `input/input.mv.db`
27+
- `input/EUROMODpolicySchedule.xlsx`
28+
- `input/DatabaseCountryYear.xlsx`
29+
30+
## Output directory layout
31+
32+
Simulation runs produce timestamped folders under `output/`, typically with:
33+
34+
- `csv/` generated statistics and exported entities
35+
- `database/` run-specific persistence output
36+
- `input/` copied or persisted run input artifacts
37+
38+
Common CSV files include:
39+
40+
- `Statistics1.csv`
41+
- `Statistics21.csv`
42+
- `Statistics31.csv`
43+
- `EmploymentStatistics1.csv`
44+
- `HealthStatistics1.csv`
45+
46+
## Logging output
47+
48+
If `-f` is enabled with `multirun.jar`, logs are written to:
49+
50+
- `output/logs/run_<seed>.txt` (stdout capture)
51+
- `output/logs/run_<seed>.log` (log4j output)
52+
53+
## Validation and analysis assets
54+
55+
- `validation/` contains validation artifacts and graph assets.
56+
- `analysis/` contains `.do` scripts and spreadsheets used for downstream analysis.

documentation/development.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Development and Testing
2+
3+
## Build
4+
5+
Compile and package:
6+
7+
```bash
8+
mvn clean package
9+
```
10+
11+
## Tests
12+
13+
### Unit tests
14+
15+
Run unit tests (Surefire):
16+
17+
```bash
18+
mvn test
19+
```
20+
21+
### Integration tests
22+
23+
Run integration tests (Failsafe):
24+
25+
```bash
26+
mvn verify
27+
```
28+
29+
Integration tests exercise setup and run flows and compare generated CSV outputs to expected files in:
30+
31+
- `src/test/java/simpaths/integrationtest/expected/`
32+
33+
## CI workflows
34+
35+
GitHub workflows in `.github/workflows/` run:
36+
37+
- build and package on pull requests to `main` and `develop`
38+
- integration tests (`mvn verify`)
39+
- smoke runs for `singlerun.jar` and `multirun.jar` with persistence variants
40+
- Javadoc generation and publish (on `develop` pushes)
41+
42+
## Javadoc
43+
44+
Generate locally:
45+
46+
```bash
47+
mvn javadoc:javadoc
48+
```
49+
50+
## Typical contributor flow
51+
52+
1. Create a feature branch in your fork.
53+
2. Implement and test changes.
54+
3. Run `mvn verify` before opening a PR.
55+
4. Open a PR against `develop` (or `main` for stable fixes, when appropriate).
56+
57+
## Debugging tips
58+
59+
- Use `-g false` on headless systems.
60+
- Use `-f` with `multirun.jar` to capture logs in `output/logs/`.
61+
- Start from `config/test_create_database.yml` and `config/test_run.yml` when reproducing CI behavior.

0 commit comments

Comments
 (0)