diff --git a/fbfmaproom/developer-docs.md b/fbfmaproom/developer-docs.md new file mode 100644 index 00000000..82a64468 --- /dev/null +++ b/fbfmaproom/developer-docs.md @@ -0,0 +1,461 @@ + +# FbF Maproom — Developer Documentation + +> **Audience:** developers contributing to or maintaining the AA Design tool. +> This document covers codebase structure, module responsibilities, data flow, and how the key abstractions fit together. For user-facing setup and environment instructions, see `README.md`. + +--- + +## Table of Contents + +1. [High-level overview](#1-high-level-overview) +2. [Repository layout](#2-repository-layout) +3. [Module reference](#3-module-reference) + - [fbfmaproom.py](#fbfmaproompy) + - [fbflayout.py](#fbflayoutpy) + - [fbftable.py](#fbftablepy) + - [fbf-update-data.py](#fbf-update-datapy) + - [\_\_about\_\_.py](#__about__py) +4. [Configuration system](#4-configuration-system) +5. [Data model](#5-data-model) +6. [Request / callback flow](#6-request--callback-flow) +7. [API endpoints](#7-api-endpoints) +8. [Key algorithms](#8-key-algorithms) +9. [Dependencies](#9-dependencies) +10. [Environment and packaging](#10-environment-and-packaging) +11. [Docker and deployment](#11-docker-and-deployment) +12. [Known limitations and TODOs](#12-known-limitations-and-todos) + +--- + +## 1. High-level overview + +The FbF ("Forecast-based Financing") Maproom is a [Dash](https://dash.plotly.com/) web application that lets users evaluate climate forecast skill for anticipatory action (AA) design. It is served by a Flask server and deployed inside a Docker container in production. + +The core workflow is: + +1. A user selects a **country**, **season**, **forecast dataset**, **observation dataset**, a geographic **region or pixel**, and a **trigger frequency**. +2. The app retrieves historical forecast and observation data (stored as [Zarr](https://zarr.readthedocs.io/) arrays on disk) and computes a **threshold** — the forecast value at or above which the worst `freq`% of years would have been triggered. +3. The app renders an interactive **table** showing year-by-year trigger outcomes and a **map** showing the current season's forecast raster. +4. A "Set trigger" button links out to a companion Gantt tool pre-filled with the calculated threshold. Note: the Gantt integration has been deprecated. The "Set trigger" button needs to be either removed or updated before it can be considered functional. + + + +--- + +## 2. Repository layout + +``` +fbfmaproom/ +├── __about__.py # Package metadata (name, version, author) +├── fbfmaproom.py # Main application: server, Dash app, callbacks, Flask routes +├── fbflayout.py # Dash component tree (pure layout, no logic) +├── fbftable.py # HTML table construction helpers +├── fbf-update-data.py # CLI script: pull data from the IRI Data Library → Zarr +├── fbfmaproom-sample.yaml # Reference configuration file (never used in prod as-is) +├── pixi.toml # Dependency spec (pixi / conda-forge) +├── pixi.lock # Locked dependency tree +├── pyproject.toml # Pytest configuration +├── Dockerfile # Production container definition +└── release_container_image # Shell script: build and push Docker image +``` + +--- + +## 3. Module reference + +### fbfmaproom.py + +The entry point and by far the largest file (~1 800 lines). It contains: + +#### Server and app setup + +```python +SERVER = flask.Flask(__name__) +APP = FbfDash(__name__, server=SERVER, ...) +``` + +`FbfDash` is a thin subclass of `dash.Dash` that overrides `index()` to return a 404 for any URL path that doesn't correspond to a configured country. This prevents Dash from silently serving the layout for an arbitrary path. + +Config is loaded at module import time from the file(s) named in the `CONFIG` environment variable (colon-separated, later files override earlier ones): + +```python +CONFIG = pingrid.load_config(os.environ["CONFIG"]) +fill_config(CONFIG) +``` + +`fill_config()` replaces raw `dict` entries under `countries[*].datasets.*` with typed `ObsDataset` / `ForecastDataset` objects so the rest of the code can call methods on them rather than indexing dicts. + +--- + +#### Dataset classes + +``` +Dataset (base) +├── ObsDataset — observation data; has lower_is_worse flag +└── ForecastDataset — forecast data; always higher = worse; has is_poe flag +``` + +Both classes lazily open their Zarr stores via `open_data_array()` and carry metadata needed for the UI (label, units, colormap, value range, number formatter). + +The `units` property uses a sentinel `DEFAULT` object to distinguish "not yet loaded" from `None`, so it can read the value from the file the first time it's needed and cache it. + +--- + +#### Data access functions + +| Function | What it does | +|---|---| +| `open_data_array(cfg)` | Opens a Zarr store, renames coordinates according to config `var_names`, attaches colormap and scale metadata. | +| `open_forecast(country_key, forecast_key)` | Thin wrapper; sets scale 0–100 (forecasts are probabilities). | +| `open_obs(country_key, obs_key)` | Thin wrapper; converts `timedelta64` arrays to float days. | +| `select_forecast(...)` | Filters forecast array to a single issue month, optionally a single year, and a single percentile level. Handles the PoE vs non-PoE direction convention. | +| `select_obs(...)` | Filters observation array to a target calendar month across all years, optionally a single year. | +| `retrieve_shapes(country_key, level, ...)` | Fetches admin-boundary polygons either from PostGIS (via psycopg2) or from a zipped shapefile (via Fiona), returning a DataFrame with `key`, `label`, and `the_geom` (Shapely geometry) columns. | + +--- + +#### Table generation pipeline + +``` +fundamental_table_data() + → For each FORECAST column: calls select_forecast() at the + requested issue month and freq percentile, renames the + target_date coordinate to "time". + → For each OBS column: calls select_obs() across all years + for the target calendar month. + → Resolves each dataset to a scalar per year for the selected + region by calling value_for_geom() (spatial average for + gridded data, direct key lookup for pre-aggregated data). + → Merges forecast and obs datasets on the shared "time" axis, + drops years before start_year, and sorts descending by time. + → Returns a single xr.Dataset with one variable per table column. + +augment_table_data() + → Converts to a pandas DataFrame, calculates percentile ranks, + identifies worst-year flags, computes thresholds and + hits/misses summary. + +generate_tables() + → Orchestrates the above two functions; called from both the + Dash callback and the /export endpoint. +``` + +`hits_and_misses()` is a simple confusion-matrix calculation — it takes a boolean "triggered" series and a boolean "bad year" series and returns `[true_pos, false_pos, false_neg, true_neg, accuracy]`, labeled in the UI as worthy-action, act-in-vain, fail-to-act, worthy-inaction, and rate. + +--- + +#### Dash callbacks + +All callbacks follow the standard Dash pattern. The main ones are: + +| Callback | Trigger(s) | Purpose | +|---|---|---| +| `initial_setup` | URL pathname change | Populate all dropdowns (season, mode, predictors, etc.) from config; restore state from query string. | +| `forecast_selectors` | Season or map column change | Determine available years and issue months by inspecting the actual forecast Zarr file. | +| `start_year_selector` | Season or pathname change | Populate the start-year filter dropdown. | +| `map_click` | Map click or page load | Update the draggable marker position. | +| `update_selected_region` | Marker position or mode change | Find which admin region (or pixel bounding box) contains the marker; store its key; draw the outline on the map. | +| `update_popup` | Marker position or mode change | Update the marker popup with region name or lat/lon coordinates. | +| `table_cb` | Most control changes | The main callback: run the full table pipeline and render the HTML table + summary. | +| `tile_url_callback` | Year, issue month, freq, map column changes | Constructs the tile URL for the forecast or obs raster layer and updates the colorbar. Also calls `select_forecast` / `select_obs` upfront to detect missing data and show a warning banner if the requested data doesn't exist yet. | +| `borders` | Pathname or mode change | Fetches all admin-boundary polygons for the current country and admin level and sends them to the GeoJSON borders overlay. Returns empty in pixel mode. | +| `validate_upload` | CSV file upload | Decodes the base64 upload, runs `validate_csv()` (checks column structure, year range, and whether all region keys are known to the app), and displays a pass/fail modal with itemised errors and notes. | +| Query-string sync (clientside) | Most control changes | Serialises all current control values into the URL query string so the page state is bookmarkable and survives reload. Runs entirely client-side. | + +There is also one **clientside callback** (JavaScript) for toggling the map / table panels — purely a CSS class toggle, no server round-trip needed. + +--- + +#### Flask routes (non-Dash) + +See [Section 7](#7-api-endpoints) for full details. These are REST endpoints used by the Gantt tool and potentially other consumers. + +--- + +### fbflayout.py + +Defines the Dash component tree. Contains **no callbacks, no data access, and no business logic** — just component definitions. If you want to add, remove, or reorganise UI controls, this is where to do it. + +Key layout sections: + +- `app_layout()` — top-level `html.Div`; assembles control bar, map column, and table column, plus modals and a disclaimer banner. +- `control_layout()` — the horizontal control bar: dropdowns for mode, forecast, issue month, season, year, severity; the frequency slider; the map/table toggle checklist; and the CSV upload button. +- `map_layout()` — a `dash_leaflet.Map` with tile layers (Street and Topo basemaps), an admin-borders GeoJSON overlay, the forecast raster tile layer, a draggable marker with a popup, a scale bar, and a colorbar. +- `table_layout()` — the reference dataset and dataset dropdowns, the "include upcoming" checkbox, the start-year filter, and a loading wrapper around the table container. +- `control(label, tooltip, component)` — a small helper that wraps any control in a `html.Div` with consistent padding and a tooltip. + +--- + +### fbftable.py + +Builds the HTML table from DataFrames produced by the main pipeline. Keeps rendering logic separate from the data logic in `fbfmaproom.py`. + +| Function | Purpose | +|---|---| +| `gen_table(tcs, dfs, data, thresholds, severity, final_season)` | Entry point. Produces a complete `html.Table` with header and body. | +| `gen_head(tcs, dfs)` | Builds ``: one row for column names (with tooltips), then one row per row in the summary DataFrame (`dfs`). | +| `gen_body(tcs, data, thresholds, severity, final_season)` | Builds ``: one `` per year in `data`. | +| `cell_class(col_name, row, severity, thresh, lower_is_worse, final_season)` | Returns a CSS class name based on whether the cell is a triggered year, an excluded (upcoming) year, both, or neither. Four possible classes: `''`, `'cell-excluded'`, `'cell-severity-{0,1,2}'`, `'cell-excluded-severity-{0,1,2}'`. | + +`gen_select_header()` renders a `