Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions .github/workflows/pull_request_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,25 @@ on: pull_request
jobs:
ci_test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:16
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: gdm_test
ports:
- 5432:5432
options: >-
--health-cmd "pg_isready -U postgres -d gdm_test"
--health-interval 10s
--health-timeout 5s
--health-retries 5
strategy:
matrix:
python-version: ["3.12", "3.13"]
env:
GDM_TEST_POSTGRES_DSN: postgresql+psycopg://postgres:postgres@localhost:5432/gdm_test

steps:
- uses: actions/checkout@v6
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ __pycache__/
*.sqlite
*.db
*.sql
!src/gdm/db/distribution_schema.sql
*.ruff_cache/

# Distribution / packaging
Expand Down
1 change: 1 addition & 0 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ parts:
- file: gdm_intro/units
- file: gdm_intro/timeseries
- file: dist_system/import_export
- file: dist_system/sql_persistence
- file: dist_system/plotting
- caption: Advanced Usage
chapters:
Expand Down
138 changes: 138 additions & 0 deletions docs/dist_system/sql_persistence.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# SQL Persistence

GDM supports persisting systems through high-level APIs on `DistributionSystem` and `CatalogSystem`.

Supported database targets:

- SQLite files (via `db_path` or SQLite `db_url`)
- PostgreSQL servers (via `db_url` DSN)

This is useful for:

- storing complete model snapshots,
- preserving component UUID identity across save/load cycles,
- loading distribution systems with `prefer_normalized=True`,
- keeping normalized distribution table structure aligned across SQLite and PostgreSQL.

## DistributionSystem: write and load

```python
from gdm.distribution import DistributionSystem

# write using a file path
system: DistributionSystem = ...
system.to_db("distribution.sqlite")

# load (default snapshot path)
loaded = DistributionSystem.from_db("distribution.sqlite")

# write/load using SQLite URL
sqlite_url = "sqlite:///distribution.sqlite"
Comment on lines +22 to +30

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The examples use relative SQLite targets (system.to_db("distribution.sqlite") and sqlite:///distribution.sqlite). With the current sqlite_path_from_target() implementation, these resolve to an absolute path at the filesystem root (leading /), so the examples won’t work as written. Either fix path resolution so these examples work, or update the docs to specify the URL/path form that is actually supported.

Suggested change
# write using a file path
system: DistributionSystem = ...
system.to_db("distribution.sqlite")
# load (default snapshot path)
loaded = DistributionSystem.from_db("distribution.sqlite")
# write/load using SQLite URL
sqlite_url = "sqlite:///distribution.sqlite"
# write using a working-directory-relative file path
system: DistributionSystem = ...
system.to_db("./distribution.sqlite")
# load (default snapshot path)
loaded = DistributionSystem.from_db("./distribution.sqlite")
# write/load using a working-directory-relative SQLite URL
sqlite_url = "sqlite:./distribution.sqlite"

Copilot uses AI. Check for mistakes.
system.to_db(db_url=sqlite_url)
loaded = DistributionSystem.from_db(db_url=sqlite_url)

# write/load using PostgreSQL DSN
postgres_url = "postgresql+psycopg://user:password@host:5432/database"
system.to_db(db_url=postgres_url)
loaded = DistributionSystem.from_db(db_url=postgres_url)
```

By default, `to_db` writes snapshot payloads. For distribution systems, normalized tables are also persisted.

## Backend behavior

### Distribution systems

- SQLite: writes snapshot payload + normalized distribution tables.
- PostgreSQL: writes snapshot payload + normalized distribution tables, with table names and relational layout aligned to SQLite.

`prefer_normalized=True` is supported on both backends for `DistributionSystem.from_db(...)`.

### Catalog systems

- SQLite and PostgreSQL both use snapshot storage.

### Load from normalized representation

Use `prefer_normalized=True` to reconstruct from normalized topology/component tables first.

```python
loaded = DistributionSystem.from_db(
db_url="postgresql+psycopg://user:password@host:5432/database",
prefer_normalized=True,
)
```

If normalized rows are unavailable for the stored system, loading falls back to snapshot reconstruction.

## CatalogSystem: write and load

```python
from gdm.distribution import CatalogSystem

catalog: CatalogSystem = ...
catalog.to_db("catalog.sqlite")

loaded_catalog = CatalogSystem.from_db("catalog.sqlite")

catalog.to_db(db_url="postgresql+psycopg://user:password@host:5432/database")
loaded_catalog = CatalogSystem.from_db(
db_url="postgresql+psycopg://user:password@host:5432/database"
)
```

`CatalogSystem` persistence uses snapshot storage.

## Replace semantics and schema initialization

For both system types, writes replace existing records for that `system_kind` by default.

- `replace=True` (default): replace previously persisted record(s) for that system kind.
- `initialize_schema=True` (default): bootstrap schema/tables when needed.

In repeated writes to an existing database, `initialize_schema=False` can be used once schema is already present.

## Table-structure parity notes (SQLite vs PostgreSQL)

For distribution persistence, PostgreSQL now materializes the same normalized table set used in SQLite (for example, `distribution_buses`, `distribution_loads`, `matrix_impedance_branches`, and related component/equipment tables). This keeps SQL inspection and downstream table-based workflows consistent across backends.

GDM additive tables (`gdm_system_snapshots`, `gdm_metadata`, `gdm_component_uuid_map`) remain backend-managed and are available on both SQLite and PostgreSQL.

## Time series behavior

When persisting a `DistributionSystem`, time-series associations are stored in DB metadata tables, and loading restores component time-series attachments from persisted snapshot data.

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc says “time-series associations are stored in DB metadata tables”, but the implementation writes them to the time_series_associations table (with some metadata also stored in gdm_metadata). Consider rewording to avoid implying the associations live in gdm_metadata (e.g., “stored in the time_series_associations table, and loading restores attachments from the snapshot payload”).

Suggested change
When persisting a `DistributionSystem`, time-series associations are stored in DB metadata tables, and loading restores component time-series attachments from persisted snapshot data.
When persisting a `DistributionSystem`, time-series associations are stored in the `time_series_associations` table, and loading restores component time-series attachments from the persisted snapshot payload.

Copilot uses AI. Check for mistakes.

## Inspecting stored snapshot payloads

For diagnostics, raw snapshot payload can be inspected via `gdm.db.load_snapshot_payload`.

```python
from gdm.db import load_snapshot_payload

payload = load_snapshot_payload(
db_url="postgresql+psycopg://user:password@host:5432/database",
system_kind="distribution",
)
```

For metadata-only inspection, `gdm.db.inspect_snapshot_metadata` is also available.

```python
from gdm.db import inspect_snapshot_metadata

metadata = inspect_snapshot_metadata(db_path="distribution.sqlite")
```

## Local PostgreSQL test setup

If you run persistence tests locally against PostgreSQL, set a DSN environment variable used by the test fixtures:

```bash
export GDM_TEST_POSTGRES_DSN='postgresql+psycopg://postgres:postgres@localhost:5432/gdm_test'
```

Then run DB persistence tests:

```bash
pytest -q tests/test_db_io.py -k postgres_dsn
```
7 changes: 7 additions & 0 deletions docs/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,13 @@ To reduce code duplication and provide client packages with a standard interface

```{tableofcontents}
``` -->

## Persistence Guide

For SQL persistence workflows (SQLite files and PostgreSQL DSNs), see:

- {doc}`dist_system/sql_persistence`

## License

BSD 3-Clause License
Expand Down
6 changes: 6 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ dependencies = [
"pandas~=2.2.3",
"geopandas",
"plotly",
"SQLAlchemy>=2.0",
"psycopg[binary]>=3.2",
]

[project.optional-dependencies]
Expand Down Expand Up @@ -98,6 +100,10 @@ allow-direct-references = true
[tool.ruff.lint.per-file-ignores]
"__init__.py" = ["E402", "F401"]
"**/{tests,docs,tools}/*" = ["E402"]
"src/gdm/db/sqlite_store.py" = ["C901"]
"src/gdm/db/sqlite_store_cap_voltage_xfmr_reg.py" = ["C901"]
"src/gdm/db/sqlite_store_geometry.py" = ["C901"]
"src/gdm/db/sqlite_store_load_solar_battery.py" = ["C901"]

[tool.hatch.build.targets.wheel]
packages = ["src/gdm"]
Expand Down
19 changes: 19 additions & 0 deletions src/gdm/db/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
"""Database adapters for Grid Data Models."""

from gdm.db.store import (
DEFAULT_DB_FORMAT_VERSION,
inspect_snapshot_metadata,
load_snapshot_payload,
load_system_from_db,
write_system_to_db,
)
from gdm.db.store import default_schema_path

__all__ = [
"DEFAULT_DB_FORMAT_VERSION",
"default_schema_path",
"inspect_snapshot_metadata",
"load_snapshot_payload",
"load_system_from_db",
"write_system_to_db",
]
97 changes: 97 additions & 0 deletions src/gdm/db/connection.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
"""Database connection target helpers.

This module centralizes validation and resolution logic for DB targets while the
storage layer transitions from path-based SQLite APIs to DSN-based backends.
"""

from __future__ import annotations

from pathlib import Path
from urllib.parse import urlparse


def resolve_db_url(db_path: str | Path | None = None, db_url: str | None = None) -> str:
"""Resolve a canonical DB URL from compatibility inputs.

Parameters
----------
db_path : str | Path | None
Legacy path input for SQLite files.
db_url : str | None
DSN/URL input. Examples: ``sqlite:////tmp/system.db``,
``postgresql+psycopg://user:pass@host:5432/db``.
"""

if db_url and db_path:
raise ValueError("Provide either 'db_path' or 'db_url', not both.")

if db_url:
return db_url

if db_path is None:
raise ValueError("A database target is required. Provide 'db_url' or 'db_path'.")

return f"sqlite:///{Path(db_path)}"


def sqlite_path_from_target(db_path: str | Path | None = None, db_url: str | None = None) -> Path:
"""Return a filesystem path for SQLite targets.

This helper supports both direct file paths and SQLite URLs. Non-SQLite
URLs are intentionally rejected in this module because PostgreSQL support is
added incrementally in subsequent milestones.
"""

resolved = resolve_db_url(db_path=db_path, db_url=db_url)
parsed = urlparse(resolved)

if parsed.scheme in {"", "sqlite"}:
if parsed.scheme == "":
return Path(resolved)

if parsed.netloc not in {"", "localhost"}:
raise ValueError(f"Unsupported SQLite URL host in '{resolved}'.")

if not parsed.path:
raise ValueError("SQLite URL must include a file path.")

return Path(parsed.path)
Comment on lines +31 to +58

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sqlite_path_from_target() currently converts db_path to a sqlite:///... URL via resolve_db_url(), then returns Path(parsed.path). For relative paths this yields an absolute path with a leading / (e.g., db_path='distribution.sqlite' becomes Path('/distribution.sqlite')), which will write/read from the filesystem root and likely fail. Similarly, a user-provided db_url='sqlite:///distribution.sqlite' will resolve to /distribution.sqlite instead of a relative path. Consider special-casing the db_path input to return Path(db_path) directly, or parsing SQLite URLs in a way that preserves relative paths (e.g., stripping the leading slash only for relative-URL forms).

Copilot uses AI. Check for mistakes.

raise NotImplementedError(
f"Database backend '{parsed.scheme}' is not supported yet in this module."
)


def get_backend_name(db_path: str | Path | None = None, db_url: str | None = None) -> str:
"""Return normalized backend name from DB target.

Returns
-------
str
One of ``sqlite``, ``postgresql``, or the parsed scheme string for
other DSN types.
"""

resolved = resolve_db_url(db_path=db_path, db_url=db_url)
parsed = urlparse(resolved)
scheme = parsed.scheme or "sqlite"

if scheme == "sqlite":
return "sqlite"

if scheme.startswith("postgresql"):
return "postgresql"

return scheme


def create_db_engine(db_path: str | Path | None = None, db_url: str | None = None):
"""Create a SQLAlchemy engine for the provided DB target."""

try:
from sqlalchemy import create_engine
except ImportError as exc:
raise ImportError("SQLAlchemy is required for DSN-based database engine support.") from exc

resolved = resolve_db_url(db_path=db_path, db_url=db_url)
return create_engine(resolved)
Loading