Skip to content

Smart Grid Transformer Integration & Automated Scenario Generation Pipeline#288

Draft
Rohith-Kanathur wants to merge 30 commits into
IBM:mainfrom
Rohith-Kanathur:feat/scenario-generator
Draft

Smart Grid Transformer Integration & Automated Scenario Generation Pipeline#288
Rohith-Kanathur wants to merge 30 commits into
IBM:mainfrom
Rohith-Kanathur:feat/scenario-generator

Conversation

@Rohith-Kanathur
Copy link
Copy Markdown
Contributor

Overview

This PR delivers two major contributions to AssetOpsBench:

  1. A new Smart Grid Transformer asset class with four FMSR diagnostic tools grounded in IEC standards
  2. A multi-phase automated scenario generation pipeline (ScenarioGeneratorAgent) that scales benchmark creation to new asset classes without manual authoring

1. Smart Grid Transformer Integration

CouchDB Data

  • Added Smart Grid Transformer asset documents to CouchDB, including dissolved gas analysis (DGA) readings, winding temperature profiles, and electrical load profiles as new data sources.

Four New FMSR Tools

Tool Standard Description
predict_health_index Supervised regression model trained on the Mendeley Transformer Health Dataset; predicts a 0–100 health score across five condition bands (Very Good → Very Poor)
interpret_dga IEC 60599 Applies the Rogers Ratio method to classify transformer fault type from dissolved gas concentrations with confidence rating
assess_winding_temperature IEC 60076-7 Computes insulation ageing rate and thermal risk from winding/oil temperature inputs
assess_load_profile IEC 60076-7 Derives three-phase apparent load and classifies loading status against cyclic loading limits

Tests

  • Added unit tests for all four FMSR tools covering valid inputs, edge cases, and schema compliance.

2. Scenario Generation Pipeline (ScenarioGeneratorAgent)

A three-phase automated pipeline that generates physically plausible, causally consistent, and tool-reachable benchmark scenarios for any onboarded asset class.

Phase 1: Asset Profiling

Discovers live asset instances, sensors, and failure mappings from CouchDB, retrieves and synthesizes domain literature from ArXiv or Semantic Scholar, and merges everything with MCP tool schemas into a single structured AssetProfile that grounds all downstream generation.

Phase 2: Budget Allocation

Distributes the total scenario count across focusses (iot, fmsr, tsfm, wo, vibration, multiagent) proportionally to the asset's available data modalities and tool coverage, with multiagent capped at 75% of total budget to preserve lane diversity.

Phase 3: Scenario Generation & Validation

Generates per-focus scenarios conditioned on the asset profile, runs each candidate through an LLM-based repair step.

Output

Each run produces a timestamped directory with scenarios.json. Each scenario object contains an id, type, text, category, and characteristic_form.

CLI

# Closed-form generation
uv run python -m scenarios.generator "Transformer" --num-scenarios 50

# Grounded open-form with live CouchDB data
uv run python -m scenarios.generator "Transformer" --data-in-couchdb --num-scenarios 50

Rohith-Kanathur and others added 30 commits April 11, 2026 23:32
…ine for benchmark scenario creation

Introduces a fully automated 4-phase scenario generation pipeline driven by LiteLLM:

PHASE 1 — Asset Profile Construction
- LLM generates targeted ArXiv search queries from the asset's canonical academic name
- Fetches PDFs via ArXiv API (up to 2 per query, first 5 pages extracted via pypdf)
- Synthesises sensor mappings, failure modes, ISO standards, and relevant tool mappings
  into an AssetProfile (Pydantic model); fatal if unparseable

PHASE 2 — Scenario Budget Allocation
- LLM dynamically distributes the total scenario count across 5 subagent categories
  (iot, fmsr, tsfm, wo, multiagent) based on the AssetProfile
- Multiagent capped at 50% of total; fatal if allocation is unparseable

PHASE 3 — Individual Agent Generation & Validation (iot / fmsr / tsfm / wo)
- Fetches 2 typed few-shot examples from ibm-research/AssetOpsBench on HuggingFace
- SCENARIO_GENERATOR_PROMPT produces raw scenario dicts per subagent
- VALIDATE_REPAIR_PROMPT corrects schema, tool alignment, and characteristic_form quality
- Validation diffs (before/after) written to numbered log files when --log is active

PHASE 4 — Multi-Agent Combiner
- MULTIAGENT_COMBINER_PROMPT seeds from up to 10 single-agent scenarios to produce
  complex cross-subagent orchestration scenarios (e.g. IoT → FMSR → WO)

CLI (python -m scenarios.generator):
  --num-scenarios N     Total scenarios to generate (default: 50)
  --output PATH         Output JSON path (default: generated_scenarios.json)
  --model-id MODEL      LiteLLM model override
  --show-workflow       Granular phase-by-phase terminal output with diffs
  --log                 Dump all raw prompts + responses to logs/<asset>_<ts>/

Supporting additions:
- models.py: AssetProfile, ScenarioBudget, Scenario Pydantic models
- prompts.py: 6 prompt templates (PROFILE_BUILDER, SCENARIO_GENERATOR,
  VALIDATE_REPAIR, MULTIAGENT_COMBINER, RESEARCH_QUERY_GENERATOR, BUDGET_ALLOCATOR)
- utils.py: fetch_arxiv_studies() with multi-query dedup + PDF extraction;
  fetch_hf_fewshot() with type-filtered HuggingFace loading + mock fallback
- Log header includes ArXiv paper titles and PDF URLs for full traceability
- src/scenarios/README.md: full usage docs, pipeline breakdown, output schema,
  troubleshooting table, and log file structure reference
@DhavalRepo18
Copy link
Copy Markdown
Collaborator

Copy link
Copy Markdown
Collaborator

@DhavalRepo18 DhavalRepo18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls See Comment

@ShuxinLin
Copy link
Copy Markdown
Collaborator

Hi @Sagar-CK @Rohith-Kanathur I would like to know why only 5 data points are in the bulk_docs_transformer.json

@Rohith-Kanathur
Copy link
Copy Markdown
Contributor Author

Hi @ShuxinLin

The current bulk_docs_transformer.json only contains 5 data points because it was initially added as lightweight mock/test data to verify if the CouchDB ingestion was happening correctly.

The intent wasn’t to model a full production level smart grid dataset but rather to validate the loader + CouchDB integration flow first. We can definitely expand the dataset with additional transformer documents.

I was thinking about using this dataset since it contains around 400 transformer data points: https://data.mendeley.com/datasets/rz75w3fkxy/1

Please let me know what you think.

Thank you.

@DhavalRepo18
Copy link
Copy Markdown
Collaborator

@Rohith-Kanathur You needed these sample data for what? These PR is for generating scenarios? or testing Scenario?
Any advise on how to collect the data and etc will be useful.

@Rohith-Kanathur
Copy link
Copy Markdown
Contributor Author

@DhavalRepo18 I was talking about this github issue: #303
(I was NOT referring to this PR which is about scenario generation)
To load smart grid transformer data into couch db, we will need a transformer dataset (in json or csv format).
This is similar to how it is done for the Chiller asset.
I suggested using a transformer dataset I found online (https://data.mendeley.com/datasets/rz75w3fkxy/1) since there are good number of data points and it contains relevant sensor readings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants