Data Platform Modernization Agent

Agentic AI for legacy-to-cloud data platform modernization. An agent pipeline migrates enterprises off legacy systems and onto any modern cloud warehouse — with a human approving every AI-generated change.

Status: Working prototype / demo-stage · mock data · solo build · AIBoomi Startup Weekend

Live demo: https://jemathew.github.io/data-platform-modernization-agent/ · Demo video: {DEMO_VIDEO} · Repo: https://github.com/JEMathew/data-platform-modernization-

Overview

Why now. Legacy data platforms (Oracle, Teradata, SQL Server, Hadoop, Informatica) carry fixed cost, can't feed modern AI/analytics, don't scale, depend on a shrinking PL/SQL talent pool, and reach end-of-life — so enterprises must modernize. But ~83% of migrations run over budget or fail, with legacy complexity the #1 cause. A project is usually kicked off by a concrete trigger: a license renewal, an AI mandate, an end-of-life deadline, a capacity wall, M&A, or a stalled prior attempt — and each is a buying moment.

{PRODUCT_NAME} is a unified agent console that runs a pipeline of AI agents — Profiler, Mapper, Code-gen, Validator — to assess a legacy estate, map it to a modern target (Snowflake, BigQuery, Databricks, Fabric), generate the migration code, and validate the result. A deterministic engine does the schema translation; an optional LLM handles the ambiguous procedural code — and a human approves every draft before anything ships.

The idea / why now

End-to-end automated migration already exists — but as services-heavy, six-figure accelerator engagements from specialist vendors. Frontier LLMs now make the hardest part — procedural-code translation — automatable as a self-serve product instead of a consulting project. Our wedge is the product-led experience and the in-app human-in-the-loop approval gate, not "vendor-neutral end-to-end" (which is now table stakes).

Why modernize (drivers). Legacy platforms carry fixed licensing/hardware cost, can't feed modern AI/analytics, don't scale, depend on a shrinking PL/SQL talent pool, slow time-to-insight, and reach end-of-life. Cloud warehouses/lakehouses fix all of these — but ~83% of migrations run over budget or fail, with legacy complexity the #1 cause.

What starts a project now (triggers). A license renewal or hardware refresh, an AI mandate, a cloud-first program, product end-of-life, a performance/capacity wall, M&A consolidation, a regulatory change, attrition of the last engineer who knows the legacy procs, FinOps cost-cutting, or a stalled prior attempt. Each trigger is also a buying moment — exactly when a team goes looking for this tool.

What it does

Step	Agent	What happens
Assess	Profiler	Surfaces schema messiness, complexity, and risk on the source
Map	Mapper	Proposes source→target field mapping (rule-driven, clickable)
Generate	Code-gen	Transpiles to target DDL, migration SQL, and stored-procedure hand-off (offline engine; optional LLM)
Review	— (human gate)	Low-confidence output flagged for approve / edit / reject
Validate	Validator	Deterministic row- and cell-level reconciliation scorecard

Plus console views for ROI, lineage, and AI-readiness.

Tech stack & tools

Front end: standalone HTML / CSS / JavaScript agent console (runs offline; no backend required to view).
Code-gen engine: a deterministic transpiler runs fully in-browser (no model, no network) — it parses a pasted legacy schema and generates target DDL, migration SQL, type mappings, and confidence-flagged fields in real time, including on inputs it has never seen. Procedural code (PL/SQL, cursors) is detected and routed to the human review gate rather than auto-converted.
Optional LLM step: the Code-gen UI accepts an Anthropic API key at runtime; if provided, ambiguous translation is sent to a model (Claude) instead of the offline engine. No key is bundled and none is required to run the demo.
Deterministic by design: profiling, orchestration, and validation are rule-based, not model-driven — validation is exact reconciliation, not an LLM guess.
Targets: Snowflake, BigQuery, Databricks, Microsoft Fabric. Sources: Oracle, Teradata, SQL Server, Hadoop, Informatica.

What's real vs. roadmap (honest)

Real in the prototype: agent console; Profiler / Mapper / Code-gen / Validator; Assessment, Mapping, Code (DDL/SQL/proc), Review (human-in-the-loop), Validation, AI-readiness; a working offline transpiler that turns pasted legacy DDL into target-cloud code live (optional LLM step if a key is supplied).

Roadmap (not built): auto-discovery across many databases, dependency graph, deployment / cutover, documentation agent, real source/target connectors, ETL-pipeline and BI/report migration, security & access control, pluggable agent registry.

Run it / view it

Clone: git clone {REPO_URL}
Open migration-agent-app.html in a modern browser — the full console runs offline on mock data.
(Optional) To enable live Code-gen, set {ANTHROPIC_API_KEY} per the note in /config, then re-run the Code module.

Demo

Video walkthrough: {DEMO_VIDEO}
Hosted app: {DEMO_LINK}
Screenshots: add console-overview.png, code-gen-live.png, review-gate.png to /docs/img.

Roadmap

Migrate (now) → real connectors + Discovery / dependency graph → deployment & cutover → Documentation & Architecture agents → pluggable agent platform → broader modernization (governance, quality, AI-readiness).

Competitive note

Mature specialists (Next Pathway, LeapLogic) already deliver vendor-neutral, end-to-end, ~95%-automated migration as consulting-led engagements. We don't claim to out-feature them; we compete on a self-serve, product-led experience with a human-in-the-loop gate for teams who can't or won't run a six-figure program.

Author

Jincen E. Mathew — https://www.linkedin.com/in/jincenmathew/

Built for AIBoomi Startup Weekend. Demo runs on synthetic data; no customer or licensed data is used.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
CONTEXT-AND-ARCHITECTURE.md		CONTEXT-AND-ARCHITECTURE.md
LICENSE		LICENSE
README.md		README.md
SUBMISSION.md		SUBMISSION.md
architecture-dataflow.html		architecture-dataflow.html
migration-agent-app.html		migration-agent-app.html
migration-agent-architecture.html		migration-agent-architecture.html
migration-agent-flow.html		migration-agent-flow.html
migration-agent-roadmap.html		migration-agent-roadmap.html
pitch-deck.pptx		pitch-deck.pptx
product-mindmap-ia.html		product-mindmap-ia.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Platform Modernization Agent

Overview

The idea / why now

What it does

Tech stack & tools

What's real vs. roadmap (honest)

Run it / view it

Demo

Roadmap

Competitive note

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Data Platform Modernization Agent

Overview

The idea / why now

What it does

Tech stack & tools

What's real vs. roadmap (honest)

Run it / view it

Demo

Roadmap

Competitive note

Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages