ThreatGraph is a cybersecurity analysis app built for the LangChain x SurrealDB hackathon.
It combines:
- SurrealDB as the persistent structured memory layer
- LangGraph as the workflow/orchestration engine
- LangChain model wrappers for optional LLM synthesis
- MITRE ATT&CK as the threat knowledge base
- seeded enterprise asset data, topology, controls, and threat vectors
- CVE correlation from NVD and KEV enrichment from CISA
The result is an app that can answer questions like:
- Which asset should I patch first?
- Which of my internal assets are relevant to APT29?
- What attack paths exist from exposed systems to crown jewels?
- What evidence supports that answer?
This project is not just a chat UI on top of text files.
It demonstrates the exact things the hackathon is scoring:
SurrealDB stores:
- ATT&CK threat groups, techniques, software, mitigations
- enterprise assets, software versions, CVEs
- controls, network segments, threat vectors
- attack relationships and evidence chains
- investigation history and checkpoints
LangGraph runs the query workflow as explicit nodes and state transitions:
- classify the question
- route to the correct graph/CVE logic
- collect evidence
- synthesize the result
- generate remediation guidance
The app keeps an investigation thread ID.
That ID is used to:
- write checkpoints
- save investigation summaries in SurrealDB
- reload prior context for follow-up questions
The use case is concrete:
- security exposure analysis
- threat-group relevance mapping
- patch prioritization
- attack-path visualization
The app exposes:
- investigation thread state
- evidence bundle counts
- recent investigation history
- LangSmith tracing status
The system has four practical layers:
From MITRE ATT&CK:
threat_grouptechniquesoftwaremitigation- edges like
uses,employs,mitigates,belongs_to
From the seeded internal environment:
assetsoftware_versioncvenetwork_segmentsecurity_controlthreat_vector
This is the important glue:
software_version -> linked_to_software -> softwaresoftware_version -> has_cve -> cvecve -> affects -> asset- asset evidence bundles built from those relationships
- LangGraph workflow in
src/agents/workflow.py - Streamlit UI in
app.py - graph visualization in
src/tools/graph_viz.py
The user types a question.
The workflow:
- classifies the question
- queries SurrealDB for the right evidence
- optionally enriches with CVE lookup data
- produces a synthesis and remediation output
- saves investigation context under the current thread ID
The app ranks assets using actual evidence bundles, not only static metadata.
The score considers:
- CVE severity
- KEV presence
- criticality
- criticality score
- crown-jewel status
- network zone
- controls
- threat vectors
- ATT&CK relevance
The attack graph shows:
- enterprise assets
- software and CVEs
- controls and threat vectors
- optional ATT&CK threat layer
- exploit-sequence attack paths from entry to crown jewels
The asset deep dive shows:
- software inventory
- CVEs
- controls
- threat vectors
- per-software vulnerability evidence
If you want the full deep explanation, read these in order:
Use the verified local file-backed DB path:
export SURREALDB_URL="file://$(pwd)/.context/dev-core-clean.db"
export SURREALDB_NS="threatgraph"
export SURREALDB_DB="main"
streamlit run app.py --server.address 127.0.0.1 --server.port 8501If you want a fresh ingest:
export SURREALDB_URL="file://$(pwd)/.context/app.db"
export SURREALDB_NS="threatgraph"
export SURREALDB_DB="main"
python3 ingest.py
streamlit run app.py --server.address 127.0.0.1 --server.port 8501The core graph, workflow, persistence, and evidence-backed queries are now real.
The remaining simplifications are mostly presentational:
- some instructional UI text
- some fallback narrative when no LLM key is configured
- the app is still a hackathon prototype, not a production SOC platform