Skip to content

jasonwang1211/security-ai-agent

Repository files navigation

Sentinel Project - AI-Assisted Blue-Team Security Triage

Sentinel Project is an AI-assisted blue-team security triage prototype. It implements a SOC-style Streamlit analyst console where supported inputs are classified by rule-based logic, assigned deterministic Risk Level / Decision values, and enriched with optional AI/RAG advisory context. The AI features are visible in the workflow, but they do not own the verdict path.

The repository is written for project review, demo walkthroughs, and portfolio discussion. It is not a production IDS/IPS, not a red-team tool, and not an autonomous response system.

Screenshot Showcase

Analyst Console (overview)

Sentinel Project analyst console home

The console is the main demo surface: scenario cards, language and mode controls, and visible safety framing. BLOCK / MONITOR / ALLOW are simulated; no real enforcement is executed.

Command Injection Result (overview)

Command Injection deterministic result

Running test; rm -rf /tmp/test produces a deterministic verdict: Command Injection, Risk HIGH, simulated Decision BLOCK, backed by rule evidence CMD-001.

Evidence-Grounded AI Brief — Official Verdict

Evidence-Grounded AI Brief official verdict detail

The brief copies the official deterministic verdict (Risk HIGH / Decision BLOCK) and is advisory only — llm_status: not_used_deterministic_fallback (no live LLM is wired).

Evidence-Grounded AI Brief — Advisory Context

Evidence-Grounded AI Brief advisory context detail

After Find Similar Cases, the brief cites structured advisory context: an approved similar case (case-001) that is not proof of compromise, and graph relationship context (graph-001) that is not a detection source.

Markdown Export — Evidence-Grounded Section

Markdown export Evidence-Grounded section

The Markdown export includes the Evidence-Grounded AI Brief section with schema version, official Risk Level / Decision, and case-001 / graph-001 citations. (Rendered from the real export markdown.)

HTTP/2 Resource Exhaustion Safe Demo (overview)

HTTP/2 Resource Exhaustion safe synthetic demo

A safe synthetic incident: deterministic verdict HTTP/2 Resource Exhaustion Suspicion, Risk MEDIUM, simulated Decision MONITOR (rule HTTP2-RES-001). No traffic is generated and no real enforcement occurs.

Core Capabilities

Capability What it shows Authority level
Rule-Based Detector Reproducible classification for supported payload and incident patterns. Detection authority
Deterministic Risk / Decision Deterministic Risk Level plus simulated BLOCK / MONITOR / ALLOW. Decision authority
Fast deterministic mode Quick demo path without optional AI/RAG warm-up. Deterministic path
Full AI-assisted mode Optional AI/RAG explanation path. Advisory only
AI Analyst Brief Event summary, why it matters, next steps, unsafe assumptions. Advisory only
Evidence-Grounded AI Brief Cited, structured brief over deterministic evidence, gaps, and optional similar-case / graph context, with a deterministic fallback. Advisory only
Evidence Gap Analyzer Confirmed facts, missing evidence, recommended checks. Advisory only
Knowledge Q&A / RAG Defensive knowledge answers from approved context. Advisory only
Approved Similar Cases Read-only comparison against hand-curated approved seed cases; not proof of compromise. Advisory only
Relationship Graph Visual context for event, rule, risk, decision, and case links; not a detection source. Advisory only
Case Draft / Markdown Export Human-reviewed report material. Human review required

Quick Start

git clone https://github.com/jasonwang1211/security-ai-agent.git
cd security-ai-agent
python -m venv venv
.\venv\Scripts\Activate.ps1
pip install -r requirements.txt
python -m streamlit run ui/streamlit_app.py --server.fileWatcherType none

Recommended first demo path:

  1. Select Fast deterministic mode.
  2. Load Command Injection Demo or HTTP/2 Resource Exhaustion Suspicion.
  3. Click Run input.
  4. Review deterministic classification, Risk Level, and simulated Decision.
  5. Open AI Analyst, Case Intelligence, Draft / Export, and the screenshot gallery as needed.

Documentation

Start with the documentation hub: docs/README.md.

Need Read
Formal project report REPORT.md
Demo operation and troubleshooting User operation guide
Step-by-step UI walkthrough UI walkthrough
Screenshots / feature gallery Screenshot gallery
Validation evidence Test report, v2.9 release gate, and v2.9 release notes
Technical architecture notes Technical notes
Roadmap Roadmap
Traditional Chinese materials zh-TW overview and zh-TW report

Validation Summary

Last recorded v2.9 release-gate validation summary:

  • pytest: 1236 passed
  • ruff: passed
  • mypy: passed, no issues found in 172 source files
  • git diff --check: passed
  • AppTest UI smoke: Run -> Find Similar Cases -> case-001 / graph-001, 0 exceptions

These checks validate demo behavior and safety-boundary regressions. They do not claim production IDS/IPS effectiveness.

Safety Boundary

  • Rule-Based Detector is the detection authority.
  • Risk Level / Decision are deterministic.
  • BLOCK / MONITOR / ALLOW are simulated decisions only.
  • RAG / LLM / AI Analyst Brief / Evidence-Grounded AI Brief / Evidence Gap Analyzer / Similar Cases / Relationship Graph provide advisory context only and do not override the official Risk Level or Decision.
  • Approved Similar Cases are comparison context only and do not prove current compromise or successful execution.
  • Relationship Graph context is for explanation only and is not a detection source.
  • No live LLM client is wired; the Evidence-Grounded AI Brief runs as a deterministic fallback.
  • No real firewall / WAF / EDR / account / cloud / SIEM / SOAR action is performed.
  • No exploit code, PoC generation, traffic generation, or offensive automation is provided.
  • Human review is required.

Limitations

Sentinel Project is not a production IDS/IPS, not a real blocking engine, not an exploit generator, and not a replacement for SIEM, SOAR, EDR, vulnerability management, or incident response approval.

Future work is tracked in docs/ROADMAP.md.

About

AI-assisted blue-team triage prototype with deterministic rule-based detection, simulated decisions, advisory RAG/LLM context, and a Streamlit SOC analyst console.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages