ML-IDS: Multi-Class Network Intrusion Detection System

A LightGBM-based IDS trained on UNSW-NB15, deployed on Raspberry Pi 4 in passive monitoring mode.

Quick start

From the repo root (.venv is not tracked):

uv sync --extra dev

Point your editor at .venv/bin/python. For notebooks, optionally register a kernel: .venv/bin/python -m ipykernel install --user --name=ml-ids --display-name="Python (ML-IDS)".

Train the full-feature LightGBM on UNSW-NB15 (expects CSVs under data/ per src/config.py). Saves the model, preprocessor, labels, metrics, and hyperparams.json under the output directory (default models/full/):

uv run python scripts/train.py

Train with tuned hyperparameters (after hyperparameter tuning):

uv run python scripts/train.py --output-dir models/full_tuned --hyperparams models/full/best_params.json

Use --feature-cols pi for the 30-feature Pi subset, or --feature-cols nfstream for the 13-feature nfstream-verified subset (Phase B).

Train the 13-feature nfstream model:

uv run python scripts/train.py --feature-cols nfstream --output-dir models/nfstream

Test the inference pipeline offline against a PCAP:

uv run python scripts/test_pcap.py --pcap capture.pcap

Run the live detection daemon (requires nfstream):

uv run python scripts/deploy.py --iface wlan0 --model-dir models/nfstream

Hyperparameter tuning (optional)

Run Optuna on the training split; writes study.db, best_params.json, and tuning_history.json under the output directory (default models/full/):

uv run python scripts/tune_hyperparams.py --n-trials 100 --output-dir models/full

Then train the final model:

uv run python scripts/train.py --output-dir models/full_tuned --hyperparams models/full/best_params.json

Multiple processes can share the same --output-dir / study.db for parallel trials.

Implementation Plans

This project is split into two phases to validate feature availability and model performance across training and deployment environments.

Phase A: Full-Feature Training - Completed

Trains LightGBM on all 38 UNSW-NB15 features as a performance ceiling for comparison with Phase B. Training writes the model, preprocessor, label map, and test-set metrics under the output directory (default models/full/).

Outputs: Full-feature model, preprocessor, metrics.json, optional hyperparams.json / tuning artifacts if you run hyperparameter tuning.

Latest results (test set, from models/full/metrics.json):

Metric	Value
Accuracy	69.76%
Macro F1	0.5265

Produced with uv run python scripts/train.py (38-feature default).

Phase B: 13-Feature nfstream Training & Pi Deployment

Plan: docs/plans/phase-b-deployment.md

Trained LightGBM on the 13 features actually extractable from nfstream on the Pi (NFSTREAM_FEATURE_COLS in src/data/loader.py). Deploys for real-time passive flow classification on wlan0 with configurable confidence threshold (default 0.7), JSON lines alert logging, and human-readable stdout.

Model results (test set, from models/nfstream/metrics.json):

Metric	Value
Accuracy	69.23%
Macro F1	0.4983
Features	13 (11 numeric + 2 categorical: proto, service)

Deployment architecture:

wlan0 → NFStreamer → FeatureExtractor → preprocessor.transform()
       → model.predict_proba() → threshold gate → AlertLogger (jsonl + stdout)

Key files:

File	Role
`src/inference/engine.py`	Load artifacts, predict with threshold gate
`src/inference/feature_extractor.py`	NFlow → 13-feature dict
`src/inference/logger.py`	JSON lines file + human stdout
`scripts/deploy.py`	Foreground daemon
`scripts/test_pcap.py`	Offline PCAP replay validation
`deploy/ml-ids.service`	systemd unit
`deploy/setup_pi.sh`	One-command Pi provisioning

Phase C: Attack Simulator (Visual Demo)

Plan: docs/plans/attack-simulator.md

Visual demo tool that generates synthetic network flows resembling real attacks, runs them through the trained ML-IDS model, and displays detections in a live React dashboard. Designed for non-technical audience demonstrations.

Architecture:

Browser ←WebSocket→ FastAPI Server → FlowGenerator → InferenceEngine
                 (port 8000)          (synthetic)     (same as live daemon)

One-time setup:

# 1. Install simulator Python deps
uv sync --extra simulator

# 2. Build prototype statistics from training data
uv run python scripts/build_simulator_prototypes.py

# 3. Install and build React frontend
cd web && npm install && npm run build && cd ..

Development (two terminals):

# Terminal 1: FastAPI backend
uv run python -m src.simulator.server --model-dir models/nfstream

# Terminal 2: Vite dev server (proxies /ws and /api to :8000)
cd web && npm run dev
# Open http://localhost:5173

Production (single command):

uv run python -m src.simulator.server \
    --model-dir models/nfstream \
    --static-dir web/dist \
    --host 0.0.0.0 --port 8000
# Open http://<ip>:8000

Dashboard features:

▶ Auto-Mix mode: scripted normal traffic punctuated by attacks
Manual attack triggers: [Fuzzers] [DoS] [Exploits] [Generic] [Recon]
Speed slider (0.5x–5x)
Live flow feed with color-coded rows
Alert panel with confidence bars
Pie chart (normal vs attack) + bar chart (alerts by type)
Togglable ML probability view per flow

Pi deployment:

Two artifacts must reach the Pi. The model (models/nfstream/lgbm_ids.pkl) is already present if the live daemon is installed. The other two are built on Mac and synced:

# On Mac: build both artifacts
uv run python scripts/build_simulator_prototypes.py
cd web && npm run build && cd ..

# Sync both to Pi
rsync -avz models/nfstream/simulator_prototypes.json pi@<pi-ip>:/home/pi/ml-ids/models/nfstream/
rsync -avz web/dist/ pi@<pi-ip>:/home/pi/ml-ids/web/dist/


# On Pi: verify artifacts are present
ls models/nfstream/simulator_prototypes.json
ls web/dist/index.html

# Run server (no Node.js needed)
uv run python -m src.simulator.server \
    --model-dir models/nfstream \
    --static-dir web/dist \
    --host 0.0.0.0 --port 8000

Alternatively, build prototypes on the Pi itself if the training data lives there: uv run python scripts/build_simulator_prototypes.py.

What gets synced vs. what's already on the Pi:

Artifact	How it reaches Pi
`models/nfstream/lgbm_ids.pkl`	Already present (installed with live daemon)
`models/nfstream/simulator_prototypes.json`	Manual: `rsync` from Mac, or build on Pi
`web/dist/`	Manual: `rsync` from Mac (Node.js not needed on Pi)

The simulator imports the exact same InferenceEngine as the live daemon, so classification behavior shown is byte-for-byte identical.

Live capture mode (real traffic):

# Start simulator with live wlan0 capture alongside synthetic
uv run python -m src.simulator.server \
    --model-dir models/nfstream \
    --static-dir web/dist \
    --live-iface wlan0 \
    --live-log-file /var/log/ml-ids/alerts-live.jsonl \
    --host 0.0.0.0 --port 8000

The dashboard shows a Source toggle: Live / Synthetic / Both. Attack buttons inject synthetic attacks into the live feed for demos.

Note: For now, the simulator runs its own nfstream capture independently of deploy.py. Both processes can coexist on wlan0. Future direction: have the simulator consume flow events from deploy.py (via its alert log or a local socket) rather than running a second capture.

Future Directions

Unify simulator and daemon capture: Have the simulator consume flows from deploy.py via IPC instead of running its own nfstream instance. See docs/plans/live-capture-mode.md.
Hyperparameter tuning on nfstream features: Run Optuna study on the 13-feature set to optimize for Pi deployment. See docs/plans/nfstream-13-feature-training.md.
Deriving state from TCP flags: Could reconstruct TCP connection state from nfstream's SYN/FIN/RST/ACK counters, recovering a third categorical feature.
iptables reactive blocking (v2): Auto-block high-confidence attack source IPs.
Metrics export: Prometheus endpoint for flow rates, alert counts, prediction distributions.

Feature Mismatch Validation

Model	Features	Accuracy	F1-Macro
Full (Phase A)	38	69.76%	0.5265
nfstream (Phase B)	13	69.23%	0.4983
Delta	-25	-0.53pp	-0.0282

The 13-feature nfstream model retains 99.2% of full-feature accuracy and 94.6% of F1-macro. The 25 dropped features (connection-time windows, TCP state, TTL, loss, load, window size, TCP base, etc.) carry minimal marginal signal for classification.

Tech Stack

Python 3.11
LightGBM (multi-class classifier)
scikit-learn (preprocessing)
pandas (data loading)
nfstream (flow capture on Pi)
joblib (model serialization)
pytest (testing)

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
__pycache__		__pycache__
data		data
deploy		deploy
docs/plans		docs/plans
models		models
notebooks		notebooks
scripts		scripts
src		src
tests		tests
web		web
.DS_Store		.DS_Store
.gitignore		.gitignore
AGENT.md		AGENT.md
CLAUDE.md		CLAUDE.md
README.md		README.md
conftest.py		conftest.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-IDS: Multi-Class Network Intrusion Detection System

Quick start

Hyperparameter tuning (optional)

Implementation Plans

Phase A: Full-Feature Training - Completed

Phase B: 13-Feature nfstream Training & Pi Deployment

Phase C: Attack Simulator (Visual Demo)

Future Directions

Feature Mismatch Validation

Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML-IDS: Multi-Class Network Intrusion Detection System

Quick start

Hyperparameter tuning (optional)

Implementation Plans

Phase A: Full-Feature Training - Completed

Phase B: 13-Feature nfstream Training & Pi Deployment

Phase C: Attack Simulator (Visual Demo)

Future Directions

Feature Mismatch Validation

Tech Stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages