Skip to content

FYC23/ML-IDS

Repository files navigation

ML-IDS: Multi-Class Network Intrusion Detection System

A LightGBM-based IDS trained on UNSW-NB15, deployed on Raspberry Pi 4 in passive monitoring mode.

Quick start

From the repo root (.venv is not tracked):

uv sync --extra dev

Point your editor at .venv/bin/python. For notebooks, optionally register a kernel: .venv/bin/python -m ipykernel install --user --name=ml-ids --display-name="Python (ML-IDS)".

Train the full-feature LightGBM on UNSW-NB15 (expects CSVs under data/ per src/config.py). Saves the model, preprocessor, labels, metrics, and hyperparams.json under the output directory (default models/full/):

uv run python scripts/train.py

Train with tuned hyperparameters (after hyperparameter tuning):

uv run python scripts/train.py --output-dir models/full_tuned --hyperparams models/full/best_params.json

Use --feature-cols pi for the 30-feature Pi subset, or --feature-cols nfstream for the 13-feature nfstream-verified subset (Phase B).

Train the 13-feature nfstream model:

uv run python scripts/train.py --feature-cols nfstream --output-dir models/nfstream

Test the inference pipeline offline against a PCAP:

uv run python scripts/test_pcap.py --pcap capture.pcap

Run the live detection daemon (requires nfstream):

uv run python scripts/deploy.py --iface wlan0 --model-dir models/nfstream

Hyperparameter tuning (optional)

Run Optuna on the training split; writes study.db, best_params.json, and tuning_history.json under the output directory (default models/full/):

uv run python scripts/tune_hyperparams.py --n-trials 100 --output-dir models/full

Then train the final model:

uv run python scripts/train.py --output-dir models/full_tuned --hyperparams models/full/best_params.json

Multiple processes can share the same --output-dir / study.db for parallel trials.

Implementation Plans

This project is split into two phases to validate feature availability and model performance across training and deployment environments.

Phase A: Full-Feature Training - Completed

Trains LightGBM on all 38 UNSW-NB15 features as a performance ceiling for comparison with Phase B. Training writes the model, preprocessor, label map, and test-set metrics under the output directory (default models/full/).

Outputs: Full-feature model, preprocessor, metrics.json, optional hyperparams.json / tuning artifacts if you run hyperparameter tuning.

Latest results (test set, from models/full/metrics.json):

Metric Value
Accuracy 69.76%
Macro F1 0.5265

Produced with uv run python scripts/train.py (38-feature default).

Phase B: 13-Feature nfstream Training & Pi Deployment

Plan: docs/plans/phase-b-deployment.md

Trained LightGBM on the 13 features actually extractable from nfstream on the Pi (NFSTREAM_FEATURE_COLS in src/data/loader.py). Deploys for real-time passive flow classification on wlan0 with configurable confidence threshold (default 0.7), JSON lines alert logging, and human-readable stdout.

Model results (test set, from models/nfstream/metrics.json):

Metric Value
Accuracy 69.23%
Macro F1 0.4983
Features 13 (11 numeric + 2 categorical: proto, service)

Deployment architecture:

wlan0 → NFStreamer → FeatureExtractor → preprocessor.transform()
       → model.predict_proba() → threshold gate → AlertLogger (jsonl + stdout)

Key files:

File Role
src/inference/engine.py Load artifacts, predict with threshold gate
src/inference/feature_extractor.py NFlow → 13-feature dict
src/inference/logger.py JSON lines file + human stdout
scripts/deploy.py Foreground daemon
scripts/test_pcap.py Offline PCAP replay validation
deploy/ml-ids.service systemd unit
deploy/setup_pi.sh One-command Pi provisioning

Phase C: Attack Simulator (Visual Demo)

Plan: docs/plans/attack-simulator.md

Visual demo tool that generates synthetic network flows resembling real attacks, runs them through the trained ML-IDS model, and displays detections in a live React dashboard. Designed for non-technical audience demonstrations.

Architecture:

Browser ←WebSocket→ FastAPI Server → FlowGenerator → InferenceEngine
                 (port 8000)          (synthetic)     (same as live daemon)

One-time setup:

# 1. Install simulator Python deps
uv sync --extra simulator

# 2. Build prototype statistics from training data
uv run python scripts/build_simulator_prototypes.py

# 3. Install and build React frontend
cd web && npm install && npm run build && cd ..

Development (two terminals):

# Terminal 1: FastAPI backend
uv run python -m src.simulator.server --model-dir models/nfstream

# Terminal 2: Vite dev server (proxies /ws and /api to :8000)
cd web && npm run dev
# Open http://localhost:5173

Production (single command):

uv run python -m src.simulator.server \
    --model-dir models/nfstream \
    --static-dir web/dist \
    --host 0.0.0.0 --port 8000
# Open http://<ip>:8000

Dashboard features:

  • ▶ Auto-Mix mode: scripted normal traffic punctuated by attacks
  • Manual attack triggers: [Fuzzers] [DoS] [Exploits] [Generic] [Recon]
  • Speed slider (0.5x–5x)
  • Live flow feed with color-coded rows
  • Alert panel with confidence bars
  • Pie chart (normal vs attack) + bar chart (alerts by type)
  • Togglable ML probability view per flow

Pi deployment:

Two artifacts must reach the Pi. The model (models/nfstream/lgbm_ids.pkl) is already present if the live daemon is installed. The other two are built on Mac and synced:

# On Mac: build both artifacts
uv run python scripts/build_simulator_prototypes.py
cd web && npm run build && cd ..

# Sync both to Pi
rsync -avz models/nfstream/simulator_prototypes.json pi@<pi-ip>:/home/pi/ml-ids/models/nfstream/
rsync -avz web/dist/ pi@<pi-ip>:/home/pi/ml-ids/web/dist/


# On Pi: verify artifacts are present
ls models/nfstream/simulator_prototypes.json
ls web/dist/index.html

# Run server (no Node.js needed)
uv run python -m src.simulator.server \
    --model-dir models/nfstream \
    --static-dir web/dist \
    --host 0.0.0.0 --port 8000

Alternatively, build prototypes on the Pi itself if the training data lives there: uv run python scripts/build_simulator_prototypes.py.

What gets synced vs. what's already on the Pi:

Artifact How it reaches Pi
models/nfstream/lgbm_ids.pkl Already present (installed with live daemon)
models/nfstream/simulator_prototypes.json Manual: rsync from Mac, or build on Pi
web/dist/ Manual: rsync from Mac (Node.js not needed on Pi)

The simulator imports the exact same InferenceEngine as the live daemon, so classification behavior shown is byte-for-byte identical.

Live capture mode (real traffic):

# Start simulator with live wlan0 capture alongside synthetic
uv run python -m src.simulator.server \
    --model-dir models/nfstream \
    --static-dir web/dist \
    --live-iface wlan0 \
    --live-log-file /var/log/ml-ids/alerts-live.jsonl \
    --host 0.0.0.0 --port 8000

The dashboard shows a Source toggle: Live / Synthetic / Both. Attack buttons inject synthetic attacks into the live feed for demos.

Note: For now, the simulator runs its own nfstream capture independently of deploy.py. Both processes can coexist on wlan0. Future direction: have the simulator consume flow events from deploy.py (via its alert log or a local socket) rather than running a second capture.

Future Directions

  • Unify simulator and daemon capture: Have the simulator consume flows from deploy.py via IPC instead of running its own nfstream instance. See docs/plans/live-capture-mode.md.
  • Hyperparameter tuning on nfstream features: Run Optuna study on the 13-feature set to optimize for Pi deployment. See docs/plans/nfstream-13-feature-training.md.
  • Deriving state from TCP flags: Could reconstruct TCP connection state from nfstream's SYN/FIN/RST/ACK counters, recovering a third categorical feature.
  • iptables reactive blocking (v2): Auto-block high-confidence attack source IPs.
  • Metrics export: Prometheus endpoint for flow rates, alert counts, prediction distributions.

Feature Mismatch Validation

Model Features Accuracy F1-Macro
Full (Phase A) 38 69.76% 0.5265
nfstream (Phase B) 13 69.23% 0.4983
Delta -25 -0.53pp -0.0282

The 13-feature nfstream model retains 99.2% of full-feature accuracy and 94.6% of F1-macro. The 25 dropped features (connection-time windows, TCP state, TTL, loss, load, window size, TCP base, etc.) carry minimal marginal signal for classification.

Tech Stack

  • Python 3.11
  • LightGBM (multi-class classifier)
  • scikit-learn (preprocessing)
  • pandas (data loading)
  • nfstream (flow capture on Pi)
  • joblib (model serialization)
  • pytest (testing)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors