A hybrid IDS that combines signature-based rules with machine learning–based anomaly detection (UNSW-NB15) and surfaces explainable alerts in a SOC-style dashboard.
- Detects known attacks via signature rules
- Detects unknown/zero-day behaviors via ML
- Generates explainable security alerts
- Visualizes alerts in a streamlined SOC dashboard
Raw Network Traffic (UNSW-NB15) → Preprocessing → Hybrid Detection Engine (Signature + ML) → alerts.csv → SOC Dashboard (Streamlit)
- Signature-based: rule matches for known patterns (e.g., port scanning, ICMP flood, brute-force)
- Machine learning: Random Forest on UNSW-NB15 to spot anomalous traffic (unknown, subtle attacks)
- Hybrid decision logic:
- Signature ✅ / ML ❌ → High severity
- Signature ❌ / ML ✅ → Medium severity
- Signature ✅ / ML ✅ → High severity
- Signature ❌ / ML ❌ → Normal traffic
- UNSW-NB15 (modern, realistic traffic; normal + malicious flows)
- Files:
UNSW_NB15_training-set.csv,UNSW_NB15_testing-set.csv
- Python, pandas, NumPy
- scikit-learn, joblib
- Streamlit (dashboard)
python -m src.preprocess --input data/raw/UNSW_NB15_training-set.csv --output data/processed/UNSW_NB15_processed.csv
python -m src.train_model
python -m src.hybrid_detector --raw data/raw/UNSW_NB15_training-set.csv --processed data/processed/UNSW_NB15_processed.csv --clf models/rf_classifier.joblib --output outputs/alerts.csv
python -m src.hybrid_detector --raw data/raw/UNSW_NB15_testing-set.csv --processed data/processed/UNSW_NB15_testing_processed.csv --clf models/rf_classifier.joblib --output outputs/alerts_test.csv
streamlit run streamlit_app.py
- Upload
alerts.csvoralerts_test.csv - View severity distribution, contextual fields (attack category, protocol, service, duration), detection source, and confidence
- Dashboard is read-only (visualization only; detection runs offline)
- Accuracy: ~96%
- Attack recall: ~97%
- Balanced detection vs. false positives
- Port scanning
- DoS / DDoS
- Brute force
- Web attacks
- Malware / backdoor
- Unknown / suspicious
- Normal traffic
- Real-time traffic ingestion
- Automated response actions
- Geo-IP visualization
- Threat intelligence integration
- Keep
data/,models/, andoutputs/out of version control (see.gitignore). - Ensure the UNSW-NB15 CSVs are present under
data/raw/before preprocessing.