Explainable Telecom Fraud Detection Platform using Machine Learning, Rule-Based Intelligence, SHAP, FastAPI, and Streamlit
TeleSentry AI is an end-to-end Telecom Fraud Detection Platform designed to identify suspicious calling behavior using a combination of:
- Rule-Based Fraud Intelligence
- Isolation Forest Anomaly Detection
- Random Forest Classification
- SHAP Explainability
- Interactive Streamlit Dashboard
- FastAPI Prediction Service
The system simulates realistic telecom users and fraudsters, engineers behavioral telecom features, detects suspicious activities, explains predictions, and exposes results through a dashboard and API.
Telecommunication fraud has become increasingly sophisticated.
Common fraud patterns include:
- Digital Arrest Scams
- Mass Calling Operations
- Long Distance Fraud Rings
- Social Engineering Networks
- Automated Calling Bots
Traditional rule-based systems fail to detect new fraud patterns, while pure machine learning systems often lack interpretability.
TeleSentry AI combines both approaches to deliver:
- High detection accuracy
- Transparent predictions
- Real-time fraud assessment
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Synthetic Data Generator โ
โ (Telecom User Simulation) โ
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Raw Synthetic Dataset โ
โ generated_dataset.csv (13k+) โ
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Data Preprocessing Layer โ
โ โ
โ โข Cleaning โ
โ โข Validation โ
โ โข Train/Test Split โ
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Feature Engineering Layer โ
โ โ
โ โข call_intensity โ
โ โข distance_per_call โ
โ โข contact_circle_ratio โ
โ โข delivery_pattern โ
โ โข high_freq_long_distance โ
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Rule Engine Layer โ
โ โ
โ โข Digital Arrest Detection โ
โ โข Mass Calling Detection โ
โ โข Long Distance Scam Detection โ
โ โข Traveler Detection โ
โ โข Business User Detection โ
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ML Layer โ
โ โ
โ Isolation Forest โ
โ Random Forest โ
โโโโโโโโโโโโโฌโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Evaluation Layer โ
โ โ
โ Accuracy โ
โ Precision โ
โ Recall โ
โ F1 Score โ
โ ROC-AUC โ
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Explainability Layer โ
โ โ
โ SHAP Summary โ
โ SHAP Waterfall โ
โ Feature Importance โ
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโดโโโโโโโโโโ
โผ โผ
โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
โ Streamlit UI โ โ FastAPI API โ
โ โ โ โ
โ Dashboard โ โ /predict โ
โ Analytics โ โ /health โ
โ Live Predict โ โ Swagger Docs โ
โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
Synthetic Data Generation
โ
Data Preprocessing
โ
Feature Engineering
โ
Rule Engine
โ
Machine Learning Layer
โ
Evaluation Layer
โ
SHAP Explainability
โ
Streamlit Dashboard + FastAPI
TeleSentry-AI/
โ
โโโ api/
โโโ dashboard/
โโโ data/
โโโ notebooks/
โโโ reports/
โโโ saved_models/
โโโ src/
โโโ tests/
โ
โโโ README.md
โโโ requirements.txt
โโโ requirements-lock.txt
โโโ LICENSE
โโโ VERSION
โโโ .env.example
Generates realistic telecom profiles:
- Delivery Partners
- Business Users
- Regular Subscribers
- Traveling Professionals
- Digital Arrest Bots
- Traditional Scammers
- Low Volume Fraudsters
Generated telecom intelligence features:
| Feature | Description |
|---|---|
| call_intensity | Calling activity level |
| distance_per_call | Average call distance ratio |
| contact_circle_ratio | Contact diversity ratio |
| delivery_pattern | Delivery behavior pattern |
| high_freq_long_distance | Suspicious high-volume calling |
Fraud intelligence layer:
- Digital Arrest Detection
- Mass Calling Detection
- Long Distance Scam Detection
- Traveler Detection
- Business User Detection
- Delivery Pattern Detection
Purpose:
- Unsupervised anomaly detection
- Detection of unusual telecom behavior
Purpose:
- Supervised fraud classification
- Fraud probability estimation
| Metric | Score |
|---|---|
| Accuracy | 98%+ |
| Precision | 97%+ |
| Recall | 98%+ |
| F1 Score | 98%+ |
| ROC-AUC | 99%+ |
TeleSentry AI uses SHAP (SHapley Additive Explanations).
Generated explanations include:
- SHAP Summary Plot
- SHAP Waterfall Plot
- Feature Importance Analysis
Top fraud indicators:
- avgCallDistance
- circleDiversity
- call_intensity
- avgDuration
- high_freq_long_distance
Interactive Streamlit dashboard provides:
- Dataset statistics
- Fraud distribution
- User type analysis
- Operator analysis
- Accuracy
- Precision
- Recall
- F1 Score
- ROC Curve
- Confusion Matrix
Predict fraud risk using telecom activity metrics.
Visualize fraud intelligence triggers.
Interpret model decisions.
Endpoints:
GET /GET /healthPOST /predictExample Request:
{
"avg_duration": 5,
"call_frequency": 150,
"unique_contacts": 100,
"avg_distance": 600,
"circle_diversity": 8
}Example Response:
{
"prediction": "FRAUD",
"fraud_probability": 0.98,
"risk_level": "CRITICAL"
}git clone https://github.com/7vik2005/TeleSentry-AI.git
cd TeleSentry-AIpip install -r requirements.txtpython -m src.data_generation.generatorpython -m src.rule_engine.rulespython -m src.models.random_forestpython -m src.explainability.shap_explainerpython -m streamlit run dashboard/app.pypython -m uvicorn api.app:app --reload- Python
- Pandas
- NumPy
- Scikit-Learn
- SHAP
- FastAPI
- Streamlit
- Plotly
- Matplotlib
- Faker
- XGBoost Integration
- Real Telecom Data Support
- Real-Time Streaming Detection
- Docker Deployment
- Cloud Deployment
- Automated Retraining Pipeline
- MLOps Integration
Machine Learning | Data Science | AI Engineering
This project is licensed under the MIT License.