An end-to-end Machine Learning project for detecting fraudulent credit card transactions using ensemble learning techniques. This project benchmarks multiple tree-based models and demonstrates superior performance using gradient boosting methods on a highly imbalanced dataset.
Credit card fraud detection is a highly imbalanced binary classification problem where fraudulent transactions represent a very small percentage of total transactions.
The objective of this project is to:
- Build a robust fraud detection pipeline
- Compare multiple ensemble learning models
- Evaluate performance using ROC-AUC and classification metrics
- Identify the best performing model for real-world deployment
-
Real-world credit card transaction dataset
-
Highly imbalanced (~0.17% fraud cases)
-
Features:
- PCA transformed features (
V1βV28) TimeAmount- Target variable
Class(0 = Legitimate, 1 = Fraud)
- PCA transformed features (
-
Dataset Link : https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud
- Python
- NumPy
- Pandas
- Scikit-learn
- XGBoost
- LightGBM
- CatBoost
- Matplotlib
- Seaborn
The following ensemble models were trained and evaluated:
- Random Forest
- AdaBoost
- CatBoost
- LightGBM
- XGBoost
| Model | ROC-AUC |
|---|---|
| π₯ XGBoost | 0.9771 |
| π₯ LightGBM | 0.9682 |
| CatBoost | 0.8578 |
| Random Forest | 0.8529 |
| AdaBoost | 0.8135 |
- Achieved ROC-AUC = 0.977
- Excellent class separation capability
- Strong performance on imbalanced dataset
- Suitable for production-level fraud detection systems
- ROC Curve & AUC Score
- Confusion Matrix
- Precision
- Recall
- F1-Score
Special focus was given to Recall to minimize false negatives (missed fraud cases).
Fraud_Detection_Model/
β
βββ credit_card_fraud_detection.ipynb
βββ README.md
βββ requirements.txt
1οΈβ£ Clone the repository
git clone https://github.com/ShrutiPatel263/Fraud_Detection_Model.git2οΈβ£ Install dependencies
pip install -r requirements.txt3οΈβ£ Run the notebook
jupyter notebook credit_card_fraud_detection.ipynbβ Implemented complete ML pipeline β Compared 5 ensemble learning algorithms β Handled severe class imbalance β Achieved high ROC-AUC (0.977) β Conducted systematic model benchmarking
Shruti Patel Machine Learning & AI Enthusiast
GitHub: https://github.com/ShrutiPatel263