This project implements a system for CAPTCHA generation, classification, and optical character recognition (OCR). The codebase includes scripts for generating CAPTCHA images with different difficulty levels, training a CNN for classification, and implementing a CRNN for OCR.
- Task 0
-- Captcha Generator
-- Dataset Generator - Task 1
-- Classification Model
-- Confusion Matrix and Evaluation - Task 2
-- OCR Dataset Generator
-- CRNN Model
-- Training the Model
This project was implemented in Google Colab. To run it:
- Upload necessary font files to Google Drive.
- Mount Google Drive in Colab:
from google.colab import drive drive.mount('/content/drive')
- Python 3.7+
- OpenCV
- Pillow
- NLTK
- PyTorch
- Torchvision
- scikit-learn
- matplotlib
- seaborn
Already Imported in Colab so no need for external downloads.
CAPTCHA Generation: Uses OpenCV and PIL to generate CAPTCHA images with easy, hard, and bonus variations. The bonus dataset is generated but not fully utilized in the OCR task.
Classification: Implements a CNN with two convolutional layers, max pooling, and two fully connected layers to classify CAPTCHA images into 100 word classes. Experiments were conducted with various learning rates and batch sizes.
OCR: Implements a CRNN with CTC loss for extracting text from CAPTCHA images. An advanced decoding function was attempted but could not be completed due to time constraints.
CAPTCHA Generation (easy, hard, and bonus datasets)
CAPTCHA Classification (CNN model trained and evaluated)
OCR (CRNN with CTC loss implemented and evaluated)
Bonus OCR Task (bonus OCR not completed)
-
Classification Task: Accuracy, Precision, Recall, F1-score, Confusion Matrix.
-
OCR Task: CTC Loss, Loss Curve Over Epochs, Predicted vs Actual Text.