A deep learning-powered letter 'P' recognition system that detects and highlights instances of the letter 'P' in images using a trained CNN model.
This project uses a Convolutional Neural Network (CNN) to identify the letter 'P' in images. The system can scan an image, detect regions containing potential letters, and classify whether each region contains the letter 'P', drawing red bounding boxes around detected instances.
- Binary Image Classification: CNN model trained to distinguish 'P' from other letters
- Automatic Region Detection: Uses OpenCV contour detection to identify potential letter regions
- Visual Results: Draws red bounding boxes around detected 'P' letters
- Confidence Scoring: Reports confidence percentage for each detection
p_recognizer/
├── dataset/
│ ├── p/ # Training images containing letter 'P'
│ └── non_p/ # Training images containing other letters
├── train.py # Model training script
├── main.py # P letter detection script
├── p_recognizer_model.keras # Trained model file
├── ulaz.png # Sample input image
└── result.png # Output image with detected P's highlighted
- Python 3.x
- TensorFlow
- OpenCV (cv2)
- Pillow (PIL)
- NumPy
- scikit-learn
Install dependencies:
pip install tensorflow opencv-python pillow numpy scikit-learnTo train a new model on your dataset:
python train.pyThe script will:
- Load images from
dataset/p/anddataset/non_p/directories - Preprocess images to 28x28 grayscale
- Train a CNN model for 20 epochs
- Save the trained model as
p_recognizer_model.keras
Training output includes:
- Total images loaded
- Model architecture summary
- Training progress with accuracy metrics
- Validation accuracy
To detect letter 'P' in an image:
python main.pyBy default, this processes ulaz.png and saves the result to result.png. To use a different image, modify the main.py file:
result = find_all_p_letters_cv(
model_path='p_recognizer_model.keras',
image_path='your_image.png'
)The CNN model consists of:
- Input Layer: 28x28x1 grayscale images
- Conv2D Layer 1: 32 filters, 3x3 kernel, ReLU activation
- MaxPooling2D Layer 1: 2x2 pool size
- Conv2D Layer 2: 64 filters, 3x3 kernel, ReLU activation
- MaxPooling2D Layer 2: 2x2 pool size
- Flatten Layer
- Dense Layer: 64 neurons, ReLU activation
- Dropout Layer: 0.5 dropout rate
- Output Layer: 1 neuron, sigmoid activation (binary classification)
Optimizer: Adam
Loss Function: Binary Crossentropy
- Loads images from
dataset/p/(labeled as 1) anddataset/non_p/(labeled as 0) - Resizes all images to 28x28 pixels and converts to grayscale
- Normalizes pixel values to [0, 1] range
- Splits data into training (80%) and validation (20%) sets
- Trains the CNN model
- Saves the trained model
- Loads the trained model and input image
- Converts image to grayscale and applies binary thresholding
- Finds contours (potential letter regions) using OpenCV
- Filters regions by size (10-200 pixels width/height)
- For each region:
- Resizes to 28x28 pixels
- Normalizes and feeds to the model
- If confidence > 50%, marks as 'P'
- Draws red rectangles around detected 'P' letters with 10-pixel padding
- Saves and displays the result
The detection process filters regions based on:
- Minimum size: 10x10 pixels
- Maximum size: 200x200 pixels
- Confidence threshold: 50% (0.5)
- Bounding box padding: 10 pixels
Adjust these in main.py as needed for your use case.
When running the detection script, you'll see console output like:
Found 25 potential letter regions
P detected at (120, 45) - confidence: 87.34%
P detected at (230, 45) - confidence: 92.15%
P detected at (340, 45) - confidence: 95.67%
Total P letters found: 3
The output image (result.png) will show red bounding boxes around each detected 'P'.
To train the model with your own data:
- Create two directories:
dataset/p/anddataset/non_p/ - Add images containing letter 'P' to
dataset/p/ - Add images containing other letters to
dataset/non_p/ - Supported formats: PNG, JPG, JPEG, BMP
- Run
train.pyto train a new model
This project is open source and available for educational and research purposes.
Contributions are welcome! Feel free to submit issues or pull requests.