Skip to content

dusanvelickovic/p_recognizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

P Recognizer

A deep learning-powered letter 'P' recognition system that detects and highlights instances of the letter 'P' in images using a trained CNN model.

Overview

This project uses a Convolutional Neural Network (CNN) to identify the letter 'P' in images. The system can scan an image, detect regions containing potential letters, and classify whether each region contains the letter 'P', drawing red bounding boxes around detected instances.

Features

  • Binary Image Classification: CNN model trained to distinguish 'P' from other letters
  • Automatic Region Detection: Uses OpenCV contour detection to identify potential letter regions
  • Visual Results: Draws red bounding boxes around detected 'P' letters
  • Confidence Scoring: Reports confidence percentage for each detection

Project Structure

p_recognizer/
├── dataset/
│   ├── p/           # Training images containing letter 'P'
│   └── non_p/       # Training images containing other letters
├── train.py         # Model training script
├── main.py          # P letter detection script
├── p_recognizer_model.keras  # Trained model file
├── ulaz.png         # Sample input image
└── result.png       # Output image with detected P's highlighted

Requirements

  • Python 3.x
  • TensorFlow
  • OpenCV (cv2)
  • Pillow (PIL)
  • NumPy
  • scikit-learn

Install dependencies:

pip install tensorflow opencv-python pillow numpy scikit-learn

Usage

Training the Model

To train a new model on your dataset:

python train.py

The script will:

  1. Load images from dataset/p/ and dataset/non_p/ directories
  2. Preprocess images to 28x28 grayscale
  3. Train a CNN model for 20 epochs
  4. Save the trained model as p_recognizer_model.keras

Training output includes:

  • Total images loaded
  • Model architecture summary
  • Training progress with accuracy metrics
  • Validation accuracy

Detecting P Letters

To detect letter 'P' in an image:

python main.py

By default, this processes ulaz.png and saves the result to result.png. To use a different image, modify the main.py file:

result = find_all_p_letters_cv(
    model_path='p_recognizer_model.keras',
    image_path='your_image.png'
)

Model Architecture

The CNN model consists of:

  • Input Layer: 28x28x1 grayscale images
  • Conv2D Layer 1: 32 filters, 3x3 kernel, ReLU activation
  • MaxPooling2D Layer 1: 2x2 pool size
  • Conv2D Layer 2: 64 filters, 3x3 kernel, ReLU activation
  • MaxPooling2D Layer 2: 2x2 pool size
  • Flatten Layer
  • Dense Layer: 64 neurons, ReLU activation
  • Dropout Layer: 0.5 dropout rate
  • Output Layer: 1 neuron, sigmoid activation (binary classification)

Optimizer: Adam
Loss Function: Binary Crossentropy

How It Works

Training Process (train.py)

  1. Loads images from dataset/p/ (labeled as 1) and dataset/non_p/ (labeled as 0)
  2. Resizes all images to 28x28 pixels and converts to grayscale
  3. Normalizes pixel values to [0, 1] range
  4. Splits data into training (80%) and validation (20%) sets
  5. Trains the CNN model
  6. Saves the trained model

Detection Process (main.py)

  1. Loads the trained model and input image
  2. Converts image to grayscale and applies binary thresholding
  3. Finds contours (potential letter regions) using OpenCV
  4. Filters regions by size (10-200 pixels width/height)
  5. For each region:
    • Resizes to 28x28 pixels
    • Normalizes and feeds to the model
    • If confidence > 50%, marks as 'P'
  6. Draws red rectangles around detected 'P' letters with 10-pixel padding
  7. Saves and displays the result

Detection Parameters

The detection process filters regions based on:

  • Minimum size: 10x10 pixels
  • Maximum size: 200x200 pixels
  • Confidence threshold: 50% (0.5)
  • Bounding box padding: 10 pixels

Adjust these in main.py as needed for your use case.

Example Output

When running the detection script, you'll see console output like:

Found 25 potential letter regions
  P detected at (120, 45) - confidence: 87.34%
  P detected at (230, 45) - confidence: 92.15%
  P detected at (340, 45) - confidence: 95.67%

Total P letters found: 3

The output image (result.png) will show red bounding boxes around each detected 'P'.

Dataset Preparation

To train the model with your own data:

  1. Create two directories: dataset/p/ and dataset/non_p/
  2. Add images containing letter 'P' to dataset/p/
  3. Add images containing other letters to dataset/non_p/
  4. Supported formats: PNG, JPG, JPEG, BMP
  5. Run train.py to train a new model

License

This project is open source and available for educational and research purposes.

Contributing

Contributions are welcome! Feel free to submit issues or pull requests.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages