Tomato Disease Classifier - README

🌱 Overview

A deep learning system for classifying tomato leaf diseases using DenseNet121 with transfer learning. This project uses Keras 3 and TensorFlow 2.13+ to detect 10 different disease classes from tomato leaf images.

✨ Features

✅ Transfer Learning - DenseNet121 pretrained on ImageNet
✅ Two-Phase Training - Feature extraction + Fine-tuning
✅ Data Augmentation - RandomFlip, RandomRotation, RandomZoom, RandomBrightness
✅ Out-of-Distribution Detection - Confidence-based filtering
✅ Mobile Deployment - TFLite quantization (float16)
✅ Comprehensive Evaluation - Confusion matrix, per-class metrics, confidence analysis
✅ Reproducible - Fixed random seeds for consistent results

🎯 Disease Classes (10)

Bacterial Spot - Bacterial infection causing dark spots
Early Blight - Fungal disease with target-like spots
Healthy - No disease present
Late Blight - Aggressive fungal disease (Phytophthora infestans)
Leaf Mold - Fungal disease on leaf undersides
Septoria Leaf Spot - Fungal disease with circular spots
Spider Mites (Two-spotted) - Pest damage causing yellowing
Target Spot - Fungal disease with concentric rings
Tomato Mosaic Virus - Viral disease causing mottling
Tomato Yellow Leaf Curl Virus - Viral disease causing leaf curl

🔧 Technical Stack

Component	Version
Python	3.8+
TensorFlow	2.13.0+
Keras	3.0.0+
NumPy	1.24.0+
Scikit-learn	1.3.0+
Matplotlib	3.7.0+
Seaborn	0.12.0+

📊 Model Architecture

Input (224x224x3)
    ↓
Data Augmentation (RandomFlip, Rotation, Zoom, Brightness)
    ↓
Preprocessing (DenseNet preprocess_input)
    ↓
DenseNet121 (ImageNet weights, frozen initially)
    ↓
GlobalAveragePooling2D
    ↓
Dense(256, ReLU)
    ↓
Dropout(0.3)
    ↓
Dense(10, Softmax) → Output

🚀 Quick Start

1. Installation

pip install -r requirements.txt

2. Prepare Dataset (Local)

dataset/
├── train/   (70% of data)
│   ├── Bacterial_spot/
│   ├── Early_blight/
│   └── ... (10 disease folders)
├── val/     (10% of data)
│   └── ... (10 disease folders)
└── test/    (20% of data)
    └── ... (10 disease folders)

3. Run Training (Kaggle)

# In Kaggle notebook
%run tomato_classifier.py /kaggle/input/tomato-dataset

4. Local Training

python tomato_classifier.py /path/to/dataset

5. Dry Run (Test - 1 epoch each)

python tomato_classifier.py /path/to/dataset --dry-run

📈 Training Pipeline

Phase 1: Feature Extraction (20 epochs)

Objective: Train custom head with frozen base model
Learning Rate: 0.001
Optimizer: Adam
Loss: Categorical Crossentropy
Checkpoint: best_model_feature_extraction.keras

Phase 2: Fine-Tuning (50 epochs)

Objective: Fine-tune last 100 DenseNet layers with low learning rate
Learning Rate: 0.00001
Unfrozen Layers: Last 100 layers of DenseNet121
Checkpoint: best_model_fine_tuning.keras

Callbacks

EarlyStopping: Patience=10 on validation accuracy
ModelCheckpoint: Save best model based on val_accuracy
ReduceLROnPlateau: Reduce LR if val_loss plateaus (factor=0.5, patience=5)

📁 Project Structure

tomato-classifier/
├── tomato_classifier.py         # Main training script
├── inference.py                 # Inference/prediction script
├── requirements.txt             # Dependencies
├── config.yaml                  # Configuration file
├── README.md                    # This file
├── QUICKSTART.md               # Quick start guide
├── CONTRIBUTING.md             # Contribution guidelines
├── ROADMAP.md                  # Future improvements
├── .copilot-instructions.md    # GitHub Copilot instructions
│
├── train/                       # Training dataset (70%)
│   └── [10 disease folders]
├── val/                         # Validation dataset (10%)
│   └── [10 disease folders]
├── test/                        # Test dataset (20%)
│   └── [10 disease folders]
│
└── outputs/
    ├── best_model_feature_extraction.keras
    ├── best_model_fine_tuning.keras
    ├── final_model.keras
    ├── tomato_disease_model.tflite
    ├── training_history_*.png
    ├── confusion_matrix.png
    ├── per_class_metrics.png
    ├── confidence_distribution.png
    └── model_architecture.png

⚙️ Configuration Parameters

Edit in tomato_classifier.py:

IMG_SIZE = 224                  # Input image size
BATCH_SIZE = 32                 # Batch size for training
INITIAL_LR = 0.001              # Feature extraction learning rate
FINE_TUNE_LR = 0.00001          # Fine-tuning learning rate
OOD_THRESHOLD = 0.7             # Confidence threshold for OOD detection

Or in config.yaml for centralized management.

📊 Expected Performance

Based on typical DenseNet121 + fine-tuning:

Metric	Typical Value
Test Accuracy	~95%+
Per-class F1-score	0.92-0.98
Average Precision	~0.96
Average Recall	~0.96

Note: Actual values depend on dataset quality and augmentation parameters

📤 Model Export

TFLite Conversion

Automatically performed during training:

tomato_disease_model.tflite  # ~45-50 MB (float16 quantized)

For Mobile Deployment:

Android: Use TFLite Support Library
iOS: Use TensorFlow Lite for iOS
File size: ~50% of original after quantization

🔍 Evaluation Metrics

The training script generates:

Overall Metrics
- Accuracy
- Precision, Recall, F1-score
Per-Class Metrics
- Per-disease accuracy
- Per-disease precision/recall/F1
Visualizations
- Training curves (loss & accuracy)
- Confusion matrix (normalized)
- Per-class performance bar chart
- Confidence distribution histogram
- Model architecture diagram
OOD Detection Analysis
- Confidence statistics
- Below-threshold predictions

💻 Usage Examples

Training from Kaggle

# Kaggle notebook cell
!pip install -q tensorflow keras scikit-learn
%run tomato_classifier.py /kaggle/input/tomato-dataset

Single Image Prediction

from inference import TomatoDiseasePredictor
import keras

# Load model
predictor = TomatoDiseasePredictor('final_model.keras')

# Predict
result = predictor.predict_single('path/to/image.jpg')
predictor.print_result(result)

Batch Prediction

from inference import TomatoDiseasePredictor

predictor = TomatoDiseasePredictor('final_model.keras')
results = predictor.predict_batch('path/to/image/directory')

# Access results
for result in results:
    print(f"{result['image']}: {result['predicted_class']} ({result['confidence']:.2%})")

Using Command Line

# Single image
python inference.py --model final_model.keras --image path/to/image.jpg

# Directory of images
python inference.py --model final_model.keras --directory path/to/images --output results.json

🐛 Troubleshooting

GPU Not Detected

import tensorflow as tf
gpus = tf.config.list_physical_devices('GPU')
print(f"GPUs available: {len(gpus)}")

Out of Memory

Reduce batch size in tomato_classifier.py:

BATCH_SIZE = 16  # or 8

Slow Training

Enable GPU acceleration in Kaggle notebook settings
Reduce number of epochs for testing
Use --dry-run flag for quick test

Dataset Not Found

Verify dataset structure:

python -c "from tomato_classifier import check_dataset_path; check_dataset_path('path/to/dataset')"

📚 Documentation

QUICKSTART.md - Quick setup and first run
CONTRIBUTING.md - How to contribute
ROADMAP.md - Future improvements
.copilot-instructions.md - GitHub Copilot guidelines

🤝 Contributing

See CONTRIBUTING.md for guidelines on:

Code style and conventions
Development setup
Making changes and testing
Pull request process

🗺️ Roadmap

See ROADMAP.md for planned features:

v1.1: Model ensemble, TFLite optimization
v1.2: Web/mobile deployment
v1.3: Explainability, active learning
v2.0: Vision Transformers, advanced architectures

📞 Support

For issues or questions:

Check QUICKSTART.md
Review troubleshooting section above
Open an issue on GitHub

🙏 Acknowledgments

DenseNet121: Huang et al., 2016
Tomato Dataset: Kaggle Tomato Disease Dataset
TensorFlow & Keras: Google AI

Last Updated: December 2025
Status: Production Ready ✅
Framework: Keras 3 + TensorFlow 2.13+

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.github		.github
test		test
train		train
val		val
.gitignore		.gitignore
README.md		README.md
Skripsi.docx		Skripsi.docx
tomato_classifier.py		tomato_classifier.py

Folders and files

Latest commit

History

Repository files navigation

Tomato Disease Classifier - README

🌱 Overview

✨ Features

🎯 Disease Classes (10)

🔧 Technical Stack

📊 Model Architecture

🚀 Quick Start

1. Installation

2. Prepare Dataset (Local)

3. Run Training (Kaggle)

4. Local Training

5. Dry Run (Test - 1 epoch each)

📈 Training Pipeline

Phase 1: Feature Extraction (20 epochs)

Phase 2: Fine-Tuning (50 epochs)

Callbacks

📁 Project Structure

⚙️ Configuration Parameters

📊 Expected Performance

📤 Model Export

TFLite Conversion

🔍 Evaluation Metrics

💻 Usage Examples

Training from Kaggle

Single Image Prediction

Batch Prediction

Using Command Line

🐛 Troubleshooting

GPU Not Detected

Out of Memory

Slow Training

Dataset Not Found

📚 Documentation

🤝 Contributing

🗺️ Roadmap

📞 Support

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages