Deep learning-based defect detection system for wind turbine blade inspection using drone imagery and Faster R-CNN.
This project fine-tunes a Faster R-CNN model with a ResNet-50 FPN backbone on the DTU Wind Turbine dataset to detect 5 types of blade defects from drone-captured images.
| ID | Code | Description |
|---|---|---|
| 1 | VG;MT | Vortex Generator / Missing Tape |
| 2 | LE;ER | Leading Edge Erosion |
| 3 | LR;DA | Lightning Receptor Damage |
| 4 | LE;CR | Leading Edge Crack |
| 5 | SF;PO | Surface Pollution |
Results after 30 epochs of training on 340 images:
| Epoch | Total Loss | Avg Loss | mAP | mAP@0.5 | mAP@0.75 |
|---|---|---|---|---|---|
| 1 | 44.38 | 0.326 | - | - | - |
| 5 | 21.70 | 0.160 | - | - | - |
| 10 | 12.52 | 0.092 | - | - | - |
| 20 | 5.83 | 0.043 | - | - | - |
| 30 | 4.17 | 0.031 | 0.420 | 0.757 | 0.509 |
The model achieved strong detection performance with mAP@0.5 of 75.7%, demonstrating effective learning of defect patterns. The total loss decreased by over 90% during training (44.38 to 4.17), indicating good convergence. The final mAP of 0.420 across all IoU thresholds shows the model can accurately localize defects, not just detect them.
Below are example detections comparing ground truth annotations (left) with model predictions (right):
To generate new sample detection images:
python main.py --generate_samplesgit clone https://github.com/cthadeufaria/computer-vision-defect-detection.git
cd computer-vision-defect-detectionpython3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install torch torchvision torchmetrics matplotlib pillow pycocotools tensorboardThe dataset is automatically downloaded when you first run training. It includes:
-
DTU Drone Inspection Images (~2.5GB)
- Source: Mendeley Data
-
DTU Annotations (COCO format)
DTU - Drone inspection images of wind turbine/
├── DTU - Drone inspection images of wind turbine/
│ ├── Nordtank 2017/ # 286 images
│ └── Nordtank 2018/ # 398 images (used for training)
DTU-annotations-main/
├── re-annotation/
│ └── D3/
│ ├── train.json # Training annotations
│ ├── test.json # Test annotations
│ └── val.json # Validation annotations
source venv/bin/activate
python train_wind_turbine.pyConfiguration options (edit train_wind_turbine.py):
num_epochs: Number of training epochs (default: 30)batch_size: Batch size (default: 2, increase if you have more RAM)lr: Learning rate (default: 0.0001)
Expected training time:
- CPU: ~30-60 min/epoch
- CUDA GPU: ~5-10 min/epoch
Training metrics are automatically logged to TensorBoard:
tensorboard --logdir ./runsThen open http://localhost:6006 in your browser to view:
- Loss/epoch_total: Total loss per epoch
- Loss/batch_*: Individual loss components (classifier, box_reg, objectness, rpn_box_reg)
- Metrics/mAP: Mean Average Precision
- Metrics/mAP_50: mAP at 50% IoU threshold
- Learning_Rate: Learning rate schedule
Run inference on images using the trained model:
python main.py --image path/to/image.jpgOr run on a directory of images:
python main.py --input_dir path/to/images/ --output_dir path/to/results/from model import FasterRCNNModel
from PIL import Image
import torch
import torchvision.transforms as T
# Load model
model = FasterRCNNModel(num_classes=6)
model.load_state_dict(torch.load('./models/faster_rcnn_wind_turbine_v1.pth'))
model.eval()
# Prepare image
transform = T.Compose([
T.Resize((1024, 1024)),
T.ToTensor(),
T.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])
image = transform(Image.open('image.jpg').convert('RGB'))
# Run inference
with torch.no_grad():
predictions = model([image])
# Results
boxes = predictions[0]['boxes'] # Bounding boxes [x1, y1, x2, y2]
labels = predictions[0]['labels'] # Class IDs (1-5)
scores = predictions[0]['scores'] # Confidence scores (0-1)Trained models are saved with versioning to prevent overwrites:
models/
├── faster_rcnn_wind_turbine_v1.pth
├── faster_rcnn_wind_turbine_v2.pth
└── ...
| Loss | Description | Good Value |
|---|---|---|
loss_classifier |
Defect type classification | < 0.1 |
loss_box_reg |
Bounding box accuracy | < 0.1 |
loss_objectness |
Object vs background | < 0.1 |
loss_rpn_box_reg |
Region proposal accuracy | < 0.1 |
| Metric | Description | Good Value |
|---|---|---|
| mAP | Mean Average Precision (all IoU) | > 0.3 |
| mAP@0.5 | Precision at 50% IoU overlap | > 0.6 |
computer-vision-defect-detection/
├── dataset.py # Dataset loader with image slicing
├── model.py # Faster R-CNN model definition
├── trainer.py # Training and evaluation logic
├── train_wind_turbine.py # Main training script
├── main.py # Inference script
├── models/ # Saved model weights (versioned)
├── runs/ # TensorBoard logs (versioned)
├── figures/ # Visualization outputs
└── missing_images.txt # Log of missing dataset images
-
MPS (Apple Silicon) not supported: Faster R-CNN hangs on Apple's Metal GPU. Use CPU or CUDA instead.
-
Missing images: 83 images referenced in annotations are from Nordtank 2017 but annotations expect Nordtank 2018. Training uses 340 valid images.
MIT License - see LICENSE file.
- DTU Wind Turbine Dataset: Mendeley Data
- Annotations: imadgohar/DTU-annotations


