Live at https://coder0304-wafershield-ai.streamlit.app
Real-time wafer defect classification using EfficientNet-Lite0 with ONNX deployment.
Achieves high accuracy with low latency, optimized for edge environments.
- Test Accuracy: 90.34%
- ONNX Accuracy: 89.77%
- Latency: 8.55 ms per image (CPU)
- Model Size: 6.76 MB
WaferShield AI is an edge-optimized deep learning system designed to classify semiconductor wafer defects in real time. The system addresses practical constraints in fabrication environments, including latency, compute limitations, and scalability.
The solution combines a lightweight architecture with an efficient deployment pipeline to enable high-throughput inspection without reliance on centralized infrastructure.
- Lightweight EfficientNet-Lite0 architecture optimized for edge deployment
- ONNX FP16 model for fast and portable inference
- Real-time performance with sub-10 ms latency per image
- Balanced dataset and robust evaluation pipeline
- Grad-CAM based explainability for model interpretability
- Streamlit-based interactive interface for live inference
Semiconductor fabrication produces large volumes of wafer inspection data. Traditional inspection workflows rely on manual analysis or centralized systems, leading to:
- Increased latency
- High infrastructure costs
- Bandwidth constraints
- Limited scalability
WaferShield AI enables localized, real-time defect classification suitable for deployment on edge devices.
- Dataset: WM-811K (LSWMD)
- Source: https://www.kaggle.com/datasets/qingyi/wm811k-wafer-map
- Total available samples: 811,457
- Center
- Clean
- Donut
- Edge-Loc
- Edge-Ring
- Loc
- Random
- Scratch
- Samples per class: 149
- Total dataset size: 1,192 images
- Train/Validation/Test split: 70/15/15
- Stratified and balanced across all classes
The system uses EfficientNet-Lite0 with transfer learning.
Key considerations:
- Optimized for mobile and embedded systems
- Strong accuracy-to-compute tradeoff
- Suitable for real-time inference scenarios
- Accuracy: 90.34%
- Macro F1 Score: ~0.90
| Class | Precision | Recall | F1 Score |
|---|---|---|---|
| Center | 0.92 | 1.00 | 0.96 |
| Clean | 0.91 | 0.91 | 0.91 |
| Donut | 0.88 | 1.00 | 0.94 |
| Edge-Loc | 0.67 | 0.91 | 0.77 |
| Edge-Ring | 1.00 | 0.86 | 0.93 |
| Loc | 1.00 | 0.64 | 0.78 |
| Random | 1.00 | 1.00 | 1.00 |
| Scratch | 1.00 | 0.91 | 0.95 |
The Loc class remains the most challenging due to similarity with Edge-Loc patterns.
The model shows strong class separation with minor confusion between spatially similar defect types.
- Format: FP16 ONNX
- Model Size: 6.76 MB
- ONNX Accuracy: 89.77%
The ONNX model enables efficient cross-platform deployment and edge inference.
- Total Test Images: 176
- Total Inference Time: 1.5056 seconds
- Average Latency: 8.55 ms per image
- Throughput: 116.9 images per second
- Runtime: ONNX Runtime (CPUExecutionProvider)
These results confirm real-time suitability for high-volume inspection systems.
Grad-CAM visualizations highlight regions contributing to predictions:
- Center defects show strong central activation
- Edge-Loc focuses on boundary regions
- Random defects show distributed activation
- Loc shows subtle localized activation
This ensures the model learns meaningful defect patterns rather than background noise.
- Python
- PyTorch (training)
- ONNX Runtime (inference)
- Streamlit (deployment interface)
- NumPy, OpenCV, Pillow
WaferShield-AI/
│
├── app.py # Streamlit application
├── requirements.txt # Deployment dependencies
├── README.md # Project documentation
│
├── results/ # Visual outputs and evaluation artifacts
│ ├── confusion_matrix.png
│ ├── gradcam_Center.png
│ ├── gradcam_Edge-Loc.png
│ ├── gradcam_Loc.png
│ └── gradcam_Random.png
streamlit run app.py
python src/train.py
python src/evaluate.py
- Edge-optimized model under 7 MB
- Real-time inference with sub-10 ms latency
- Balanced multi-class dataset
- Robust evaluation and benchmarking
- ONNX deployment pipeline
- Explainability with Grad-CAM
- Validation on external datasets
- Integration with embedded AI toolchains (e.g., NXP eIQ)
- Hardware-specific optimizations
- Real-time industrial deployment pipeline
This project is intended for academic and research purposes.
