An automated defect classification system for semiconductor wafer maps using Convolutional Neural Networks (CNN) to identify and categorize spatial defect patterns.
This project develops an Automated Defect Classification system using Convolutional Neural Networks (CNN) to automatically categorize spatial defect patterns on semiconductor wafer maps. The system aims to:
- Replace manual inspection processes
- Reduce human error in defect classification
- Accelerate identification of process anomalies
- Enable faster Root Cause Analysis (RCA)
In semiconductor manufacturing, wafer map patterns provide critical insights into fabrication process health. This automated system delivers:
Rapid identification of defect clusters (e.g., Scratch, Edge-Ring) enables process engineers to perform Root Cause Analysis faster.
Automating classification reduces the "man-to-machine" ratio and minimizes misclassification risks due to operator fatigue.
Detecting systematic patterns vs. random defects helps pinpoint specific faulty process steps, such as:
- Etching uniformity issues
- CMP (Chemical Mechanical Planarization) handling errors
- Equipment-specific failures
Source: WM811K Wafer Map Dataset
- Scale: 811,000+ wafer maps
- Format: 2D images with pixel values representing die status
0: Background1: Good Die2: Defective Die
- Classes: 9 categories
- 8 defect patterns: Center, Donut, Edge-Loc, Edge-Ring, Loc, Random, Scratch, Near-full
- 1 normal class: None
- Challenge: Natural class imbalance (common in manufacturing data)
Please be aware that the original dataset (WM-811K) contains a typo in the column header:
trianTestLabelis used instead oftrainTestLabel. To maintain compatibility with the raw data, this project uses the original spelling (trianTestLabel) throughout the codebase.
The dataset exhibits significant class imbalance, with Edge-Ring and Edge-Loc being the most common defect types.
- Python 3.11+
- pip package manager
- Clone the repository
git clone https://github.com/NUSSETO/Semiconductor_Project.git
cd Semiconductor_Project- Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies
pip install -r requirements.txt- Download dataset
- Download the WM811K dataset from Kaggle
- Place
LSWMD.pklin the project root directory
- Analyze class distribution
- Visualize spatial defect patterns
- Understand data characteristics
- Resize wafer maps to standardized dimensions (64Γ64 and 96x96 for comparative analysis)
- Apply denoising techniques
- Handle class imbalance
Design and train CNN tailored for spatial pattern recognition:
- Convolutional layers for feature extraction
- MaxPooling for dimensionality reduction
- Dense layers for classification
- Dropout for regularization
- Monitor training/validation metrics
- Optimize hyperparameters
- Implement data augmentation
Investigate model performance through confusion matrices:
| Metric | Value |
|---|---|
| Overall Accuracy | ~78% |
| Macro Recall | ~79% |
| Training Time | ~28 minutes |
| Model Size | ~19 MB |
Sequential Model:
βββ Input (96 x 96 x 3)
βββ Conv2D (32 filters, 5Γ5)
βββ MaxPooling2D (2Γ2)
βββ Conv2D (64 filters, 3Γ3)
βββ MaxPooling2D (2Γ2)
βββ Conv2D (128 filters, 3Γ3)
βββ MaxPooling2D (2Γ2)
βββ Conv2D (256 filters, 3Γ3)
βββ MaxPooling2D (2Γ2)
βββ Flatten
βββ Dense (128 units)
βββ Dropout (0.5)
βββ Dense (8 units, softmax)Key Features:
- Input shape: 96Γ96Γ3
- Activation: ReLU for hidden layers, Softmax for output
- Optimizer: Adam
- Loss function: Categorical Crossentropy
- Note on Input Size: While the exploratory phase and initial training (as seen in
main.ipynb) experimented with 64x64 resolution for efficiency, the final deployed model architecture has been optimized for 96x96 resolution to capture finer details of defect patterns.
- Start Jupyter Notebook
jupyter notebook-
Open
main.ipynb -
Run all cells or execute step-by-step:
# Load the trained model
from tensorflow.keras.models import load_model
# Option 1: Load the modern Keras format (Recommended)
model = load_model('model/wafer_defect_model.keras')
# Option 2: Load the legacy H5 format
# model = load_model('model/wafer_defect_model.h5')
# Predict on new wafer map
prediction = model.predict(preprocessed_wafer_map)
defect_class = np.argmax(prediction)Semiconductor_Project/
βββ main.ipynb # Main analysis notebook
βββ main.html # For easy access
βββ LICENSE # MIT License
βββ requirements.txt # Python dependencies
βββ .gitignore # Git ignore rules
βββ README.md # This file
βββ img/ # Visualization images
β βββ Defect_distribution.png
β βββ Defect_size.png
β βββ Resize_example.png
β βββ Training_history.png
β βββ Original_CM.png
β βββ Second_CM.png
β βββ Final_CM.png
βββ model/
β βββ wafer_defect_model.h5
β βββ wafer_defect_model.keras
βββ data/
βββ LSWMD.pkl # Dataset (not tracked in git)
- Dataset: WM811K Wafer Map Dataset
- Inspired by semiconductor manufacturing quality control practices
- Built with TensorFlow/Keras/Antigravity/Gemini
This project is open-source and available under the MIT License.
Author: Jason Huang
Focus: Semiconductor Manufacturing Quality Control, Machine Learning, Data Analysis