Anand Kumar • Kavinder Roghit Kanthen • Josna John
This research aims to develop a more effective and accurate automated diagnostic tool for skin cancer analysis by simultaneously addressing lesion segmentation and classification tasks. Traditional methods typically handle these tasks separately, which can lead to inefficiencies and reduced accuracy.
The GS-TransUNet model aims to integrate these tasks into a cohesive framework, leveraging advanced machine learning techniques to improve diagnostic precision and speed.
By combining 2D Gaussian Splatting with Transformer UNet architecture, this study seeks to enhance the consistency and accuracy of segmentation masks, which are crucial for the reliable classification of skin lesions. This integrated approach improves diagnostic accuracy and reduces the computational cost associated with separate processing stages, paving the way for real-time applications in clinical settings.
Get the pre-trained Vision Transformer models from Google:
- Available Models:
R50-ViT-B_16,ViT-B_16,ViT-L_16 - Source: Google Cloud Storage - ViT Models
# Download and setup ViT model
wget https://storage.googleapis.com/vit_models/imagenet21k/{MODEL_NAME}.npz
mkdir -p ../model/vit_checkpoint/imagenet21k
mv {MODEL_NAME}.npz ../model/vit_checkpoint/imagenet21k/{MODEL_NAME}.npz📁 Required Datasets
-
ISIC-2017 Dataset
- 📥 Download ISIC-2017
- 🔬 Comprehensive skin lesion dataset
-
PH-2 Dataset
- 📥 Download PH-2
- 🏥 Dermoscopy image database
⚠️ Important: Update the root paths indataset.pyto match your local setup.
🛠️ Installation Steps
-
Python Environment
# Requires Python 3.7+ python --version -
Install Dependencies
pip install -r model/requirements.txt
🚀 Start Training
# Run training with GPU acceleration
CUDA_VISIBLE_DEVICES=0 python train.py --xp_name gauss🔄 Note: The script automatically runs testing after training completion.
| Feature | Description |
|---|---|
| 🔬 Dual Task Learning | Simultaneous segmentation and classification |
| ⚡ 2D Gaussian Splatting | Enhanced feature representation |
| 🤖 Transformer UNet | Advanced attention mechanisms |
| 🚀 Real-time Inference | Optimized for clinical deployment |
| 📊 SOTA Performance | State-of-the-art accuracy on medical datasets |
If you find our work helpful in your research, please consider citing:
📋 BibTeX Citation
@inproceedings{kumar2024gstransunet,
author = {Kumar, Anand and Kavinder Roghit, Kanthen and John, Josna},
booktitle = {Medical Imaging 2025: AI/ML},
organization = {SPIE},
title = {GS-TransUNet: integrated 2D Gaussian splatting and transformer UNet for accurate skin lesion analysis},
month = nov,
year = {2024},
doi = {10.1117/12.3046869}
}