This repository contains all code and documentation for my thesis project "Optimizing Class-Imbalanced Chest X-Ray Disease Classification with Class-Balanced Learning" supervised by Prof. Rafael de Andrade Moral (Department of Mathematics & Statistics, Maynooth University, 2025). The project addresses automated multi-label thoracic disease classification in chest X-ray images, focusing on class imbalance and the critical reduction of missed diagnoses.
| Experiment | Configuration | Total FP | Total FN | Mean AUC | Total Images |
|---|---|---|---|---|---|
| Baseline | No class/pos weights | 17,105 | 21,879 | 0.7098 | 108,492 |
| Effective Weight | β = 0.999, included No Finding | 14,573 | 22,419 | 0.7032 | 108,492 |
| Effective Weight | β = 0.999, excluded No Finding | 1,044 | 22,974 | 0.7874 | 48,131 |
| Inverse Frequency | Excluded No Finding | 36,676 | 10,791 | 0.7433 | 48,131 |
| Inverse Frequency | Included No Finding | 40,125 | 16,001 | 0.7487 | 108,492 |
| Class | Probability |
|---|---|
| Atelectasis | 0.0735 |
| Cardiomegaly | 0.0003 |
| Consolidation | 0.0377 |
| Edema | 0.0086 |
| Effusion | 0.8868 |
| Emphysema | 0.0001 |
| Infiltration | 0.3509 |
| Mass | 0.9483 |
| Nodule | 0.8672 |
| Pleural Thickening | 0.7313 |
| Pneumothorax | 0.0032 |
| No Finding | 0.2788 |
Corresponding Grad-CAM Visualizations Example:
These interpretable visualizations highlight model attention for each class, offering transparency for real-world usage.
- Dataset: NIH ChestX-ray14 (released by the National Institutes of Health, USA)
- 108,948 frontal chest X-rays from 32,717 unique patients, labeled across 14 disease categories plus "No Finding"
- Acquired from NIH Clinical Center (1992-2015), includes AP/PA views in PNG format, originally 1024x1024px (resized to 224x224 for model input)
- Substantial class imbalance (over half are "No Finding"); many rare conditions occur in <1% of cases
- Task: Multi-label disease prediction (one image can have multiple diseases)
- Purpose: Focus on reducing missed diagnoses (false negatives) with clinically practical sensitivity/specificity trade-off
- Backbone: DenseNet-121 (transfer learning from ImageNet)
- Key methods:
- Class re-weighting (inverse frequency, effective number)
- Binary cross-entropy loss and variants (PyTorch implementation)
- SmoothGradCAM/Grad-CAM for heatmaps (explainable X-ray decision visualization)
- Sigmoid Activation (multi-label classification)
- Hyperparameters: Adam (lr=1e-4), batch size 32, 15 epochs, input 224x224
- Acceleration: CUDA-enabled GPU
- Sensitivity vs. Specificity: Inverse frequency weights with "No Finding" included delivered the best balance, reducing false negatives by ~30% versus baseline, at the cost of more false positives—an acceptable trade-off for screening use cases.
- Imbalance Mitigation: Effective number weighting worked best to minimize false positives, but missed more actual cases (higher false negative count).
- General Limits: Results reflect the source institution's demographics/conditions; external validation is necessary for deployment elsewhere. Original images are downscaled, possibly losing subtle features.
- Open terminal in
chest_xrayfolder:pip install -r requirements.txt
- (Optional, for reproducing full metrics):
python download_data.pyto download and resize data- Pretrained model checkpoints (all strategies) in
saved_models - Example images for CAM viz:
data/sample_images - All analysis:
notebooks/chesnet.ipynb– EDA, data prepexcluding_no_finding.ipynb– Train without 'No Finding'including_no_finding.ipynb– Verify best performing model/metrics
- Pretrained model checkpoints (all strategies) in
- Author: Shishir Ashoka Chandra Mouli
- Supervisor: Prof. Rafael de Andrade Moral
- Institution: Maynooth University, Ireland
- Year: 2025
This repository/code is released for academic/research use.



