Chest CT Scan Cancer Classification
4-class lung cancer classification on 613 CT images. MobileNetV2 best: 66.03% test accuracy. 16 models: HOG+8 classical + custom CNNs + TL. MC-Dropout uncertainty flags cases for radiologist review.
613 train / 315 test CT slices — 4 lung cancer types
HOG+classical → custom CNNs → 2-phase TL → MC-Dropout uncertainty
Multi-method pipeline for 4-class chest CT cancer classification.
Dataset
- ▸613 train / 72 val / 315 test CT scan slices
- ▸4 classes: adenocarcinoma (195), large cell carcinoma (115), normal (148), squamous cell (155)
- ▸Small dataset — primary constraint limiting deep model performance
Phase 1 — Classical ML (HOG + 8 models)
- ▸HOG: 9 orientations, 8×8 pixels/cell, 64×64 grayscale
- ▸PCA: 95% variance at 100 components
- ▸Best: Extra Trees (56.83%), SVM-RBF (55.56%)
Phase 2 — Custom CNNs All performed poorly (28–34%) — 613 images insufficient for training CNNs from scratch.
Phase 3 — Transfer Learning
| Model | Test Accuracy |
|---|---|
| EfficientNetV2S | 22.86% |
| ResNet50 | 55.24% |
| Ensemble (MV2 + ResNet50) | 62.54% |
| VGG16 | 64.76% |
| MobileNetV2 | 66.03% |
2-phase training: frozen base (10–12 ep) → fine-tune (15–25 ep), label smoothing 0.1, focal loss γ=2
MC-Dropout Uncertainty Quantification 30 forward passes with dropout active → confidence distribution. High-variance predictions → flagged for radiologist review. This is critical: medical AI should express uncertainty rather than force a prediction when the evidence is weak.
Why MobileNetV2 Wins Lightweight architecture reduces overfitting on 613 examples. Heavier models (ResNet50, VGG16) overfit despite 2-phase training.