Augmentation Hierarchy (Most to Least Impactful)
Tier 1: Always Do This
- Random horizontal/vertical flip
- Random rotation (±15°)
- Random crop and resize
- Color jitter (brightness, contrast, saturation)
Tier 2: Usually Helps
- Mixup: blend two images and their labels
lam = np.random.beta(0.2, 0.2)
x_mix = lam * x1 + (1-lam) * x2
y_mix = lam * y1 + (1-lam) * y2
- CutMix: paste a patch from one image to another
Tier 3: For Very Small Datasets (<200 samples)
- Elastic transformations (for medical images)
- Grid distortion
- Test-time augmentation (TTA) — ensemble 8 augmented versions at inference
Tier 4: Synthetic Data
- Train a GAN or use Stable Diffusion to generate additional training samples
- Works well for domain-specific rare classes