10/Case Study

ML Image Classification

ResNet-50 Fine-Tuning & Interpretability

RoleML Engineer

Timeline2024

StackPython, TensorFlow, ResNet-50

StatusShipped

Impact

Improved model accuracy from 81% to 94.2% through systematic fine-tuning and data augmentation.

Overview

Context

A from-the-fundamentals computer-vision project: fine-tune ResNet-50, then push accuracy the disciplined way — one controlled experiment at a time — and use Grad-CAM to see what the model was looking at, not just whether it was right. Going from 81% to 94.2% came from reading the failures, not throwing data at it.

Challenge

The problem

Baseline model accuracy was 81%, insufficient for production use. The challenge was improving accuracy through principled techniques while maintaining interpretability — understanding not just whether the model was right, but why.

Approach

How I built it

Implemented targeted data augmentation based on failure analysis of misclassified samples

Applied cosine learning-rate scheduling for stable convergence

Used class-weighted loss to handle class imbalance without oversampling

Added strategic dropout regularization to prevent overfitting

Built Grad-CAM visualization pipeline for model interpretability and failure diagnosis

Technical Decisions

Why these choices

Grad-CAM for interpretability

Accuracy metrics alone don't reveal whether a model is learning meaningful features. Grad-CAM visualizations showed where the model was attending, enabling targeted improvements.

Transfer learning with selective unfreezing

Full fine-tuning risks catastrophic forgetting on small datasets. Selective unfreezing of later layers preserved learned features while adapting to the target domain.

Outcomes

What shipped

Accuracy improvement from 81% to 94.2%

Grad-CAM visualizations for model interpretability

Systematic failure analysis and targeted data augmentation

Reproducible training pipeline with documented experiments

Takeaways

What I learned

Interpretability tools like Grad-CAM should be part of every computer vision workflow — they reveal what metrics can't

Targeted augmentation based on failure analysis outperforms generic augmentation strategies