Skip to content

What do CNNs actually learn? Visualizing VGG16 filters, explaining predictions with Grad-CAM, testing CNIL anonymization methods, and building a one-shot face recognition system.

License

Notifications You must be signed in to change notification settings

Pchambet/cnn-explainability-workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 CNN Explainability Workshop

What do CNNs actually learn? And can we trust their decisions?

Open In Colab Python TensorFlow License: MIT

Deep neural networks are often called "black boxes" β€” they make predictions, but why? This project opens the box. Using VGG16 (a 138-million-parameter image classifier), we investigate what a CNN actually learns, what it looks at when making a decision, and whether we can fool it with anonymization techniques.

This isn't just code that runs. It's an investigation β€” with quantitative evidence, critical analysis, and real-world implications for AI ethics and regulation.


🧭 The Big Picture

Imagine you have a neural network that can classify 1,000 types of objects in images. You show it a face. Three fundamental questions arise:

Question How we answer it Part
What can it detect? Visualize the learned filters at different depths 1
What does it look at? Grad-CAM heatmaps + occlusion analysis 2
Can we use it for faces? One-shot face recognition via transfer learning 3
Can we ship it? Model export, optimization, GDPR compliance 4

Each part builds on the previous one. Part 1 reveals the network's capacity, Part 2 shows how it uses that capacity, Part 3 exploits it for a new task, and Part 4 asks if this is responsible.


Part 1 β€” What CNN Filters Detect

The idea in simple terms

A CNN is made of stacked layers of "filters" β€” small pattern detectors. Early layers detect simple things (edges, colors). Deeper layers combine them into complex concepts (textures, shapes, faces). But we never told the network to learn edges β€” it figured it out on its own, just from millions of labeled images.

How do we see what a filter has learned? We start with a random noise image and iteratively modify it to maximally activate a specific filter. The result is the visual pattern that filter responds to most strongly β€” its "dream", in a sense.

What we found

Filter visualization across 3 hierarchical levels of VGG16

Three rows, three levels of abstraction:

Layer Depth What filters detect Biological analogy
block1_conv2 Shallow Edges, gradients, color transitions Simple cells in V1 cortex (Hubel & Wiesel, 1962)
block3_conv3 Middle Textures, repetitive patterns, grids Complex cells combining multiple orientations
block5_conv3 Deep High-level structures, face-like shapes Inferotemporal cortex specialized for object recognition

This hierarchy was not programmed β€” it emerged purely from training on ImageNet. The network independently rediscovered principles of biological vision that neuroscientists identified in the 1960s.

Going deeper: quantitative analysis

We don't just look at filters β€” we measure them:

  • Shannon entropy β€” How complex is each filter's output? (Higher = richer patterns)
  • Inter-filter correlation β€” Are filters actually different from each other? (Lower = more diverse representations)
  • Spatial variance β€” Does the filter produce structured patterns or uniform noise?

Key insight: deeper layers have higher entropy (more complex patterns) but lower correlation (filters specialize and diversify). This is why transfer learning works β€” the network builds a rich, non-redundant feature space.


Part 2 β€” Where the Network Actually Looks

Grad-CAM: making decisions visible

When a neural network classifies an image, which pixels influenced that decision? Grad-CAM (Gradient-weighted Class Activation Mapping) answers this by tracing the gradients back to the last convolutional layers and producing a heatmap of "importance".

Think of it like asking the network: "Show me what convinced you."

Grad-CAM heatmaps across 3 convolutional layers

The surprising result: the model predicts "jersey" (a type of clothing), not a face. Why? Because VGG16 was trained on ImageNet β€” a dataset of 1,000 object categories. It knows shirts, not people. The heatmap confirms this: the network focuses on the shirt and neckline, not facial features.

This reveals a critical principle: interpretability depends on the training domain. The same architecture, trained on different data, would look at completely different regions.

ROI analysis: putting numbers on attention

We divided the face into 5 anatomical zones (forehead, eyes, nose, mouth, neck) and measured Grad-CAM activation in each zone across 3 layers. This gives us a quantitative map of where the network directs its attention β€” not just a visual impression.

Occlusion sensitivity: a second opinion

Grad-CAM uses gradients (what the network could change). Occlusion analysis uses something more direct: we physically block parts of the image with a gray patch and measure how much the prediction confidence drops.

Occlusion sensitivity map showing critical image regions

Where Grad-CAM says "this area contributed to the decision", occlusion says "without this area, the decision changes". When both methods agree, we have strong evidence.

Method What it measures Speed Reliability
Grad-CAM Gradient-based importance Fast (1 pass) Can miss redundant features
Occlusion Actual impact when removed Slow (NΒ² passes) Model-agnostic ground truth

Part 2b β€” Challenging CNIL Anonymization Guidelines

The experiment

The French data protection authority (CNIL) recommends masking the eyes to anonymize faces. But is that enough to fool a modern CNN?

We tested 4 anonymization strategies with increasing coverage and measured the actual confidence drop on VGG16:

Comparison of 4 anonymization methods and their effect on CNN classification

The actual numbers

Method Predicted class Confidence Drop vs. original Same class?
Original (no mask) jersey 38.1% β€” β€”
Eyes only jersey 8.2% βˆ’78.4% βœ… Yes
Eyes + Nose jersey 23.0% βˆ’39.8% βœ… Yes
Full face web_site 4.0% βˆ’89.6% ❌ No
Gaussian blur perfume 6.1% βˆ’84.0% ❌ No

What do these numbers actually tell us?

Let's be intellectually honest β€” the results are more nuanced than a simple "eye masking fails":

  1. Eye masking does reduce confidence significantly β€” from 38.1% down to 8.2%, a 78% drop. That's not nothing. The mask clearly disrupts the network's ability to make a confident prediction.

  2. But the predicted class doesn't change. The model still says "jersey." It adapted β€” instead of using the whole face+body combination, it relied more heavily on clothing texture, skin tone, and body shape. The identity of the prediction survived the masking.

  3. Eyes + Nose is paradoxically less effective than eyes alone (23% confidence vs 8.2%). This is counterintuitive. A possible explanation: masking a larger contiguous area creates a strong contrast edge that the CNN treats as a new feature (a "rectangle on a shirt"), which accidentally reinforces the clothing classification.

  4. Only full-face and blur force a class change β€” and even then, the replacement classes ("web_site", "perfume") are nonsensical, meaning the model is truly confused, not just less confident.

The takeaway

The CNIL guideline of masking the eyes was designed for human perception β€” a person can't recognize a face without seeing the eyes. But CNNs don't "see" like humans. They use statistical texture patterns across the entire image: skin color distribution, jawline contour, hair texture, clothing patterns. Masking 14% of the image (the eye region) leaves 86% of exploitable features intact.

This has real-world implications: anonymization guidelines designed for humans may not protect against algorithmic re-identification. Moreover, research suggests this problem is ethnically biased β€” face recognition systems perform differently across demographic groups, and anonymization effectiveness varies with skin tone and facial structure (Shrutin et al., 2019).


Part 3 β€” One-Shot Face Recognition

The challenge

Can we build a face recognition system with just one photo per person? Traditional deep learning needs thousands of examples. With one sample, classical training is impossible.

The solution: transfer learning + KNN

Instead of training a new network, we reuse VGG16 as a feature extractor. The deep layers of VGG16 (trained on millions of images) produce a 25,088-dimensional "fingerprint" for any input image. We then use a simple K-Nearest Neighbors classifier to match faces based on these fingerprints.

Why KNN instead of a neural network?

Approach Training data needed Accuracy with 1 sample Explainability
Neural Network Thousands ❌ Impossible Low
KNN + Transfer Learning 1 per person βœ… Works High (distance-based)

This is an elegant demonstration of the build vs. reuse principle in ML: sometimes the smartest solution is to leverage existing knowledge rather than starting from scratch.


Part 4 β€” Production & Ethics

Model export

The feature extractor is exported as a TensorFlow SavedModel (~112 MB), ready for deployment. In production, the full 138M-parameter model would need optimization:

Constraint Raw VGG16 After optimization
Size 528 MB ~130 MB (quantized)
Inference ~50 ms/image ~15 ms (TF-Lite)
GDPR ⚠️ Stores biometric data βœ… On-device processing

Ethical considerations

Facial embeddings are biometric data under GDPR (Article 9). Any production system must address consent, storage, and the right to be forgotten. This project demonstrates the technical pipeline β€” but deploying it responsibly requires legal and ethical safeguards beyond the code.


Methodology

Every claim in this project is backed by quantitative evidence, not just visual inspection:

Analysis Method Why it matters
Filter diversity Shannon entropy + inter-filter correlation Proves filters learn different features, not redundant ones
Attention mapping ROI analysis (5 zones Γ— 3 layers) Quantifies where the network looks, not just visually
Occlusion sensitivity Sliding patch with confidence Ξ΄ Model-agnostic validation of Grad-CAM findings
Anonymization efficacy 4 methods, confidence measured Experimental challenge of CNIL regulatory guidelines

Quick Start

# Requires uv (https://github.com/astral-sh/uv)
uv sync
uv run jupyter lab

Open TP_Reconnaissance_Faciale.ipynb and Run All (~8 min on CPU).

Project Structure

β”œβ”€β”€ TP_Reconnaissance_Faciale.ipynb   ← Main notebook (run this)
β”œβ”€β”€ test_face.jpg                     ← Test portrait image
β”œβ”€β”€ output_figures/                   ← Generated visualizations
β”‚   β”œβ”€β”€ 01_filters.png
β”‚   β”œβ”€β”€ 02_gradcam.png
β”‚   β”œβ”€β”€ 03_occlusion.png
β”‚   └── 04_cnil_mask.png
β”œβ”€β”€ pyproject.toml                    ← Dependencies (uv)
└── README.md

Tech Stack

Python 3.11 Β· TensorFlow 2.20 Β· OpenCV Β· scikit-learn Β· Matplotlib Package manager: uv

References

License

MIT

About

What do CNNs actually learn? Visualizing VGG16 filters, explaining predictions with Grad-CAM, testing CNIL anonymization methods, and building a one-shot face recognition system.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published