This repository provides a dataset of synthetic visual concepts generated using zero-shot Text-to-Image (T2I) models, designed to support research in concept-based Explainable Artificial Intelligence (XAI).
Concept-based XAI methods aim to interpret deep learning models through human-understandable visual concepts (e.g., textures, object parts). However, these approaches typically rely on large, manually curated datasets, which limits scalability.
To address this, we explore the use of synthetic concept datasets generated via T2I models as a scalable alternative.
The dataset contains:
- 🏷️ Real concept images gathered from various datasets and search engines
- 🎨 Synthetic concept images generated from predefined textual prompts
- 🔁 Multiple samples per concept to enable variability analysis
Each concept is designed to approximate a human-interpretable visual feature, such as:
- textures (e.g., striped, dotted)
- object parts (e.g., wings, wheels)
- materials or patterns
The root of the repository is concepts/.
After the helper script analysis.py, every concept and T2I model gets its own dedicated folder:
concepts/
│
├── analysis.py # Helper script for the stats
│
├── asparagus/
│ ├── asparagus/ # Real asparagus images
│ ├── asparagus_flux/ # Asparagus concept generated by Flux 1.1
│ ├── asparagus_gpti1/ # Asparagus concept generated by GPT‑Image 1
│ └── asparagus_sd35/ # Asparagus concept generated by Stable Diffusion 3.5
│
├── ... # Other concepts follow the same pattern
This dataset is intended for:
- Evaluating concept-based XAI methods
- Studying representation similarity between synthetic and real concepts
- Testing intra-concept consistency across generated samples
- Supporting downstream explanation tasks
- Analyzing the effect of concept removal on model explanations
The dataset supports four key analyses:
- Concept Representation Similarity - Compare embeddings of synthetic vs. real concept images
- Intra-Concept Similarity - Measure consistency across subsets of the same concept
- Downstream Explanation Performance - Evaluate usefulness in explaining class predictions
- Concept Removal Impact - Assess how removing a concept affects explanation behavior
While synthetic data offers scalability, this dataset highlights some challenges:
- ❗ Potential mismatch between synthetic and real-world concepts
- 🤖 Biases introduced by the generative model
These limitations should be carefully considered when using synthetic data for interpretability.
git clone https://github.com/DataSciencePolimi/ZeroShot-T2I-Concepts.git
cd ZeroShot-T2I-ConceptsAfter downloading or cloning the repo, you can run the bundled script to analyze the dataset:
python analysis.pyExplore the dataset structure and integrate it into your XAI pipelines.
If you use this dataset, please cite:
@InProceedings{ZeroShot-T2I-Concepts,
author = {Astolfi, Giacomo and Bianchi, Matteo and Campi, Riccardo and De Santis, Antonio and Brambilla, Marco},
title = {A Framework for Evaluating Zero-Shot Image Generation in Concept-based Explainability},
booktitle = {2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2026}
}Full author list (equal contribution noted):
Giacomo Astolfi*, Matteo Bianchi*, Riccardo Campi*, Antonio De Santis, Marco Brambilla
Contributions, issues, and discussions are welcome! Feel free to open a PR or start a discussion.