Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
c819ae0
added option to specify list of module names instead of single name
lucaeyring Jul 29, 2025
8827ec6
added option to specify list of module names instead of single name
lucaeyring Jul 29, 2025
60172b1
added option to specify list of module names instead of single name
lucaeyring Jul 29, 2025
a195ed3
added option to specify list of module names instead of single name
lucaeyring Jul 29, 2025
b1b3bdc
added option to specify list of module names instead of single name
lucaeyring Jul 29, 2025
eb77558
added option to specify list of module names instead of single name
lucaeyring Jul 29, 2025
1f143ff
added option to specify list of module names instead of single name
lucaeyring Jul 29, 2025
fa3df92
adjustments for backward compatibility
lucaeyring Aug 4, 2025
d603a32
adressing comments
lucaeyring Aug 4, 2025
a075024
fixing tests and unifying interface across frameworks
lucaeyring Aug 4, 2025
c6e3902
fixing tests
lucaeyring Aug 4, 2025
21fae0b
update python
lucaeyring Aug 4, 2025
75c4e70
merge upstream and integrate comments
lucaeyring Aug 5, 2025
dde828f
merge upstream and integrate comments
lucaeyring Aug 5, 2025
63c951d
update readme, and added concatenate option for extract_features
lucaeyring Aug 5, 2025
6408a28
update readme, and added concatenate option for extract_features
lucaeyring Aug 5, 2025
34b135c
update readme and concatenated saving
lucaeyring Aug 6, 2025
3783ad2
small bug fix
lucaeyring Aug 6, 2025
9c18a3a
remove print statement
lucaeyring Aug 6, 2025
a2698ef
updated setup and requirements
lucaeyring Aug 6, 2025
9fd9ad8
Update README.md
LukasMut Aug 6, 2025
7d769fc
Update README.md
LukasMut Aug 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 47 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
<img src="https://img.shields.io/pypi/dm/thingsvision" alt="downloads">
</a>
<a href="https://www.python.org/" rel="nofollow">
<img src="https://img.shields.io/badge/python-3.9%20%7C%203.10%20%7C%203.11-blue.svg" alt="Python version" />
<img src="https://img.shields.io/badge/python-3.9%20%7C%203.10%20%7C%203.11%20%7C%203.12-blue.svg" alt="Python version" />
</a>
<a href="https://github.com/ViCCo-Group/thingsvision/blob/master/LICENSE" rel="nofollow">
<img src="https://img.shields.io/pypi/l/thingsvision" alt="License" />
Expand Down Expand Up @@ -97,34 +97,17 @@ Neural networks come from different sources. With `thingsvision`, you can extrac
<!-- Setting up your environment -->
### :computer: Setting up your environment
#### Working locally
First, create a new `conda environment` with Python version 3.8, 3.9, 3.10, or 3.11 e.g. by using `conda`:

First, create a new `conda environment` with Python version 3.10, 3.11, or 3.12 e.g. by using `conda`:
```bash
$ conda create -n thingsvision python=3.9
$ conda create -n thingsvision python=3.10
$ conda activate thingsvision
```

Then, activate the environment and simply install `thingsvision` via running the following `pip` command in your terminal.

```bash
$ pip install --upgrade thingsvision
$ pip install git+https://github.com/openai/CLIP.git
```

If you want to extract features for [harmonized models](https://vicco-group.github.io/thingsvision/AvailableModels.html#harmonization) from the [Harmonization repo](https://github.com/serre-lab/harmonization), you have to additionally run the following `pip` command in your `thingsvision` environment (FYI: as of now, this seems to be working smoothly on Ubuntu only but not on macOS),

```bash
$ pip install git+https://github.com/serre-lab/Harmonization.git
$ pip install keras-cv-attention-models>=1.3.5
```

If you want to extract features for [DreamSim](https://dreamsim-nights.github.io/) from the [DreamSim repo](https://github.com/ssundaram21/dreamsim), you have to additionally run the following `pip` command in your `thingsvision` environment,

```bash
$ pip install dreamsim==0.1.2
```

See the [docs](https://vicco-group.github.io/thingsvision/AvailableModels.html#dreamsim) for which `DreamSim` models are available in `thingsvision`.
The package automatically installs the [Harmonization](https://github.com/serre-lab/harmonization) and [DreamSim](https://github.com/ssundaram21/dreamsim) repositories. See the documentation for available [harmonized models](https://vicco-group.github.io/thingsvision/AvailableModels.html#harmonization) and [DreamSim models](https://vicco-group.github.io/thingsvision/AvailableModels.html#dreamsim) in `thingsvision`.

#### Google Colab
Alternatively, you can use Google Colab to play around with `thingsvision` by uploading your image data to Google Drive (via directory mounting).
Expand Down Expand Up @@ -252,6 +235,49 @@ for batch in my_dataloader:
... # whatever post-processing you want to add to the extracted features
```

#### Multi Module Feature Extraction

It is possible to jointly extract features for multiple `module_names` of a single model.

##### PyTorch

```python

module_names = ['visual', ...] # add more module_names here

# your custom dataset and dataloader classes come here (for example, a PyTorch data loader)
my_dataset = ...
my_dataloader = ...

with extractor.batch_extraction(module_names=module_names, output_type="tensor") as e:
for batch in my_dataloader:
... # whatever preprocessing you want to add to the batch
feature_batch_dict = e.extract_batch(
batch=batch,
flatten_acts=True, # flatten 2D feature maps from an early convolutional or attention layer
)
... # whatever post-processing you want to add to the extracted features
```

##### TensorFlow / Keras

```python
module_names = ['visual', ...] # add more module_names here

# your custom dataset and dataloader classes come here (for example, TFRecords files)
my_dataset = ...
my_dataloader = ...

for batch in my_dataloader:
... # whatever preprocessing you want to add to the batch
feature_batch = extractor.extract_batch(
batch=batch,
module_names=module_names,
flatten_acts=True, # flatten 2D feature maps from an early convolutional or attention layer
)
... # whatever post-processing you want to add to the extracted features
```

#### Human alignment

*Human alignment*: If you want to align the extracted features with human object similarity according to the approach introduced in *[Improving neural network representations using human similiarty judgments](https://proceedings.neurips.cc/paper_files/paper/2023/hash/9febda1c8344cc5f2d51713964864e93-Abstract-Conference.html)* you can optionally `align` the extracted features using the following method:
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ numpy<2
open_clip_torch==3.*
pandas
regex
safetensors<0.6
scikit-image
scikit-learn
scipy
Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
"open_clip_torch==3.*",
"pandas",
"regex",
"safetensors<0.6",
"scikit-image",
"scikit-learn",
"scipy",
Expand Down
51 changes: 51 additions & 0 deletions tests/test_features.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,19 @@ def get_4D_features(self):
flatten_acts=False,
)
return features

def get_multi_features(self):
model_name = "vgg16_bn"
extractor, _, batches = helper.create_extractor_and_dataloader(
model_name=model_name, pretrained=False, source="torchvision"
)
module_names = ["features.23", "classifier.3"]
features = extractor.extract_features(
batches=batches,
module_names=module_names,
flatten_acts=False,
)
return features

def test_postprocessing(self):
"""Test different postprocessing methods (e.g., centering, normalization, compression)."""
Expand Down Expand Up @@ -89,6 +102,18 @@ def test_storing_4d(self):
)

self.check_file_exists("features", format, False)

def test_storing_multi(self):
features = self.get_multi_features()
for _, feature in features.items():
for format in set(helper.FILE_FORMATS) - set(["txt"]):
# tests whether features can be saved in any of the formats except txt
save_features(
features=feature,
out_path=helper.OUT_PATH,
file_format=format,
)
self.check_file_exists(f"features", format, False)

def test_splitting_2d(self):
n_splits = 3
Expand Down Expand Up @@ -129,3 +154,29 @@ def test_splitting_4d(self):
file_format="txt",
n_splits=n_splits,
)

def test_splitting_multi(self):
n_splits = 3
features = self.get_multi_features()
for format in set(helper.FILE_FORMATS) - set(["txt"]):
for _, feature in features.items():
if format == "pt":
feature = torch.from_numpy(feature)
split_features(
features=feature,
root=helper.OUT_PATH,
file_format=format,
n_splits=n_splits,
)

for i in range(1, n_splits):
self.check_file_exists(f"features_{i:02d}", format, False)

with self.assertRaises(Exception):
for _, feature in features.items():
split_features(
features=feature,
root=helper.OUT_PATH,
file_format="txt",
n_splits=n_splits,
)
Loading
Loading