Have you evaluated CLIP and StableRep++ on imagenet zero-shot classification with  laion50m subsets? 

Dear Authors,

I appreciate for your exceptional contributions to the field, particularly regarding your works on syn-rep-learn.

From your "stablerep" paper, particularly the scaling effects of linear probing shown in Figure 6, I discovered these do not seem to be consistent in ImageNet zero-shot classification using provided CLIP-based checkpoints.
(CLIP shows better accuracy than StableRep++ for larger pretraining samples)

Below, I've included the code I employed to generate these results, adapted from your repositories ("StableRep" and "Scaling"). 
The command and its corresponding output are as follows:

[stablerep.zip](https://github.com/google-research/syn-rep-learn/files/14490499/stablerep.zip)

```
# model checkpoints are automatically downloaded from dropbox

# CLIP ViT-B-16
python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion3m:CLIP_vitb16
[laion3m:CLIP_vitb16]  ImageNet zero-shot accuracy: 21.83

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion10m:CLIP_vitb16
[laion10m:CLIP_vitb16]  ImageNet zero-shot accuracy: 40.732

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion20m:CLIP_vitb16
[laion20m:CLIP_vitb16]  ImageNet zero-shot accuracy: 45.754

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion50m:CLIP_vitb16
[laion50m:CLIP_vitb16]  ImageNet zero-shot accuracy: 49.564

# StableRep-pp
python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion3m:StableRep-pp_vitb16
[laion3m:StableRep-pp_vitb16]  ImageNet zero-shot accuracy: 31.71

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion10m:StableRep-pp_vitb16
[laion10m:StableRep-pp_vitb16]  ImageNet zero-shot accuracy: 40.86

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion20m:StableRep-pp_vitb16
[laion20m:StableRep-pp_vitb16]  ImageNet zero-shot accuracy: 43.614

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion50m:StableRep-pp_vitb16
[laion50m:StableRep-pp_vitb16]  ImageNet zero-shot accuracy: 44.886
```

Given these observations, I am curious to understand whether there might be an oversight on my part or if this phenomenon reflects the model's behavior. 
Could you possibly give any idea on this discrepancy?

Best regards,




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have you evaluated CLIP and StableRep++ on imagenet zero-shot classification with laion50m subsets? #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Have you evaluated CLIP and StableRep++ on imagenet zero-shot classification with laion50m subsets? #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions