Training computation of MVP

Hi, thanks for your inspiring work! According to the [MAE repo](https://github.com/facebookresearch/mae/issues/32#issuecomment-1030426550) and its [pre-training guidelines](https://github.com/facebookresearch/mae/blob/efb2a8062c206524e35e47d04501ed4f544c0ae8/PRETRAIN.md), it should have a large batch size, indicating that it may require many GPUs for distributed training. Therefore I am curious about the training computation overhead. How much time did you use to pre-train a ViT? What GPUs did you use and how many? Thanks in advance for your response.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training computation of MVP #19

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Training computation of MVP #19

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions