Success with OPT-175B

Hello,

Thank you for sharing this great implementation with the community.

I just wanted to open this Issue and share my success in running the OPT-175B model on a DGX station.

<img width="1210" alt="Screenshot 2022-07-25 at 8 39 03 PM" src="https://user-images.githubusercontent.com/588431/180911205-058ea5ad-1eea-444e-ab82-81f3d3939eb6.png">

The model takes ~3 minutes to load and it uses ~58% of memory on the first 7 GPUs and 28% of the last one.

Please feel free to close this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Success with OPT-175B #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Success with OPT-175B #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions