Skip to content

Conversation

@152334H
Copy link

@152334H 152334H commented Mar 6, 2024

Description
See #168. This is the most non-invasive fix I could come up with. Thanks to @aliencaocao for idea.

Minor Revision

  • adds msamp.common.tensor.tensor.pretend_scaling_is_torch, which can be used to fix GradScaler().step().

This is a non-breaking change as it does not deviate from prior behaviour without explicitly calling with pretend_scaling_is_torch().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant