[#168 fix] add context manager to fake `ScalingTensor`/`ScalingParameter`'s `class` as `torch.Tensor` #169

152334H · 2024-03-06T05:02:54Z

Description
See #168. This is the most non-invasive fix I could come up with. Thanks to @aliencaocao for idea.

Minor Revision

adds msamp.common.tensor.tensor.pretend_scaling_is_torch, which can be used to fix GradScaler().step().

This is a non-breaking change as it does not deviate from prior behaviour without explicitly calling with pretend_scaling_is_torch().

152334H added 2 commits March 6, 2024 04:49

add context manager to fake ScalingTensor -> torch.Tensor

fcae35d

mnist ddp/single gpu examples fixed

e1a1a06

Provide feedback