Skip to content

Mixed Precision Training #2

@tom-m-walker

Description

@tom-m-walker

Hi,

Thanks for this great work.

I noticed that the model class has the option to use FP16, but it's not used by default.

Was FP32 found to be necessary to achieve good performance? If so, was there hypotheses for which part of the architecture required high precision?

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions