add kunlunxin vendor op#66
Conversation
| inv_scale, | ||
| ) | ||
|
|
||
| def cast_to_fp8( |
There was a problem hiding this comment.
Where is the cast_to_fp8 op used in TransformerEngine?
There was a problem hiding this comment.
The fp8_utils.py module in Megatron-LM-FL requires the use of cast_to_fp8; specifically: from transformer_engine.pytorch.cpp_extensions import cast_to_fp8.
There was a problem hiding this comment.
Megatron-LM-FL includes a version check for Transformer Engine (TE). For TE versions >= 2.0, QuantizedTensor is used directly; for versions between 1.0 and 2.0, cast_to_fp8 is used instead. Since TE-FL currently targets TE V2.9, the cast_to_fp8 path should no longer need to be retained, right?
There was a problem hiding this comment.
We haven't upgraded to the latest version of TE yet, so for the time being, we can only bind this cast_to_fp8 function; the upgrade to the newer version of TE is still in progress.
| ext = get_ext() | ||
|
|
||
| try: | ||
| import transformer_engine_klx_torch |
There was a problem hiding this comment.
We don't actually recommend using transformer_engine_klx_torch directly; it's better to call the kernels or operators provided by the vendor to avoid the overhead and the complexity of version control.
There was a problem hiding this comment.
transformer_engine_klx_torch is currently an integral part of our kernel; its format was designed based on the conventions adopted by other vendors.

Description
Add some kunlunxin ops bind code
Type of change
Changes
Add kunlunxin backend bind support.
Add kunlunxin ops bind and register.
Checklist: