You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add an MXFP ball RTL implementation in the prototype lib (under the arch path).
A Pull Request (PR) containing a test written in C for this operation and a README to introduce your design.
Report the performance results in this issue.
Task Description
MXFP is a lower-precision floating-point representation designed to reduce data size and simplify computations in the following process. Using MXFP can improve throughput and hardware efficiency in bandwidth-sensitive workloads, while still maintaining acceptable numerical quality for many ML scenarios.
You can learn this format and its variants, starting from this paper, "With Shared Microexponents, A Little Shifting Goes a Long Way".
As we envisage, an FP32 matrix will be loaded into the banks, and then a your customised MXFP instruction will read the data from one bank into the ball you are to implement, before outputting it to another bank.
Deliverables
Task Description