The ref.zip downloaded from https://huggingface.co/PKU-SEC-Lab/LightMamba has some discrepancies.
Taking B_BUFFER.cpp as an example:
//* actual size
//* scale down
/** @brief Number of layers */
constexpr int L = 2;
constexpr int T_LOAD = 100; // saved seq length
constexpr int T = 1;
constexpr int TP = 1; // this TP is input parallelism
...
// read input refs
auto IN_Q = read_tensor<int64_t> (file_path + "/B_q_layer" + file_path_suffix);
auto IN_S = read_tensor<int64_t> (file_path + "/B_s_layer" + file_path_suffix);
From this code snippet, it takes the input from
B_q_layer*.bin
B_s_layer*.bin
However, I had to change the saved seq length T_LOAD to 100 instead of the original 512 to make the HLS simulation runs.
Another problem is the mismatching size of the input files:
before_rms1_layer0.bin = 10,5 MB
before_rms1_layer1.bin = 10,5 MB
rms1_layer0.bin = 10,5 MB
rms1_layer1.bin = 2,0 MB
So, I get mismatches when I run the simulation of RMSNORM_1.cpp. Can you please help me check if the input files from Hugging Face with the commit 0e991c6 work, and maybe provide us with a new working commit?
The
ref.zipdownloaded from https://huggingface.co/PKU-SEC-Lab/LightMamba has some discrepancies.Taking
B_BUFFER.cppas an example:From this code snippet, it takes the input from
B_q_layer*.binB_s_layer*.binHowever, I had to change the saved seq length
T_LOADto100instead of the original512to make the HLS simulation runs.Another problem is the mismatching size of the input files:
before_rms1_layer0.bin= 10,5 MBbefore_rms1_layer1.bin= 10,5 MBrms1_layer0.bin= 10,5 MBrms1_layer1.bin= 2,0 MBSo, I get mismatches when I run the simulation of
RMSNORM_1.cpp. Can you please help me check if the input files from Hugging Face with the commit0e991c6work, and maybe provide us with a new working commit?