fix(ci): tighten pre-commit hook to clippy --all-targets#150
Merged
Conversation
Perf regression report (ADR-058)
|
| Bench | Δ point | 95% CI | new ns | base ns | verdict |
|---|---|---|---|---|---|
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_16c |
+5.62% | [+5.59%, +5.64%] | 1498.7 | 1498.7 | ⚠ WARN |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_256c |
-3.03% | [-3.20%, -2.85%] | 28622.2 | 28622.2 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_256c |
-3.23% | [-3.34%, -3.12%] | 21758.9 | 21758.9 | 🚀 WIN |
simd_throughput_384/normalize |
-3.69% | [-3.70%, -3.67%] | 118.7 | 118.7 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/768d_256c |
-3.78% | [-3.89%, -3.67%] | 21768.9 | 21768.9 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/768d_256c |
-4.88% | [-4.98%, -4.77%] | 28527.5 | 28527.5 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/128d_64c |
-4.96% | [-5.02%, -4.90%] | 525.2 | 525.2 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/1024d_256c |
-4.74% | [-5.10%, -4.36%] | 35982.1 | 35982.1 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_256c |
-4.94% | [-5.19%, -4.68%] | 35821.4 | 35821.4 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_256c |
-5.05% | [-5.34%, -4.74%] | 35941.9 | 35941.9 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_256c |
-5.19% | [-5.57%, -4.79%] | 26857.1 | 26857.1 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/768d_256c |
-5.77% | [-5.86%, -5.68%] | 28008.5 | 28008.5 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/768d_16c |
-6.44% | [-6.49%, -6.39%] | 959.3 | 959.3 | 🚀 WIN |
int8_batch_cosine/int8_loop/1000 |
-7.52% | [-7.70%, -7.33%] | 19483.3 | 19483.3 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/768d_16c |
-7.90% | [-7.92%, -7.87%] | 641.7 | 641.7 | 🚀 WIN |
simd_batch_cosine/simd_batch/1000 |
-13.39% | [-13.46%, -13.31%] | 80376.7 | 80376.7 | 🚀 WIN |
simd_batch_dot_product/simd_batch/1000 |
-16.51% | [-16.58%, -16.46%] | 76823.8 | 76823.8 | 🚀 WIN |
All 259 measurements
| Bench | Δ point | CI-lower | CI-upper |
|---|---|---|---|
add_bias_gelu/4096 |
-0.01% | -0.03% | +0.01% |
add_bias_gelu/896 |
+0.06% | +0.02% | +0.11% |
binary_cosine_distance/binary/1024 |
-0.58% | -0.59% | -0.56% |
binary_cosine_distance/binary/1536 |
-0.29% | -0.31% | -0.28% |
binary_cosine_distance/binary/384 |
-0.41% | -0.44% | -0.39% |
binary_cosine_distance/binary/768 |
+0.28% | +0.26% | +0.30% |
binary_cosine_distance/float32_simd/1024 |
+0.28% | +0.24% | +0.33% |
binary_cosine_distance/float32_simd/1536 |
+0.06% | +0.05% | +0.08% |
binary_cosine_distance/float32_simd/384 |
+0.11% | +0.08% | +0.13% |
binary_cosine_distance/float32_simd/768 |
+0.40% | +0.38% | +0.43% |
elementwise_mul/4096 |
-1.02% | -1.06% | -0.99% |
gelu/4096 |
-0.00% | -0.02% | +0.02% |
gelu/896 |
+0.01% | -0.01% | +0.03% |
int4_cosine_distance/float32_simd/1024 |
+0.77% | +0.74% | +0.79% |
int4_cosine_distance/float32_simd/1536 |
+0.01% | +0.00% | +0.02% |
int4_cosine_distance/float32_simd/384 |
+0.50% | +0.48% | +0.52% |
int4_cosine_distance/float32_simd/768 |
+0.28% | +0.27% | +0.30% |
int4_cosine_distance/int4/1024 |
+0.25% | +0.21% | +0.28% |
int4_cosine_distance/int4/1536 |
+0.03% | +0.00% | +0.05% |
int4_cosine_distance/int4/384 |
+0.22% | +0.14% | +0.31% |
int4_cosine_distance/int4/768 |
+0.16% | +0.10% | +0.23% |
int8_batch_cosine/float32_simd/10 |
+0.08% | +0.07% | +0.08% |
int8_batch_cosine/float32_simd/100 |
-0.32% | -0.35% | -0.30% |
int8_batch_cosine/float32_simd/1000 |
-2.92% | -2.99% | -2.86% |
int8_batch_cosine/int8_loop/10 |
+0.21% | +0.19% | +0.23% |
int8_batch_cosine/int8_loop/100 |
+0.77% | +0.75% | +0.78% |
int8_batch_cosine/int8_loop/1000 |
-7.52% | -7.70% | -7.33% |
int8_prepared_dot_product/per_call/1024 |
+0.01% | -0.01% | +0.04% |
int8_prepared_dot_product/per_call/127 |
-0.00% | -0.01% | +0.01% |
int8_prepared_dot_product/per_call/128 |
+0.00% | -0.00% | +0.01% |
int8_prepared_dot_product/per_call/129 |
+0.00% | -0.01% | +0.01% |
int8_prepared_dot_product/per_call/384 |
+0.02% | +0.02% | +0.04% |
int8_prepared_dot_product/per_call/768 |
-0.01% | -0.02% | +0.00% |
int8_prepared_dot_product/prepared/1024 |
-0.26% | -0.30% | -0.22% |
int8_prepared_dot_product/prepared/127 |
+0.18% | +0.15% | +0.21% |
int8_prepared_dot_product/prepared/128 |
+0.04% | -0.01% | +0.09% |
int8_prepared_dot_product/prepared/129 |
+0.78% | +0.76% | +0.80% |
int8_prepared_dot_product/prepared/384 |
+0.37% | +0.34% | +0.40% |
int8_prepared_dot_product/prepared/768 |
+0.01% | -0.07% | +0.08% |
int8_quantization/quantize/1024 |
+0.03% | +0.02% | +0.04% |
int8_quantization/quantize/1536 |
-0.20% | -0.21% | -0.20% |
int8_quantization/quantize/384 |
+0.02% | +0.01% | +0.02% |
int8_quantization/quantize/768 |
+0.02% | +0.01% | +0.02% |
int8_raw_dot_product/dot_product_i8/1024 |
-0.92% | -0.96% | -0.88% |
int8_raw_dot_product/dot_product_i8/127 |
+0.09% | +0.07% | +0.11% |
int8_raw_dot_product/dot_product_i8/128 |
+1.78% | +1.71% | +1.85% |
int8_raw_dot_product/dot_product_i8/129 |
+0.26% | +0.21% | +0.31% |
int8_raw_dot_product/dot_product_i8/384 |
+1.13% | +1.09% | +1.17% |
int8_raw_dot_product/dot_product_i8/768 |
+1.43% | +1.38% | +1.48% |
int8_raw_dot_product/dot_product_i8_raw/1024 |
-0.08% | -0.14% | -0.02% |
int8_raw_dot_product/dot_product_i8_raw/127 |
+0.28% | +0.25% | +0.30% |
int8_raw_dot_product/dot_product_i8_raw/128 |
-0.02% | -0.04% | +0.00% |
int8_raw_dot_product/dot_product_i8_raw/129 |
+0.38% | +0.29% | +0.48% |
int8_raw_dot_product/dot_product_i8_raw/384 |
+0.06% | +0.04% | +0.08% |
int8_raw_dot_product/dot_product_i8_raw/768 |
-0.29% | -0.32% | -0.25% |
int8_vs_float32_cosine/float32_simd/1024 |
+0.06% | +0.05% | +0.08% |
int8_vs_float32_cosine/float32_simd/1536 |
+0.11% | +0.09% | +0.12% |
int8_vs_float32_cosine/float32_simd/384 |
-0.98% | -1.02% | -0.93% |
int8_vs_float32_cosine/float32_simd/768 |
-0.04% | -0.07% | -0.02% |
int8_vs_float32_cosine/int8/1024 |
-0.94% | -0.99% | -0.89% |
int8_vs_float32_cosine/int8/1536 |
-0.15% | -0.21% | -0.07% |
int8_vs_float32_cosine/int8/384 |
+1.16% | +1.09% | +1.22% |
int8_vs_float32_cosine/int8/768 |
-2.26% | -2.36% | -2.16% |
layer_norm/4096 |
-0.38% | -0.41% | -0.36% |
layer_norm/896 |
-0.06% | -0.08% | -0.04% |
memory_size/search_1000_float32 |
-0.13% | -0.18% | -0.09% |
memory_size/search_1000_int8 |
-0.79% | -0.82% | -0.77% |
rms_norm/4096 |
-0.01% | -0.06% | +0.04% |
rms_norm/896 |
-0.27% | -0.30% | -0.25% |
silu_inplace/4096 |
+0.01% | +0.00% | +0.02% |
silu_inplace/896 |
+0.01% | +0.00% | +0.03% |
simd_batch_cosine/scalar_loop/10 |
-0.02% | -0.03% | -0.01% |
simd_batch_cosine/scalar_loop/100 |
+0.07% | +0.04% | +0.10% |
simd_batch_cosine/scalar_loop/1000 |
-0.68% | -0.71% | -0.65% |
simd_batch_cosine/simd_batch/10 |
+0.02% | +0.01% | +0.03% |
simd_batch_cosine/simd_batch/100 |
+0.78% | +0.76% | +0.80% |
simd_batch_cosine/simd_batch/1000 |
-13.39% | -13.46% | -13.31% |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_1000c |
+0.31% | +0.27% | +0.35% |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_16c |
-0.54% | -0.56% | -0.53% |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_256c |
-5.05% | -5.34% | -4.74% |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_4c |
-0.05% | -0.07% | -0.03% |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_64c |
-0.12% | -0.13% | -0.10% |
simd_batch_cosine_non_normalized_query/pair_loop/384d_1000c |
-0.20% | -0.24% | -0.15% |
simd_batch_cosine_non_normalized_query/pair_loop/384d_16c |
-0.07% | -0.08% | -0.06% |
simd_batch_cosine_non_normalized_query/pair_loop/384d_256c |
-0.07% | -0.09% | -0.06% |
simd_batch_cosine_non_normalized_query/pair_loop/384d_4c |
-0.11% | -0.13% | -0.09% |
simd_batch_cosine_non_normalized_query/pair_loop/384d_64c |
+0.14% | +0.11% | +0.17% |
simd_batch_cosine_non_normalized_query/pair_loop/768d_1000c |
+0.76% | +0.71% | +0.81% |
simd_batch_cosine_non_normalized_query/pair_loop/768d_16c |
-0.27% | -0.30% | -0.24% |
simd_batch_cosine_non_normalized_query/pair_loop/768d_256c |
-4.88% | -4.98% | -4.77% |
simd_batch_cosine_non_normalized_query/pair_loop/768d_4c |
-0.13% | -0.14% | -0.11% |
simd_batch_cosine_non_normalized_query/pair_loop/768d_64c |
-0.32% | -0.33% | -0.30% |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_1000c |
+0.34% | +0.31% | +0.37% |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_16c |
-0.23% | -0.24% | -0.21% |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_256c |
-4.94% | -5.19% | -4.68% |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_4c |
-0.07% | -0.08% | -0.06% |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_64c |
-0.17% | -0.19% | -0.16% |
simd_batch_cosine_non_normalized_query/simd_batch/384d_1000c |
+0.08% | +0.05% | +0.11% |
simd_batch_cosine_non_normalized_query/simd_batch/384d_16c |
+0.03% | +0.02% | +0.05% |
simd_batch_cosine_non_normalized_query/simd_batch/384d_256c |
-0.04% | -0.07% | -0.01% |
simd_batch_cosine_non_normalized_query/simd_batch/384d_4c |
+0.23% | +0.22% | +0.24% |
simd_batch_cosine_non_normalized_query/simd_batch/384d_64c |
+0.17% | +0.16% | +0.18% |
simd_batch_cosine_non_normalized_query/simd_batch/768d_1000c |
+1.07% | +1.02% | +1.12% |
simd_batch_cosine_non_normalized_query/simd_batch/768d_16c |
-0.60% | -0.61% | -0.58% |
simd_batch_cosine_non_normalized_query/simd_batch/768d_256c |
-5.77% | -5.86% | -5.68% |
simd_batch_cosine_non_normalized_query/simd_batch/768d_4c |
-0.10% | -0.11% | -0.09% |
simd_batch_cosine_non_normalized_query/simd_batch/768d_64c |
-0.27% | -0.28% | -0.25% |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_1000c |
+0.48% | +0.43% | +0.53% |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_16c |
+0.62% | +0.61% | +0.63% |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_256c |
-2.98% | -3.18% | -2.78% |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_4c |
-0.01% | -0.02% | -0.00% |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_64c |
+0.04% | +0.03% | +0.05% |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_1000c |
-0.37% | -0.41% | -0.33% |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_16c |
+0.03% | +0.02% | +0.05% |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_256c |
+0.13% | +0.09% | +0.15% |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_4c |
+0.01% | -0.00% | +0.02% |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_64c |
+0.51% | +0.50% | +0.52% |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_1000c |
-0.79% | -0.83% | -0.75% |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_16c |
+0.57% | +0.55% | +0.58% |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_256c |
-3.03% | -3.20% | -2.85% |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_4c |
-0.05% | -0.05% | -0.04% |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_64c |
+0.25% | +0.23% | +0.26% |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_1000c |
-0.30% | -0.37% | -0.23% |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_16c |
+5.62% | +5.59% | +5.64% |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_256c |
-5.19% | -5.57% | -4.79% |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_4c |
-0.33% | -0.37% | -0.30% |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_64c |
+0.02% | -0.02% | +0.06% |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_1000c |
-0.32% | -0.37% | -0.26% |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_16c |
-0.67% | -0.80% | -0.53% |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_256c |
+0.30% | +0.28% | +0.32% |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_4c |
-0.99% | -1.12% | -0.86% |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_64c |
+1.87% | +1.86% | +1.88% |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_1000c |
-0.36% | -0.42% | -0.31% |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_16c |
+1.38% | +1.34% | +1.43% |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_256c |
-3.23% | -3.34% | -3.12% |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_4c |
+0.18% | +0.13% | +0.25% |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_64c |
+0.47% | +0.45% | +0.48% |
simd_batch_cosine_normalized_query/simd_batch/1024d_1000c |
+0.36% | +0.32% | +0.40% |
simd_batch_cosine_normalized_query/simd_batch/1024d_16c |
+1.08% | +1.07% | +1.09% |
simd_batch_cosine_normalized_query/simd_batch/1024d_256c |
-4.74% | -5.10% | -4.36% |
simd_batch_cosine_normalized_query/simd_batch/1024d_4c |
-0.04% | -0.04% | -0.03% |
simd_batch_cosine_normalized_query/simd_batch/1024d_64c |
+0.07% | +0.06% | +0.09% |
simd_batch_cosine_normalized_query/simd_batch/384d_1000c |
-0.32% | -0.36% | -0.28% |
simd_batch_cosine_normalized_query/simd_batch/384d_16c |
+0.04% | +0.03% | +0.05% |
simd_batch_cosine_normalized_query/simd_batch/384d_256c |
+0.11% | +0.09% | +0.13% |
simd_batch_cosine_normalized_query/simd_batch/384d_4c |
+0.30% | +0.29% | +0.31% |
simd_batch_cosine_normalized_query/simd_batch/384d_64c |
+0.33% | +0.32% | +0.34% |
simd_batch_cosine_normalized_query/simd_batch/768d_1000c |
-0.60% | -0.64% | -0.56% |
simd_batch_cosine_normalized_query/simd_batch/768d_16c |
+0.76% | +0.74% | +0.77% |
simd_batch_cosine_normalized_query/simd_batch/768d_256c |
-2.51% | -2.61% | -2.41% |
simd_batch_cosine_normalized_query/simd_batch/768d_4c |
-0.03% | -0.04% | -0.03% |
simd_batch_cosine_normalized_query/simd_batch/768d_64c |
+0.39% | +0.38% | +0.40% |
simd_batch_dot_product/scalar_loop/10 |
-0.01% | -0.02% | +0.00% |
simd_batch_dot_product/scalar_loop/100 |
+0.02% | -0.05% | +0.08% |
simd_batch_dot_product/scalar_loop/1000 |
-1.24% | -1.31% | -1.17% |
simd_batch_dot_product/simd_batch/10 |
+1.36% | +1.31% | +1.42% |
simd_batch_dot_product/simd_batch/100 |
+1.45% | +1.44% | +1.47% |
simd_batch_dot_product/simd_batch/1000 |
-16.51% | -16.58% | -16.46% |
simd_cosine_similarity/scalar/1024 |
+0.03% | +0.02% | +0.05% |
simd_cosine_similarity/scalar/1536 |
+0.04% | +0.03% | +0.06% |
simd_cosine_similarity/scalar/384 |
+0.16% | +0.11% | +0.20% |
simd_cosine_similarity/scalar/768 |
+0.05% | +0.03% | +0.07% |
simd_cosine_similarity/simd/1024 |
+0.95% | +0.91% | +1.00% |
simd_cosine_similarity/simd/1536 |
+0.14% | +0.12% | +0.16% |
simd_cosine_similarity/simd/384 |
+0.54% | +0.48% | +0.60% |
simd_cosine_similarity/simd/768 |
-0.36% | -0.40% | -0.31% |
simd_dot_product/scalar/1024 |
-0.00% | -0.02% | +0.01% |
simd_dot_product/scalar/1536 |
-0.00% | -0.01% | +0.01% |
simd_dot_product/scalar/384 |
+0.07% | +0.06% | +0.09% |
simd_dot_product/scalar/768 |
-0.73% | -0.74% | -0.72% |
simd_dot_product/simd/1024 |
+1.04% | +0.97% | +1.11% |
simd_dot_product/simd/1536 |
+1.55% | +1.46% | +1.64% |
simd_dot_product/simd/384 |
-2.11% | -2.29% | -1.94% |
simd_dot_product/simd/768 |
+0.32% | +0.21% | +0.45% |
simd_euclidean_distance/scalar/1024 |
+0.03% | -0.06% | +0.09% |
simd_euclidean_distance/scalar/1536 |
-0.02% | -0.03% | -0.01% |
simd_euclidean_distance/scalar/384 |
+0.02% | -0.03% | +0.06% |
simd_euclidean_distance/scalar/768 |
-0.01% | -0.06% | +0.03% |
simd_euclidean_distance/simd/1024 |
-0.04% | -0.09% | -0.01% |
simd_euclidean_distance/simd/1536 |
+0.00% | -0.03% | +0.03% |
simd_euclidean_distance/simd/384 |
+0.34% | +0.27% | +0.39% |
simd_euclidean_distance/simd/768 |
-0.34% | -0.40% | -0.30% |
simd_normalize/scalar/1024 |
+0.18% | -0.02% | +0.38% |
simd_normalize/scalar/1536 |
+0.17% | +0.02% | +0.32% |
simd_normalize/scalar/384 |
-0.07% | -0.50% | +0.36% |
simd_normalize/scalar/768 |
+0.28% | +0.06% | +0.48% |
simd_normalize/simd/1024 |
-2.36% | -4.41% | -0.31% |
simd_normalize/simd/1536 |
-1.70% | -3.07% | -0.32% |
simd_normalize/simd/384 |
+0.35% | -1.97% | +2.62% |
simd_normalize/simd/768 |
+0.10% | -1.42% | +1.71% |
simd_normalized_cosine_fast_path/cosine_full/1024 |
+0.02% | -0.00% | +0.04% |
simd_normalized_cosine_fast_path/cosine_full/384 |
+0.02% | -0.03% | +0.07% |
simd_normalized_cosine_fast_path/cosine_full/768 |
+0.38% | +0.35% | +0.41% |
simd_normalized_cosine_fast_path/dot_product/1024 |
+1.58% | +1.48% | +1.68% |
simd_normalized_cosine_fast_path/dot_product/384 |
+1.06% | +0.87% | +1.28% |
simd_normalized_cosine_fast_path/dot_product/768 |
+3.00% | +2.91% | +3.10% |
simd_prepared_query_normalized_cosine/dot_product_loop/1024 |
-0.87% | -0.95% | -0.79% |
simd_prepared_query_normalized_cosine/dot_product_loop/384 |
+0.04% | -0.06% | +0.15% |
simd_prepared_query_normalized_cosine/dot_product_loop/768 |
-1.01% | -1.16% | -0.87% |
simd_prepared_query_normalized_cosine/prepared_full_cosine/1024 |
+0.78% | +0.42% | +1.02% |
simd_prepared_query_normalized_cosine/prepared_full_cosine/384 |
+0.67% | +0.64% | +0.70% |
simd_prepared_query_normalized_cosine/prepared_full_cosine/768 |
+1.00% | +0.93% | +1.09% |
simd_prepared_query_normalized_cosine/prepared_meta_unit/1024 |
-1.22% | -1.28% | -1.15% |
simd_prepared_query_normalized_cosine/prepared_meta_unit/384 |
+0.06% | -0.01% | +0.13% |
simd_prepared_query_normalized_cosine/prepared_meta_unit/768 |
-0.09% | -0.17% | -0.01% |
simd_query_batch_dot_product/pair_loop/128d_16c |
+0.48% | +0.38% | +0.58% |
simd_query_batch_dot_product/pair_loop/128d_256c |
+0.66% | +0.51% | +0.79% |
simd_query_batch_dot_product/pair_loop/128d_4c |
-0.23% | -0.31% | -0.16% |
simd_query_batch_dot_product/pair_loop/128d_64c |
-2.38% | -2.45% | -2.33% |
simd_query_batch_dot_product/pair_loop/384d_16c |
-0.01% | -0.07% | +0.04% |
simd_query_batch_dot_product/pair_loop/384d_256c |
+0.51% | +0.46% | +0.55% |
simd_query_batch_dot_product/pair_loop/384d_4c |
+1.26% | +1.09% | +1.42% |
simd_query_batch_dot_product/pair_loop/384d_64c |
+0.28% | +0.22% | +0.33% |
simd_query_batch_dot_product/pair_loop/768d_16c |
-6.44% | -6.49% | -6.39% |
simd_query_batch_dot_product/pair_loop/768d_256c |
-3.78% | -3.89% | -3.67% |
simd_query_batch_dot_product/pair_loop/768d_4c |
-0.48% | -0.65% | -0.28% |
simd_query_batch_dot_product/pair_loop/768d_64c |
-0.82% | -0.88% | -0.78% |
simd_query_batch_dot_product/simd_batch/128d_16c |
+0.23% | +0.20% | +0.27% |
simd_query_batch_dot_product/simd_batch/128d_256c |
+0.42% | +0.36% | +0.46% |
simd_query_batch_dot_product/simd_batch/128d_4c |
+0.76% | +0.71% | +0.81% |
simd_query_batch_dot_product/simd_batch/128d_64c |
-4.96% | -5.02% | -4.90% |
simd_query_batch_dot_product/simd_batch/384d_16c |
+0.52% | +0.50% | +0.54% |
simd_query_batch_dot_product/simd_batch/384d_256c |
+1.49% | +1.44% | +1.54% |
simd_query_batch_dot_product/simd_batch/384d_4c |
+0.49% | +0.44% | +0.55% |
simd_query_batch_dot_product/simd_batch/384d_64c |
+0.66% | +0.53% | +0.80% |
simd_query_batch_dot_product/simd_batch/768d_16c |
-7.90% | -7.92% | -7.87% |
simd_query_batch_dot_product/simd_batch/768d_256c |
+2.66% | +2.45% | +2.90% |
simd_query_batch_dot_product/simd_batch/768d_4c |
+0.18% | +0.16% | +0.21% |
simd_query_batch_dot_product/simd_batch/768d_64c |
-0.25% | -0.36% | -0.15% |
simd_squared_euclidean_fast_path/euclidean_full/1024 |
+0.00% | -0.03% | +0.04% |
simd_squared_euclidean_fast_path/euclidean_full/384 |
-0.04% | -0.10% | +0.01% |
simd_squared_euclidean_fast_path/euclidean_full/768 |
+0.04% | +0.03% | +0.05% |
simd_squared_euclidean_fast_path/squared_euclidean/1024 |
-0.21% | -0.23% | -0.19% |
simd_squared_euclidean_fast_path/squared_euclidean/384 |
+0.09% | +0.05% | +0.12% |
simd_squared_euclidean_fast_path/squared_euclidean/768 |
+0.08% | +0.06% | +0.10% |
simd_throughput_384/cosine_similarity |
+0.06% | -0.00% | +0.11% |
simd_throughput_384/dot_product |
-0.18% | -0.26% | -0.11% |
simd_throughput_384/euclidean_distance |
+0.06% | +0.03% | +0.10% |
simd_throughput_384/normalize |
-3.69% | -3.70% | -3.67% |
softmax_attention/128 |
+0.03% | +0.02% | +0.05% |
softmax_attention/512 |
-0.10% | -0.22% | +0.05% |
tier_prepared_batch_sizes/int4_batch_prepared/10 |
-0.07% | -0.11% | -0.03% |
tier_prepared_batch_sizes/int4_batch_prepared/100 |
+0.75% | +0.70% | +0.79% |
tier_prepared_batch_sizes/int4_batch_prepared/1000 |
+0.02% | -0.01% | +0.05% |
tier_prepared_batch_sizes/int4_query_per_call/10 |
+0.47% | +0.46% | +0.48% |
tier_prepared_batch_sizes/int4_query_per_call/100 |
+0.40% | +0.39% | +0.41% |
tier_prepared_batch_sizes/int4_query_per_call/1000 |
+0.44% | +0.43% | +0.44% |
tier_prepared_batch_sizes/int8_batch_prepared/10 |
+0.11% | +0.04% | +0.17% |
tier_prepared_batch_sizes/int8_batch_prepared/100 |
-0.84% | -0.97% | -0.73% |
tier_prepared_batch_sizes/int8_batch_prepared/1000 |
+0.59% | +0.55% | +0.63% |
tier_prepared_batch_sizes/int8_query_per_call/10 |
+0.02% | +0.01% | +0.03% |
tier_prepared_batch_sizes/int8_query_per_call/100 |
+0.03% | +0.02% | +0.04% |
tier_prepared_batch_sizes/int8_query_per_call/1000 |
+0.00% | -0.02% | +0.02% |
tier_prepared_query/binary_query_once_1000 |
+0.20% | +0.18% | +0.23% |
tier_prepared_query/binary_query_per_call_1000 |
-0.14% | -0.15% | -0.13% |
tier_prepared_query/int4_query_once_1000 |
+0.13% | +0.08% | +0.17% |
tier_prepared_query/int4_query_per_call_1000 |
-0.06% | -0.07% | -0.05% |
tier_prepared_query/int8_query_once_1000 |
-1.75% | -1.78% | -1.72% |
tier_prepared_query/int8_query_per_call_1000 |
-0.03% | -0.04% | -0.02% |
Rule: CI-lower of change ≤3.0% passes silently; (3.0%, 7.0%] warns; >7.0% fails. Override via PR label bench-allow-regression.
x86_64-linux — perf regression report
❌ 1 FAIL (regression >7.0% confirmed by 95% CI)
⚠ 1 WARN (regression 3.0-7.0% confirmed)
🚀 257 confirmed improvement
| Bench | Δ point | 95% CI | new ns | base ns | verdict |
|---|---|---|---|---|---|
elementwise_mul/4096 |
+14.06% | [+13.79%, +14.44%] | 233.2 | 233.2 | ❌ FAIL |
simd_euclidean_distance/simd/1536 |
+5.49% | [+5.27%, +5.65%] | 87.5 | 87.5 | ⚠ WARN |
simd_dot_product/simd/768 |
-4.45% | [-4.71%, -4.23%] | 38.8 | 38.8 | 🚀 WIN |
simd_euclidean_distance/simd/768 |
-5.60% | [-5.95%, -5.10%] | 53.4 | 53.4 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_16c |
-5.90% | [-6.07%, -5.66%] | 442.3 | 442.3 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_64c |
-6.87% | [-7.06%, -6.70%] | 1757.3 | 1757.3 | 🚀 WIN |
simd_normalize/simd/768 |
-5.05% | [-7.90%, -2.63%] | 104.9 | 104.9 | 🚀 WIN |
simd_normalized_cosine_fast_path/dot_product/384 |
-8.18% | [-8.43%, -7.94%] | 25.1 | 25.1 | 🚀 WIN |
simd_prepared_query_normalized_cosine/dot_product_loop/1024 |
-8.26% | [-8.85%, -7.79%] | 67868.6 | 67868.6 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_256c |
-8.77% | [-8.91%, -8.61%] | 6967.5 | 6967.5 | 🚀 WIN |
layer_norm/896 |
-8.73% | [-9.06%, -8.32%] | 155.1 | 155.1 | 🚀 WIN |
simd_dot_product/simd/384 |
-8.88% | [-9.10%, -8.64%] | 25.1 | 25.1 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_4c |
-8.94% | [-9.49%, -8.62%] | 116.0 | 116.0 | 🚀 WIN |
simd_normalize/simd/384 |
-7.07% | [-9.78%, -4.06%] | 64.2 | 64.2 | 🚀 WIN |
rms_norm/4096 |
-9.64% | [-9.79%, -9.51%] | 724.0 | 724.0 | 🚀 WIN |
simd_normalize/simd/1536 |
-8.17% | [-10.27%, -6.09%] | 179.8 | 179.8 | 🚀 WIN |
simd_euclidean_distance/simd/384 |
-10.92% | [-11.10%, -10.71%] | 30.3 | 30.3 | 🚀 WIN |
simd_throughput_384/dot_product |
-11.12% | [-11.38%, -10.85%] | 24.3 | 24.3 | 🚀 WIN |
simd_throughput_384/euclidean_distance |
-11.28% | [-11.49%, -11.08%] | 30.2 | 30.2 | 🚀 WIN |
simd_throughput_384/normalize |
-11.16% | [-11.51%, -10.82%] | 95.7 | 95.7 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_1000c |
-11.29% | [-11.66%, -10.99%] | 28139.8 | 28139.8 | 🚀 WIN |
int8_vs_float32_cosine/float32_simd/1536 |
-11.65% | [-11.86%, -11.35%] | 105.5 | 105.5 | 🚀 WIN |
simd_normalize/simd/1024 |
-10.13% | [-11.95%, -8.26%] | 126.4 | 126.4 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_64c |
-11.87% | [-12.11%, -11.69%] | 2228.2 | 2228.2 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_1000c |
-12.17% | [-12.32%, -12.05%] | 35613.8 | 35613.8 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_256c |
-12.32% | [-12.58%, -12.12%] | 8843.8 | 8843.8 | 🚀 WIN |
simd_cosine_similarity/simd/768 |
-13.01% | [-13.20%, -12.85%] | 60.1 | 60.1 | 🚀 WIN |
int8_vs_float32_cosine/float32_simd/768 |
-13.03% | [-13.21%, -12.87%] | 60.0 | 60.0 | 🚀 WIN |
int8_vs_float32_cosine/int8/768 |
-14.21% | [-14.46%, -14.02%] | 25.1 | 25.1 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_4c |
-14.54% | [-14.81%, -14.32%] | 143.7 | 143.7 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_16c |
-15.02% | [-15.13%, -14.89%] | 549.7 | 549.7 | 🚀 WIN |
memory_size/search_1000_float32 |
-16.12% | [-16.25%, -16.00%] | 33591.6 | 33591.6 | 🚀 WIN |
simd_prepared_query_normalized_cosine/prepared_full_cosine/384 |
-16.15% | [-16.32%, -16.01%] | 34751.7 | 34751.7 | 🚀 WIN |
simd_batch_cosine/simd_batch/1000 |
-16.39% | [-16.53%, -16.25%] | 48191.8 | 48191.8 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8/768 |
-16.49% | [-16.65%, -16.35%] | 22.2 | 22.2 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/384d_4c |
-16.74% | [-16.97%, -16.54%] | 118.8 | 118.8 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/384d_64c |
-16.74% | [-16.97%, -16.45%] | 2055.9 | 2055.9 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/384d_1000c |
-16.73% | [-17.04%, -16.33%] | 34392.7 | 34392.7 | 🚀 WIN |
binary_cosine_distance/float32_simd/384 |
-17.04% | [-17.15%, -16.91%] | 37.3 | 37.3 | 🚀 WIN |
simd_prepared_query_normalized_cosine/dot_product_loop/768 |
-16.96% | [-17.17%, -16.83%] | 52526.2 | 52526.2 | 🚀 WIN |
simd_batch_dot_product/simd_batch/1000 |
-16.93% | [-17.21%, -16.68%] | 41305.6 | 41305.6 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/384d_16c |
-17.19% | [-17.35%, -17.04%] | 434.4 | 434.4 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/384d_256c |
-17.28% | [-17.40%, -17.20%] | 6994.3 | 6994.3 | 🚀 WIN |
simd_batch_cosine/simd_batch/100 |
-17.04% | [-17.41%, -16.75%] | 3580.7 | 3580.7 | 🚀 WIN |
int8_prepared_dot_product/prepared/768 |
-17.12% | [-17.45%, -16.86%] | 22.4 | 22.4 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8_raw/127 |
-17.36% | [-17.49%, -17.23%] | 11.8 | 11.8 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/384d_64c |
-17.44% | [-17.54%, -17.33%] | 2107.2 | 2107.2 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/384d_64c |
-17.38% | [-17.56%, -17.22%] | 1739.5 | 1739.5 | 🚀 WIN |
simd_normalized_cosine_fast_path/cosine_full/384 |
-17.19% | [-17.58%, -16.87%] | 36.7 | 36.7 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/384d_256c |
-17.47% | [-17.61%, -17.37%] | 8445.5 | 8445.5 | 🚀 WIN |
simd_prepared_query_normalized_cosine/dot_product_loop/384 |
-17.08% | [-17.62%, -16.64%] | 25661.4 | 25661.4 | 🚀 WIN |
int8_vs_float32_cosine/float32_simd/384 |
-17.67% | [-17.82%, -17.51%] | 36.4 | 36.4 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/384d_1000c |
-17.44% | [-17.84%, -17.12%] | 32924.8 | 32924.8 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/384d_4c |
-17.43% | [-17.88%, -17.04%] | 67.9 | 67.9 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/384d_256c |
-17.80% | [-17.92%, -17.68%] | 8181.8 | 8181.8 | 🚀 WIN |
simd_throughput_384/cosine_similarity |
-17.79% | [-18.00%, -17.60%] | 36.4 | 36.4 | 🚀 WIN |
int8_batch_cosine/float32_simd/100 |
-17.62% | [-18.00%, -17.32%] | 3547.4 | 3547.4 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/128d_64c |
-17.97% | [-18.07%, -17.87%] | 396.3 | 396.3 | 🚀 WIN |
int8_batch_cosine/int8_loop/1000 |
-17.83% | [-18.16%, -17.51%] | 14944.5 | 14944.5 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/384d_16c |
-18.12% | [-18.20%, -18.05%] | 224.8 | 224.8 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_64c |
-18.17% | [-18.26%, -18.06%] | 3400.0 | 3400.0 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8/127 |
-18.18% | [-18.32%, -18.05%] | 13.4 | 13.4 | 🚀 WIN |
int4_cosine_distance/float32_simd/384 |
-18.09% | [-18.46%, -17.78%] | 37.6 | 37.6 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/768d_4c |
-18.41% | [-18.51%, -18.28%] | 114.5 | 114.5 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8_raw/384 |
-18.26% | [-18.60%, -17.97%] | 11.0 | 11.0 | 🚀 WIN |
int8_prepared_dot_product/prepared/384 |
-18.39% | [-18.61%, -18.11%] | 12.9 | 12.9 | 🚀 WIN |
int8_prepared_dot_product/prepared/1024 |
-18.55% | [-18.82%, -18.32%] | 28.5 | 28.5 | 🚀 WIN |
int8_prepared_dot_product/prepared/127 |
-18.53% | [-18.85%, -18.24%] | 13.3 | 13.3 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/384d_16c |
-18.66% | [-18.85%, -18.51%] | 528.4 | 528.4 | 🚀 WIN |
int4_cosine_distance/float32_simd/768 |
-18.70% | [-18.86%, -18.56%] | 56.7 | 56.7 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/128d_4c |
-18.68% | [-18.95%, -18.43%] | 58.0 | 58.0 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/384d_4c |
-18.81% | [-18.97%, -18.60%] | 134.2 | 134.2 | 🚀 WIN |
int8_batch_cosine/float32_simd/10 |
-18.76% | [-19.00%, -18.55%] | 324.8 | 324.8 | 🚀 WIN |
int8_batch_cosine/float32_simd/1000 |
-18.52% | [-19.03%, -18.03%] | 46874.3 | 46874.3 | 🚀 WIN |
int8_vs_float32_cosine/float32_simd/1024 |
-18.95% | [-19.05%, -18.85%] | 69.5 | 69.5 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8/384 |
-18.71% | [-19.05%, -18.32%] | 12.7 | 12.7 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_256c |
-18.90% | [-19.19%, -18.64%] | 14826.4 | 14826.4 | 🚀 WIN |
tier_prepared_batch_sizes/int8_batch_prepared/100 |
-19.08% | [-19.25%, -18.83%] | 1340.4 | 1340.4 | 🚀 WIN |
simd_prepared_query_normalized_cosine/prepared_meta_unit/384 |
-19.19% | [-19.29%, -19.09%] | 24907.8 | 24907.8 | 🚀 WIN |
simd_euclidean_distance/simd/1024 |
-19.40% | [-19.51%, -19.28%] | 67.8 | 67.8 | 🚀 WIN |
rms_norm/896 |
-18.90% | [-19.53%, -18.41%] | 183.0 | 183.0 | 🚀 WIN |
int8_batch_cosine/int8_loop/100 |
-19.41% | [-19.70%, -19.14%] | 1358.7 | 1358.7 | 🚀 WIN |
tier_prepared_batch_sizes/int8_batch_prepared/10 |
-19.53% | [-19.79%, -19.22%] | 140.5 | 140.5 | 🚀 WIN |
simd_prepared_query_normalized_cosine/prepared_meta_unit/768 |
-19.61% | [-19.81%, -19.46%] | 42984.5 | 42984.5 | 🚀 WIN |
binary_cosine_distance/float32_simd/768 |
-19.76% | [-19.87%, -19.63%] | 55.2 | 55.2 | 🚀 WIN |
simd_batch_dot_product/simd_batch/10 |
-19.65% | [-19.87%, -19.41%] | 243.2 | 243.2 | 🚀 WIN |
softmax_attention/128 |
-19.75% | [-19.87%, -19.63%] | 3848.3 | 3848.3 | 🚀 WIN |
tier_prepared_batch_sizes/int8_batch_prepared/1000 |
-19.55% | [-19.92%, -19.19%] | 13062.7 | 13062.7 | 🚀 WIN |
memory_size/search_1000_int8 |
-19.71% | [-19.99%, -19.48%] | 12575.2 | 12575.2 | 🚀 WIN |
int4_cosine_distance/float32_simd/1536 |
-20.01% | [-20.15%, -19.85%] | 95.6 | 95.6 | 🚀 WIN |
int8_batch_cosine/int8_loop/10 |
-19.79% | [-20.19%, -19.39%] | 135.6 | 135.6 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8_raw/1024 |
-19.77% | [-20.20%, -19.44%] | 24.5 | 24.5 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/384d_16c |
-20.01% | [-20.24%, -19.76%] | 508.2 | 508.2 | 🚀 WIN |
tier_prepared_query/int8_query_once_1000 |
-20.26% | [-20.59%, -19.84%] | 13802.8 | 13802.8 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_1000c |
-20.50% | [-20.60%, -20.38%] | 58163.7 | 58163.7 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8_raw/768 |
-20.39% | [-20.64%, -20.18%] | 19.1 | 19.1 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8/1024 |
-20.53% | [-20.72%, -20.38%] | 27.5 | 27.5 | 🚀 WIN |
simd_cosine_similarity/simd/1536 |
-20.74% | [-20.82%, -20.68%] | 94.6 | 94.6 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/128d_16c |
-20.53% | [-20.89%, -20.19%] | 97.3 | 97.3 | 🚀 WIN |
binary_cosine_distance/float32_simd/1536 |
-20.74% | [-20.89%, -20.58%] | 95.6 | 95.6 | 🚀 WIN |
int8_vs_float32_cosine/int8/384 |
-20.60% | [-20.91%, -20.33%] | 14.4 | 14.4 | 🚀 WIN |
simd_batch_dot_product/simd_batch/100 |
-20.75% | [-20.99%, -20.57%] | 2893.6 | 2893.6 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_16c |
-20.81% | [-20.99%, -20.64%] | 843.0 | 843.0 | 🚀 WIN |
binary_cosine_distance/float32_simd/1024 |
-20.87% | [-21.03%, -20.74%] | 68.7 | 68.7 | 🚀 WIN |
simd_squared_euclidean_fast_path/euclidean_full/384 |
-20.96% | [-21.11%, -20.81%] | 30.7 | 30.7 | 🚀 WIN |
softmax_attention/512 |
-21.02% | [-21.13%, -20.87%] | 59702.4 | 59702.4 | 🚀 WIN |
simd_batch_dot_product/scalar_loop/1000 |
-21.02% | [-21.14%, -20.92%] | 293168.9 | 293168.9 | 🚀 WIN |
simd_batch_dot_product/scalar_loop/100 |
-21.06% | [-21.22%, -20.83%] | 28592.6 | 28592.6 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/384d_64c |
-20.75% | [-21.26%, -20.35%] | 984.3 | 984.3 | 🚀 WIN |
simd_batch_cosine/scalar_loop/100 |
-21.29% | [-21.33%, -21.26%] | 84467.8 | 84467.8 | 🚀 WIN |
simd_batch_dot_product/scalar_loop/10 |
-21.22% | [-21.39%, -21.07%] | 2847.3 | 2847.3 | 🚀 WIN |
simd_batch_cosine/simd_batch/10 |
-21.20% | [-21.40%, -21.01%] | 322.0 | 322.0 | 🚀 WIN |
simd_batch_cosine/scalar_loop/10 |
-21.35% | [-21.45%, -21.25%] | 8458.0 | 8458.0 | 🚀 WIN |
add_bias_gelu/4096 |
-21.38% | [-21.50%, -21.22%] | 1500.4 | 1500.4 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/128d_16c |
-21.42% | [-21.60%, -21.22%] | 175.2 | 175.2 | 🚀 WIN |
simd_batch_cosine/scalar_loop/1000 |
-21.45% | [-21.62%, -21.31%] | 847416.8 | 847416.8 | 🚀 WIN |
simd_euclidean_distance/scalar/384 |
-21.57% | [-21.62%, -21.51%] | 298.4 | 298.4 | 🚀 WIN |
int8_vs_float32_cosine/int8/1536 |
-21.46% | [-21.66%, -21.23%] | 40.2 | 40.2 | 🚀 WIN |
simd_cosine_similarity/scalar/384 |
-21.63% | [-21.69%, -21.59%] | 852.5 | 852.5 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/384d_4c |
-21.51% | [-21.73%, -21.28%] | 137.1 | 137.1 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_4c |
-21.57% | [-21.76%, -21.40%] | 213.3 | 213.3 | 🚀 WIN |
simd_normalized_cosine_fast_path/cosine_full/768 |
-21.35% | [-21.97%, -20.84%] | 54.7 | 54.7 | 🚀 WIN |
simd_normalize/scalar/1536 |
-21.45% | [-21.98%, -20.87%] | 1389.4 | 1389.4 | 🚀 WIN |
tier_prepared_batch_sizes/int4_batch_prepared/100 |
-21.69% | [-22.02%, -21.22%] | 10306.3 | 10306.3 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8_raw/128 |
-21.65% | [-22.02%, -21.31%] | 5.3 | 5.3 | 🚀 WIN |
tier_prepared_query/int4_query_per_call_1000 |
-21.90% | [-22.02%, -21.76%] | 1899501.5 | 1899501.5 | 🚀 WIN |
simd_normalized_cosine_fast_path/cosine_full/1024 |
-21.82% | [-22.07%, -21.59%] | 67.3 | 67.3 | 🚀 WIN |
tier_prepared_batch_sizes/int4_query_per_call/1000 |
-21.93% | [-22.07%, -21.77%] | 1889947.0 | 1889947.0 | 🚀 WIN |
tier_prepared_batch_sizes/int4_query_per_call/10 |
-21.90% | [-22.07%, -21.67%] | 18924.3 | 18924.3 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/384d_64c |
-21.99% | [-22.09%, -21.90%] | 1970.6 | 1970.6 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/384d_256c |
-21.88% | [-22.09%, -21.71%] | 3942.2 | 3942.2 | 🚀 WIN |
simd_normalize/scalar/384 |
-21.83% | [-22.10%, -21.54%] | 347.8 | 347.8 | 🚀 WIN |
int8_vs_float32_cosine/int8/1024 |
-21.94% | [-22.10%, -21.76%] | 30.8 | 30.8 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8_raw/129 |
-21.85% | [-22.14%, -21.62%] | 5.7 | 5.7 | 🚀 WIN |
simd_dot_product/simd/1536 |
-21.94% | [-22.15%, -21.78%] | 75.7 | 75.7 | 🚀 WIN |
gelu/896 |
-21.97% | [-22.18%, -21.78%] | 310.8 | 310.8 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/384d_1000c |
-22.00% | [-22.21%, -21.82%] | 31766.3 | 31766.3 | 🚀 WIN |
simd_euclidean_distance/scalar/768 |
-22.05% | [-22.21%, -21.89%] | 612.9 | 612.9 | 🚀 WIN |
simd_euclidean_distance/scalar/1536 |
-22.19% | [-22.22%, -22.16%] | 1240.6 | 1240.6 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/768d_256c |
-21.81% | [-22.24%, -21.43%] | 7702.0 | 7702.0 | 🚀 WIN |
tier_prepared_batch_sizes/int4_query_per_call/100 |
-22.01% | [-22.25%, -21.82%] | 188882.7 | 188882.7 | 🚀 WIN |
simd_cosine_similarity/scalar/768 |
-22.14% | [-22.28%, -22.04%] | 1794.7 | 1794.7 | 🚀 WIN |
int8_quantization/quantize/768 |
-22.02% | [-22.31%, -21.65%] | 3625.3 | 3625.3 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_256c |
-22.13% | [-22.32%, -21.90%] | 17184.8 | 17184.8 | 🚀 WIN |
simd_dot_product/scalar/384 |
-22.22% | [-22.32%, -22.10%] | 287.9 | 287.9 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8/129 |
-21.93% | [-22.35%, -21.60%] | 7.4 | 7.4 | 🚀 WIN |
tier_prepared_batch_sizes/int4_batch_prepared/10 |
-21.92% | [-22.36%, -21.57%] | 1034.9 | 1034.9 | 🚀 WIN |
simd_dot_product/scalar/1024 |
-22.34% | [-22.37%, -22.32%] | 811.2 | 811.2 | 🚀 WIN |
simd_cosine_similarity/scalar/1024 |
-22.25% | [-22.44%, -22.13%] | 2422.7 | 2422.7 | 🚀 WIN |
int4_cosine_distance/int4/384 |
-22.13% | [-22.44%, -21.87%] | 106.1 | 106.1 | 🚀 WIN |
binary_cosine_distance/binary/1536 |
-22.15% | [-22.44%, -21.77%] | 127.1 | 127.1 | 🚀 WIN |
int4_cosine_distance/int4/768 |
-22.24% | [-22.44%, -22.05%] | 196.3 | 196.3 | 🚀 WIN |
tier_prepared_query/int8_query_per_call_1000 |
-22.25% | [-22.45%, -22.02%] | 1824839.6 | 1824839.6 | 🚀 WIN |
simd_dot_product/scalar/1536 |
-22.42% | [-22.45%, -22.38%] | 1230.0 | 1230.0 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/384d_256c |
-22.20% | [-22.46%, -21.98%] | 7895.7 | 7895.7 | 🚀 WIN |
tier_prepared_batch_sizes/int4_batch_prepared/1000 |
-22.12% | [-22.47%, -21.87%] | 102783.2 | 102783.2 | 🚀 WIN |
simd_dot_product/scalar/768 |
-22.44% | [-22.47%, -22.39%] | 601.7 | 601.7 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/128d_256c |
-21.91% | [-22.49%, -21.47%] | 2578.5 | 2578.5 | 🚀 WIN |
int8_prepared_dot_product/per_call/1024 |
-22.32% | [-22.52%, -22.10%] | 4834.0 | 4834.0 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/128d_256c |
-22.31% | [-22.53%, -22.17%] | 1616.3 | 1616.3 | 🚀 WIN |
simd_squared_euclidean_fast_path/euclidean_full/768 |
-22.37% | [-22.53%, -22.20%] | 43.9 | 43.9 | 🚀 WIN |
int4_cosine_distance/int4/1536 |
-22.29% | [-22.53%, -22.08%] | 376.8 | 376.8 | 🚀 WIN |
add_bias_gelu/896 |
-22.10% | [-22.54%, -21.69%] | 326.1 | 326.1 | 🚀 WIN |
tier_prepared_batch_sizes/int8_query_per_call/100 |
-22.30% | [-22.55%, -22.09%] | 181725.0 | 181725.0 | 🚀 WIN |
int8_prepared_dot_product/per_call/127 |
-22.27% | [-22.57%, -21.92%] | 605.3 | 605.3 | 🚀 WIN |
simd_normalize/scalar/1024 |
-22.38% | [-22.57%, -22.22%] | 916.0 | 916.0 | 🚀 WIN |
simd_normalize/scalar/768 |
-22.28% | [-22.60%, -22.02%] | 689.9 | 689.9 | 🚀 WIN |
int8_prepared_dot_product/per_call/384 |
-22.51% | [-22.60%, -22.41%] | 1809.4 | 1809.4 | 🚀 WIN |
int8_prepared_dot_product/per_call/768 |
-22.47% | [-22.63%, -22.35%] | 3620.1 | 3620.1 | 🚀 WIN |
simd_cosine_similarity/scalar/1536 |
-22.46% | [-22.68%, -22.29%] | 3678.8 | 3678.8 | 🚀 WIN |
binary_cosine_distance/binary/1024 |
-22.55% | [-22.69%, -22.43%] | 87.3 | 87.3 | 🚀 WIN |
simd_dot_product/simd/1024 |
-22.58% | [-22.72%, -22.46%] | 51.0 | 51.0 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_4c |
-22.64% | [-22.77%, -22.48%] | 258.9 | 258.9 | 🚀 WIN |
simd_euclidean_distance/scalar/1024 |
-22.51% | [-22.83%, -22.26%] | 821.8 | 821.8 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_64c |
-22.68% | [-22.85%, -22.52%] | 4069.4 | 4069.4 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/768d_16c |
-22.66% | [-22.85%, -22.48%] | 463.2 | 463.2 | 🚀 WIN |
tier_prepared_query/binary_query_once_1000 |
-22.65% | [-22.89%, -22.44%] | 37511.6 | 37511.6 | 🚀 WIN |
silu_inplace/896 |
-22.66% | [-22.90%, -22.45%] | 2327.9 | 2327.9 | 🚀 WIN |
int4_cosine_distance/int4/1024 |
-22.57% | [-22.91%, -22.29%] | 256.1 | 256.1 | 🚀 WIN |
int8_prepared_dot_product/per_call/128 |
-22.63% | [-22.91%, -22.40%] | 609.4 | 609.4 | 🚀 WIN |
int8_quantization/quantize/384 |
-22.62% | [-22.92%, -22.37%] | 1799.3 | 1799.3 | 🚀 WIN |
tier_prepared_batch_sizes/int8_query_per_call/1000 |
-22.37% | [-22.94%, -21.90%] | 1821339.8 | 1821339.8 | 🚀 WIN |
int8_quantization/quantize/1536 |
-22.79% | [-23.01%, -22.63%] | 7215.9 | 7215.9 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/384d_16c |
-22.86% | [-23.02%, -22.71%] | 493.4 | 493.4 | 🚀 WIN |
int8_quantization/quantize/1024 |
-22.69% | [-23.04%, -22.42%] | 4802.8 | 4802.8 | 🚀 WIN |
int8_prepared_dot_product/per_call/129 |
-22.74% | [-23.06%, -22.48%] | 610.9 | 610.9 | 🚀 WIN |
gelu/4096 |
-22.58% | [-23.07%, -22.15%] | 1422.7 | 1422.7 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/1024d_4c |
-22.92% | [-23.07%, -22.74%] | 255.8 | 255.8 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_1000c |
-21.68% | [-23.10%, -19.84%] | 67785.3 | 67785.3 | 🚀 WIN |
simd_squared_euclidean_fast_path/squared_euclidean/1024 |
-22.97% | [-23.10%, -22.83%] | 51.7 | 51.7 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/128d_64c |
-22.89% | [-23.12%, -22.65%] | 645.6 | 645.6 | 🚀 WIN |
tier_prepared_batch_sizes/int8_query_per_call/10 |
-22.73% | [-23.20%, -22.28%] | 18262.1 | 18262.1 | 🚀 WIN |
tier_prepared_query/binary_query_per_call_1000 |
-22.97% | [-23.21%, -22.71%] | 690536.4 | 690536.4 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_16c |
-22.86% | [-23.25%, -22.37%] | 1021.9 | 1021.9 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/1024d_256c |
-23.09% | [-23.31%, -22.87%] | 16819.0 | 16819.0 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/768d_64c |
-23.36% | [-23.45%, -23.25%] | 1879.3 | 1879.3 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/1024d_64c |
-23.35% | [-23.51%, -23.25%] | 3993.6 | 3993.6 | 🚀 WIN |
simd_squared_euclidean_fast_path/squared_euclidean/768 |
-23.33% | [-23.53%, -23.15%] | 39.9 | 39.9 | 🚀 WIN |
tier_prepared_query/int4_query_once_1000 |
-22.83% | [-23.54%, -22.32%] | 102550.5 | 102550.5 | 🚀 WIN |
simd_cosine_similarity/simd/1024 |
-23.42% | [-23.56%, -23.29%] | 69.6 | 69.6 | 🚀 WIN |
binary_cosine_distance/binary/768 |
-23.12% | [-23.64%, -22.69%] | 67.6 | 67.6 | 🚀 WIN |
simd_squared_euclidean_fast_path/squared_euclidean/384 |
-22.61% | [-23.66%, -21.72%] | 26.7 | 26.7 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/1024d_16c |
-23.54% | [-23.71%, -23.42%] | 1002.1 | 1002.1 | 🚀 WIN |
simd_squared_euclidean_fast_path/euclidean_full/1024 |
-23.56% | [-23.73%, -23.44%] | 54.9 | 54.9 | 🚀 WIN |
silu_inplace/4096 |
-23.25% | [-23.95%, -22.62%] | 10678.7 | 10678.7 | 🚀 WIN |
binary_cosine_distance/binary/384 |
-23.19% | [-24.08%, -22.55%] | 38.2 | 38.2 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_4c |
-23.94% | [-24.23%, -23.71%] | 208.2 | 208.2 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/768d_4c |
-24.27% | [-24.40%, -24.16%] | 204.9 | 204.9 | 🚀 WIN |
simd_query_batch_dot_product/simd_batch/128d_4c |
-23.96% | [-24.44%, -23.50%] | 35.8 | 35.8 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/1024d_1000c |
-24.31% | [-24.45%, -24.13%] | 64800.0 | 64800.0 | 🚀 WIN |
simd_normalized_cosine_fast_path/dot_product/768 |
-24.26% | [-24.49%, -24.07%] | 38.8 | 38.8 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_256c |
-24.24% | [-24.59%, -23.84%] | 16936.3 | 16936.3 | 🚀 WIN |
simd_prepared_query_normalized_cosine/prepared_full_cosine/768 |
-24.45% | [-24.64%, -24.32%] | 52172.0 | 52172.0 | 🚀 WIN |
simd_cosine_similarity/simd/384 |
-24.50% | [-24.66%, -24.35%] | 33.2 | 33.2 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/384d_4c |
-24.38% | [-24.74%, -24.07%] | 129.8 | 129.8 | 🚀 WIN |
int8_prepared_dot_product/prepared/129 |
-24.67% | [-24.95%, -24.30%] | 7.2 | 7.2 | 🚀 WIN |
simd_normalized_cosine_fast_path/dot_product/1024 |
-24.08% | [-24.98%, -23.33%] | 50.9 | 50.9 | 🚀 WIN |
int4_cosine_distance/float32_simd/1024 |
-24.87% | [-25.17%, -24.63%] | 65.1 | 65.1 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_4c |
-25.26% | [-25.36%, -25.13%] | 261.4 | 261.4 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_16c |
-25.31% | [-25.49%, -25.10%] | 808.1 | 808.1 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_4c |
-25.41% | [-25.57%, -25.28%] | 178.1 | 178.1 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_1000c |
-25.50% | [-25.69%, -25.34%] | 66551.0 | 66551.0 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_1000c |
-25.55% | [-25.71%, -25.37%] | 52011.4 | 52011.4 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_256c |
-25.68% | [-25.87%, -25.55%] | 13139.0 | 13139.0 | 🚀 WIN |
int8_raw_dot_product/dot_product_i8/128 |
-25.70% | [-25.91%, -25.50%] | 6.7 | 6.7 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_16c |
-25.72% | [-25.94%, -25.50%] | 1020.3 | 1020.3 | 🚀 WIN |
simd_prepared_query_normalized_cosine/prepared_full_cosine/1024 |
-25.70% | [-25.97%, -25.50%] | 67019.1 | 67019.1 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/768d_64c |
-25.75% | [-25.97%, -25.59%] | 3224.0 | 3224.0 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_16c |
-25.89% | [-26.04%, -25.78%] | 1008.7 | 1008.7 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/768d_1000c |
-25.87% | [-26.12%, -25.59%] | 52465.5 | 52465.5 | 🚀 WIN |
int8_prepared_dot_product/prepared/128 |
-25.86% | [-26.20%, -25.59%] | 6.7 | 6.7 | 🚀 WIN |
simd_prepared_query_normalized_cosine/prepared_meta_unit/1024 |
-26.03% | [-26.22%, -25.85%] | 57162.0 | 57162.0 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/768d_4c |
-26.12% | [-26.23%, -26.00%] | 205.7 | 205.7 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_64c |
-25.04% | [-26.33%, -23.29%] | 3268.3 | 3268.3 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/768d_16c |
-26.21% | [-26.38%, -26.04%] | 801.1 | 801.1 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/768d_256c |
-26.28% | [-26.48%, -26.09%] | 12871.5 | 12871.5 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/768d_16c |
-26.60% | [-26.74%, -26.43%] | 788.4 | 788.4 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/768d_256c |
-26.22% | [-26.76%, -25.82%] | 13114.3 | 13114.3 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/768d_256c |
-26.59% | [-26.78%, -26.39%] | 12778.3 | 12778.3 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/768d_4c |
-26.63% | [-26.80%, -26.48%] | 175.3 | 175.3 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_1000c |
-26.16% | [-26.88%, -25.28%] | 65526.8 | 65526.8 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/768d_1000c |
-26.89% | [-26.97%, -26.81%] | 50561.3 | 50561.3 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_4c |
-26.61% | [-26.97%, -26.29%] | 255.5 | 255.5 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/768d_16c |
-26.85% | [-26.97%, -26.71%] | 785.7 | 785.7 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/768d_1000c |
-26.87% | [-27.03%, -26.72%] | 50674.8 | 50674.8 | 🚀 WIN |
simd_batch_cosine_normalized_query/simd_batch/768d_64c |
-26.91% | [-27.08%, -26.78%] | 3135.7 | 3135.7 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_256c |
-25.52% | [-27.11%, -24.22%] | 17226.5 | 17226.5 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_64c |
-26.49% | [-27.21%, -25.95%] | 4037.9 | 4037.9 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/768d_4c |
-27.18% | [-27.38%, -27.00%] | 201.0 | 201.0 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/768d_64c |
-27.39% | [-27.78%, -27.08%] | 3140.9 | 3140.9 | 🚀 WIN |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_64c |
-27.20% | [-27.85%, -26.69%] | 3976.4 | 3976.4 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_1000c |
-28.03% | [-28.24%, -27.80%] | 44534.9 | 44534.9 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_256c |
-28.35% | [-28.41%, -28.29%] | 11044.4 | 11044.4 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/768d_256c |
-28.26% | [-28.41%, -28.15%] | 11004.1 | 11004.1 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_16c |
-28.80% | [-28.87%, -28.74%] | 677.7 | 677.7 | 🚀 WIN |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_64c |
-28.86% | [-29.06%, -28.72%] | 2695.5 | 2695.5 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/768d_16c |
-28.89% | [-29.07%, -28.75%] | 676.5 | 676.5 | 🚀 WIN |
simd_query_batch_dot_product/pair_loop/768d_64c |
-29.52% | [-29.59%, -29.46%] | 2697.0 | 2697.0 | 🚀 WIN |
layer_norm/4096 |
-31.21% | [-31.54%, -30.96%] | 632.4 | 632.4 | 🚀 WIN |
All 259 measurements
| Bench | Δ point | CI-lower | CI-upper |
|---|---|---|---|
add_bias_gelu/4096 |
-21.38% | -21.50% | -21.22% |
add_bias_gelu/896 |
-22.10% | -22.54% | -21.69% |
binary_cosine_distance/binary/1024 |
-22.55% | -22.69% | -22.43% |
binary_cosine_distance/binary/1536 |
-22.15% | -22.44% | -21.77% |
binary_cosine_distance/binary/384 |
-23.19% | -24.08% | -22.55% |
binary_cosine_distance/binary/768 |
-23.12% | -23.64% | -22.69% |
binary_cosine_distance/float32_simd/1024 |
-20.87% | -21.03% | -20.74% |
binary_cosine_distance/float32_simd/1536 |
-20.74% | -20.89% | -20.58% |
binary_cosine_distance/float32_simd/384 |
-17.04% | -17.15% | -16.91% |
binary_cosine_distance/float32_simd/768 |
-19.76% | -19.87% | -19.63% |
elementwise_mul/4096 |
+14.06% | +13.79% | +14.44% |
gelu/4096 |
-22.58% | -23.07% | -22.15% |
gelu/896 |
-21.97% | -22.18% | -21.78% |
int4_cosine_distance/float32_simd/1024 |
-24.87% | -25.17% | -24.63% |
int4_cosine_distance/float32_simd/1536 |
-20.01% | -20.15% | -19.85% |
int4_cosine_distance/float32_simd/384 |
-18.09% | -18.46% | -17.78% |
int4_cosine_distance/float32_simd/768 |
-18.70% | -18.86% | -18.56% |
int4_cosine_distance/int4/1024 |
-22.57% | -22.91% | -22.29% |
int4_cosine_distance/int4/1536 |
-22.29% | -22.53% | -22.08% |
int4_cosine_distance/int4/384 |
-22.13% | -22.44% | -21.87% |
int4_cosine_distance/int4/768 |
-22.24% | -22.44% | -22.05% |
int8_batch_cosine/float32_simd/10 |
-18.76% | -19.00% | -18.55% |
int8_batch_cosine/float32_simd/100 |
-17.62% | -18.00% | -17.32% |
int8_batch_cosine/float32_simd/1000 |
-18.52% | -19.03% | -18.03% |
int8_batch_cosine/int8_loop/10 |
-19.79% | -20.19% | -19.39% |
int8_batch_cosine/int8_loop/100 |
-19.41% | -19.70% | -19.14% |
int8_batch_cosine/int8_loop/1000 |
-17.83% | -18.16% | -17.51% |
int8_prepared_dot_product/per_call/1024 |
-22.32% | -22.52% | -22.10% |
int8_prepared_dot_product/per_call/127 |
-22.27% | -22.57% | -21.92% |
int8_prepared_dot_product/per_call/128 |
-22.63% | -22.91% | -22.40% |
int8_prepared_dot_product/per_call/129 |
-22.74% | -23.06% | -22.48% |
int8_prepared_dot_product/per_call/384 |
-22.51% | -22.60% | -22.41% |
int8_prepared_dot_product/per_call/768 |
-22.47% | -22.63% | -22.35% |
int8_prepared_dot_product/prepared/1024 |
-18.55% | -18.82% | -18.32% |
int8_prepared_dot_product/prepared/127 |
-18.53% | -18.85% | -18.24% |
int8_prepared_dot_product/prepared/128 |
-25.86% | -26.20% | -25.59% |
int8_prepared_dot_product/prepared/129 |
-24.67% | -24.95% | -24.30% |
int8_prepared_dot_product/prepared/384 |
-18.39% | -18.61% | -18.11% |
int8_prepared_dot_product/prepared/768 |
-17.12% | -17.45% | -16.86% |
int8_quantization/quantize/1024 |
-22.69% | -23.04% | -22.42% |
int8_quantization/quantize/1536 |
-22.79% | -23.01% | -22.63% |
int8_quantization/quantize/384 |
-22.62% | -22.92% | -22.37% |
int8_quantization/quantize/768 |
-22.02% | -22.31% | -21.65% |
int8_raw_dot_product/dot_product_i8/1024 |
-20.53% | -20.72% | -20.38% |
int8_raw_dot_product/dot_product_i8/127 |
-18.18% | -18.32% | -18.05% |
int8_raw_dot_product/dot_product_i8/128 |
-25.70% | -25.91% | -25.50% |
int8_raw_dot_product/dot_product_i8/129 |
-21.93% | -22.35% | -21.60% |
int8_raw_dot_product/dot_product_i8/384 |
-18.71% | -19.05% | -18.32% |
int8_raw_dot_product/dot_product_i8/768 |
-16.49% | -16.65% | -16.35% |
int8_raw_dot_product/dot_product_i8_raw/1024 |
-19.77% | -20.20% | -19.44% |
int8_raw_dot_product/dot_product_i8_raw/127 |
-17.36% | -17.49% | -17.23% |
int8_raw_dot_product/dot_product_i8_raw/128 |
-21.65% | -22.02% | -21.31% |
int8_raw_dot_product/dot_product_i8_raw/129 |
-21.85% | -22.14% | -21.62% |
int8_raw_dot_product/dot_product_i8_raw/384 |
-18.26% | -18.60% | -17.97% |
int8_raw_dot_product/dot_product_i8_raw/768 |
-20.39% | -20.64% | -20.18% |
int8_vs_float32_cosine/float32_simd/1024 |
-18.95% | -19.05% | -18.85% |
int8_vs_float32_cosine/float32_simd/1536 |
-11.65% | -11.86% | -11.35% |
int8_vs_float32_cosine/float32_simd/384 |
-17.67% | -17.82% | -17.51% |
int8_vs_float32_cosine/float32_simd/768 |
-13.03% | -13.21% | -12.87% |
int8_vs_float32_cosine/int8/1024 |
-21.94% | -22.10% | -21.76% |
int8_vs_float32_cosine/int8/1536 |
-21.46% | -21.66% | -21.23% |
int8_vs_float32_cosine/int8/384 |
-20.60% | -20.91% | -20.33% |
int8_vs_float32_cosine/int8/768 |
-14.21% | -14.46% | -14.02% |
layer_norm/4096 |
-31.21% | -31.54% | -30.96% |
layer_norm/896 |
-8.73% | -9.06% | -8.32% |
memory_size/search_1000_float32 |
-16.12% | -16.25% | -16.00% |
memory_size/search_1000_int8 |
-19.71% | -19.99% | -19.48% |
rms_norm/4096 |
-9.64% | -9.79% | -9.51% |
rms_norm/896 |
-18.90% | -19.53% | -18.41% |
silu_inplace/4096 |
-23.25% | -23.95% | -22.62% |
silu_inplace/896 |
-22.66% | -22.90% | -22.45% |
simd_batch_cosine/scalar_loop/10 |
-21.35% | -21.45% | -21.25% |
simd_batch_cosine/scalar_loop/100 |
-21.29% | -21.33% | -21.26% |
simd_batch_cosine/scalar_loop/1000 |
-21.45% | -21.62% | -21.31% |
simd_batch_cosine/simd_batch/10 |
-21.20% | -21.40% | -21.01% |
simd_batch_cosine/simd_batch/100 |
-17.04% | -17.41% | -16.75% |
simd_batch_cosine/simd_batch/1000 |
-16.39% | -16.53% | -16.25% |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_1000c |
-25.50% | -25.69% | -25.34% |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_16c |
-25.72% | -25.94% | -25.50% |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_256c |
-25.52% | -27.11% | -24.22% |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_4c |
-25.26% | -25.36% | -25.13% |
simd_batch_cosine_non_normalized_query/pair_loop/1024d_64c |
-26.49% | -27.21% | -25.95% |
simd_batch_cosine_non_normalized_query/pair_loop/384d_1000c |
-16.73% | -17.04% | -16.33% |
simd_batch_cosine_non_normalized_query/pair_loop/384d_16c |
-18.66% | -18.85% | -18.51% |
simd_batch_cosine_non_normalized_query/pair_loop/384d_256c |
-17.47% | -17.61% | -17.37% |
simd_batch_cosine_non_normalized_query/pair_loop/384d_4c |
-21.51% | -21.73% | -21.28% |
simd_batch_cosine_non_normalized_query/pair_loop/384d_64c |
-17.44% | -17.54% | -17.33% |
simd_batch_cosine_non_normalized_query/pair_loop/768d_1000c |
-25.87% | -26.12% | -25.59% |
simd_batch_cosine_non_normalized_query/pair_loop/768d_16c |
-26.21% | -26.38% | -26.04% |
simd_batch_cosine_non_normalized_query/pair_loop/768d_256c |
-26.22% | -26.76% | -25.82% |
simd_batch_cosine_non_normalized_query/pair_loop/768d_4c |
-26.12% | -26.23% | -26.00% |
simd_batch_cosine_non_normalized_query/pair_loop/768d_64c |
-25.75% | -25.97% | -25.59% |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_1000c |
-26.16% | -26.88% | -25.28% |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_16c |
-25.89% | -26.04% | -25.78% |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_256c |
-24.24% | -24.59% | -23.84% |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_4c |
-26.61% | -26.97% | -26.29% |
simd_batch_cosine_non_normalized_query/simd_batch/1024d_64c |
-27.20% | -27.85% | -26.69% |
simd_batch_cosine_non_normalized_query/simd_batch/384d_1000c |
-22.00% | -22.21% | -21.82% |
simd_batch_cosine_non_normalized_query/simd_batch/384d_16c |
-22.86% | -23.02% | -22.71% |
simd_batch_cosine_non_normalized_query/simd_batch/384d_256c |
-22.20% | -22.46% | -21.98% |
simd_batch_cosine_non_normalized_query/simd_batch/384d_4c |
-24.38% | -24.74% | -24.07% |
simd_batch_cosine_non_normalized_query/simd_batch/384d_64c |
-21.99% | -22.09% | -21.90% |
simd_batch_cosine_non_normalized_query/simd_batch/768d_1000c |
-26.87% | -27.03% | -26.72% |
simd_batch_cosine_non_normalized_query/simd_batch/768d_16c |
-26.85% | -26.97% | -26.71% |
simd_batch_cosine_non_normalized_query/simd_batch/768d_256c |
-26.28% | -26.48% | -26.09% |
simd_batch_cosine_non_normalized_query/simd_batch/768d_4c |
-27.18% | -27.38% | -27.00% |
simd_batch_cosine_non_normalized_query/simd_batch/768d_64c |
-27.39% | -27.78% | -27.08% |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_1000c |
-21.68% | -23.10% | -19.84% |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_16c |
-22.86% | -23.25% | -22.37% |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_256c |
-22.13% | -22.32% | -21.90% |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_4c |
-22.64% | -22.77% | -22.48% |
simd_batch_cosine_normalized_query/pair_loop_cosine/1024d_64c |
-22.68% | -22.85% | -22.52% |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_1000c |
-12.17% | -12.32% | -12.05% |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_16c |
-15.02% | -15.13% | -14.89% |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_256c |
-12.32% | -12.58% | -12.12% |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_4c |
-14.54% | -14.81% | -14.32% |
simd_batch_cosine_normalized_query/pair_loop_cosine/384d_64c |
-11.87% | -12.11% | -11.69% |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_1000c |
-25.55% | -25.71% | -25.37% |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_16c |
-25.31% | -25.49% | -25.10% |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_256c |
-25.68% | -25.87% | -25.55% |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_4c |
-23.94% | -24.23% | -23.71% |
simd_batch_cosine_normalized_query/pair_loop_cosine/768d_64c |
-25.04% | -26.33% | -23.29% |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_1000c |
-20.50% | -20.60% | -20.38% |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_16c |
-20.81% | -20.99% | -20.64% |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_256c |
-18.90% | -19.19% | -18.64% |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_4c |
-21.57% | -21.76% | -21.40% |
simd_batch_cosine_normalized_query/pair_loop_dot/1024d_64c |
-18.17% | -18.26% | -18.06% |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_1000c |
-11.29% | -11.66% | -10.99% |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_16c |
-5.90% | -6.07% | -5.66% |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_256c |
-8.77% | -8.91% | -8.61% |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_4c |
-8.94% | -9.49% | -8.62% |
simd_batch_cosine_normalized_query/pair_loop_dot/384d_64c |
-6.87% | -7.06% | -6.70% |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_1000c |
-28.03% | -28.24% | -27.80% |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_16c |
-28.80% | -28.87% | -28.74% |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_256c |
-28.35% | -28.41% | -28.29% |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_4c |
-25.41% | -25.57% | -25.28% |
simd_batch_cosine_normalized_query/pair_loop_dot/768d_64c |
-28.86% | -29.06% | -28.72% |
simd_batch_cosine_normalized_query/simd_batch/1024d_1000c |
-24.31% | -24.45% | -24.13% |
simd_batch_cosine_normalized_query/simd_batch/1024d_16c |
-23.54% | -23.71% | -23.42% |
simd_batch_cosine_normalized_query/simd_batch/1024d_256c |
-23.09% | -23.31% | -22.87% |
simd_batch_cosine_normalized_query/simd_batch/1024d_4c |
-22.92% | -23.07% | -22.74% |
simd_batch_cosine_normalized_query/simd_batch/1024d_64c |
-23.35% | -23.51% | -23.25% |
simd_batch_cosine_normalized_query/simd_batch/384d_1000c |
-17.44% | -17.84% | -17.12% |
simd_batch_cosine_normalized_query/simd_batch/384d_16c |
-20.01% | -20.24% | -19.76% |
simd_batch_cosine_normalized_query/simd_batch/384d_256c |
-17.80% | -17.92% | -17.68% |
simd_batch_cosine_normalized_query/simd_batch/384d_4c |
-18.81% | -18.97% | -18.60% |
simd_batch_cosine_normalized_query/simd_batch/384d_64c |
-16.74% | -16.97% | -16.45% |
simd_batch_cosine_normalized_query/simd_batch/768d_1000c |
-26.89% | -26.97% | -26.81% |
simd_batch_cosine_normalized_query/simd_batch/768d_16c |
-26.60% | -26.74% | -26.43% |
simd_batch_cosine_normalized_query/simd_batch/768d_256c |
-26.59% | -26.78% | -26.39% |
simd_batch_cosine_normalized_query/simd_batch/768d_4c |
-24.27% | -24.40% | -24.16% |
simd_batch_cosine_normalized_query/simd_batch/768d_64c |
-26.91% | -27.08% | -26.78% |
simd_batch_dot_product/scalar_loop/10 |
-21.22% | -21.39% | -21.07% |
simd_batch_dot_product/scalar_loop/100 |
-21.06% | -21.22% | -20.83% |
simd_batch_dot_product/scalar_loop/1000 |
-21.02% | -21.14% | -20.92% |
simd_batch_dot_product/simd_batch/10 |
-19.65% | -19.87% | -19.41% |
simd_batch_dot_product/simd_batch/100 |
-20.75% | -20.99% | -20.57% |
simd_batch_dot_product/simd_batch/1000 |
-16.93% | -17.21% | -16.68% |
simd_cosine_similarity/scalar/1024 |
-22.25% | -22.44% | -22.13% |
simd_cosine_similarity/scalar/1536 |
-22.46% | -22.68% | -22.29% |
simd_cosine_similarity/scalar/384 |
-21.63% | -21.69% | -21.59% |
simd_cosine_similarity/scalar/768 |
-22.14% | -22.28% | -22.04% |
simd_cosine_similarity/simd/1024 |
-23.42% | -23.56% | -23.29% |
simd_cosine_similarity/simd/1536 |
-20.74% | -20.82% | -20.68% |
simd_cosine_similarity/simd/384 |
-24.50% | -24.66% | -24.35% |
simd_cosine_similarity/simd/768 |
-13.01% | -13.20% | -12.85% |
simd_dot_product/scalar/1024 |
-22.34% | -22.37% | -22.32% |
simd_dot_product/scalar/1536 |
-22.42% | -22.45% | -22.38% |
simd_dot_product/scalar/384 |
-22.22% | -22.32% | -22.10% |
simd_dot_product/scalar/768 |
-22.44% | -22.47% | -22.39% |
simd_dot_product/simd/1024 |
-22.58% | -22.72% | -22.46% |
simd_dot_product/simd/1536 |
-21.94% | -22.15% | -21.78% |
simd_dot_product/simd/384 |
-8.88% | -9.10% | -8.64% |
simd_dot_product/simd/768 |
-4.45% | -4.71% | -4.23% |
simd_euclidean_distance/scalar/1024 |
-22.51% | -22.83% | -22.26% |
simd_euclidean_distance/scalar/1536 |
-22.19% | -22.22% | -22.16% |
simd_euclidean_distance/scalar/384 |
-21.57% | -21.62% | -21.51% |
simd_euclidean_distance/scalar/768 |
-22.05% | -22.21% | -21.89% |
simd_euclidean_distance/simd/1024 |
-19.40% | -19.51% | -19.28% |
simd_euclidean_distance/simd/1536 |
+5.49% | +5.27% | +5.65% |
simd_euclidean_distance/simd/384 |
-10.92% | -11.10% | -10.71% |
simd_euclidean_distance/simd/768 |
-5.60% | -5.95% | -5.10% |
simd_normalize/scalar/1024 |
-22.38% | -22.57% | -22.22% |
simd_normalize/scalar/1536 |
-21.45% | -21.98% | -20.87% |
simd_normalize/scalar/384 |
-21.83% | -22.10% | -21.54% |
simd_normalize/scalar/768 |
-22.28% | -22.60% | -22.02% |
simd_normalize/simd/1024 |
-10.13% | -11.95% | -8.26% |
simd_normalize/simd/1536 |
-8.17% | -10.27% | -6.09% |
simd_normalize/simd/384 |
-7.07% | -9.78% | -4.06% |
simd_normalize/simd/768 |
-5.05% | -7.90% | -2.63% |
simd_normalized_cosine_fast_path/cosine_full/1024 |
-21.82% | -22.07% | -21.59% |
simd_normalized_cosine_fast_path/cosine_full/384 |
-17.19% | -17.58% | -16.87% |
simd_normalized_cosine_fast_path/cosine_full/768 |
-21.35% | -21.97% | -20.84% |
simd_normalized_cosine_fast_path/dot_product/1024 |
-24.08% | -24.98% | -23.33% |
simd_normalized_cosine_fast_path/dot_product/384 |
-8.18% | -8.43% | -7.94% |
simd_normalized_cosine_fast_path/dot_product/768 |
-24.26% | -24.49% | -24.07% |
simd_prepared_query_normalized_cosine/dot_product_loop/1024 |
-8.26% | -8.85% | -7.79% |
simd_prepared_query_normalized_cosine/dot_product_loop/384 |
-17.08% | -17.62% | -16.64% |
simd_prepared_query_normalized_cosine/dot_product_loop/768 |
-16.96% | -17.17% | -16.83% |
simd_prepared_query_normalized_cosine/prepared_full_cosine/1024 |
-25.70% | -25.97% | -25.50% |
simd_prepared_query_normalized_cosine/prepared_full_cosine/384 |
-16.15% | -16.32% | -16.01% |
simd_prepared_query_normalized_cosine/prepared_full_cosine/768 |
-24.45% | -24.64% | -24.32% |
simd_prepared_query_normalized_cosine/prepared_meta_unit/1024 |
-26.03% | -26.22% | -25.85% |
simd_prepared_query_normalized_cosine/prepared_meta_unit/384 |
-19.19% | -19.29% | -19.09% |
simd_prepared_query_normalized_cosine/prepared_meta_unit/768 |
-19.61% | -19.81% | -19.46% |
simd_query_batch_dot_product/pair_loop/128d_16c |
-21.42% | -21.60% | -21.22% |
simd_query_batch_dot_product/pair_loop/128d_256c |
-21.91% | -22.49% | -21.47% |
simd_query_batch_dot_product/pair_loop/128d_4c |
-18.68% | -18.95% | -18.43% |
simd_query_batch_dot_product/pair_loop/128d_64c |
-22.89% | -23.12% | -22.65% |
simd_query_batch_dot_product/pair_loop/384d_16c |
-17.19% | -17.35% | -17.04% |
simd_query_batch_dot_product/pair_loop/384d_256c |
-17.28% | -17.40% | -17.20% |
simd_query_batch_dot_product/pair_loop/384d_4c |
-16.74% | -16.97% | -16.54% |
simd_query_batch_dot_product/pair_loop/384d_64c |
-17.38% | -17.56% | -17.22% |
simd_query_batch_dot_product/pair_loop/768d_16c |
-28.89% | -29.07% | -28.75% |
simd_query_batch_dot_product/pair_loop/768d_256c |
-28.26% | -28.41% | -28.15% |
simd_query_batch_dot_product/pair_loop/768d_4c |
-26.63% | -26.80% | -26.48% |
simd_query_batch_dot_product/pair_loop/768d_64c |
-29.52% | -29.59% | -29.46% |
simd_query_batch_dot_product/simd_batch/128d_16c |
-20.53% | -20.89% | -20.19% |
simd_query_batch_dot_product/simd_batch/128d_256c |
-22.31% | -22.53% | -22.17% |
simd_query_batch_dot_product/simd_batch/128d_4c |
-23.96% | -24.44% | -23.50% |
simd_query_batch_dot_product/simd_batch/128d_64c |
-17.97% | -18.07% | -17.87% |
simd_query_batch_dot_product/simd_batch/384d_16c |
-18.12% | -18.20% | -18.05% |
simd_query_batch_dot_product/simd_batch/384d_256c |
-21.88% | -22.09% | -21.71% |
simd_query_batch_dot_product/simd_batch/384d_4c |
-17.43% | -17.88% | -17.04% |
simd_query_batch_dot_product/simd_batch/384d_64c |
-20.75% | -21.26% | -20.35% |
simd_query_batch_dot_product/simd_batch/768d_16c |
-22.66% | -22.85% | -22.48% |
simd_query_batch_dot_product/simd_batch/768d_256c |
-21.81% | -22.24% | -21.43% |
simd_query_batch_dot_product/simd_batch/768d_4c |
-18.41% | -18.51% | -18.28% |
simd_query_batch_dot_product/simd_batch/768d_64c |
-23.36% | -23.45% | -23.25% |
simd_squared_euclidean_fast_path/euclidean_full/1024 |
-23.56% | -23.73% | -23.44% |
simd_squared_euclidean_fast_path/euclidean_full/384 |
-20.96% | -21.11% | -20.81% |
simd_squared_euclidean_fast_path/euclidean_full/768 |
-22.37% | -22.53% | -22.20% |
simd_squared_euclidean_fast_path/squared_euclidean/1024 |
-22.97% | -23.10% | -22.83% |
simd_squared_euclidean_fast_path/squared_euclidean/384 |
-22.61% | -23.66% | -21.72% |
simd_squared_euclidean_fast_path/squared_euclidean/768 |
-23.33% | -23.53% | -23.15% |
simd_throughput_384/cosine_similarity |
-17.79% | -18.00% | -17.60% |
simd_throughput_384/dot_product |
-11.12% | -11.38% | -10.85% |
simd_throughput_384/euclidean_distance |
-11.28% | -11.49% | -11.08% |
simd_throughput_384/normalize |
-11.16% | -11.51% | -10.82% |
softmax_attention/128 |
-19.75% | -19.87% | -19.63% |
softmax_attention/512 |
-21.02% | -21.13% | -20.87% |
tier_prepared_batch_sizes/int4_batch_prepared/10 |
-21.92% | -22.36% | -21.57% |
tier_prepared_batch_sizes/int4_batch_prepared/100 |
-21.69% | -22.02% | -21.22% |
tier_prepared_batch_sizes/int4_batch_prepared/1000 |
-22.12% | -22.47% | -21.87% |
tier_prepared_batch_sizes/int4_query_per_call/10 |
-21.90% | -22.07% | -21.67% |
tier_prepared_batch_sizes/int4_query_per_call/100 |
-22.01% | -22.25% | -21.82% |
tier_prepared_batch_sizes/int4_query_per_call/1000 |
-21.93% | -22.07% | -21.77% |
tier_prepared_batch_sizes/int8_batch_prepared/10 |
-19.53% | -19.79% | -19.22% |
tier_prepared_batch_sizes/int8_batch_prepared/100 |
-19.08% | -19.25% | -18.83% |
tier_prepared_batch_sizes/int8_batch_prepared/1000 |
-19.55% | -19.92% | -19.19% |
tier_prepared_batch_sizes/int8_query_per_call/10 |
-22.73% | -23.20% | -22.28% |
tier_prepared_batch_sizes/int8_query_per_call/100 |
-22.30% | -22.55% | -22.09% |
tier_prepared_batch_sizes/int8_query_per_call/1000 |
-22.37% | -22.94% | -21.90% |
tier_prepared_query/binary_query_once_1000 |
-22.65% | -22.89% | -22.44% |
tier_prepared_query/binary_query_per_call_1000 |
-22.97% | -23.21% | -22.71% |
tier_prepared_query/int4_query_once_1000 |
-22.83% | -23.54% | -22.32% |
tier_prepared_query/int4_query_per_call_1000 |
-21.90% | -22.02% | -21.76% |
tier_prepared_query/int8_query_once_1000 |
-20.26% | -20.59% | -19.84% |
tier_prepared_query/int8_query_per_call_1000 |
-22.25% | -22.45% | -22.02% |
Rule: CI-lower of change ≤3.0% passes silently; (3.0%, 7.0%] warns; >7.0% fails. Override via PR label bench-allow-regression.
Gate is in advisory mode (Rollout step 3, ADR-058 §Rollout). Failures do not block merge for the first 7 days.
Criterion panics with --baseline when a bench group has no prior data (e.g., newly added groups). --save-baseline saves new data AND compares against existing data if present, without panicking on missing groups. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Criterion --quick uses fewer samples — enough to detect direction and magnitude for a PR gate, not tight CIs. Full runs are for local bench-compare before submitting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
--all-targetsto the pre-commit clippy invocation in.githooks/pre-commitThis completes #87 — PR #144 cleaned the code but left the hook unchanged.
Test plan
cargo clippy --workspace --all-targets -- -D warningscleanCloses #87
🤖 Generated with Claude Code