Conversation
Adds a JIT backend for POWER8 and later Power ISA CPUs. Assembly instructions were restricted to those available in Power ISA v2.06 in order to facilitate adding support for POWER7, but currently only RandomX V1 is supported on those chips due to its lack of AES instructions. Support has been added for both little-endian and big-endian CPUs, but only little-endian has been tested. Fixes tevador#132
|
Benchmarks on a Raptor Computing Systems Talos II with dual POWER9 CPUs:
|
The vector permutation is unnecessary on little-endian systems when using `stvx`.
RandomX v1 also uses AES in the scratchpad hash/fill step, so you can use the existing soft AES code for RandomX v2 loop. It shouldn't be that hard compared to the full JIT implementation that you've done already. |
This only saves one or two instructions, but there are no drawbacks to how this optimization is implemented so there's no reason not to do it.
|
FYI, the ppc64le build is failing: https://github.com/tevador/RandomX/actions/runs/24008647458/job/70025217444 |
This only saves one or two instructions in a very cold path in the code, but there are no drawbacks to implementing this optimization so there's no reason not to do it.
This optimization can save one or two instructions for some immediates.
From the CI log:
We could also avoid the dependency entirely by Which option would you prefer? FWIW, in the future I plan to use more of those definitions (full list is here) to detect the system's ISA version in order to patch in more-optimized code for the newer architectures, so my personal preference is to just use the Linux kernel header so there's no possibility of copy/paste issues. That said, I'll understand completely if you want to avoid a dependency on kernel headers just for a handful of constant definitions (which AIUI should never change between kernel versions). |
We already query the CPU feature support in cpu.cpp, so there's no need to do it again.
This is the same split in Debian--the ppc64el port is only supported on POWER8 and later, so POWER7 and earlier can only run Debian ppc64 (big-endian 64-bit PowerPC). Because of this, we set the default little-endian architecture to POWER8. And since the RandomX JIT backend for PPC64 requires VSX, which is only supported by POWER7 and later, the lowest we can set the default big-endian architecture to is POWER7.
Adds a JIT backend for POWER8 and later Power ISA CPUs. Assembly instructions were restricted to those available in Power ISA v2.06 in order to facilitate adding support for POWER7, but currently only RandomX V1 is supported on those chips due to its lack of AES instructions.
Support has been added for both little-endian and big-endian CPUs, but only little-endian has been tested.
Fixes #132