Skip to content

Conversation

@valadaptive
Copy link
Contributor

I'm not sure where the core_arch modules originally came from. Raph's comment says he copied them from Pulp, which doesn't seem to document how or where they were generated.

This PR replaces them with new autogenerated versions, created by parsing the stdarch crate. I've added it as a submodule under fearless_simd_gen.

I've chosen to make core_arch generation a separate subcommand in the fearless_simd_gen binary; I expect it to be run much less often, and want to allow people to regenerate the rest of the code without having to clone the (relatively heavy) stdarch repo.

I'm not sure how much utility core_arch itself provides right now. The main reason for doing this is that I've embarked on a bit of a yak-shaving expedition:

  • In order to support AVX-512, we need a separate abstraction for mask types (Mask types for AVX-512 #179).
  • In order to add a separate abstraction for mask types, we need an operation that converts them to and from integer types.
  • In order to convert mask types to integer types, we need 64-bit integer types.
  • In order to support 64-bit integer types, we need to make sure we don't accidentally use any intrinsics that are too new (Make sure we're not using any unsupported intrinsics in the various x86 levels #124). Many 64-bit operations are only implemented in AVX-512, even though their naming scheme is identical to operations supported in far older instruction sets.
  • In order to avoid accidentally using unsupported intrinsics, we need a way to import only the supported ones...
  • And to do that, we need to automatically assemble a list of which intrinsics require which CPU features, hence this PR.

@valadaptive valadaptive requested a review from LaurenzV February 4, 2026 13:44
@valadaptive
Copy link
Contributor Author

Apologies for the churn; this should be actually ready for review now. The NEON intrinsics in stdarch are a mess to sort through.

@LaurenzV
Copy link
Collaborator

LaurenzV commented Feb 5, 2026

Perhaps we should discuss (either Zulip or next office hours) what to do with the core_arch module? It's a lot to review. 😅 So it would be good to clarify that beforehand.

Copy link
Contributor

@Shnatsel Shnatsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of testing improvements have been merged into main recently. Merging main into this branch and seeing CI pass would make me much more confident in the correctness of this PR.

Comment on lines +91 to +94
/// Generate the constructor for this architecture.
///
/// x86 and aarch64 require runtime feature detection, so the constructor is unsafe.
/// wasm32 has static feature detection, so the constructor is safe.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aarch64 is guaranteed to have NEON and currently fearless_simd doesn't have any other Aarch64 levels other than baseline NEON. So this could actually be safe for now; making it unsafe is just for future-proofing.

@LaurenzV
Copy link
Collaborator

I don’t think any of the core_arch stuff is tested in CI though, no?

@Shnatsel
Copy link
Contributor

Only through high-level operations, I believe. So only a small subset.

@LaurenzV
Copy link
Collaborator

Aren't we using the intrinsics directly? I don't think we are running anything from core_arch in the tests. 🤔 But I might be wrong.

@Shnatsel
Copy link
Contributor

Ah, nevermind then, I must have been mistaken

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants