Rollup of 7 pull requests#152632
Conversation
This moves all LLVM intrinsic handling out of the regular call path for cg_gcc and makes it easier to hook into this code for future cg_llvm changes.
…acro_transparency`
It's described as a "backwards compatibility hack to keep the diff small". Removing it requires only a modest amount of churn, and the resulting code is clearer without the invisible derefs.
…ics-generation Simplify intrinsics generation
Regenerate intrinsics
Move LTO to OngoingCodegen::join This will make it easier to in the future move all this code to link_binary. Follow up to rust-lang#147810 Part of rust-lang/compiler-team#908
This is the conceptual opposite of the rust-cold calling convention and
is particularly useful in combination with the new `explicit_tail_calls`
feature.
For relatively tight loops implemented with tail calling (`become`) each
of the function with the regular calling convention is still responsible
for restoring the initial value of the preserved registers. So it is not
unusual to end up with a situation where each step in the tail call loop
is spilling and reloading registers, along the lines of:
foo:
push r12
; do things
pop r12
jmp next_step
This adds up quickly, especially when most of the clobberable registers
are already used to pass arguments or other uses.
I was thinking of making the name of this ABI a little less LLVM-derived
and more like a conceptual inverse of `rust-cold`, but could not come
with a great name (`rust-cold` is itself not a great name: cold in what
context? from which perspective? is it supposed to mean that the
function is rarely called?)
Fix segfault related to __builtin_unreachable with inline asm
…bilee
add `simd_splat` intrinsic
Add `simd_splat` which lowers to the LLVM canonical splat sequence.
```llvm
insertelement <N x elem> poison, elem %x, i32 0
shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer
```
Right now we try to fake it using one of
```rust
fn splat(x: u32) -> u32x8 {
u32x8::from_array([x; 8])
}
```
or (in `stdarch`)
```rust
fn splat(value: $elem_type) -> $name {
#[derive(Copy, Clone)]
#[repr(simd)]
struct JustOne([$elem_type; 1]);
let one = JustOne([value]);
// SAFETY: 0 is always in-bounds because we're shuffling
// a simd type with exactly one element.
unsafe { simd_shuffle!(one, one, [0; $len]) }
}
```
Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples:
- rust-lang#60637
- rust-lang#137407
- rust-lang#122623
- rust-lang#97804
---
As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no `const` way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends.
Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below.
Currently this just adds the intrinsic, it does not actually use it anywhere yet.
…ochenkov
abi: add a rust-preserve-none calling convention
This is the conceptual opposite of the rust-cold calling convention and is particularly useful in combination with the new `explicit_tail_calls` feature.
For relatively tight loops implemented with tail calling (`become`) each of the function with the regular calling convention is still responsible for restoring the initial value of the preserved registers. So it is not unusual to end up with a situation where each step in the tail call loop is spilling and reloading registers, along the lines of:
foo:
push r12
; do things
pop r12
jmp next_step
This adds up quickly, especially when most of the clobberable registers are already used to pass arguments or other uses.
I was thinking of making the name of this ABI a little less LLVM-derived and more like a conceptual inverse of `rust-cold`, but could not come with a great name (`rust-cold` is itself not a great name: cold in what context? from which perspective? is it supposed to mean that the function is rarely called?)
This comment has been minimized.
This comment has been minimized.
…uwer Rollup of 7 pull requests Successful merges: - #152622 (Update GCC subtree) - #145024 (Optimize indexing slices and strs with inclusive ranges) - #151365 (UnsafePinned: implement opsem effects of UnsafeUnpin) - #152381 (Do not require `'static` for obtaining reflection information.) - #143575 (Remove named lifetimes in some `PartialOrd` & `PartialEq` `impl`s) - #152404 (tests: adapt align-offset.rs for InstCombine improvements in LLVM 23) - #152582 (rustc_query_impl: Use `ControlFlow` in `visit_waiters` instead of nested options)
|
The job Click to see the possible cause of the failure (guessed by this bot) |
|
💔 Test for 60f234f failed: CI. Failed job:
|
|
@bors retry |
This comment has been minimized.
This comment has been minimized.
|
📌 Perf builds for each rolled up PR:
previous master: a33907a7a5 In the case of a perf regression, run the following command for each PR you suspect might be the cause: |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing a33907a (parent) -> 7bee525 (this PR) Test differencesShow 713 test diffsStage 1
Stage 2
Additionally, 703 doctest diffs were found. These are ignored, as they are noisy. Job group index
Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 7bee525095c0872e87c038c412c781b9bbb3f5dc --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (7bee525): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.7%, secondary -6.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 7.8%, secondary 2.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.0%, secondary 0.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 483.617s -> 481.339s (-0.47%) |
…jhpratt Add information to spurious `oneshot::send_before_recv_timeout` test This test regularly spuriously fails in CI, such as rust-lang#152632 (comment) We can just remove the assertion but I'd like to understand why, so I'm adding more information to the assert
…jhpratt Add information to spurious `oneshot::send_before_recv_timeout` test This test regularly spuriously fails in CI, such as rust-lang#152632 (comment) We can just remove the assertion but I'd like to understand why, so I'm adding more information to the assert
…jhpratt Remove timing assertion from `oneshot::send_before_recv_timeout` This test regularly spuriously fails in CI, such as rust-lang#152632 (comment) We can just remove the assertion but I'd like to understand why, so I'm adding more information to the assert
…jhpratt Remove timing assertion from `oneshot::send_before_recv_timeout` This test regularly spuriously fails in CI, such as rust-lang#152632 (comment) We can just remove the assertion but I'd like to understand why, so I'm adding more information to the assert
Successful merges:
'staticfor obtaining reflection information. #152381 (Do not require'staticfor obtaining reflection information.)PartialOrd&PartialEqimpls #143575 (Remove named lifetimes in somePartialOrd&PartialEqimpls)ControlFlowinvisit_waitersinstead of nested options #152582 (rustc_query_impl: UseControlFlowinvisit_waitersinstead of nested options)r? @ghost
Create a similar rollup