Bring in upstream update (for main branch)#6
Open
peledins-zimperium wants to merge 10000 commits intomainfrom
Open
Bring in upstream update (for main branch)#6peledins-zimperium wants to merge 10000 commits intomainfrom
main branch)#6peledins-zimperium wants to merge 10000 commits intomainfrom
Conversation
…lvm#180326) In the MLIR C API headers, clang-tidy’s `modernize-use-using` check reports a large number of type definitions that use `typedef`. In my IDE, this even causes the `typedef` code to be shown as struck through. However, in this case it is clearly not possible to replace them with `using`. This PR suppresses the `modernize-use-using` check for the code inside `extern "C"` blocks.
…egator (llvm#178597) Replace instances of -1ULL, -2ULL, and -3ULL with std::numeric_limits in Bolt DataAggregator Trace constants to address C4146 compiler warning. Changes: - BR_ONLY: -1ULL → std::numeric_limits<uint64_t>::max() - FT_ONLY: -1ULL → std::numeric_limits<uint64_t>::max() - FT_EXTERNAL_ORIGIN: -2ULL → std::numeric_limits<uint64_t>::max() - 1 - FT_EXTERNAL_RETURN: -3ULL → std::numeric_limits<uint64_t>::max() - 2 Fixes part of llvm#147439
The `Zvabd` is for `RISC-V Integer Vector Absolute Difference` and it provides 5 instructions: * `vabs.v`: Vector Signed Integer Absolute. * `vabd.vv`: Vector Signed Integer Absolute Difference. * `vabdu.vv`: Vector Unsigned Integer Absolute Difference. * `vwabda.vv`: Vector Signed Integer Absolute Difference And Accumulate. * `vwabdau.vv`: Vector Unsigned Integer Absolute Difference And Accumulate. Doc: https://github.com/riscv/integer-vector-absolute-difference Reviewers: topperc, lukel97, preames, tclin914, asb, kito-cheng, mshockwave Pull Request: llvm#180139
We directly lower `ISD::ABDS`/`ISD::ABDU` to `Zvabd` instructions. Note that we only support SEW=8/16 for `vabd.vv`/`vabdu.vv`. Reviewers: mshockwave, lukel97, topperc, preames, tclin914, 4vtomat Reviewed By: lukel97, topperc Pull Request: llvm#180141
…llvm#178909) Currently, Clang only checks arrays and structures for size at a top-level view, that is it does not consider whether they will fit in the address space when applying the address space attribute. This can lead to situations where a variable is declared in an address space but its type is too large to fit in that address space, leading to potentially invalid modules. This patch proposes a fix for this by checking the size of the type against the maximum size that can be addressed in the given address space when applying the address space attribute. This does not currently handle instantiations of dependent variables, as the attributes are not re-processesd at that time. This is planned for further investigation and a follow-up patch. --------- Signed-off-by: Steffen Holst Larsen <HolstLarsen.Steffen@amd.com> Co-authored-by: Steffen Holst Larsen <HolstLarsen.Steffen@amd.com>
We add pseudos/patterns for `vabs.v` instruction and handle the lowering in `RISCVTargetLowering::lowerABS`. Reviewers: topperc, 4vtomat, mshockwave, preames, lukel97, tclin914 Reviewed By: mshockwave Pull Request: llvm#180142
Exactly match the s_wait_event instruction. For some reason we already had this instruction used through llvm.amdgcn.s.wait.event.export.ready, but that hardcodes a specific value. This should really be a bitmask that can combine multiple wait types. gfx11 -> gfx12 broke compatabilty in a weird way, by inverting the interpretation of the bit but also shifting the used bit by 1. Simplify the selection of the old intrinsic by just using the magic number 2, which should satisfy both cases.
When `Zvabd` exists, `llvm.abs` is lowered to `vabs.v` so the cost is 1. Reviewers: mshockwave, topperc, lukel97, skachkov-sc, preames Reviewed By: topperc Pull Request: llvm#180146
Currently vector splice intrinsics are costed through getShuffleCost when the offset is fixed. When the offset is variable though we can't use a shuffle mask so it currently returns invalid. This implements the cost in RISCVTTIImpl::getIntrinsicInstrCost as the cost of a slideup and a slidedown, which matches the codegen. It also implements the type based cost whenever the offset argument isn't available. It may be possible to reduce the cost in future when one of the vector operands is known to be poison, in which case we only generate a single slideup or slidedown.
Previously this would just print hex values. Print names for the recognized values, matching the sp3 syntax.
…g) (llvm#175976) This PR adds CIR lowering support for predicated SVE `svdup` builtins on AArch64. The corresponding ACLE intrinsics are documented at: https://developer.arm.com/architectures/instruction-sets/intrinsics This change focuses on the zeroing-predicated variants (suffix `_z`, e.g. `svdup_n_f32_z`), which lower to the LLVM SVE `dup` intrinsic with a `zeroinitializer` passthrough operand. IMPLEMENTATION NOTES -------------------- * The CIR type converter is extended to support `BuiltinType::SveBool`, which is lowered to `cir.vector<[16] x i1>`, matching current Clang behaviour and ensuring compatibility with existing LLVM SVE lowering. * Added logic that converts `cir.vector<[16] x i1>` according to the underlying element type. This is done by calling `@llvm.aarch64.sve.convert.from.svbool`. TEST NOTES ---------- Compared to the unpredicated `svdup` tests (llvm#174433), the new tests perform more explicit checks to verify: * Correct argument usage * Correct return value + type This helped validate differences between the default Clang lowering and the CIR-based lowering. Once all `svdup` variants are implemented, the tests will be unified. EXAMPLE LOWERING ---------------- The following example illustrates that CIR lowering produces equivalent LLVM IR to the default Clang path. Input: ```c svint8_t test_svdup_n_s8(svbool_t pg, int8_t op) { return svdup_n_s8_z(pg, op); } OUTPUT 1 (default): ```llvm define dso_local <vscale x 16 x i8> @test(<vscale x 16 x i1> %pg, i8 noundef %op) #0 { entry: %pg.addr = alloca <vscale x 16 x i1>, align 2 %op.addr = alloca i8, align 1 store <vscale x 16 x i1> %pg, ptr %pg.addr, align 2 store i8 %op, ptr %op.addr, align 1 %0 = load <vscale x 16 x i1>, ptr %pg.addr, align 2 %1 = load i8, ptr %op.addr, align 1 %2 = call <vscale x 16 x i8> @llvm.aarch64.sve.dup.nxv16i8(<vscale x 16 x i8> zeroinitializer, <vscale x 16 x i1> %0, i8 %1) ret <vscale x 16 x i8> %2 } ``` OUTPUT 2 (via `-fclangir`): ```llvm ; Function Attrs: noinline define dso_local <vscale x 16 x i8> @test(<vscale x 16 x i1> %0, i8 %1) #0 { %3 = alloca <vscale x 16 x i1>, i64 1, align 2 %4 = alloca i8, i64 1, align 1 %5 = alloca <vscale x 16 x i8>, i64 1, align 16 store <vscale x 16 x i1> %0, ptr %3, align 2 store i8 %1, ptr %4, align 1 %6 = load <vscale x 16 x i1>, ptr %3, align 2 %7 = load i8, ptr %4, align 1 %8 = call <vscale x 16 x i8> @llvm.aarch64.sve.dup.nxv16i8(<vscale x 16 x i8> zeroinitializer, <vscale x 16 x i1> %6, i8 %7) store <vscale x 16 x i8> %8, ptr %5, align 16 %9 = load <vscale x 16 x i8>, ptr %5, align 16 ret <vscale x 16 x i8> %9 } ```
…ext(min/max(x, y)) fold (llvm#180164) If only of the operands is one-use, the total number of fpexts stays the same, but the min/max is performed on a narrowed type. Additionally, the fpext may fold with a following fptrunc.
The partial check lines while claiming UTC output here were highly confusing. Regenerate the check lines. While here, use a newer version and rename blocks to avoid anon block conflicts.
…vm#177343) Load monitor operations make more sense as atomic operations, as non-atomic operations cannot be used for inter-thread communication w/o additional synchronization. The previous built-in made it work because one could just override the CPol bits, but that bypasses the memory model and forces the user to learn about ISA bits encoding. Making load monitor an atomic operation has a couple of advantages. First, the memory model foundation for it is stronger. We just lean on the existing rules for atomic operations. Second, the CPol bits are abstracted away from the user, which avoids leaking ISA details into the API. This patch also adds supporting memory model and intrinsics documentation to AMDGPUUsage. Solves SWDEV-516398.
llvm#179968) Fold `min/max(fpext x, C)` to `fpext(min/max(x, fptrunc C))` in cases where the truncation of the constant is lossless. This helps eliminate fpext/fptrunc pairs around min/max and addresses the regression from llvm#177988. Proof: https://alive2.llvm.org/ce/z/y_Bcdd
Should use `nnan` flag only.
…transform pass. (llvm#178134) This PR covers the `mlir::vector::populateFlattenVectorTransferPatterns` as a transform pass.
Ran my python script from llvm#97043 over the repo again and there were 2 duplicate test-cases that have been introduced since I last did this. Also one of the WASM classes had a duplicate method which I just removed.
…llvm#180278) Extract `ArraySectionAnalyzer` from `OptimizedBufferization` into a standalone analysis utility so it can be reused by other passes (e.g., `ScheduleOrderedAssignments`). Also extracts the logic to detect if a designate is using the indices of an elemental operation in storage order. This will be used in WHERE construct optimization in the next patch.
For fixed-length masks we need to AND the result of the whilewr/rw with `ptrue vl*` (which is at least one more instruction).
This case could be turned into powr or pown, so track which case ends up preferred.
Add handling for `STT_TLS` (thread-local storage) symbols in the ELF symbol parsing code. Previously, TLS symbols like `errno` from glibc were not recognized because `STT_TLS` was not handled in the symbol type switch statement. This treats TLS symbols as data symbols (`eSymbolTypeData`), similar to `STT_OBJECT`. The actual TLS address resolution is already implemented in `DynamicLoaderPOSIXDYLD::GetThreadLocalData()` which uses the DWARF `DW_OP_form_tls_address` opcode to calculate thread-local addresses.
replace magic value `std::numeric_limits<unsigned>::max()` with a named constant `ImpossibleRepairCost` to improve readability
…ex (llvm#179699) The definition for V_INDIRECT_REG_READ_GPR_IDX_B32_V*'s SSrc_b32 operand allows immediates, but the expansion logic handles only register cases now. This can result in expansion failures when e.g. llvm.amdgcn.wave.reduce.umin.i32 is folded into a constant and then used as an insertelement idx.
If the scalar integer selection sources are freely transferable to the FPU, then splat to create an allbits select condition and create a vector select instead
…180300) The dependency actually appears to be unused. Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>
…lvm#180722) No test because I'm not sure how to reproduce this, but this patch fixes `CodeGen/ptrauth-qualifier-function.c`. For function pointer types and function reference types, we use `Pointer`s these days, so we _can_ return them.
…use a conversion (llvm#179453) We're currently unwrapping `less<T>` even if the `key_type` isn't `T`. This causes the removal of an implicit conversion to `const T&` if the types mismatch. Making `less<T>` transparent in that case changes overload resolution and makes it fail potentially. Fixes llvm#179319
…ss space size (llvm#179625) When a global variable has a size that exceeds the size of the address space it resides in, the verifier should fail as the variable can neither be materialized nor fully accessed. This patch adds a check to the verifier to enforce it. --------- Signed-off-by: Steffen Holst Larsen <HolstLarsen.Steffen@amd.com> Co-authored-by: Steffen Holst Larsen <HolstLarsen.Steffen@amd.com>
…d/vpmaddubsw/pmulhrsw nodes (llvm#180728) Missing demanded elts handling
This is consistent with all other SPARC test directories.
…de sections (llvm#180411) When merging `.bss` into a code section (e.g., `/MERGE:.bss=.text`), the INT3 gap-filling loop in `writeSections()` would write past the output buffer. This happens because `.bss` chunks have `hasData=false`, so they contribute to `VirtualSize` but not `SizeOfRawData`. The loop was using chunk RVAs without checking if they exceeded the raw data region. This caused a crash on Windows with `/FILEALIGN:1` (access violation 0xC0000005). The tight alignment leaves no slack in the mapped buffer, so the overflow immediately hits unmapped memory. The fix bounds all memset operations to `rawSize` and exits early when encountering chunks beyond the raw data boundary. Fixes llvm#180406
llvm#180548) When making a region IsolatedFromAbove, replace uses in any region within the parent region, not just the immediate parent region.
…fullfp16 cases (llvm#180567) Noticed while working on some upcoming generic shuffle handling
…s. (llvm#180718) Conservatively treat unstable pointers as SCEVCouldNotCompute in getPtrToAddrExpr, and return SCEVUnknown when constructing from IR. This surfaced as part of the discussion in llvm#178861. PR: llvm#180718
Model zext i1 %x to in as select i1 %x, in 1, in 0 in case, if there are other select instructions, which can be combined into a bundle. Fixes llvm#178403 Reviewers: hiraditya, RKSimon Pull Request: llvm#180635
… and mla_mls_merge.ll. NFC
…0689) urem x, n: result < n (remainder is always less than divisor) urem x, n: result <= x (remainder is at most the dividend) udiv x, n: result <= x (quotient is at most the dividend) https://alive2.llvm.org/ce/z/ezzsjQ
…ddwd/vpmaddubsw/vpmulhrsw vector width reduction (llvm#180738)
This reverts commit 70aebae to fix buildbots https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flab.llvm.org%2Fbuildbot%2F%23%2Fbuilders%2F85%2Fbuilds%2F18614&data=05%7C02%7C%7Ce5641da3fe984280a6e908de68b3658c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C639063316889757116%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=65hUwLDdZkXq3zUEt3cVuqJNwXN7Alw4JKDggDbjeVk%3D&reserved=0
…l_gather (llvm#180243) - fixing incorrect assertion and related function name - MPI_comm_split is not pure - simplifying/standardizing permutation in all_gather --------- Co-authored-by: Rolf Morel <rolfmorel@gmail.com>
Comment on lines
+92
to
+97
| - name: Download source code | ||
| uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1 | ||
| with: | ||
| ref: ${{ matrix.ref }} | ||
| repository: ${{ matrix.repo }} | ||
| - name: Configure |
Check warning
Code scanning / CodeQL
Checkout of untrusted code in trusted context Medium
Copilot Autofix
AI 3 months ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
main branch)
jalopezg-git
approved these changes
Feb 11, 2026
jalopezg-git
left a comment
There was a problem hiding this comment.
LGTM (as in bringing main up to date; this is a fast-forward). Note, however, that no builds happen directly from this branch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR updates the
mainbranch to its current upstream state; this action should have been performed by the 'Sync fork' button but, for whatever reason it doesn't work. PRs should be avoided for this purpose in the future.