Skip to content

Conversation

@CyberShadow
Copy link
Contributor

LLM disclosure: this entire PR is AI-generated. It might be OK, or it might be slop. The fix looks sensible on the surface, and it passes the test suite, however, so does most slop in general. The bug report was open for a couple years now and no one was picking it up, so off the chance that this is actually OK, it seems worth a shot.


When template functions containing DMD-style inline assembly are instantiated in multiple modules and linked with LTO, duplicate symbol errors can occur because the asm labels have identical names.

This fix assigns a unique ID to each function containing inline asm, based on a deterministic hash of the template instantiation module name. The ID is appended to label names, ensuring uniqueness across compilation units.

Fixes: #4294

🤖 Generated with Claude Code

@CyberShadow
Copy link
Contributor Author

Here's a self-contained reproducer script which demonstrates the issue:

#!/bin/bash
#
# LDC Issue #4294 Reproducer
# https://github.com/ldc-developers/ldc/issues/4294
#
# Reproduces the duplicate symbol error with inline assembly labels in
# template functions when using LTO on i686.
#
# SETUP (run once):
#
#   # 1. Install i686 cross-compiler
#   sudo dpkg --add-architecture i386
#   sudo apt-get update
#   sudo apt-get install -y gcc-i686-linux-gnu libc6-dev-i386
#
#   # 2. Build i686 D runtime with LTO
#   mkdir -p /tmp/ldc-i686-runtime && cd /tmp/ldc-i686-runtime
#   CC=i686-linux-gnu-gcc /path/to/ldc-build-runtime \
#       --dFlags="-mtriple=i686-linux-gnu" \
#       --dFlags="-flto=full" \
#       BUILD_SHARED_LIBS=OFF
#
# USAGE:
#   ./reproducer-4294.sh [path-to-ldc2]
#
# EXPECTED RESULTS:
#   Without fix: FAIL - "symbol '..._L1' is already defined"
#   With fix:    PASS - LTO linking succeeds
#

set -e

LDC="${1:-ldc2}"
RUNTIME_DIR="${LDC_I686_RUNTIME:-/tmp/ldc-i686-runtime/ldc-build-runtime.tmp}"
TMPDIR=$(mktemp -d)
trap "rm -rf $TMPDIR" EXIT

echo "LDC Issue #4294 Reproducer"
echo "=========================="
echo "Compiler: $LDC"
echo "i686 Runtime: $RUNTIME_DIR"
echo

if [ ! -d "$RUNTIME_DIR/lib" ]; then
    echo "ERROR: i686 runtime not found at $RUNTIME_DIR"
    echo "Please build it first (see script header for instructions)"
    exit 1
fi

# Minimal reproducer from the original issue - uses std.bigint
# which triggers inline asm in std.internal.math.biguintx86
cat > "$TMPDIR/main.d" << 'EOF'
import std;

void main() {
    ulong i;
    BigInt() * i;
}
EOF

COMMON_FLAGS="-mtriple=i686-linux-gnu -flto=full -L-L$RUNTIME_DIR/lib -static"

echo "Compiling and linking with LTO (i686, static)..."
set +e
OUTPUT=$("$LDC" $COMMON_FLAGS "$TMPDIR/main.d" -of="$TMPDIR/test" -gcc=i686-linux-gnu-gcc 2>&1)
RESULT=$?
set -e

echo "$OUTPUT"

if [ $RESULT -eq 0 ]; then
    echo
    echo "RESULT: PASS - LTO linking succeeded."
    echo "The fix IS applied."
else
    if echo "$OUTPUT" | grep -q "already defined"; then
        echo
        echo "RESULT: FAIL - Duplicate symbol error (expected without fix)."
        echo "The fix is NOT applied."
    else
        echo
        echo "RESULT: FAIL - Linking failed for another reason."
    fi
    exit 1
fi

Test suite addition is lower level since the problem manifests in rather specific conditions (i686 + LTO + static linking).

@CyberShadow CyberShadow marked this pull request as draft December 20, 2025 15:00
@CyberShadow CyberShadow force-pushed the fix-asm-labels-lto-4294 branch from f17dd38 to 8318aaa Compare December 20, 2025 15:04
@CyberShadow CyberShadow marked this pull request as ready for review December 20, 2025 16:49
@kinke
Copy link
Member

kinke commented Dec 22, 2025

So the issue is restricted to 32-bit x86, not seen for x86_64? (Sorry, on the phone and not a PC at the moment, can't test anything myself.) If so, this seems like an LLVM issue; the LDC win32 build isn't LTO'd either (as only exception), because of weird undefined symbols...

@CyberShadow CyberShadow force-pushed the fix-asm-labels-lto-4294 branch from 8318aaa to d6bd3d1 Compare December 22, 2025 10:47
@CyberShadow
Copy link
Contributor Author

So the issue is restricted to 32-bit x86, not seen for x86_64?

Ah, actually it is reproducible on x86_64 too - Phobos doesn't use x86_64 asm in a way that triggers the bug, but it is possible to write a test which demonstrates actual linking failure. Updated the test to do so rather than explicitly check label formatting.

@CyberShadow CyberShadow force-pushed the fix-asm-labels-lto-4294 branch from d6bd3d1 to e1be882 Compare December 22, 2025 16:42
@CyberShadow CyberShadow marked this pull request as draft December 22, 2025 18:07
@CyberShadow
Copy link
Contributor Author

Nice, CI found more bugs, let me have a look

When template functions containing DMD-style inline assembly are
instantiated in multiple modules and linked with LTO, duplicate
symbol errors can occur because the asm labels have identical names.

This fix assigns a unique ID to each function containing inline asm,
based on a deterministic hash of the template instantiation module
name. The ID is appended to label names, ensuring uniqueness across
compilation units.

Fixes: ldc-developers#4294

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@CyberShadow CyberShadow force-pushed the fix-asm-labels-lto-4294 branch from e1be882 to c75e376 Compare December 22, 2025 18:09
@CyberShadow CyberShadow marked this pull request as ready for review December 22, 2025 19:38
@CyberShadow
Copy link
Contributor Author

CyberShadow commented Dec 22, 2025

Nice, CI found more bugs, let me have a look

OK, looks like to fix those we need to stop using module-level asm and move to embedding x86 asm in LLVM IR functions. This would be the more general / comprehensive fix for this bug too, and seems to be what clang does for C/C++, but it's a much bigger lift. For now restricted the test to Linux-only, where it passes.

@CyberShadow
Copy link
Contributor Author

OK, looks like to fix those we need to stop using module-level asm and move to embedding x86 asm in LLVM IR functions. This would be the more general / comprehensive fix for this bug too, and seems to be what clang does for C/C++, but it's a much bigger lift.

Well, here it is: #5041

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

multibyteMulAdd "is already defined" with -flto=full on i686

3 participants