Emit naked functions as LLVM IR with inline asm instead of module asm#5041
Emit naked functions as LLVM IR with inline asm instead of module asm#5041CyberShadow wants to merge 7 commits intoldc-developers:masterfrom
Conversation
This fixes issue ldc-developers#4294 where LTO linking fails with "symbol already defined" errors for naked template functions. The root cause was that module-level assembly from multiple compilation units gets concatenated during LTO before COMDAT deduplication can occur. The fix emits naked functions as proper LLVM IR functions with: - The 'naked' attribute (suppresses prologue/epilogue generation) - LinkOnceODRLinkage for template instantiations - COMDAT groups for proper symbol deduplication during LTO - Inline asm containing the function body - OptimizeNone and NoInline attributes to prevent LLVM from cloning the function during optimization passes (which would duplicate labels) Labels in the inline asm use printLabelName() for consistency with label references generated by the asm parser, ensuring labels are properly quoted to match the format used in jump instructions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
8240d63 to
9ca884a
Compare
…odule scope Use gIR->saveInsertPoint() instead of gIR->ir->saveIP() because the latter goes through IRBuilderHelper::operator->() which asserts that there's a valid insert block. At module scope (e.g., when compiling naked functions in phobos), there may not be an existing insert point. The RAII InsertPointGuard handles both null and non-null insert points correctly, saving and restoring the builder state automatically. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7095b8e to
76f45e7
Compare
The test was using -mtriple=x86_64-linux-gnu for the runtime test, which fails on macOS because it tries to link a Linux binary. Split the test requirements: - FileCheck verification: uses explicit mtriple for reproducible output - Runtime verification: uses native platform, requires host_X86 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use IR-DAG instead of IR-LABEL for template function check since template functions may be emitted after their callers in the IR. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2d7ee8c to
e54363f
Compare
- Use getIRMangledName() instead of mangleExact() to get proper Windows calling convention decoration (e.g., \01 prefix for vectorcall) - Use DtoLinkage() for proper linkage determination for all function types - Fix linkage for functions already declared by DtoDeclareFunction() - Remove redundant COMDAT setup code This fixes undefined symbol errors when using naked template functions across Windows DLL boundaries. The issue was that DtoDeclareFunction() creates functions with ExternalLinkage, and DtoDefineNakedFunction() wasn't correcting the linkage or using the proper IR mangle name. Add test that cross-compiles for Windows and verifies naked template functions work correctly with DLL linking. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
e54363f to
9dc63f8
Compare
Use setLinkage() and setVisibility() from tollvm.h instead of manually handling linkage and DLL storage class. This is the idiomatic pattern in LDC that correctly handles: - Lambdas (internal linkage, no dllexport) - Templates (weak_odr linkage with COMDAT) - Exported functions (dllexport on Windows) - Regular functions (external linkage) The setVisibility() function uses hasExportedLinkage() which returns false for internal/private linkage, preventing the invalid combination of local linkage with DLL storage class. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use version(D_InlineAsm_X86_64) and version(D_InlineAsm_X86) blocks to handle both 64-bit and 32-bit x86 platforms - Use ECX for first argument on Windows x64 ABI (not EDI like SysV) - Use [ESP+4] for first argument on 32-bit cdecl calling convention - Provide fallback return values for non-x86 platforms 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
e59f03e to
f475f9e
Compare
|
It's green 🫠 |
|
circle CI fails, though I'm not entirely sure why. |
|
It's failing on master too, so I wasn't counting that one... |
Claude's analysisRoot Cause Analysis The test memoryerror_null_read is failing on CircleCI's Ubuntu 24.04 machine executor. The test expects to catch a null pointer dereference via a custom SIGSEGV handler and print an assert message, but instead the program crashes with a raw segmentation fault. Key Findings
Likely Causes on Ubuntu 24.04
Recommended Fixes
|
|
Thx for the work, the approach makes sense to me! - I think that ideally we'd totally get rid of Detailsdiff --git a/gen/asmstmt.cpp b/gen/asmstmt.cpp
index 12d364d012..94233aee0a 100644
--- a/gen/asmstmt.cpp
+++ b/gen/asmstmt.cpp
@@ -784,21 +784,3 @@ void CompoundAsmStatement_toIR(CompoundAsmStatement *stmt, IRState *p) {
p->ir->SetInsertPoint(bb);
}
}
-
-//////////////////////////////////////////////////////////////////////////////
-
-void AsmStatement_toNakedIR(InlineAsmStatement *stmt, IRState *irs) {
- IF_LOG Logger::println("InlineAsmStatement::toNakedIR(): %s",
- stmt->loc.toChars());
- LOG_SCOPE;
-
- // is there code?
- if (!stmt->asmcode) {
- return;
- }
- AsmCode *code = static_cast<AsmCode *>(stmt->asmcode);
-
- // build asm stmt
- replace_func_name(irs, code->insnTemplate);
- irs->nakedAsm << "\t" << code->insnTemplate << std::endl;
-}
diff --git a/gen/functions.cpp b/gen/functions.cpp
index 6879170671..2e1b98cfd1 100644
--- a/gen/functions.cpp
+++ b/gen/functions.cpp
@@ -1127,12 +1127,6 @@ void DtoDefineFunction(FuncDeclaration *fd, bool linkageAvailableExternally) {
gIR->funcGenStates.pop_back();
};
- // if this function is naked, we take over right away! no standard processing!
- if (fd->isNaked()) {
- DtoDefineNakedFunction(fd);
- return;
- }
-
SCOPE_EXIT {
if (irFunc->isDynamicCompiled()) {
defineDynamicCompiledFunction(gIR, irFunc);
@@ -1210,18 +1204,13 @@ void DtoDefineFunction(FuncDeclaration *fd, bool linkageAvailableExternally) {
gIR->ir->setFastMathFlags(irFunc->FMF);
gIR->DBuilder.EmitFuncStart(fd);
- // @naked: emit body and return, no prologue/epilogue
- if (func->hasFnAttribute(llvm::Attribute::Naked)) {
- Statement_toIR(fd->fbody, gIR);
- const bool wasDummy = eraseDummyAfterReturnBB(gIR->scopebb());
- if (!wasDummy && !gIR->scopereturned()) {
- // this is what clang does to prevent LLVM complaining about
- // non-terminated function
- gIR->ir->CreateUnreachable();
- }
- return;
+ if (fd->isNaked()) { // DMD-style `asm { naked; }`
+ func->addFnAttr(llvm::Attribute::Naked);
}
+ // @naked: emit body and return, no prologue/epilogue
+ const bool isNaked = func->hasFnAttribute(llvm::Attribute::Naked);
+
// create alloca point
// this gets erased when the function is complete, so alignment etc does not
// matter at all
@@ -1234,12 +1223,12 @@ void DtoDefineFunction(FuncDeclaration *fd, bool linkageAvailableExternally) {
emitInstrumentationFnEnter(fd);
if (global.params.trace && fd->emitInstrumentation && !fd->isCMain() &&
- !fd->isNaked()) {
+ !isNaked) {
emitDMDStyleFunctionTrace(*gIR, fd, funcGen);
}
// give the 'this' parameter (an lvalue) storage and debug info
- if (irFty.arg_this) {
+ if (!isNaked && irFty.arg_this) {
LLValue *thisvar = irFunc->thisArg;
assert(thisvar);
@@ -1261,7 +1250,7 @@ void DtoDefineFunction(FuncDeclaration *fd, bool linkageAvailableExternally) {
}
// define all explicit parameters
- if (fd->parameters)
+ if (!isNaked && fd->parameters)
defineParameters(irFty, *fd->parameters);
// Initialize PGO state for this function
@@ -1327,7 +1316,10 @@ void DtoDefineFunction(FuncDeclaration *fd, bool linkageAvailableExternally) {
// pass the previous block into this block
gIR->DBuilder.EmitStopPoint(fd->endloc);
- if (func->getReturnType() == LLType::getVoidTy(gIR->context())) {
+
+ if (isNaked) {
+ gIR->ir->CreateUnreachable();
+ } else if (func->getReturnType() == LLType::getVoidTy(gIR->context())) {
gIR->ir->CreateRetVoid();
} else if (isAnyMainFunction(fd)) {
gIR->ir->CreateRet(LLConstant::getNullValue(func->getReturnType()));
diff --git a/gen/functions.h b/gen/functions.h
index 5dcfa7032d..609da95620 100644
--- a/gen/functions.h
+++ b/gen/functions.h
@@ -40,7 +40,6 @@ void DtoResolveFunction(FuncDeclaration *fdecl);
void DtoDeclareFunction(FuncDeclaration *fdecl);
void DtoDefineFunction(FuncDeclaration *fd, bool linkageAvailableExternally = false);
-void DtoDefineNakedFunction(FuncDeclaration *fd);
void emitABIReturnAsmStmt(IRAsmBlock *asmblock, Loc loc,
FuncDeclaration *fdecl);
diff --git a/gen/irstate.h b/gen/irstate.h
index 8004a0c83f..721db8dc3f 100644
--- a/gen/irstate.h
+++ b/gen/irstate.h
@@ -215,7 +215,6 @@ public:
// for inline asm
IRAsmBlock *asmBlock = nullptr;
- std::ostringstream nakedAsm;
// Globals to pin in the llvm.used array to make sure they are not
// eliminated.
diff --git a/gen/naked.cpp b/gen/naked.cpp
index c82c338010..5a458c0470 100644
--- a/gen/naked.cpp
+++ b/gen/naked.cpp
@@ -30,237 +30,6 @@
using namespace dmd;
-////////////////////////////////////////////////////////////////////////////////
-// FIXME: Integrate these functions
-void AsmStatement_toNakedIR(InlineAsmStatement *stmt, IRState *irs);
-
-////////////////////////////////////////////////////////////////////////////////
-
-class ToNakedIRVisitor : public Visitor {
- IRState *irs;
-
-public:
- explicit ToNakedIRVisitor(IRState *irs) : irs(irs) {}
-
- //////////////////////////////////////////////////////////////////////////
-
- // Import all functions from class Visitor
- using Visitor::visit;
-
- //////////////////////////////////////////////////////////////////////////
-
- void visit(Statement *stmt) override {
- error(stmt->loc, "Statement not allowed in naked function");
- }
-
- //////////////////////////////////////////////////////////////////////////
-
- void visit(InlineAsmStatement *stmt) override {
- AsmStatement_toNakedIR(stmt, irs);
- }
-
- //////////////////////////////////////////////////////////////////////////
-
- void visit(CompoundStatement *stmt) override {
- IF_LOG Logger::println("CompoundStatement::toNakedIR(): %s",
- stmt->loc.toChars());
- LOG_SCOPE;
-
- if (stmt->statements) {
- for (auto s : *stmt->statements) {
- if (s) {
- s->accept(this);
- }
- }
- }
- }
-
- //////////////////////////////////////////////////////////////////////////
-
- void visit(ExpStatement *stmt) override {
- IF_LOG Logger::println("ExpStatement::toNakedIR(): %s",
- stmt->loc.toChars());
- LOG_SCOPE;
-
- // This happens only if there is a ; at the end:
- // asm { naked; ... };
- // Is this a legal AST?
- if (!stmt->exp) {
- return;
- }
-
- // only expstmt supported in declarations
- if (!stmt->exp || stmt->exp->op != EXP::declaration) {
- visit(static_cast<Statement *>(stmt));
- return;
- }
-
- DeclarationExp *d = static_cast<DeclarationExp *>(stmt->exp);
- VarDeclaration *vd = d->declaration->isVarDeclaration();
- FuncDeclaration *fd = d->declaration->isFuncDeclaration();
- EnumDeclaration *ed = d->declaration->isEnumDeclaration();
-
- // and only static variable/function declaration
- // no locals or nested stuffies!
- if (!vd && !fd && !ed) {
- visit(static_cast<Statement *>(stmt));
- return;
- }
- if (vd && !(vd->storage_class & (STCstatic | STCmanifest))) {
- error(vd->loc, "non-static variable `%s` not allowed in naked function",
- vd->toChars());
- return;
- }
- if (fd && !fd->isStatic()) {
- error(fd->loc,
- "non-static nested function `%s` not allowed in naked function",
- fd->toChars());
- return;
- }
- // enum decls should always be safe
-
- // make sure the symbols gets processed
- // TODO: codegen() here is likely incorrect
- Declaration_codegen(d->declaration, irs);
- }
-
- //////////////////////////////////////////////////////////////////////////
-
- void visit(LabelStatement *stmt) override {
- IF_LOG Logger::println("LabelStatement::toNakedIR(): %s",
- stmt->loc.toChars());
- LOG_SCOPE;
-
- // Use printLabelName to match how label references are generated in asm-x86.h.
- // This ensures label definitions match the quoted format used in jump instructions.
- printLabelName(irs->nakedAsm, mangleExact(irs->func()->decl),
- stmt->ident->toChars());
- irs->nakedAsm << ":\n";
-
- if (stmt->statement) {
- stmt->statement->accept(this);
- }
- }
-};
-
-////////////////////////////////////////////////////////////////////////////////
-
-void DtoDefineNakedFunction(FuncDeclaration *fd) {
- IF_LOG Logger::println("DtoDefineNakedFunction(%s)", mangleExact(fd));
- LOG_SCOPE;
-
- // Get the proper IR mangle name (includes Windows calling convention decoration)
- TypeFunction *tf = fd->type->isTypeFunction();
- const std::string irMangle = getIRMangledName(fd, tf ? tf->linkage : LINK::d);
-
- // Get or create the LLVM function first, before visiting the body.
- // The visitor may call Declaration_codegen which needs an IR insert point.
- llvm::Module &module = gIR->module;
- llvm::Function *func = module.getFunction(irMangle);
-
- if (!func) {
- // Create function type using the existing infrastructure
- llvm::FunctionType *funcType = DtoFunctionType(fd);
-
- // Create the function with ExternalLinkage initially.
- // setLinkage() below will set the correct linkage.
- func = llvm::Function::Create(funcType, llvm::GlobalValue::ExternalLinkage,
- irMangle, &module);
- } else if (!func->empty()) {
- // Function already has a body - this can happen if the function was
- // already defined (e.g., template instantiation in another module).
- // Don't add another body.
- return;
- } else if (func->hasFnAttribute(llvm::Attribute::Naked)) {
- // Function already has naked attribute - it was already processed
- return;
- }
-
- // Set linkage and visibility using the standard infrastructure.
- // This correctly handles:
- // - Lambdas (internal linkage, no dllexport)
- // - Templates (weak_odr linkage with COMDAT)
- // - Exported functions (dllexport on Windows)
- // - Regular functions (external linkage)
- setLinkage(DtoLinkage(fd), func);
- setVisibility(fd, func);
-
- // Set naked attribute - this tells LLVM not to generate prologue/epilogue
- func->addFnAttr(llvm::Attribute::Naked);
-
- // Prevent optimizations that might clone or modify the function.
- // The inline asm contains labels that would conflict if duplicated.
- func->addFnAttr(llvm::Attribute::OptimizeNone);
- func->addFnAttr(llvm::Attribute::NoInline);
-
- // Set other common attributes
- func->addFnAttr(llvm::Attribute::NoUnwind);
-
- // Create entry basic block and set insert point before visiting body.
- // The visitor's ExpStatement::visit may call Declaration_codegen for
- // static symbols, which may need an active IR insert point.
- llvm::BasicBlock *entryBB =
- llvm::BasicBlock::Create(gIR->context(), "entry", func);
-
- // Save current insert point and switch to new function.
- // Use gIR->setInsertPoint() instead of gIR->ir->SetInsertPoint() because
- // the latter goes through IRBuilderHelper::operator->() which asserts that
- // there's a valid insert block. At module scope, there may not be one yet.
- // gIR->setInsertPoint() accesses the builder directly and also returns an
- // RAII guard that restores the previous state when it goes out of scope.
- const auto savedInsertPoint = gIR->setInsertPoint(entryBB);
-
- // Clear the nakedAsm stream and collect the function body
- std::ostringstream &asmstr = gIR->nakedAsm;
- asmstr.str("");
-
- // Use the visitor to collect asm statements into nakedAsm
- ToNakedIRVisitor visitor(gIR);
- fd->fbody->accept(&visitor);
-
- if (global.errors) {
- fatal();
- }
-
- // Get the collected asm string and escape $ characters for LLVM inline asm.
- // In LLVM inline asm, $N refers to operand N, so literal $ must be escaped as $$.
- std::string asmBody;
- {
- std::string raw = asmstr.str();
- asmBody.reserve(raw.size() * 2); // Worst case: all $ characters
- for (char c : raw) {
- if (c == '$') {
- asmBody += "$$";
- } else {
- asmBody += c;
- }
- }
- }
- asmstr.str(""); // Clear for potential reuse
-
- // Create inline asm - the entire function body is a single asm block
- // No constraints needed since naked functions handle everything in asm
- llvm::FunctionType *asmFuncType =
- llvm::FunctionType::get(llvm::Type::getVoidTy(gIR->context()), false);
-
- llvm::InlineAsm *inlineAsm = llvm::InlineAsm::get(
- asmFuncType,
- asmBody,
- "", // No constraints
- true, // Has side effects
- false, // Not align stack
- llvm::InlineAsm::AD_ATT // AT&T syntax
- );
-
- gIR->ir->CreateCall(inlineAsm);
-
- // Naked functions don't return normally through LLVM IR
- gIR->ir->CreateUnreachable();
-
- // The savedInsertPoint RAII guard automatically restores the insert point
- // when it goes out of scope.
-}
-
////////////////////////////////////////////////////////////////////////////////
void emitABIReturnAsmStmt(IRAsmBlock *asmblock, Loc loc, |
This fixes segfault-handler tests for LDC Linux CI, apparently caused by an insufficient alternate-stack size. Claude was able to diagnose the problem: ldc-developers/ldc#5041 (comment)
…22473) This fixes segfault-handler tests for LDC Linux CI, apparently caused by an insufficient alternate-stack size. Claude was able to diagnose the problem: ldc-developers/ldc#5041 (comment)
…lang/dmd!22473) This fixes segfault-handler tests for LDC Linux CI, apparently caused by an insufficient alternate-stack size. Claude was able to diagnose the problem: #5041 (comment)
LLM disclosure: this entire PR is AI-generated. It might be OK, or it might be slop. The fix looks sensible on the surface, and it passes the test suite, however, so does most slop in general. Caveat emptor.
This is the more comprehensive alternative to #5036. Instead of hacking the front-end to track and deduplicate label names in assembly so that they don't collide in LTO, this changes LDC's handling of naked asm functions to use a completely different approach - LLVM IR with inline asm blocks. I understand this is what C++ does, and is how clang avoids the problem we were seeing.
LLM output follows:
Summary
This PR fixes issue #4294 where LTO linking fails with "symbol already defined" errors for naked template functions. The fix changes how naked functions are emitted: instead of generating module-level assembly, we now emit proper LLVM IR functions with inline assembly.
Problem
When compiling code that uses naked template functions (like
std.internal.math.biguintx86) with LTO (-flto=fullor-flto=thin), linking fails with errors like:Root Cause
The previous implementation emitted naked functions as module-level assembly using
gIR->module.appendModuleInlineAsm(). This approach has a fundamental problem with LTO:Why This Affects Specific Code
biguintx86.donly exists for x86 (32-bit). x86_64 usesbiguintnoasm.d(pure D)L1:,L_unrolled:) trigger the duplicate symbol errorsSolution
Emit naked functions as LLVM IR functions with inline assembly instead of module-level assembly:
Key Changes
nakedattributeOptimizeNoneandNoInlineattributes prevent LLVM from cloning the function (which would duplicate labels)printLabelName()to match the quoted format used in jump instructionsTesting
New Tests Added
tests/linking/asm_labels_lto.dtests/codegen/naked_asm_output.dtests/codegen/naked_asm_corner_cases.dTest Results
-O3 -release(includesbiguintx86.d)Verified Properties
.LfuncName_labelName:"axG"with comdat groupnaked noinline nounwind optnoneFiles Changed
gen/naked.cppDtoDefineNakedFunctionto emit LLVM IR instead of module asmtests/linking/asm_labels_lto.dtests/codegen/naked_asm_output.dtests/codegen/naked_asm_corner_cases.dTechnical Details
Before (Module Assembly)
After (LLVM IR with Inline Asm)
Why
OptimizeNoneandNoInline?During optimization passes (especially at
-O1and above), LLVM may clone or duplicate functions. For naked functions with assembly labels, this would create duplicate label definitions. These attributes prevent such transformations.Why
unreachableterminator?Naked functions handle their own return via assembly (
retinstruction). Theunreachableterminator tells LLVM not to generate any return sequence - the inline asm handles it.Compatibility
Limitations
Related Issues
multibyteMulAdd"is already defined" with-flto=fullon i686 #4294Checklist