Skip to content

[lld/COFF] Fix -start-lib / -end-lib more after reviews.llvm.org/D116434 #124294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 24, 2025

Conversation

nico
Copy link
Contributor

@nico nico commented Jan 24, 2025

This is a follow-up to #120452 in a way.

Since lld/COFF does not yet insert all defined in an obj file before all undefineds (ELF and MachO do this, see #67445 and things linked from there), it's possible that:

  1. We add an obj file a.obj
  2. a.obj contains an undefined that's in b.obj, causing b.obj to be added
  3. b.obj contains an undefined that's in a part of a.obj that's not yet in the symbol table, causing a recursive load of a.obj, which adds the symbols in there twice, leading to duplicate symbol errors.

For normal archives, ArchiveFile::addMember() has a seen check to prevent this. For start-lib lazy objects, we can just check if the archive is still lazy at the recursive call.

This bug is similar to issue #59162.

(Eventually, we'll probably want to do what the MachO and ELF ports do.)

Includes a test that caused duplicate symbol diagnostics before this code change.

This is a follow-up to llvm#120452 in a way.

Since lld/COFF does not yet insert all defined in an obj file before
all undefineds (ELF and MachO do this, see llvm#67445 and things linked
from there), it's possible that:

1. We add an obj file a.obj
2. a.obj contains an undefined that's in b.obj, causing b.obj to be
   added
3. b.obj contains an undefined that's in a part of a.obj that's not
   yet in the symbol table, causing a recursive load of a.obj,
   which adds the symbols in there twice, leading to duplicate symbol
   errors.

For normal archives, `ArchiveFile::addMember()` has a `seen` check
to prevent this. For start-lib lazy objects, we can just check if
the archive is still lazy at the recursive call.

This bug is similar to issue llvm#59162.

(Eventually, we'll probably want to do what the MachO and ELF ports
do.)

Includes a test that caused duplicate symbol diagnostics before this
code change.
@llvmbot
Copy link
Member

llvmbot commented Jan 24, 2025

@llvm/pr-subscribers-lld-coff

Author: Nico Weber (nico)

Changes

This is a follow-up to #120452 in a way.

Since lld/COFF does not yet insert all defined in an obj file before all undefineds (ELF and MachO do this, see #67445 and things linked from there), it's possible that:

  1. We add an obj file a.obj
  2. a.obj contains an undefined that's in b.obj, causing b.obj to be added
  3. b.obj contains an undefined that's in a part of a.obj that's not yet in the symbol table, causing a recursive load of a.obj, which adds the symbols in there twice, leading to duplicate symbol errors.

For normal archives, ArchiveFile::addMember() has a seen check to prevent this. For start-lib lazy objects, we can just check if the archive is still lazy at the recursive call.

This bug is similar to issue #59162.

(Eventually, we'll probably want to do what the MachO and ELF ports do.)

Includes a test that caused duplicate symbol diagnostics before this code change.


Full diff: https://github.com/llvm/llvm-project/pull/124294.diff

2 Files Affected:

  • (modified) lld/COFF/SymbolTable.cpp (+2)
  • (modified) lld/test/COFF/start-lib.ll (+69)
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index 32ea4a5b2e1fc3..4ade347875fde3 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -56,6 +56,8 @@ static void forceLazy(Symbol *s) {
   }
   case Symbol::Kind::LazyObjectKind: {
     InputFile *file = cast<LazyObject>(s)->file;
+    if (!file->lazy)
+      return;
     file->lazy = false;
     file->symtab.ctx.driver.addFile(file);
     break;
diff --git a/lld/test/COFF/start-lib.ll b/lld/test/COFF/start-lib.ll
index a46147f21ccbb3..134cdc2a6e1df6 100644
--- a/lld/test/COFF/start-lib.ll
+++ b/lld/test/COFF/start-lib.ll
@@ -173,3 +173,72 @@ target triple = "x86_64-pc-windows-msvc"
 define void @baz() {
   ret void
 }
+
+
+; Check cycles between symbols in two /start-lib files.
+; If the links succeed and does not emit duplicate symbol diagnostics,
+; that's enough.
+
+; RUN: llc -filetype=obj %t.dir/main3.ll -o %t-main3.obj
+; RUN: llc -filetype=obj %t.dir/cycle1.ll -o %t-cycle1.obj
+; RUN: llc -filetype=obj %t.dir/cycle2.ll -o %t-cycle2.obj
+; RUN: opt -thinlto-bc %t.dir/main3.ll -o %t-main3.bc
+; RUN: opt -thinlto-bc %t.dir/cycle1.ll -o %t-cycle1.bc
+; RUN: opt -thinlto-bc %t.dir/cycle2.ll -o %t-cycle2.bc
+
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.obj %t-cycle1.obj %t-cycle2.obj
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.obj /start-lib %t-cycle1.obj %t-cycle2.obj /end-lib
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     /start-lib %t-cycle1.obj %t-cycle2.obj /end-lib %t-main3.obj
+
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.bc %t-cycle1.bc %t-cycle2.bc
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.bc /start-lib %t-cycle1.bc %t-cycle2.bc /end-lib
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     /start-lib %t-cycle1.bc %t-cycle2.bc /end-lib %t-main3.bc
+
+#--- main3.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @foo1()
+
+define void @main() {
+  call void () @foo1()
+  ret void
+}
+
+#--- cycle1.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @bar()
+
+define void @foo1() {
+  ; cycle1.ll pulls in cycle2.ll for bar(), and cycle2.ll then pulls in
+  ; cycle1.ll again for foo2().
+  call void () @bar()
+  ret void
+}
+
+define void @foo2() {
+  ret void
+}
+
+
+#--- cycle2.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @foo2()
+
+define void @bar() {
+  call void () @foo2()
+  ret void
+}

@llvmbot
Copy link
Member

llvmbot commented Jan 24, 2025

@llvm/pr-subscribers-lld

Author: Nico Weber (nico)

Changes

This is a follow-up to #120452 in a way.

Since lld/COFF does not yet insert all defined in an obj file before all undefineds (ELF and MachO do this, see #67445 and things linked from there), it's possible that:

  1. We add an obj file a.obj
  2. a.obj contains an undefined that's in b.obj, causing b.obj to be added
  3. b.obj contains an undefined that's in a part of a.obj that's not yet in the symbol table, causing a recursive load of a.obj, which adds the symbols in there twice, leading to duplicate symbol errors.

For normal archives, ArchiveFile::addMember() has a seen check to prevent this. For start-lib lazy objects, we can just check if the archive is still lazy at the recursive call.

This bug is similar to issue #59162.

(Eventually, we'll probably want to do what the MachO and ELF ports do.)

Includes a test that caused duplicate symbol diagnostics before this code change.


Full diff: https://github.com/llvm/llvm-project/pull/124294.diff

2 Files Affected:

  • (modified) lld/COFF/SymbolTable.cpp (+2)
  • (modified) lld/test/COFF/start-lib.ll (+69)
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index 32ea4a5b2e1fc3..4ade347875fde3 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -56,6 +56,8 @@ static void forceLazy(Symbol *s) {
   }
   case Symbol::Kind::LazyObjectKind: {
     InputFile *file = cast<LazyObject>(s)->file;
+    if (!file->lazy)
+      return;
     file->lazy = false;
     file->symtab.ctx.driver.addFile(file);
     break;
diff --git a/lld/test/COFF/start-lib.ll b/lld/test/COFF/start-lib.ll
index a46147f21ccbb3..134cdc2a6e1df6 100644
--- a/lld/test/COFF/start-lib.ll
+++ b/lld/test/COFF/start-lib.ll
@@ -173,3 +173,72 @@ target triple = "x86_64-pc-windows-msvc"
 define void @baz() {
   ret void
 }
+
+
+; Check cycles between symbols in two /start-lib files.
+; If the links succeed and does not emit duplicate symbol diagnostics,
+; that's enough.
+
+; RUN: llc -filetype=obj %t.dir/main3.ll -o %t-main3.obj
+; RUN: llc -filetype=obj %t.dir/cycle1.ll -o %t-cycle1.obj
+; RUN: llc -filetype=obj %t.dir/cycle2.ll -o %t-cycle2.obj
+; RUN: opt -thinlto-bc %t.dir/main3.ll -o %t-main3.bc
+; RUN: opt -thinlto-bc %t.dir/cycle1.ll -o %t-cycle1.bc
+; RUN: opt -thinlto-bc %t.dir/cycle2.ll -o %t-cycle2.bc
+
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.obj %t-cycle1.obj %t-cycle2.obj
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.obj /start-lib %t-cycle1.obj %t-cycle2.obj /end-lib
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     /start-lib %t-cycle1.obj %t-cycle2.obj /end-lib %t-main3.obj
+
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.bc %t-cycle1.bc %t-cycle2.bc
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.bc /start-lib %t-cycle1.bc %t-cycle2.bc /end-lib
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     /start-lib %t-cycle1.bc %t-cycle2.bc /end-lib %t-main3.bc
+
+#--- main3.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @foo1()
+
+define void @main() {
+  call void () @foo1()
+  ret void
+}
+
+#--- cycle1.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @bar()
+
+define void @foo1() {
+  ; cycle1.ll pulls in cycle2.ll for bar(), and cycle2.ll then pulls in
+  ; cycle1.ll again for foo2().
+  call void () @bar()
+  ret void
+}
+
+define void @foo2() {
+  ret void
+}
+
+
+#--- cycle2.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @foo2()
+
+define void @bar() {
+  call void () @foo2()
+  ret void
+}

@llvmbot
Copy link
Member

llvmbot commented Jan 24, 2025

@llvm/pr-subscribers-platform-windows

Author: Nico Weber (nico)

Changes

This is a follow-up to #120452 in a way.

Since lld/COFF does not yet insert all defined in an obj file before all undefineds (ELF and MachO do this, see #67445 and things linked from there), it's possible that:

  1. We add an obj file a.obj
  2. a.obj contains an undefined that's in b.obj, causing b.obj to be added
  3. b.obj contains an undefined that's in a part of a.obj that's not yet in the symbol table, causing a recursive load of a.obj, which adds the symbols in there twice, leading to duplicate symbol errors.

For normal archives, ArchiveFile::addMember() has a seen check to prevent this. For start-lib lazy objects, we can just check if the archive is still lazy at the recursive call.

This bug is similar to issue #59162.

(Eventually, we'll probably want to do what the MachO and ELF ports do.)

Includes a test that caused duplicate symbol diagnostics before this code change.


Full diff: https://github.com/llvm/llvm-project/pull/124294.diff

2 Files Affected:

  • (modified) lld/COFF/SymbolTable.cpp (+2)
  • (modified) lld/test/COFF/start-lib.ll (+69)
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index 32ea4a5b2e1fc3..4ade347875fde3 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -56,6 +56,8 @@ static void forceLazy(Symbol *s) {
   }
   case Symbol::Kind::LazyObjectKind: {
     InputFile *file = cast<LazyObject>(s)->file;
+    if (!file->lazy)
+      return;
     file->lazy = false;
     file->symtab.ctx.driver.addFile(file);
     break;
diff --git a/lld/test/COFF/start-lib.ll b/lld/test/COFF/start-lib.ll
index a46147f21ccbb3..134cdc2a6e1df6 100644
--- a/lld/test/COFF/start-lib.ll
+++ b/lld/test/COFF/start-lib.ll
@@ -173,3 +173,72 @@ target triple = "x86_64-pc-windows-msvc"
 define void @baz() {
   ret void
 }
+
+
+; Check cycles between symbols in two /start-lib files.
+; If the links succeed and does not emit duplicate symbol diagnostics,
+; that's enough.
+
+; RUN: llc -filetype=obj %t.dir/main3.ll -o %t-main3.obj
+; RUN: llc -filetype=obj %t.dir/cycle1.ll -o %t-cycle1.obj
+; RUN: llc -filetype=obj %t.dir/cycle2.ll -o %t-cycle2.obj
+; RUN: opt -thinlto-bc %t.dir/main3.ll -o %t-main3.bc
+; RUN: opt -thinlto-bc %t.dir/cycle1.ll -o %t-cycle1.bc
+; RUN: opt -thinlto-bc %t.dir/cycle2.ll -o %t-cycle2.bc
+
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.obj %t-cycle1.obj %t-cycle2.obj
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.obj /start-lib %t-cycle1.obj %t-cycle2.obj /end-lib
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     /start-lib %t-cycle1.obj %t-cycle2.obj /end-lib %t-main3.obj
+
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.bc %t-cycle1.bc %t-cycle2.bc
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     %t-main3.bc /start-lib %t-cycle1.bc %t-cycle2.bc /end-lib
+; RUN: lld-link -out:%t3.exe -entry:main \
+; RUN:     /start-lib %t-cycle1.bc %t-cycle2.bc /end-lib %t-main3.bc
+
+#--- main3.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @foo1()
+
+define void @main() {
+  call void () @foo1()
+  ret void
+}
+
+#--- cycle1.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @bar()
+
+define void @foo1() {
+  ; cycle1.ll pulls in cycle2.ll for bar(), and cycle2.ll then pulls in
+  ; cycle1.ll again for foo2().
+  call void () @bar()
+  ret void
+}
+
+define void @foo2() {
+  ret void
+}
+
+
+#--- cycle2.ll
+
+target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc"
+
+declare void @foo2()
+
+define void @bar() {
+  call void () @foo2()
+  ret void
+}

@nico nico merged commit d9b8120 into llvm:main Jan 24, 2025
6 of 7 checks passed
@nico nico deleted the lld-coff-start-lib-2 branch January 24, 2025 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants