-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[lld][ELF] Merge equivalent symbols found during ICF #139493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-lld-elf @llvm/pr-subscribers-lld Author: Pranav Kant (pranavk) ChangesFixes scenario reported in #134342 (comment) Full diff: https://github.com/llvm/llvm-project/pull/139493.diff 2 Files Affected:
diff --git a/lld/ELF/ICF.cpp b/lld/ELF/ICF.cpp
index 1882116d4d5f0..797d0968da6c2 100644
--- a/lld/ELF/ICF.cpp
+++ b/lld/ELF/ICF.cpp
@@ -334,26 +334,27 @@ bool ICF<ELFT>::equalsConstant(const InputSection *a, const InputSection *b) {
: constantEq(a, ra.relas, b, rb.relas);
}
-template <class RelTy>
-static SmallVector<Symbol *> getReloc(const InputSection *sec,
+template <class ELFT, class RelTy>
+static SmallVector<std::pair<Symbol *, uint64_t>> getReloc(const InputSection *sec,
Relocs<RelTy> relocs) {
- SmallVector<Symbol *> syms;
+ SmallVector<std::pair<Symbol *, uint64_t>> syms;
for (auto ri = relocs.begin(), re = relocs.end(); ri != re; ++ri) {
Symbol &sym = sec->file->getRelocTargetSym(*ri);
- syms.push_back(&sym);
+ uint64_t addend = getAddend<ELFT>(*ri);
+ syms.push_back(std::make_pair(&sym, addend));
}
return syms;
}
template <class ELFT>
-static SmallVector<Symbol *> getRelocTargetSyms(const InputSection *sec) {
+static SmallVector<std::pair<Symbol *, uint64_t>> getRelocTargets(const InputSection *sec) {
const RelsOrRelas<ELFT> rel = sec->template relsOrRelas<ELFT>();
if (rel.areRelocsCrel())
- return getReloc(sec, rel.crels);
+ return getReloc<ELFT>(sec, rel.crels);
if (rel.areRelocsRel())
- return getReloc(sec, rel.rels);
+ return getReloc<ELFT>(sec, rel.rels);
- return getReloc(sec, rel.relas);
+ return getReloc<ELFT>(sec, rel.relas);
}
// Compare two lists of relocations. Returns true if all pairs of
@@ -568,19 +569,19 @@ template <class ELFT> void ICF<ELFT>::run() {
if (end - begin == 1)
return;
print() << "selected section " << sections[begin];
- SmallVector<Symbol *> syms = getRelocTargetSyms<ELFT>(sections[begin]);
+ SmallVector<std::pair<Symbol *, uint64_t>> syms = getRelocTargets<ELFT>(sections[begin]);
for (size_t i = begin + 1; i < end; ++i) {
print() << " removing identical section " << sections[i];
sections[begin]->replace(sections[i]);
- SmallVector<Symbol *> replacedSyms =
- getRelocTargetSyms<ELFT>(sections[i]);
+ SmallVector<std::pair<Symbol *, uint64_t>> replacedSyms =
+ getRelocTargets<ELFT>(sections[i]);
assert(syms.size() == replacedSyms.size() &&
"Should have same number of syms!");
for (size_t i = 0; i < syms.size(); i++) {
- if (syms[i] == replacedSyms[i] || !syms[i]->isGlobal() ||
- !replacedSyms[i]->isGlobal())
+ if (syms[i].first == replacedSyms[i].first || !syms[i].first->isGlobal() ||
+ !replacedSyms[i].first->isGlobal() || syms[i].second != replacedSyms[i].second)
continue;
- symbolEquivalence.unionSets(syms[i], replacedSyms[i]);
+ symbolEquivalence.unionSets(syms[i].first, replacedSyms[i].first);
}
// At this point we know sections merged are fully identical and hence
diff --git a/lld/test/ELF/icf-addend.s b/lld/test/ELF/icf-addend.s
new file mode 100644
index 0000000000000..0023bbf2ff421
--- /dev/null
+++ b/lld/test/ELF/icf-addend.s
@@ -0,0 +1,33 @@
+# REQUIRES: x86
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
+# RUN: ld.lld %t.o -o /dev/null --icf=all --print-icf-sections | FileCheck %s
+
+# Check that ICF doesn't fold different symbols when functions referencing them
+# can be folded because they are pointing to the same address.
+
+# CHECK: selected section {{.*}}:(.text.f1)
+# CHECK: removing identical section {{.*}}:(.text.f2)
+# CHECK-NOT: redirecting 'y' in symtab to 'x'
+# CHECK-NOT: redirecting 'y' to 'x'
+
+.globl x, y
+
+.section .rodata,"a",@progbits
+x:
+.long 11
+y:
+.long 12
+
+.section .text.f1,"ax",@progbits
+f1:
+movq x+4(%rip), %rax
+
+.section .text.f2,"ax",@progbits
+f2:
+movq y(%rip), %rax
+
+.section .text._start,"ax",@progbits
+.globl _start
+_start:
+call f1
+call f2
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
As per previous discussion, the symbol folding needs to be done in all circumstances where section folding considers the symbols as equivalent -- at least for the special relocation types. So (at least for those kinds of relocs) I think we must forbid the fold of sections with relocations having different offsets in the first place. (Rather than just avoid folding the symbols, as this patch does). |
This reverts commit fd3fecf.
lld/ELF/ICF.cpp
Outdated
// We should not merge sections in the first place if their symbols | ||
// cannot be merged together. | ||
bool canMergeSymbols = addA == addB; | ||
if (da->value + addA == db->value + addB && canMergeSymbols) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i can simplify this condition too but i think this is more readable in its current form.
Original lld commit was reverted. So I have reapplied that commit in this PR and committed changes to fix the problem in a different commit. You can just look at latest commit to see those changes to make review simple. This commit basically stops section folding when addends are different which would automatically stop symbol folding. Yes, it means that we would be folding less sections now. I am hoping we wouldn't see too much of this pattern. So it's not going to be noticeable regression. One way to fix this would be to only stop section folding when the relocation is non-trivial but as we discussed in #136641, that's most likely going to bring target-specific logic in which we are not sure we want to do. We can continue building consensus on that in that PR while we let this in? |
I have added separate commits for addressing concern in #136641. I am okay splitting it across multiple PRs but I feel everything is so closely related, perhaps it should be just one PR. |
Since the "merge symbol" PR has been reverted. This title is no longer accurate. You need to include the original description and append description about addends, potentially simplify the concatenated description. |
@MaskRay done. |
@@ -286,9 +313,14 @@ bool ICF<ELFT>::constantEq(const InputSection *secA, Relocs<RelTy> ra, | |||
// Relocations referring to InputSections are constant-equal if their | |||
// section offsets are equal. | |||
if (isa<InputSection>(da->section)) { | |||
if (da->value + addA == db->value + addB) | |||
if (da->value + addA == db->value + addB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was not in #134342 . I have tried to understand why the logic is so complex.
Looks like https://reviews.llvm.org/D34094 , while supporting MergeInputSection, also introduced the untested condition for InputSection
if (isa<InputSection>(DA->Section))
return DA->Value + AddA == DB->Value + AddB;
I feel that we should switch to the safer da->value == db->value && addA == addB
, which likely doesn't lose anything (I believe it's difficult for clang emitted code to utilize the relaxed condition.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think da->value + addA == db->value + addB
is more readable. I can read it as if the address resolves to the same place, sections are considered same. It also takes care of section folding for more cases. Changing it to da->value == db->value && addA == addB
would exclude some cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing it to
da->value == db->value && addA == addB
would exclude some cases.
The excluded cases are likely negligible. This condition is for InputSection (non merge string), where symbols defined not at offset 0 in -ffunction-sections -fdata-sections code are extremely rare. I don't know a case LLVM codegen would do this.
// getting merged into each other (done later in ICF). We do this as | ||
// post-ICF passes cannot handle duplicates when iterating over local | ||
// symbols. There are also assertions that prevent this. | ||
if ((!da->isGlobal() || !db->isGlobal()) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No test seems to cover this block . Actually I don't understand why we need this check (#134342 (comment)). If we merge two sections, symbols defined relative to them, even if global, can be merged as well, regardless whether there are GOT/TLS relocations.
If we delete the isTrivialRelocation
check from #136641 , we can get rid of the OffsetGetter complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add the test for this.
Regarding why we need this, the issue was brought to the attention first here in this comment: #131630 (comment)
Quoting part of James' comment:
I also added a new check that ensures local symbols aren't merged with global symbols or vice-versa, otherwise assert(b->isLocal() && "should have been caught in initializeSymbols()"); is triggered by lld/test/ELF/icf-ineligible.s. (Instead, that test now fails because the sections referring to a local symbol aren't merged with those referring to the equivalent global symbol, as the test expects, but maybe that's OK.)
So we are not merging sections (and that automatically implies not merging their symbols) when there are local symbols involved. Note that we are only avoiding merging of such sections when it's a non-trivial relocation as that's when it will cause the original 'GOT merging' problem that motivated this series of changes.
Regarding deleting isTrivialRelocation()
, I am currently using it in two places -- here and previous block. Both usages try to merge sections when only trivial relocations are involved as our original 'GOT merging' problem only occurs with non-trivial relocation.
Basically, the invariant we are trying to maintain here is:
When there's a non-trivial relocation, we shouldn't merge the sections unless their symbols can be merged
We can try to satisfy this invariant broadly by removing !isTrivialRelocation()
check but that will fold less sections. I believe the current code satisfies the above invariant as tightly as possible to maximize section folding without causing 'GOT merging' problem.
Let'me rephrase the problem. When we have (#131630 (comment))
This aggressive section merging, without accounting for symbol identity, causes correctness issues in the AArch64 MachineOutliner case (#129122). To address this, we propose merging symbols S1 and S2 (by redirecting their To resolve this, the proposed solution in this pull request is to:
The proposed changes add considerable complexity, and I'm still uncertain about incorporating them into ICF.cpp. That said, thank you for creating the test cases! If I were designing an ABI, I would simply disallow GOT code sequences that span multiple sections. |
I did try to see if there was any support internally for restricting this in the ABI. A summary of the outcome:
So I don't think the ABI is going to mandate that the outliner be changed. However I think there's sufficient recommendation that there is justification to do it. At the moment I expect GOT relocations to local symbols to be restricted to non-default code-models or extensions like memory tagging/PAuthABI. Is there a cheaper way of detecting when this case has occurred? One alternative might be to detect and error out with a message to turn off icf, or the outliner if the binary is affected? I can't immediately think of one though. |
This reapplies #134342 which was reverted in fd3fecf
Original description:
Fixes a correctness issue for AArch64 when ADRP and LDR instructions are outlined in separate sections and sections are fed to ICF for deduplication.
See test case (based on #129122) for details. All rodata.* sections are folded into a single section with ICF. This leads to all f2_* function sections getting folded into one (as their relocation target symbols g* belong to .rodata.g* sections that have already been folded into one). Since relocations still refer original g* symbols, we end up creating duplicate GOT entry for all such symbols. This PR addresses that by tracking such folded symbols and create one GOT entry for all such symbols.
Fixes #129122
In addition to the original fix, this reland also fixes following issues: