Skip to content

[lld][ICF] Prevent merging two sections when they point to non-globals #136641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions lld/ELF/ICF.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@
#include "InputFiles.h"
#include "LinkerScript.h"
#include "OutputSections.h"
#include "Relocations.h"
#include "SymbolTable.h"
#include "Symbols.h"
#include "SyntheticSections.h"
Expand Down Expand Up @@ -356,6 +357,52 @@ static SmallVector<Symbol *> getRelocTargetSyms(const InputSection *sec) {
return getReloc(sec, rel.relas);
}

// A non-trivial relocation should ideally not be split by Machine outliner
// but it's not illegal to split it in which case ICF shouldn't fold outlined
// functions containing these relocations.
static bool isTrivialRelocationType(uint16_t emachine, RelType type) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be worth stating what our current definition of non-trivial is?

I think we're currently looking at relocations that are GOT entry generating, and are part of a sequence of relocated instructions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. GOT entry-generating && part of sequence (eg: ADRP/LDR) && valid to be split apart by machine outliner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the wording a bit to make it more. clear

if (emachine == EM_AARCH64) {
switch (type) {
case R_AARCH64_GOT_LD_PREL19:
case R_AARCH64_LD64_GOTOFF_LO15:
case R_AARCH64_ADR_GOT_PAGE:
case R_AARCH64_LD64_GOT_LO12_NC:
case R_AARCH64_LD64_GOTPAGE_LO15:
case R_AARCH64_TLSIE_MOVW_GOTTPREL_G1:
case R_AARCH64_TLSIE_MOVW_GOTTPREL_G0_NC:
case R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21:
case R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC:
case R_AARCH64_TLSIE_LD_GOTTPREL_PREL19:
case R_AARCH64_TLSDESC_LD_PREL19:
case R_AARCH64_TLSDESC_ADR_PREL21:
case R_AARCH64_TLSDESC_ADR_PAGE21:
case R_AARCH64_TLSDESC_LD64_LO12:
case R_AARCH64_TLSDESC_ADD_LO12:
case R_AARCH64_TLSDESC_OFF_G1:
case R_AARCH64_TLSDESC_OFF_G0_NC:
case R_AARCH64_AUTH_MOVW_GOTOFF_G0:
case R_AARCH64_AUTH_MOVW_GOTOFF_G0_NC:
case R_AARCH64_AUTH_MOVW_GOTOFF_G1:
case R_AARCH64_AUTH_MOVW_GOTOFF_G1_NC:
case R_AARCH64_AUTH_MOVW_GOTOFF_G2:
case R_AARCH64_AUTH_MOVW_GOTOFF_G2_NC:
case R_AARCH64_AUTH_MOVW_GOTOFF_G3:
case R_AARCH64_AUTH_GOT_LD_PREL19:
case R_AARCH64_AUTH_LD64_GOTOFF_LO15:
case R_AARCH64_AUTH_ADR_GOT_PAGE:
case R_AARCH64_AUTH_LD64_GOT_LO12_NC:
case R_AARCH64_AUTH_LD64_GOTPAGE_LO15:
case R_AARCH64_AUTH_GOT_ADD_LO12_NC:
case R_AARCH64_AUTH_GOT_ADR_PREL_LO21:
case R_AARCH64_AUTH_TLSDESC_ADR_PAGE21:
case R_AARCH64_AUTH_TLSDESC_LD64_LO12:
case R_AARCH64_AUTH_TLSDESC_ADD_LO12:
return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are the most likely candidates but there's a few more possibilities. We need to consider all relocations in https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst that had GDAT, GTLSDESC or GTPREL in their operation.

The group relocations 300 - 306 construct a constant from a series of MOVW and MOVK instructions. To the best of my knowledge no compiler uses these yet. They would be most usefil in a position-independent large code-model (.got > 4 GiB from the code), but no-one has yet needed anything that big.

TLS (these all use sequences of instructions and a GOT entry)
539 - 543 For the Initial Exec TLS relocations (GTPREL)
560 - 566 For the TLS Descriptor relocations (GTLSDESC)

The AUTH equivalent "Relocations for PAuth ABI extension" can be generated by clang when generating code for the PAuth ABI. These would be:
581 - 597

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In (#131630 (comment)) you mentioned TLSDESC should never be split up. So TLS relocations from 539..543 and 560..566 shouldn't be needed, right?

I can go ahead and add 300..306 AND 581-597 for PAuth ABI.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect it to break in GNU ld as that assumes that the sequence is emitted exactly. I've recently written that into the ABI in ARM-software/abi-aa#311 (not merged yet).

Putting it here, would be more a philosophical "this is a non trivial relocation" rather than something that would make a practical difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added all of them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the definition of "non-trivial" here should be "all relocation types which have a side-effect based on the symbol's identity".

And I think we should be able to avoid adding new target-specific code for this by using the existing TargetInfo::getRelExpr virtual function, and then switching on the returned RelExpr enum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunately not completely target-neutral. See

// The following is abstract relocation types used for only one target.
//
// Even though RelExpr is intended to be a target-neutral representation
// of a relocation type, there are some relocations whose semantics are
// unique to a target. Such relocation are marked with RE_<TARGET_NAME>.
RE_AARCH64_GOT_PAGE_PC,

So we will still end up with target specific stuff if we use this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking more into it, perhaps something is possible if I factor out some of functionality in Relocations.cpp (functionality that check if GOT or TLS needs to be handled). That would likely be a bigger diff than I was hoping but would do the job.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have addressed this concern in a separate commit (see the last commit) in #139493

}
}
return true;
}

// Compare two lists of relocations. Returns true if all pairs of
// relocations point to the same section in terms of ICF.
template <class ELFT>
Expand All @@ -375,6 +422,20 @@ bool ICF<ELFT>::variableEq(const InputSection *secA, Relocs<RelTy> ra,
auto *da = cast<Defined>(&sa);
auto *db = cast<Defined>(&sb);

// Merging sections here also means that we would mark corresponding
// relocation target symbols as equivalent, done later in ICF during section
// folding. To preserve correctness for such symbol equivalence (see
// GH#129122 for details), we also have to disable section merging here:
// 1. We don't merge local symbols into global symbols, or vice-versa. There
// are post-icf passes that assert on this behavior.
// 2. We also don't merge two local symbols together. There are post-icf
// passes that expect to see no duplicates when iterating over local
// symbols.
if ((da->isGlobal() != db->isGlobal()) &&
!isTrivialRelocationType(ctx.arg.emachine,
rai->getType(ctx.arg.isMips64EL)))
return false;

// We already dealt with absolute and non-InputSection symbols in
// constantEq, and for InputSections we have already checked everything
// except the equivalence class.
Expand Down
5 changes: 2 additions & 3 deletions lld/test/ELF/icf-ineligible.s
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,8 @@

# CHECK: selected section {{.*}}:(.text.f1)
# CHECK: removing identical section {{.*}}:(.text.f2)
# CHECK: removing identical section {{.*}}:(.text.f3)
# CHECK: selected section {{.*}}:(.text.f4)
# CHECK: removing identical section {{.*}}:(.text.f5)
# CHECK: redirecting 'd_alias' in symtab to 'd'
# CHECK: redirecting 'd_alias' to 'd'

.globl d, d_alias, fu, f1, f2, f3, f4, f5

Expand Down
Loading