-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[llvm-objdump][ELF]Fix crash when reading strings from .dynstr #125679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-binary-utilities Author: Cabbaken (cabbaken) ChangesThis change introduces a check for the strtab offset to prevent llvm-objdump from crashing when processing malformed ELF files. Full diff: https://github.com/llvm/llvm-project/pull/125679.diff 2 Files Affected:
diff --git a/llvm/test/tools/llvm-objdump/ELF/dynamic-section.test b/llvm/test/tools/llvm-objdump/ELF/dynamic-section.test
index 5205c5a3876d5f..33ca7e5b40c262 100644
--- a/llvm/test/tools/llvm-objdump/ELF/dynamic-section.test
+++ b/llvm/test/tools/llvm-objdump/ELF/dynamic-section.test
@@ -438,6 +438,9 @@ ProgramHeaders:
# RUN: yaml2obj --docnum=4 %s -o %t4
# RUN: llvm-objdump -p %t4 | FileCheck %s --strict-whitespace --check-prefix=INDENT
+# RUN: yaml2obj --docnum=5 %s -o %t5
+# RUN: llvm-objdump -p %t5 | FileCheck %s --strict-whitespace --check-prefix=INDENT
+
# INDENT: {{^}}Dynamic Section:
# INDENT: {{^}} NEEDED 0x
@@ -470,3 +473,34 @@ Sections:
Value: 0x1
- Tag: DT_NULL
Value: 0x0
+
+--- !ELF
+FileHeader:
+ Class: ELFCLASS64
+ Data: ELFDATA2LSB
+ Type: ET_EXEC
+ Machine: EM_X86_64
+Sections:
+ - Name: .dynstr
+ Type: SHT_STRTAB
+ Address: 0x1000
+ Size: 0x10
+ Content: "004400550066007700"
+ - Name: .dynamic
+ Type: SHT_DYNAMIC
+ Entries:
+ - Tag: DT_NEEDED
+ Value: 0x1245657656
+ - Tag: DT_STRTAB
+ Value: 0x1000
+ - Tag: DT_NULL
+ Value: 0x0
+ProgramHeaders:
+ - Type: PT_LOAD
+ VAddr: 0x1000
+ FirstSec: .dynstr
+ LastSec: .dynamic
+ - Type: PT_DYNAMIC
+ VAddr: 0x101D
+ FirstSec: .dynamic
+ LastSec: .dynamic
\ No newline at end of file
diff --git a/llvm/tools/llvm-objdump/ELFDump.cpp b/llvm/tools/llvm-objdump/ELFDump.cpp
index e9e5b059f1786e..949bf1c6faacb8 100644
--- a/llvm/tools/llvm-objdump/ELFDump.cpp
+++ b/llvm/tools/llvm-objdump/ELFDump.cpp
@@ -14,8 +14,11 @@
#include "ELFDump.h"
#include "llvm-objdump.h"
+#include "llvm/BinaryFormat/ELF.h"
#include "llvm/Demangle/Demangle.h"
+#include "llvm/Object/ELF.h"
#include "llvm/Object/ELFObjectFile.h"
+#include "llvm/Object/ELFTypes.h"
#include "llvm/Support/Format.h"
#include "llvm/Support/raw_ostream.h"
@@ -221,6 +224,20 @@ template <class ELFT> void ELFDumper<ELFT>::printDynamicSection() {
std::string TagFmt = " %-" + std::to_string(MaxLen) + "s ";
outs() << "\nDynamic Section:\n";
+ auto StringTableSize = (typename ELFT::Xword)0;
+ for (const auto &Sec : cantFail(Elf.sections())) {
+ if (Sec.sh_type == ELF::SHT_STRTAB)
+ StringTableSize =
+ StringTableSize < Sec.sh_size ? Sec.sh_size : StringTableSize;
+ }
+ for (const typename ELFT::Dyn &Dyn : DynamicEntries) {
+ if (Dyn.d_tag == ELF::DT_STRSZ) {
+ StringTableSize =
+ StringTableSize < Dyn.getVal() ? Dyn.getVal() : StringTableSize;
+ break;
+ }
+ }
+
for (const typename ELFT::Dyn &Dyn : DynamicEntries) {
if (Dyn.d_tag == ELF::DT_NULL)
continue;
@@ -235,6 +252,12 @@ template <class ELFT> void ELFDumper<ELFT>::printDynamicSection() {
Expected<StringRef> StrTabOrErr = getDynamicStrTab(Elf);
if (StrTabOrErr) {
const char *Data = StrTabOrErr->data();
+ if (Dyn.getVal() > StringTableSize) {
+ reportWarning("Invalid string table offset", Obj.getFileName());
+ outs() << format(TagFmt.c_str(), Str.c_str())
+ << format(Fmt, (uint64_t)Dyn.getVal());
+ continue;
+ }
outs() << format(TagFmt.c_str(), Str.c_str()) << Data + Dyn.getVal()
<< "\n";
continue;
|
I'm not sure why the tests failed. It doesn't seem related to this patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general llvm-readobj's error handling is better than llvm-objdump. You could look at its code for inspiration. The relevant patch might be https://reviews.llvm.org/D77216
Instead of a general title "Enhancing llvm-objdump stability and error handling", a specific one like "Fix ... when reading strings from .dynstr" or the like would be better and help future readers that find your patch.
…6612) This change introduces a check for the strtab offset to prevent llvm-objdump from crashing when processing malformed ELF files. Additionally, it modifies how llvm-objdump handles and outputs malformed ELF files with invalid string offsets. More info: https://discourse.llvm.org/t/should-llvm-objdump-objdump-display-actual-corrupted-values-in-malformed-elf-files/84391 Signed-off-by: cabbaken <cabbaken@outlook.com>
It's not uncommon for tests to fail for reasons unrelated to your PR. Typically this is because there's a bug at the point of main that you branched off of. Looking at the failures suggests that this is the case here. You can ignore them. |
Got it, thanks for your kindly response~ |
The default behaviour for merging is to use the PR title and description, not the commit message (although the initial default of the PR title and description is taken from the commit message). If you need to update these, you should edit the PR, not the commit message. Also, please avoid force pushing after updating, as it makes it harder to see what has changed between versions of the PR. Finally, please add "Fixes: #86612" to the PR description. In that way, when the PR is merged, the issue should auto-close, I believe. |
I've made the changes about the title and description of PR and will follow these guidelines in future work. |
@@ -438,6 +438,9 @@ ProgramHeaders: | |||
# RUN: yaml2obj --docnum=4 %s -o %t4 | |||
# RUN: llvm-objdump -p %t4 | FileCheck %s --strict-whitespace --check-prefix=INDENT | |||
|
|||
# RUN: yaml2obj --docnum=5 %s -o %t5 | |||
# RUN: llvm-objdump -p %t5 | FileCheck %s --strict-whitespace --check-prefix=INDENT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2>&1 | FileCheck ...
and use a different check prefix.
Test both the warning invalid string table offset
and the output. The output describes how llvm-objdump handles input input, and is useful as a change detector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, I will fix it soon. This is my first time using the llvm-test-suite, and I'm sorry for being unfamiliar with it.
llvm/tools/llvm-objdump/ELFDump.cpp
Outdated
StringTableSize < Sec.sh_size ? Sec.sh_size : StringTableSize; | ||
} | ||
for (const typename ELFT::Dyn &Dyn : DynamicEntries) { | ||
if (Dyn.d_tag == ELF::DT_STRSZ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The section header table is optional in loadable units (executables, shared objects). Dynamic loaders only read dynamic tags, not the section header table. The dynamic tags are source of truths for the dynamic symbol table size. You can keep just the DT_STRSZ code and ignore SHT_STRTAB
.
(
We could emit a warning when the two values mismatch, but llvm-readobj doesn't do it. For the purpose of fixing just this crash it may not be necessary to add the warning. Emitting a warning requires quite a few tests to make for an acceptable patch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I check SHT_STRTAB
because, in getDynamicStrTab(Elf)
, I noticed that both dynamic entries and sections are checked. Therefore, I believe checking the sections is necessary.
The test(tools/llvm-objdump/ELF/private-headers.test
) will fail without this check.
@MaskRay
Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delayed comments - I had them pending, but forgot to submit!
Type: SHT_DYNAMIC | ||
Entries: | ||
- Tag: DT_NEEDED | ||
Value: 0x1245657656 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps worth having a tag that has a value exactly matching the size, to test the edge case behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the edge of the value may depend on the the OS or the hardware.
Since this value is the offset of the .dynstr(we need this value to be exceed max virtual address), it may have different edge value in different architecture and OS. So we just need a value that big enough to test if llvm-objdump could process it correctly.
Please let me know if this interpretation is off or if there are additional considerations to be aware of, and thanks for your review~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check used is Dyn.getVal() >= StringTableSize
, so the edge I'm referring to is the StringTableSize
, i.e. an address at exactly StringTableSize
produces a warning, whereas one that is one before this value does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, if the Dyn.getVal()
just greater than StringTableSize
but less than valid virtual memory size, the program won't crash. And this actually happens quite often(when you see some meaningless char in the output of llvm-objdump -x
). It was be specifically described here.
So the value of Dyn.getVal()
should be large enough to make the program crash, I think.
Please let me know if I misunderstood something. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't be reading memory outside the file image, or you could end up trying to read memory belonging e.g. to another a process entirely (depending on how memory safety works on your particular OS). A crash is just one of possible behaviours. For example, it's not impossible you'd print something that appears meaningful but is actually completely bogus.
As a valid virtual address will depend on the host OS, that isn't something that is worth targeting. Instead, the thing you should be targeting is where the behaviour changes - namely when the address points at or past the end of the string table. In that way, you can show you get the warning when the value reaches the bad point, but not when it is just before. We don't need to trigger a crash, showing that we don't read bogus data is good enough, since it's the same problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand now!
I will fix it in next commit.
llvm/tools/llvm-objdump/ELFDump.cpp
Outdated
for (const typename ELFT::Dyn &Dyn : DynamicEntries) { | ||
if (Dyn.d_tag == ELF::DT_STRSZ) { | ||
StringTableSize = | ||
StringTableSize < Dyn.getVal() ? Dyn.getVal() : StringTableSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it doesn't exist already somewhere, I think you may want a check that shows that the DT_STRSZ value makes sense, i.e. doesn't point past the end of the data. This could be a separate PR, but if you're concerned about invalid DT_NEEDED values, it makes equal sense to be concerned about invalid DT_STRSZ. NB: this could be checked already when you get the dynamic string table later - I haven't looked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I will check it in the next PR.
Co-authored-by: James Henderson <James.Henderson@sony.com>
Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
Type: SHT_DYNAMIC | ||
Entries: | ||
- Tag: DT_NEEDED | ||
Value: 0x1245657656 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't be reading memory outside the file image, or you could end up trying to read memory belonging e.g. to another a process entirely (depending on how memory safety works on your particular OS). A crash is just one of possible behaviours. For example, it's not impossible you'd print something that appears meaningful but is actually completely bogus.
As a valid virtual address will depend on the host OS, that isn't something that is worth targeting. Instead, the thing you should be targeting is where the behaviour changes - namely when the address points at or past the end of the string table. In that way, you can show you get the warning when the value reaches the bad point, but not when it is just before. We don't need to trigger a crash, showing that we don't read bogus data is good enough, since it's the same problem.
llvm/tools/llvm-objdump/ELFDump.cpp
Outdated
Expected<const typename ELFT::Shdr *> StringTableSecOrError = | ||
getSection<ELFT>(cantFail(Elf.sections()), Sec.sh_link); | ||
if (!StringTableSecOrError) { | ||
consumeError(StringTableSecOrError.takeError()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably shouldn't be throwing away this error and instead should report it and bail out of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I will fix it in next commit.
llvm/tools/llvm-objdump/ELFDump.cpp
Outdated
StringTableSize = StringTableSize < (*StringTableSecOrError)->sh_size | ||
? (*StringTableSecOrError)->sh_size | ||
: StringTableSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ternary isn't necessary. The dynamic section will only reference one string table, so you'll only have one size by this point.
In fact, once you find the dynamic section and have read the string size from it, you can bail out of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, I’ll fix it.
But if I apply the condition (if (Sec.sh_type == ELF::SHT_DYNSYM || Sec.sh_type == ELF::SHT_DYNAMIC)
), this might be necessary.
I’ll implement this first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This point is probably moot, if we switch to simply relying on the size returned by getDynamicStrTab
(after fixing getDynamicStrTab
as I've described, because this code should disappear.
llvm/tools/llvm-objdump/ELFDump.cpp
Outdated
for (const typename ELFT::Dyn &Dyn : DynamicEntries) { | ||
if (Dyn.d_tag == ELF::DT_STRSZ) { | ||
StringTableSize = | ||
StringTableSize < Dyn.getVal() ? Dyn.getVal() : StringTableSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced we should be picking the larger of the two sizes, assuming a section one is found. It would be safer to pick the smaller one. That being said, if this matches llvm-readobj, I'm okay with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm-readobj
use ELF::DT_STRSZ
to check string table size here: llvm/tools/llvm-readobj/ELFDumper.cpp:2007: parseDynamicTable()
.
And I think the ternary here is necessary, because, the StringTableSize
may be 0
after the loop of the sections(the sections may be stripped), it will finally be 0
without this ternary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, I think the point is moot now, since this code should disappear once we've fixed getDynamicStrTab
.
Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
FYI, it'll be next week before I come back to this. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
@@ -63,14 +63,22 @@ static Expected<StringRef> getDynamicStrTab(const ELFFile<ELFT> &Elf) { | |||
if (!DynamicEntriesOrError) | |||
return DynamicEntriesOrError.takeError(); | |||
|
|||
typename ELFT::Xword StringTableSize{0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm style prefers = 0
to {0}
. and = nullptr
below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the constructor is marked as
explicit
. Try this:typename ELFT::Xword StringTableSize{0};
This worked locally for me with a Visual Studio build at least.
#125679 (comment)
I followed this. Do you have any suggestions on which style is the correct one to use here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typename ELFT::Xword StringTableSize = 0;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typename ELFT::Xword StringTableSize = 0;
I got error with this.
No viable conversion from 'int' to 'typename ELFType<llvm::endianness::little, false>::Xword' (aka 'packed_endian_specific_integral<unsigned long, (llvm::endianness)1, 1>')
I have discussed this with @jh7370 here, I initially use
typename ELFT::Xword StringTableSize = (typename ELFT::Xword)0;
which is unnecessarily redundant, then I modified to
auto StringTableSize = (typename ELFT::Xword)0;
but @jh7370 recommended
typename ELFT::Xword StringTableSize{0};
More info: llvm#125679 (comment) Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
The dynamic string table used by the dynamic section is referenced by the sh_link field of that section, so we should use that directly, rather than going via the dynamic symbol table. More info: #125679 (comment) Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
…) (#127975) The dynamic string table used by the dynamic section is referenced by the sh_link field of that section, so we should use that directly, rather than going via the dynamic symbol table. More info: llvm/llvm-project#125679 (comment) Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. |
❤️❤️❤️❤️ |
This change introduces a check for the strtab offset to prevent llvm-objdump from crashing when processing malformed ELF files.
It provide a minimal reproduce test for #86612 (comment).
Additionally, it modifies how llvm-objdump handles and outputs malformed ELF files with invalid string offsets.(More info: https://discourse.llvm.org/t/should-llvm-objdump-objdump-display-actual-corrupted-values-in-malformed-elf-files/84391)
Fixes: #86612