-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[llvm-readobj] Support --string-table for COFF #141552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-binary-utilities Author: Haohai Wen (HaohaiWen) ChangesFull diff: https://github.com/llvm/llvm-project/pull/141552.diff 5 Files Affected:
diff --git a/llvm/include/llvm/Object/COFF.h b/llvm/include/llvm/Object/COFF.h
index 3d0738c409049..bf13c0d01cfd9 100644
--- a/llvm/include/llvm/Object/COFF.h
+++ b/llvm/include/llvm/Object/COFF.h
@@ -939,6 +939,10 @@ class COFFObjectFile : public ObjectFile {
return uintptr_t(0);
}
+ StringRef getStringTable() const {
+ return StringRef(StringTable, StringTableSize);
+ }
+
uint16_t getMachine() const {
if (COFFHeader) {
if (CHPEMetadata) {
diff --git a/llvm/test/tools/llvm-readobj/COFF/string-table.test b/llvm/test/tools/llvm-readobj/COFF/string-table.test
new file mode 100644
index 0000000000000..6165594b5914f
--- /dev/null
+++ b/llvm/test/tools/llvm-readobj/COFF/string-table.test
@@ -0,0 +1,60 @@
+# RUN: yaml2obj %s -o %t.obj
+# RUN: llvm-readobj --string-table %t.obj | FileCheck %s
+
+# CHECK: StringTable {
+# CHECK: Length: 49
+# CHECK: [ 4] .debug_str
+# CHECK: [ f] ?x@@3HA_test_test
+# CHECK: [ 21] _main_test_test
+# CHECK: }
+
+--- !COFF
+header:
+ Machine: IMAGE_FILE_MACHINE_AMD64
+ Characteristics: []
+sections:
+ - Name: .text
+ Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
+ Alignment: 16
+ SectionData: 508D0500000000C70424000000005AC3
+ - Name: .debug_str
+ Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
+ Alignment: 1
+ SectionData: ''
+symbols:
+ - Name: .text
+ Value: 0
+ SectionNumber: 1
+ SimpleType: IMAGE_SYM_TYPE_NULL
+ ComplexType: IMAGE_SYM_DTYPE_NULL
+ StorageClass: IMAGE_SYM_CLASS_STATIC
+ SectionDefinition:
+ Length: 16
+ NumberOfRelocations: 1
+ NumberOfLinenumbers: 0
+ CheckSum: 0
+ Number: 1
+ - Name: .debug_str
+ Value: 0
+ SectionNumber: 2
+ SimpleType: IMAGE_SYM_TYPE_NULL
+ ComplexType: IMAGE_SYM_DTYPE_NULL
+ StorageClass: IMAGE_SYM_CLASS_STATIC
+ SectionDefinition:
+ Length: 0
+ NumberOfRelocations: 0
+ NumberOfLinenumbers: 0
+ CheckSum: 0
+ Number: 2
+ - Name: '?x@@3HA_test_test'
+ Value: 0
+ SectionNumber: 1
+ SimpleType: IMAGE_SYM_TYPE_NULL
+ ComplexType: IMAGE_SYM_DTYPE_NULL
+ StorageClass: IMAGE_SYM_CLASS_EXTERNAL
+ - Name: _main_test_test
+ Value: 0
+ SectionNumber: 1
+ SimpleType: IMAGE_SYM_TYPE_NULL
+ ComplexType: IMAGE_SYM_DTYPE_FUNCTION
+ StorageClass: IMAGE_SYM_CLASS_EXTERNAL
diff --git a/llvm/tools/llvm-readobj/COFFDumper.cpp b/llvm/tools/llvm-readobj/COFFDumper.cpp
index 5d1d2a34d6442..dce8e60bda1ef 100644
--- a/llvm/tools/llvm-readobj/COFFDumper.cpp
+++ b/llvm/tools/llvm-readobj/COFFDumper.cpp
@@ -108,6 +108,7 @@ class COFFDumper : public ObjDumper {
void printStackMap() const override;
void printAddrsig() override;
void printCGProfile() override;
+ void printStringTable() override;
private:
StringRef getSymbolName(uint32_t Index);
@@ -2219,6 +2220,17 @@ void COFFDumper::printCGProfile() {
}
}
+void COFFDumper::printStringTable() {
+ DictScope DS(W, "StringTable");
+ StringRef StrTable = Obj->getStringTable();
+ uint32_t StrTabSize = StrTable.size();
+ W.printNumber("Length", StrTabSize);
+ // Print strings from the fifth byte, since the first four bytes contain the
+ // length (in bytes) of the string table (including the length field).
+ if (StrTabSize > 4)
+ printAsStringList(StrTable, 4);
+}
+
StringRef COFFDumper::getSymbolName(uint32_t Index) {
Expected<COFFSymbolRef> Sym = Obj->getSymbol(Index);
if (!Sym)
diff --git a/llvm/tools/llvm-readobj/ObjDumper.h b/llvm/tools/llvm-readobj/ObjDumper.h
index cd744e3bbfb71..e4c37318e8a4d 100644
--- a/llvm/tools/llvm-readobj/ObjDumper.h
+++ b/llvm/tools/llvm-readobj/ObjDumper.h
@@ -157,7 +157,7 @@ class ObjDumper {
llvm::codeview::GlobalTypeTableBuilder &GlobalCVTypes,
bool GHash) {}
- // Only implemented for XCOFF.
+ // Only implemented for XCOFF/COFF.
virtual void printStringTable() {}
virtual void printAuxiliaryHeader() {}
virtual void printExceptionSection() {}
diff --git a/llvm/tools/llvm-readobj/Opts.td b/llvm/tools/llvm-readobj/Opts.td
index 7d574d875d22e..f95461aaca1a7 100644
--- a/llvm/tools/llvm-readobj/Opts.td
+++ b/llvm/tools/llvm-readobj/Opts.td
@@ -44,7 +44,7 @@ defm sort_symbols : Eq<"sort-symbols", "Specify the keys to sort the symbols bef
def stack_sizes : FF<"stack-sizes", "Display contents of all stack sizes sections. This option has no effect for GNU style output">;
def stackmap : FF<"stackmap", "Display contents of stackmap section">;
defm string_dump : Eq<"string-dump", "Display the specified section(s) as a list of strings">, MetaVarName<"<name or index>">;
-def string_table : FF<"string-table", "Display the string table (only for XCOFF now)">;
+def string_table : FF<"string-table", "Display the string table (only for XCOFF/COFF now)">;
def symbols : FF<"symbols", "Display the symbol table. Also display the dynamic symbol table when using GNU output style for ELF">;
def unwind : FF<"unwind", "Display unwind information">;
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking very good overall, thanks! The PR and commit subject talks about --symbol-table
though when this actually is --string-table
.
@@ -0,0 +1,60 @@ | |||
# RUN: yaml2obj %s -o %t.obj | |||
# RUN: llvm-readobj --string-table %t.obj | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For purposes of testing edge cases, it would be good to check a file with no string table too - i.e. an executable. (That would make sure that we're not doing anything undefined with null pointers etc, in an ubsan build of llvm-readobj.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your PR title says --symbol-table
but you're modifying the --string-table
option. Please update.
Otherwise, generally looks reasonable, but needs a bit more testing.
@@ -157,7 +157,7 @@ class ObjDumper { | |||
llvm::codeview::GlobalTypeTableBuilder &GlobalCVTypes, | |||
bool GHash) {} | |||
|
|||
// Only implemented for XCOFF. | |||
// Only implemented for XCOFF/COFF. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was previously intended to refer to all the following functions, since e.g. printAuxiliaryHeader
is XCOFF only still. Please adjust accordingly (you probably need to move it and/or add another comment.
# CHECK: [ 21] _main_test_test | ||
# CHECK: } | ||
|
||
--- !COFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you try simplifying this YAML a bit, by removing unnecessary fields/sections/symbols, please? In particular, you only need a couple of sections, just to populate the string table.
@@ -0,0 +1,60 @@ | |||
# RUN: yaml2obj %s -o %t.obj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test doesn't cover the following significant cases:
- Empty string tables (i.e. value 0 in the length field).
- Malformed string tables with too small size (i.e. size less than 4)
- Malformed string tables with length pointing past the end of the file.
Please add these, as they help ensure a robust llvm-readobj that doesn't crash when it gets something unexpected (or worse, start reading memory outside what it should). The malformed cases might require improvements to yaml2obj to allow you to create a string table with arbitrary values in the length field. This should be a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that making yaml2obj
produce malformed string tables is out of scope here; as the yaml representation doesn't contain the string table explicitly at all, but it is created implicitly from strings needed for symbols and sections, it would be a significant change/extension to the yaml format just for testing those invalid cases.
So for testing those cases, practically, I think bundling a small binary file is the most pragmatic solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave you to judge that for the COFF side (although I think the general consensus is that prebuilt binaries should be avoided), but in general, it's precisely these sorts of cases that have led to incremental improvements on the ELF side of yaml2obj. In this particular situation, I'd add a new top-level YAML entry for the string table, which, if implemented, would override the default-generated one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty string tables (i.e. value 0 in the length field).
Malformed string tables with too small size (i.e. size less than 4)
Malformed string tables with length pointing past the end of the file.
Updated test for empty stringtab.
StringTableSize has been set to at least 4 and there's assertion to make sure stringtable end must be '\0'.
llvm-project/llvm/lib/Object/COFFObjectFile.cpp
Lines 442 to 450 in 6da8f3b
// Treat table sizes < 4 as empty because contrary to the PECOFF spec, some | |
// tools like cvtres write a size of 0 for an empty table instead of 4. | |
if (StringTableSize < 4) | |
StringTableSize = 4; | |
// Check that the string table is null terminated if has any in it. | |
if (StringTableSize > 4 && StringTable[StringTableSize - 1] != 0) | |
return createStringError(object_error::parse_failed, | |
"string table missing null terminator"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StringTableSize has been set to at least 4 and there's assertion to make sure stringtable end must be '\0'.
That might well be what the code currently does, but the test needs to show that it does this. Otherwise, somebody might change how the code works and accidentally break existing behaviour, because it is untested. For example, somebody might decide the printing function should read the data directly, rather than rely on the existing library methods, but then forget to add these sort of conditions.
Anyway, that code looks suspect: what happens if the file ends fewer than 4 bytes after the end of the string table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string table llvm-readobj get from COFFObjectFile is StringRef.
I think it's COFFObjectFile's responsibility to make sure this StringRef is readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're missing my point completely. I'm asking you to verify that the outward behaviour of llvm-readobj is correct through tests, not just reading the code. Putting it another way, it would not be impossible for somebody to decide that the COFFObjectFile interface is no longer the appropriate way to implement llvm-readobj for COFF and to implement something themselves that reads and prints COFF object information. In such a case, the tests should work for both the old and new implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you mentioned this should be a new feature for yaml2obj in separate PR and that would benefit COFF as well as XCOFF.
Any more comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM - this now uses preexisting input binaries for testing the other cases; this looks good enough for this feature IMO.
Extending yaml2obj to be able to synthesize the string table would be a nice addition, but I wouldn't block this feature on doing that, as we have decent coverage with those existing binaries.
Thanks |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/20997 Here is the relevant piece of the build log for the reference
|
No description provided.