Skip to content

[llvm-readobj] Support --string-table for COFF #141552

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 3, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions llvm/include/llvm/Object/COFF.h
Original file line number Diff line number Diff line change
Expand Up @@ -939,6 +939,10 @@ class COFFObjectFile : public ObjectFile {
return uintptr_t(0);
}

StringRef getStringTable() const {
return StringRef(StringTable, StringTableSize);
}

uint16_t getMachine() const {
if (COFFHeader) {
if (CHPEMetadata) {
Expand Down
37 changes: 37 additions & 0 deletions llvm/test/tools/llvm-readobj/COFF/string-table.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# RUN: yaml2obj %s -o %t.obj
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't cover the following significant cases:

  1. Empty string tables (i.e. value 0 in the length field).
  2. Malformed string tables with too small size (i.e. size less than 4)
  3. Malformed string tables with length pointing past the end of the file.

Please add these, as they help ensure a robust llvm-readobj that doesn't crash when it gets something unexpected (or worse, start reading memory outside what it should). The malformed cases might require improvements to yaml2obj to allow you to create a string table with arbitrary values in the length field. This should be a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that making yaml2obj produce malformed string tables is out of scope here; as the yaml representation doesn't contain the string table explicitly at all, but it is created implicitly from strings needed for symbols and sections, it would be a significant change/extension to the yaml format just for testing those invalid cases.

So for testing those cases, practically, I think bundling a small binary file is the most pragmatic solution.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave you to judge that for the COFF side (although I think the general consensus is that prebuilt binaries should be avoided), but in general, it's precisely these sorts of cases that have led to incremental improvements on the ELF side of yaml2obj. In this particular situation, I'd add a new top-level YAML entry for the string table, which, if implemented, would override the default-generated one.

Copy link
Contributor Author

@HaohaiWen HaohaiWen May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty string tables (i.e. value 0 in the length field).
Malformed string tables with too small size (i.e. size less than 4)
Malformed string tables with length pointing past the end of the file.

Updated test for empty stringtab.
StringTableSize has been set to at least 4 and there's assertion to make sure stringtable end must be '\0'.

// Treat table sizes < 4 as empty because contrary to the PECOFF spec, some
// tools like cvtres write a size of 0 for an empty table instead of 4.
if (StringTableSize < 4)
StringTableSize = 4;
// Check that the string table is null terminated if has any in it.
if (StringTableSize > 4 && StringTable[StringTableSize - 1] != 0)
return createStringError(object_error::parse_failed,
"string table missing null terminator");

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringTableSize has been set to at least 4 and there's assertion to make sure stringtable end must be '\0'.

That might well be what the code currently does, but the test needs to show that it does this. Otherwise, somebody might change how the code works and accidentally break existing behaviour, because it is untested. For example, somebody might decide the printing function should read the data directly, rather than rely on the existing library methods, but then forget to add these sort of conditions.

Anyway, that code looks suspect: what happens if the file ends fewer than 4 bytes after the end of the string table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string table llvm-readobj get from COFFObjectFile is StringRef.
I think it's COFFObjectFile's responsibility to make sure this StringRef is readable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing my point completely. I'm asking you to verify that the outward behaviour of llvm-readobj is correct through tests, not just reading the code. Putting it another way, it would not be impossible for somebody to decide that the COFFObjectFile interface is no longer the appropriate way to implement llvm-readobj for COFF and to implement something themselves that reads and prints COFF object information. In such a case, the tests should work for both the old and new implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you mentioned this should be a new feature for yaml2obj in separate PR and that would benefit COFF as well as XCOFF.

# RUN: llvm-readobj --string-table %t.obj \
# RUN: %p/Inputs/coff-load-config-x64.dll \
# RUN: %p/Inputs/zero-string-table.obj.coff-i386 | FileCheck %s

# CHECK-LABEL: File: {{.*}}string-table.test.tmp.obj
# CHECK: StringTable {
# CHECK-NEXT: Length: 31
# CHECK-NEXT: [ 4] .debug_str
# CHECK-NEXT: [ f] _main_test_test
# CHECK-NEXT: }

# CHECK-LABEL: File: {{.*}}coff-load-config-x64.dll
# CHECK: StringTable {
# CHECK-NEXT: Length: 0
# CHECK-NEXT: }

# CHECK-LABEL: File: {{.*}}zero-string-table.obj.coff-i386
# CHECK: StringTable {
# CHECK-NEXT: Length: 4
# CHECK-NEXT: }

--- !COFF
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you try simplifying this YAML a bit, by removing unnecessary fields/sections/symbols, please? In particular, you only need a couple of sections, just to populate the string table.

header:
Machine: IMAGE_FILE_MACHINE_AMD64
sections:
- Name: .text
Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
- Name: .debug_str
Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
symbols:
- Name: _main_test_test
Value: 0
SectionNumber: 1
SimpleType: IMAGE_SYM_TYPE_NULL
ComplexType: IMAGE_SYM_DTYPE_FUNCTION
StorageClass: IMAGE_SYM_CLASS_EXTERNAL
12 changes: 12 additions & 0 deletions llvm/tools/llvm-readobj/COFFDumper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ class COFFDumper : public ObjDumper {
void printStackMap() const override;
void printAddrsig() override;
void printCGProfile() override;
void printStringTable() override;

private:
StringRef getSymbolName(uint32_t Index);
Expand Down Expand Up @@ -2219,6 +2220,17 @@ void COFFDumper::printCGProfile() {
}
}

void COFFDumper::printStringTable() {
DictScope DS(W, "StringTable");
StringRef StrTable = Obj->getStringTable();
uint32_t StrTabSize = StrTable.size();
W.printNumber("Length", StrTabSize);
// Print strings from the fifth byte, since the first four bytes contain the
// length (in bytes) of the string table (including the length field).
if (StrTabSize > 4)
printAsStringList(StrTable, 4);
}

StringRef COFFDumper::getSymbolName(uint32_t Index) {
Expected<COFFSymbolRef> Sym = Obj->getSymbol(Index);
if (!Sym)
Expand Down
4 changes: 3 additions & 1 deletion llvm/tools/llvm-readobj/ObjDumper.h
Original file line number Diff line number Diff line change
Expand Up @@ -157,8 +157,10 @@ class ObjDumper {
llvm::codeview::GlobalTypeTableBuilder &GlobalCVTypes,
bool GHash) {}

// Only implemented for XCOFF.
// Only implemented for XCOFF/COFF.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was previously intended to refer to all the following functions, since e.g. printAuxiliaryHeader is XCOFF only still. Please adjust accordingly (you probably need to move it and/or add another comment.

virtual void printStringTable() {}

// Only implemented for XCOFF.
virtual void printAuxiliaryHeader() {}
virtual void printExceptionSection() {}
virtual void printLoaderSection(bool PrintHeader, bool PrintSymbols,
Expand Down
2 changes: 1 addition & 1 deletion llvm/tools/llvm-readobj/Opts.td
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ defm sort_symbols : Eq<"sort-symbols", "Specify the keys to sort the symbols bef
def stack_sizes : FF<"stack-sizes", "Display contents of all stack sizes sections. This option has no effect for GNU style output">;
def stackmap : FF<"stackmap", "Display contents of stackmap section">;
defm string_dump : Eq<"string-dump", "Display the specified section(s) as a list of strings">, MetaVarName<"<name or index>">;
def string_table : FF<"string-table", "Display the string table (only for XCOFF now)">;
def string_table : FF<"string-table", "Display the string table (only for XCOFF/COFF now)">;
def symbols : FF<"symbols", "Display the symbol table. Also display the dynamic symbol table when using GNU output style for ELF">;
def unwind : FF<"unwind", "Display unwind information">;

Expand Down