Skip to content

Discussion: ObjWriter in C# #77178

Closed
@filipnavara

Description

@filipnavara

Let me start with a bit of a background. Last year I wrote a library for manipulating MachO object files in C# called Melanzana. Few weeks ago, I started doing some changes to the ObjWriter code and I found it to be quite suboptimal in terms of performance. As an exercise I wrote a prototype of an ObjWriter replacement in C# for the MachO files based on my library. I later extended it to emit DWARF debugging information and ELF files as well through the LibObjectFile library. Obviously, the code is not production ready and it is not on par with the current ObjWriter library but I wanted to gauge whether there would be an interest carrying this experiment forward.

What works?

  • Producing MachO object files for osx-x64 and osx-arm64 targets
  • Producing ELF files for linux-x64 and linux-arm64 targets
  • Producing COFF files for win-x64 and win-arm64 targets
  • DWARF debugging information for types, methods, variables, and line numbers
  • CodeView debugging information for types, methods, variables, and line numbers

What doesn't work?

  • No shared symbol (COMDAT) support for ELF
  • No ARM32 and X86 support (NativeAOT doesn't properly support them anyway)

The obvious advantage of the approach is that the object writing libraries (whether it is LibObjectFile or Melanzana) are closer to the data model that the ILCompiler emits. That makes it more efficient at producing the raw section data, relocations, and symbol tables. The LLVM-based ObjWriter has high overhead (15%-30% of the whole compilation process in my tests). Some of the overhead can be reduced (eg. switching sections is expensive) but it usually comes at the cost of writing at least some part of the code in C#.

The disadvantage is that this is a lot of code and two external library references. Essentially, it's trading one dependency for another. While MachO and ELF formats are already part of the initial experiment the COFF one is not (all Windows targets).

There's also a middle way. Parts of the experiment can be reused to feed the data more efficiently into the current ObjWriter. For example, the unwinding sections (__eh_frame, __compact_unwind) could be produced completely in the managed code. That would avoid lot of overhead for section switching at minimal impact on portability (can be used only for specific targets).

The experiment also serves as a good testbed to quickly evaluate some space savings. Some little things I noticed:

  • Non-primary LSDA frames contain relative pointer to main LSDA. This can be generated without symbols and relocations easily.
  • Currently MachO produces __eh_frame data with 8-byte PC relative pointers. The linker can handle 4-byte ones as well so they can be used (as already done for ELF).
  • ELF produces PLT relocations even within the same section. This seems to inhibit some optimizations and it's not recommended.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

No status

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions