Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when there are more than ten CODE resources when using a custom segmap #195

Closed
ryandesign opened this issue Oct 26, 2022 · 1 comment · Fixed by #196
Closed

Crash when there are more than ten CODE resources when using a custom segmap #195

ryandesign opened this issue Oct 26, 2022 · 1 comment · Fixed by #196

Comments

@ryandesign
Copy link
Contributor

I'm trying to understand how and why to use a custom segmap (#192) and in the process of investigating this I'm encountering a crash when a custom segmap results in there being more than ten CODE resources in the application.

Almost minimal reproduction example: https://gist.github.com/ryandesign/1ad7e2e2f6cf967548928750d872e033

(It includes RetroConsole which is not necessary to demonstrate the problem but having more segments be non-empty makes the problem easier to see when looking at the resource sizes.)

The default segmap results in nine CODE resources and the app works fine:

out0 CODE out0 RELA

When I use a custom segmap that attempts to reproduce this default, I also get nine CODE resources (in a slightly different order) and the app works fine:

out1 CODE out1 RELA

When I add another segment (with nothing in it) the app still works fine:

out2 CODE out2 RELA

When I add yet another segment (with nothing in it) you can tell by the CODE resource sizes that they contain the wrong data and the app crashes immediately on launch:

out3 CODE out3 RELA

For this last one, we also get Invalid ref errors when linking, like:

Invalid ref from .code2:c to .code1(puts)+139052
needsJT: true
from addr: 139052, exceptionInfoStart: 79788

Full build output is in the above gist URL.

@ryandesign
Copy link
Contributor Author

ryandesign commented Oct 27, 2022

There seem to be two different places where ID numbers are assigned, and once there are ten segments the two algorithms don't match up.

First, SegmentMap::SegmentMap in Elf2Mac/SegmentMap.cc assigns IDs to each segment. These ID numbers appear to be what we want the CODE/RELA resource IDs eventually to be. The segments are not pushed onto the segments vector in ID order but rather in the order in which the filters have to be applied. Then, SegmentMap::CreateLdScript in Elf2Mac/LdScript.cc iterates over each segment, calling SegmentInfo::CreateLdScript in Elf2Mac/LdScript.cc to write the linker script for each segment. The segment ID number is inserted into the linker script as a string in several places:

    out << "\t.code" << id << " : {\n";
        out << "\t\t__EH_FRAME_BEGIN__" << id << " = .;\n";
        out << boost::replace_all_copy<string>(R"ld(
        . = ALIGN(0x4);
        FILL(0);
        . += 32;
        LONG(__EH_FRAME_BEGIN__@N@ - .);
)ld", "@N@", boost::lexical_cast<string>(id));

After the linker has run, its output is parsed in Object::Object in Elf2Mac/Object.cc, and each code section encountered in the ELF file is pushed onto the codeSections vector, and at the end, all codeSections are sorted by name:

    std::sort(codeSections.begin(), codeSections.end(),
              [](shared_ptr<Section> a, shared_ptr<Section> b) { return a->name < b->name; });

I suspect this is where things go awry, since the sort order will be in lexical (alphabetical) order rather than in numerical order:

.code1
.code10
.code2
.code3
...

Finally, Object::MultiSegmentApp in Elf2Mac/Object.cc iterates over the sorted codeSegments and assigns a codeID to each one in order starting with 1, but since the sorted order is wrong once we have ten or more segments, the assigned codeIDs here are wrong too. .code1 will get ID 1, .code10 will get ID 2, .code2 will get ID 3, etc.

The simplest solution is probably to pad the IDs with a sufficient number of leading zeroes when inserting them into the linker script such that all of the IDs have the same width and will then sort correctly in a lexical sort:

.code00001
.code00002
.code00003
...
.code00010

lexical_cast doesn't let you add leading zeroes but there are other solutions using std::ostringstream, std::setw, and std::setfill.

Another solution could be to change the sort comparison function to one that performs natural order sorting.

A third possibility would be not to assume that the segments are in any particular order and to extract the correct segment number by converting the string representation of the ID in the ELF file back into an integer and using that as the ID.

ryandesign added a commit to ryandesign/Retro68 that referenced this issue Oct 27, 2022
ryandesign added a commit to ryandesign/Retro68 that referenced this issue Oct 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant