Skip to content

Conversation

@Jepson2k
Copy link
Collaborator

@Jepson2k Jepson2k commented Feb 6, 2024

One sentence summary of this PR (This should go in the CHANGELOG!)
Added an ESP unpacker to support unpacking Espressif binaries.

Link to Related Issue(s)
#410

Please describe the changes in your request.
Added esp.py to ofrak_core/ofrak/core which contains multiple resource view for ESP binaries as well as identification and unpacking. Uses ESPTool.py for most of the work.

Anyone you think should look at this, specifically?
@rbs-jacob

There are multiple "TODO:"s located in the code, a few of which are notes for things I'm unsure about, and the rest are questions that I couldn't find answers to in the contribution guidelines.

Copy link
Member

@rbs-jacob rbs-jacob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not review all of this in great detail, but there are some substantial changes needed before I'd be comfortable merging this in

Comment on lines 47 to 62
def determine_chip(f: _TemporaryFileWrapper) -> str:
"""
Determines the chip type based on the firmware image.
:param f: A temporary file object containing the firmware image
:return: The chip name as a string, defaults to 'esp8266' if not determined
"""
extended_header = f.read(16)
if extended_header[-1] not in [0, 1]:
return "esp8266"

chip_id = int.from_bytes(extended_header[4:5], "little")
for rom in [n for n in ROM_LIST if n.CHIP_NAME != "ESP8266"]:
if chip_id == rom.IMAGE_CHIP_ID:
return rom.CHIP_NAME.lower()
return "esp8266"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For type-checking purposes, this should probably return an enum instead of a string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why use a temporary file wrapper instead of indexing into the data itself? All of the data is already stored in memory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Philosophically, there are a lot of child types here that we don't need to define.

To make an analogy: the OFRAK tar unpacker unpacks files, not tar blocks. We want to do the equivalent here and unpack only what will actually be semantically useful for further layers of unpacking or analysis. We don't necessarily need tagged children for each part of the file type, especially if those parts are metadata.

Unfortunately, this guiding philosophy isn't documented anywhere, so there isn't a way you could have known this before. Also if any of this is based on the Elf unpacker, that one deviates from this philosophy a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants