Skip to content

Latest commit

 

History

History
166 lines (140 loc) · 7.4 KB

README.md

File metadata and controls

166 lines (140 loc) · 7.4 KB

machofile

machofile is a module to parse Mach-O binary files

Inspired by Ero Carrera's pefile, this module aims to provide a similar capability but for Mach-O binaries instead. Reference material and documentation used to gain the file format knowledge, the basic structures and constant are taken from the resources listed below.

machofile is self-contained. The module has no dependencies; it is endianness independent; and it works on macOS, Windows, and Linux.

While there are other mach-o parsing modules out there, the motivations behind developing this one are:

  • first and foremost, for me this was a great way to deep dive and learn more about the Mach-O format and structures
  • to provide a simple way to parse Mach-O files for analysis
  • to not depend on external modules (e.g. lief, macholib, macho, etc.), since everything is directly extracted from the file and is all in pure python.

This is the very first/alpha version still (2023.11.04), so please let me know if you try or find bugs but also be gentle ;) code will be optimized and more features will be added in the near future.

Current Features:

  • Parse Mach-O Header
  • Parse Load Commands
  • Parse File Segments
  • Parse Dylib Commands
  • Parse Dylib List

Note: as of now, this has initially be tested against x86 and x86_64 Mach-O samples.

Next features to be implemented:

  • extract Entry Point
  • Parse Code Signature information
  • Embedded strings
  • File Attributes
  • data entropy calculation
  • flag for suspicious libraries
  • Packer detection
  • Hashes: dylib hash, import hash, export hash, ...
  • prettify output to console
  • add output option to yaml and json
  • add options to parse only specific structures

Credits

Those are the people that I would like to thank for being the inspiration that led me to write this module:

Usage and example

You can either use it from command line or import it as a module in your python code, and call each function individually to parse only the structures you are interested in.

Module version

It expect to be supplied with either a file path or a data buffer to parse.

import machofile
macho = MachO(file_path='/path/to/machobinary')
macho = MachO('/path/to/machobinary')

The above two lines are equivalent and would load the Mach-O file and parse it. If the data buffer is already available, it can be supplied directly with:

import machofile
macho = MachO(data=bytes_variable)

You will then need to invoke the parse() method to start the parsing process, and can then call each function individually to parse only the structures you are interested in.

macho.parse()
dylib_cmd_list, dylib_lst = macho.get_dylib_commands()
...

Command Line version

From CLI, at the moment it just retrieves all the structures parsed, in the future there will be flags to just get one specific structure or a list of them.

% python3 machofile-cli.py -h
usage: machofile-cli.py [-h] -f FILE [-a] [-i] [-hd] [-l] [-sg] [-d]

Parse Mach-O file structures.

options:
  -h, --help            show this help message and exit
  -f FILE, --file FILE  Path to the file to be parsed
  -a, --all             Print all info about the file
  -i, --info            Print general info about the file
  -hd, --header         Print Mach-O header info
  -l, --load_cmd_t      Print Load Command Table and Command list
  -sg, --segments       Print File Segments info
  -d, --dylib           Print Dylib Command Table and Dylib list

Example output:

% python3 machofile-cli.py -a -f b4f68a58658ceceb368520dafc35b270272ac27b8890d5b3ff0b968170471e2b 

[General File Info]
        Filename:    b4f68a58658ceceb368520dafc35b270272ac27b8890d5b3ff0b968170471e2b
        Filesize:    54240
        Filetype:    Mach-O i386 executable
        Flags:       <NOUNDEFS|DYLDLINK|TWOLEVEL>
        MD5:         20ffe440e4f557b9e03855b5da2b3c9c
        SHA1:        1bf61ecad8568a774f9fba726a254a9603d09f33
        SHA256:      b4f68a58658ceceb368520dafc35b270272ac27b8890d5b3ff0b968170471e2b

[Mac-O Header]
        magic:       MH_MAGIC (32-bit)
        cputype:     Intel i386
        cpusubtype:  x86_ALL, x86_64_H, x86_64_LIB64
        filetype:    MH_EXECUTE
        ncmds:       13
        sizeofcmds:  1180
        flags:       MH_NOUNDEFS, MH_DYLDLINK, MH_TWOLEVEL

[Load Cmd table]
        {'cmd': 'LC_SEGMENT', 'cmdsize': 56}
        {'cmd': 'LC_SEGMENT', 'cmdsize': 192}
        {'cmd': 'LC_SEGMENT', 'cmdsize': 328}
        {'cmd': 'LC_SEGMENT', 'cmdsize': 192}
        {'cmd': 'LC_SEGMENT', 'cmdsize': 56}
        {'cmd': 'LC_SYMTAB', 'cmdsize': 24}
        {'cmd': 'LC_DYSYMTAB', 'cmdsize': 80}
        {'cmd': 'LC_LOAD_DYLINKER', 'cmdsize': 28}
        {'cmd': 'LC_UUID', 'cmdsize': 24}
        {'cmd': 'LC_UNIXTHREAD', 'cmdsize': 80}
        {'cmd': 'LC_LOAD_DYLIB', 'cmdsize': 52}
        {'cmd': 'LC_LOAD_DYLIB', 'cmdsize': 52}
        {'cmd': 'LC_CODE_SIGNATURE', 'cmdsize': 16}

[Load Commands]
        LC_CODE_SIGNATURE
        LC_DYSYMTAB
        LC_LOAD_DYLIB
        LC_LOAD_DYLINKER
        LC_SYMTAB
        LC_UNIXTHREAD
        LC_UUID

[File Segments]
        SEGNAME    VADDR VSIZE OFFSET SIZE  MAX_VM_PROTECTION INITIAL_VM_PROTECTION NSECTS FLAGS
        ----------------------------------------------------------------------------------------
        __PAGEZERO 0     4096  0      0     0                 0                     0      0    
        __TEXT     4096  28672 0      28672 7                 5                     2      0    
        __DATA     32768 4096  28672  4096  7                 3                     4      0    
        __IMPORT   36864 4096  32768  4096  7                 7                     2      0    
        __LINKEDIT 40960 20480 36864  17376 7                 1                     0      0    

[Dylib Commands]
        DYLIB_NAME_OFFSET DYLIB_TIMESTAMP DYLIB_CURRENT_VERSION DYLIB_COMPAT_VERSION DYLIB_NAME                   
        ----------------------------------------------------------------------------------------------------------
        24                2               65536                 65536                b'/usr/lib/libgcc_s.1.dylib' 
        24                2               7274759               65536                b'/usr/lib/libSystem.B.dylib'

[Dylib Names]
        b'/usr/lib/libgcc_s.1.dylib'
        b'/usr/lib/libSystem.B.dylib'

Reference/Documentation links: