Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mach-O file format support #617

Merged
merged 38 commits into from Nov 21, 2017
Merged

Mach-O file format support #617

merged 38 commits into from Nov 21, 2017

Conversation

ghost
Copy link

@ghost ghost commented Mar 6, 2017

New module adds support for Mach-O files (binary format for OS X, iOS, watchOS and tvOS). Module can load basic header (architecture information, file type...), segments, sections and entry point information. Module also supports fat Mach-O files (archive for Mach-O files for different architectures). Tests are included. Documentation is not included, but I can write it when module is eventually accepted.

@plusvic
Copy link
Member

plusvic commented Mar 8, 2017

Test cases for this module are failing when compiled for 32bit with -m32.

Check this: https://travis-ci.org/VirusTotal/yara/jobs/208363567. Could you take a look at what's happening?

@ghost
Copy link
Author

ghost commented Mar 10, 2017

I managed to fix 32-bit build error. I will need help with rest of the build issues as I do not have neither Windows nor OS X machine.

@ghost
Copy link
Author

ghost commented Mar 24, 2017

with friend's help I managed to fix rest of the build issues on Windows and OS X. Module should be ready for review.

if (is_undefined(module, "nfat_arch"))
return_integer(UNDEFINED);

for (uint64_t i = 0; i < nfat; ++i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to use a 64bit iterator here, the fat counter is ut32, also is better to use i++, because preincrement doesn't gives any benefit here

int64_t type_arg = integer_argument(1);
int64_t subtype_arg = integer_argument(2);

uint64_t nfat = get_integer(module, "nfat_arch");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nfat is uint32_t, there's no need to use 64bit integer here

size_t offset = yr_be32toh(archs[i].offset);
if (size < offset)
{
st->offsets = (size_t*)yr_realloc(st->offsets,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont do realloc over the same pointer, use a temporary intermediate one and check if it's returning null. to avoid memory leaks

set_integer(yr_be32toh(header->nfat_arch), object, "nfat_arch");

fat_arch_t* archs = (fat_arch_t*)(header + 1);
for (size_t i = 0; i < yr_be32toh(header->nfat_arch); ++i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont preincrement

set_integer(yr_be32toh(header->nfat_arch), object, "nfat_arch");

fat_arch_t* archs = (fat_arch_t*)(header + 1);
for (size_t i = 0; i < yr_be32toh(header->nfat_arch); ++i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't call yr_be32toh on every iteration, this field is not suposed to change during the loop


// All data in Mach-O fat binary header(s) is in big-endian byte order.

fat_header_t* header = (fat_header_t*)data;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use const

YR_OBJECT* object,
YR_SCAN_CONTEXT* context)
{
uint8_t* magic = (uint8_t*)data;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use const

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i cant see any boundary checks, without trying this pr, it feels like this code will crash when parsing truncated files


// Get length of string with limit (some strings are terminated by size limit).

size_t macho_limited_strlen(char* string, size_t limit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's already in libc and it's called strnlen

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but as long as it's from ISO C90, microsoft didn't had time yet to include it in their toolchains, so this strnlen will not work for mingw/msvc builds. But libc version will be more optimal, so it will be good to use it if available

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input string is const

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with strnlen - seems OK according to build tests.


// Get length of string with limit (some strings are terminated by size limit).

size_t macho_limited_strlen(char* string, size_t limit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function should be static, it's not used outside this .c file


// Convert virtual address to file offset. Segments have to be already loaded.

int macho_rva_to_offset(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stdbool is available on all compilers, maybe it will be good to use bool and true / false here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this. It is not used anywhere in the project. Maybe I would prefer consistency here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a good code practice, but consistency across the project makes sense. this change should be proposed across the whole project, so it fits better in a separate PR.

for (uint64_t i = 0; i < segment_count; ++i)
{
uint64_t start = get_integer(object, "segments[%i].vmaddr", i);
uint64_t end = start + get_integer(object, "segments[%i].vmsize", i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end can be < start because get_integer returns a signed integer.

)
{
uint64_t segment_count = get_integer(object, "number_of_segments");
for (uint64_t i = 0; i < segment_count; ++i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont preincrement unless strictly necessary

YR_OBJECT* object
)
{
uint64_t segment_count = get_integer(object, "number_of_segments");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_integer returns a signed integer, casting it blindly to unsigned can result on some unexpected behaviour

Copy link
Author

@ghost ghost Jun 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this problem even with exact-wide types (has to be exactly wide and in 2's complement)? If so, how to go around this? Via pointers - *(uint64_t*)&value? Because I need it to be unsigned (e.g. offset in file bigger that 2^32 - unlikely but possible).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you tested this code with any mach0 kernel or module which use 64bit addresses to see if it behaves correctly? probably an & -1LL would remove that undefined behaviour

int64_t type = get_integer(module, "fat_arch[%i].cputype", i);
if (type == type_arg)
{
uint64_t file_offset = get_integer(module, "fat_arch[%i].offset", i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implicit sign cast


foreach_memory_block(iterator, block)
{
void* block_data = block->fetch_data(block);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetch_data should return the amount of data left

YR_MEMORY_BLOCK_ITERATOR* iterator = context->iterator;

// Prepare storage.
FAT_DATA* st = (FAT_DATA*)yr_malloc(sizeof(FAT_DATA));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check if malloc returns null

void* file_data = (uint8_t*)block_data + offset;
YR_OBJECT* object = get_object(module_object, "file[%i]",
st->file_count);
macho_parse_file(file_data, object, context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

macho_parse_file knows the file_data pointer, but doesn't knows the size of it

@trufae
Copy link
Contributor

trufae commented Jul 26, 2017

Looks good to merge for me! cc @plusvic

@radare
Copy link

radare commented Jul 26, 2017

looks good to me for merging!

@trufae
Copy link
Contributor

trufae commented Aug 11, 2017

@plusvic ? there are tons of prs that lgtm and are green for a long time, anything missing to merge?

@plusvic
Copy link
Member

plusvic commented Aug 11, 2017 via email

@radare
Copy link

radare commented Aug 11, 2017 via email

@plusvic
Copy link
Member

plusvic commented Aug 11, 2017 via email

@plusvic plusvic modified the milestones: v3.7.0, v3.8.0 Oct 31, 2017
@trufae
Copy link
Contributor

trufae commented Nov 2, 2017

looks green for travis, but there are still comments from my review that havent been fixed. and cant be merged because of conflicts, pls review. PRs shouldnt take ages to be merged :((

Copy link
Member

@plusvic plusvic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some comments. My biggest concern with this PR so far is that the parsing is done assuming the correctness of the parsed file, no efforts are made in order to make sure that it won't crash with malformed files.


uint32_t* magic = (uint32_t*)block_data;
// Parse classic Mach-O file.
if (macho_is_file_block(magic))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

macho_is_file_block dereferences a pointer to uint32_t without checking if the buffer size is at least sizeof(uint32_t)

\
uint64_t num_sg = 0; \
uint8_t *command = (uint8_t*)(header + 1); \
for (unsigned i = 0; i < yr_##bo##32toh(header->ncmds); i++) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sanity checks for preventing out-of-bounds reads. If header->ncmds is altered or the file is truncated after the header this will segfault.

size_t offset = yr_be##bits##toh(archs[i].offset); \
if (size < offset) \
{ \
st->offsets = (size_t*)yr_realloc(st->offsets, \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yr_realloc could return NULL.

\
/* Get specific Mach-O file data. */ \
void* file_data = (uint8_t*)data + offset; \
macho_parse_file(file_data, size, get_object(object, "file[%i]", i), \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This blindly assume that offset and data are correct and will lay inside the buffer, but these values come from the file header and should be sanitized.

}

// Parse Mach-O fat binary.
if (macho_is_fat_file_block(magic))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same problem here, we don't know if the buffer is at least sizeof(uint32_t).

set_integer(yr_##bo##32toh(sg->flags), \
object, "segments[%i].flags", i); \
\
for (unsigned j = 0; j < yr_##bo##32toh(sg->nsects); ++j) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sanity checks for preventing out-of-bounds reads. If sg->nsects is altered or the file is truncated after the header this will segfault.


declare_function("file_index", "i", "i", file_index_type);
declare_function("file_index", "ii", "i", file_index_subtype);
declare_function("ep_for_arch", "i", "i", entry_point_for_arch_type);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call this entry_point_for_arch for consistency with entry_point.

}

// Parse previously out of block files.
for (size_t i = 0; i < st->count; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not actually necessary. You can safely assume that a Mach-O file will come in a single block, no matter how big it is. Multiple blocks are mainly for scanning process memory, where each block represents memory region.

@ghost
Copy link
Author

ghost commented Nov 6, 2017

I was working on other task at work and it is middle of midterm exams season too, I will take look at it asap - probably end of this week or next one.

@ghost
Copy link
Author

ghost commented Nov 9, 2017

Most problems should be fixed.

There are two problems remaining:

  1. int64_t to uint64_t cast mentioned by trufae. I believe it should be OK but I am not sure. In C++ I would just use reinterpret_cast, no idea what is C alternative for this.

  2. Making macho module optional. In this case I am not sure if Travis and AppVeyor will work after doing this so this should be probably done right after PR merge and not before.

@trufae
Copy link
Contributor

trufae commented Nov 21, 2017

i think its ok to merge as it is, other modules got merged with much less review and those tiny issues only affect when working with kernel mach0s and can be fixed after that with proper tests

@plusvic plusvic merged commit 0c57f0b into VirusTotal:master Nov 21, 2017
@plusvic
Copy link
Member

plusvic commented Nov 21, 2017

Coverity has found 8 issues with this PR:

Hi,

Please find the latest report on new defect(s) introduced to plusvic/yara found with Coverity Scan.

8 new defect(s) introduced to plusvic/yara found with Coverity Scan.


New defect(s) Reported-by: Coverity Scan
Showing 8 of 8 defect(s)


** CID 179966:  Program hangs  (NEGATIVE_RETURNS)
/libyara/modules/macho.c: 124 in macho_offset_to_rva()


________________________________________________________________________________________________________
*** CID 179966:  Program hangs  (NEGATIVE_RETURNS)
/libyara/modules/macho.c: 124 in macho_offset_to_rva()
118     int macho_offset_to_rva(
119         uint64_t offset,
120         uint64_t* result,
121         YR_OBJECT* object)
122     {
123       uint64_t segment_count = get_integer(object, "number_of_segments");
>>>     CID 179966:  Program hangs  (NEGATIVE_RETURNS)
>>>     Using unsigned variable "segment_count" in a loop exit condition.
124       for (uint64_t i = 0; i < segment_count; i++)
125       {
126         uint64_t start = get_integer(object, "segments[%i].fileoff", i);
127         uint64_t end = start + get_integer(object, "segments[%i].filesize", i);
128
129         if (offset >= start && offset < end)

** CID 179965:  Program hangs  (NEGATIVE_RETURNS)
/libyara/modules/macho.c: 100 in macho_rva_to_offset()


________________________________________________________________________________________________________
*** CID 179965:  Program hangs  (NEGATIVE_RETURNS)
/libyara/modules/macho.c: 100 in macho_rva_to_offset()
94     int macho_rva_to_offset(
95         uint64_t address,
96         uint64_t* result,
97         YR_OBJECT* object)
98     {
99       uint64_t segment_count = get_integer(object, "number_of_segments");
>>>     CID 179965:  Program hangs  (NEGATIVE_RETURNS)
>>>     Using unsigned variable "segment_count" in a loop exit condition.
100       for (uint64_t i = 0; i < segment_count; i++)
101       {
102         uint64_t start = get_integer(object, "segments[%i].vmaddr", i);
103         uint64_t end = start + get_integer(object, "segments[%i].vmsize", i);
104
105         if (address >= start && address < end)

** CID 179964:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 334 in macho_handle_segment_64_be()


________________________________________________________________________________________________________
*** CID 179964:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 334 in macho_handle_segment_64_be()
328       }                                                                            \
329     }                                                                              \
330
331     MACHO_HANDLE_SEGMENT(32,le)
332     MACHO_HANDLE_SEGMENT(64,le)
333     MACHO_HANDLE_SEGMENT(32,be)
>>>     CID 179964:  Insecure data handling  (TAINTED_SCALAR)
>>>     Using tainted variable "_yr_bswap32(sg->nsects)" as a loop boundary.
334     MACHO_HANDLE_SEGMENT(64,be)
335
336
337     // Parse Mach-O file with specific bit-width and byte order.
338
339     #define MACHO_PARSE_FILE(bits,bo)                                              \

** CID 179963:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 492 in macho_parse_fat_file_32()


________________________________________________________________________________________________________
*** CID 179963:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 492 in macho_parse_fat_file_32()
486         /* Get specific Mach-O file data. */                                       \
487         macho_parse_file(data + offset, file_size,                                 \
488                          get_object(object, "file[%i]", i), context);              \
489       }                                                                            \
490     }                                                                              \
491
>>>     CID 179963:  Insecure data handling  (TAINTED_SCALAR)
>>>     Using tainted variable "count" as a loop boundary.
492     MACHO_PARSE_FAT_FILE(32)
493     MACHO_PARSE_FAT_FILE(64)
494
495
496     // Parse Mach-O fat file.
497

** CID 179962:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 333 in macho_handle_segment_32_be()


________________________________________________________________________________________________________
*** CID 179962:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 333 in macho_handle_segment_32_be()
327         }                                                                          \
328       }                                                                            \
329     }                                                                              \
330
331     MACHO_HANDLE_SEGMENT(32,le)
332     MACHO_HANDLE_SEGMENT(64,le)
>>>     CID 179962:  Insecure data handling  (TAINTED_SCALAR)
>>>     Using tainted variable "_yr_bswap32(sg->nsects)" as a loop boundary.
333     MACHO_HANDLE_SEGMENT(32,be)
334     MACHO_HANDLE_SEGMENT(64,be)
335
336
337     // Parse Mach-O file with specific bit-width and byte order.
338

** CID 179961:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 493 in macho_parse_fat_file_64()


________________________________________________________________________________________________________
*** CID 179961:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 493 in macho_parse_fat_file_64()
487         macho_parse_file(data + offset, file_size,                                 \
488                          get_object(object, "file[%i]", i), context);              \
489       }                                                                            \
490     }                                                                              \
491
492     MACHO_PARSE_FAT_FILE(32)
>>>     CID 179961:  Insecure data handling  (TAINTED_SCALAR)
>>>     Using tainted variable "count" as a loop boundary.
493     MACHO_PARSE_FAT_FILE(64)
494
495
496     // Parse Mach-O fat file.
497
498     void macho_parse_fat_file(

** CID 179960:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 405 in macho_parse_file_64_be()


________________________________________________________________________________________________________
*** CID 179960:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 405 in macho_parse_file_64_be()
399       set_integer(seg_count, object, "number_of_segments");                        \
400     }                                                                              \
401
402     MACHO_PARSE_FILE(32,le)
403     MACHO_PARSE_FILE(64,le)
404     MACHO_PARSE_FILE(32,be)
>>>     CID 179960:  Insecure data handling  (TAINTED_SCALAR)
>>>     Using tainted variable "_yr_bswap32(header->ncmds)" as a loop boundary.
405     MACHO_PARSE_FILE(64,be)
406
407
408     // Parse Mach-O file.
409
410     void macho_parse_file(

** CID 179959:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 404 in macho_parse_file_32_be()


________________________________________________________________________________________________________
*** CID 179959:  Insecure data handling  (TAINTED_SCALAR)
/libyara/modules/macho.c: 404 in macho_parse_file_32_be()
398                                                                                    \
399       set_integer(seg_count, object, "number_of_segments");                        \
400     }                                                                              \
401
402     MACHO_PARSE_FILE(32,le)
403     MACHO_PARSE_FILE(64,le)
>>>     CID 179959:  Insecure data handling  (TAINTED_SCALAR)
>>>     Using tainted variable "_yr_bswap32(header->ncmds)" as a loop boundary.
404     MACHO_PARSE_FILE(32,be)
405     MACHO_PARSE_FILE(64,be)
406
407
408     // Parse Mach-O file.
409

@wxsBSD
Copy link
Collaborator

wxsBSD commented Nov 21, 2017

Along with the coverity findings mentioned above, this PR unintentionally reverts some changes to hex_grammar.y. I may be misreading this but I think it's worth a closer look to make sure those changes are intentional.

@plusvic
Copy link
Member

plusvic commented Nov 21, 2017

I'll review the changes in hex_grammar.y.

Besides the issues found by Coverity, ASAN found a heap overflow, it may be related, maybe not, I didn't look into the details. I can share the poc file.

INFO: Seed: 4052154712
INFO: Loaded 1 modules   (6559 inline 8-bit counters): 6559 [0x99a4c8, 0x99be67), 
INFO: Loaded 1 PC tables (6559 PCs): 6559 [0x707390,0x720d80), 
INFO:       10 files found in /tmp/input
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 62032 bytes
INFO: seed corpus: files: 10 min: 20b max: 62032b total: 165977b rss: 45Mb
=================================================================
==10==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000005220 at pc 0x00000056d612 bp 0x7fff25eb7d20 sp 0x7fff25eb7d18
READ of size 4 at 0x603000005220 thread T0
SCARINESS: 17 (4-byte-read-heap-buffer-overflow)
    #0 0x56d611 in macho_parse_file_32_le /src/yara/libyara/modules/macho.c:402:1
    #1 0x5762e8 in macho__load /src/yara/libyara/modules/macho.c:1150:7
    #2 0x52af68 in yr_modules_load /src/yara/libyara/modules.c:175:16
    #3 0x5946e2 in yr_execute_code /src/yara/libyara/exec.c:865:18
    #4 0x543eba in yr_rules_scan_mem_blocks /src/yara/libyara/rules.c:472:3
    #5 0x54509e in yr_rules_scan_mem /src/yara/libyara/rules.c:586:10
    #6 0x51aa28 in LLVMFuzzerTestOneInput /src/yara/tests/oss-fuzz/macho_fuzzer.cc:77:3
    #7 0x5dc3c0 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:512:13
    #8 0x5daa76 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*) /src/libfuzzer/FuzzerLoop.cpp:440:3
    #9 0x5df4b2 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:715:7
    #10 0x5dfbe8 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:731:3
    #11 0x5c6c3d in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:743:6
    #12 0x5b9e78 in main /src/libfuzzer/FuzzerMain.cpp:20:10
    #13 0x7f92f1bec82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #14 0x41c638 in _start (/out/macho_fuzzer+0x41c638)

0x603000005220 is located 4 bytes to the right of 28-byte region [0x603000005200,0x60300000521c)
allocated by thread T0 here:
    #0 0x516628 in operator new[](unsigned long) /src/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:95
    #1 0x5dc1a8 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:503:23
    #2 0x5daa76 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*) /src/libfuzzer/FuzzerLoop.cpp:440:3
    #3 0x5df4b2 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:715:7
    #4 0x5dfbe8 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:731:3
    #5 0x5c6c3d in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:743:6
    #6 0x5b9e78 in main /src/libfuzzer/FuzzerMain.cpp:20:10
    #7 0x7f92f1bec82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

SUMMARY: AddressSanitizer: heap-buffer-overflow /src/yara/libyara/modules/macho.c:402:1 in macho_parse_file_32_le
Shadow bytes around the buggy address:
  0x0c067fff89f0: fa fa fd fd fd fa fa fa fd fd fd fd fa fa fd fd
  0x0c067fff8a00: fd fd fa fa fd fd fd fa fa fa fd fd fd fa fa fa
  0x0c067fff8a10: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
  0x0c067fff8a20: fa fa fd fd fd fa fa fa fd fd fd fd fa fa fd fd
  0x0c067fff8a30: fd fa fa fa 00 00 04 fa fa fa 00 00 00 04 fa fa
=>0x0c067fff8a40: 00 00 00 04[fa]fa 00 00 00 fa fa fa 00 00 00 fa
  0x0c067fff8a50: fa fa 00 00 00 fa fa fa 00 00 02 fa fa fa 00 00
  0x0c067fff8a60: 01 fa fa fa 00 00 01 fa fa fa 00 00 01 fa fa fa
  0x0c067fff8a70: 00 00 01 fa fa fa 00 00 03 fa fa fa 00 00 00 04
  0x0c067fff8a80: fa fa 00 00 05 fa fa fa 00 00 07 fa fa fa 00 00
  0x0c067fff8a90: 02 fa fa fa 00 00 01 fa fa fa 00 00 04 fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==10==ABORTING
MS: 0 ; base unit: 0000000000000000000000000000000000000000
0xce,0xfa,0xed,0xfe,0x7,0x0,0x0,0x0,0x3,0x0,0x0,0x0,0x6,0x0,0x0,0x0,0x12,0x0,0x0,0x0,0xfc,0xa,0x0,0x0,0x85,0x0,0x10,0x0,
\xce\xfa\xed\xfe\x07\x00\x00\x00\x03\x00\x00\x00\x06\x00\x00\x00\x12\x00\x00\x00\xfc\x0a\x00\x00\x85\x00\x10\x00
artifact_prefix='./'; Test unit written to ./crash-083a09b7a6b6bc21c1be120b5ae8e3d667cf9db4
Base64: zvrt/gcAAAADAAAABgAAABIAAAD8CgAAhQAQAA==
```


@plusvic
Copy link
Member

plusvic commented Nov 22, 2017

In 300374f I've excluded the macho from the build by default. You need to use ./configure --enable-macho for building the module into YARA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants