Skip to content

Fix bug where ASan shadow region was overlapping static data #15806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 17, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 44 additions & 17 deletions emcc.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,11 @@ def base64_encode(b):
return b64.decode('ascii')


def align_to_wasm_page_boundary(address):
page_size = webassembly.WASM_PAGE_SIZE
return ((address + (page_size - 1)) // page_size) * page_size


@unique
class OFormat(Enum):
# Output a relocatable object file. We use this
Expand Down Expand Up @@ -2179,6 +2184,17 @@ def check_memory_setting(setting):
'emscripten_builtin_free',
]

if ('leak' in sanitize or 'address' in sanitize) and not settings.ALLOW_MEMORY_GROWTH:
# Increase the minimum memory requirements to account for extra memory
# that the sanitizers might need (in addition to the shadow memory
# requirements handled below).
# These values are designed be an over-estimate of the actual requirements and
# are based on experimentation with different tests/programs under asan and
# lsan.
settings.INITIAL_MEMORY += 50 * 1024 * 1024
if settings.USE_PTHREADS:
settings.INITIAL_MEMORY += 50 * 1024 * 1024

if settings.USE_OFFSET_CONVERTER and settings.WASM2JS:
exit_with_error('wasm2js is not compatible with USE_OFFSET_CONVERTER (see #14630)')

Expand Down Expand Up @@ -2235,27 +2251,38 @@ def check_memory_setting(setting):
if settings.GLOBAL_BASE != -1:
exit_with_error("ASan does not support custom GLOBAL_BASE")

max_mem = settings.INITIAL_MEMORY
# Increase the TOTAL_MEMORY and shift GLOBAL_BASE to account for
# the ASan shadow region which starts at address zero.
# The shadow region is 1/8th the size of the total memory and is
# itself part of the total memory.
# We use the following variables in this calculation:
# - user_mem : memory usable/visible by the user program.
# - shadow_size : memory used by asan for shadow memory.
# - total_mem : the sum of the above. this is the size of the wasm memory (and must be aligned to WASM_PAGE_SIZE)
user_mem = settings.INITIAL_MEMORY
if settings.ALLOW_MEMORY_GROWTH:
max_mem = settings.MAXIMUM_MEMORY
user_mem = settings.MAXIMUM_MEMORY

shadow_size = max_mem // 8
settings.GLOBAL_BASE = shadow_size
# Given the know value of user memory size we can work backwards
# to find the total memory and the shadow size based on the fact
# that the user memory is 7/8ths of the total memory.
# (i.e. user_mem == total_mem * 7 / 8
total_mem = user_mem * 8 / 7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slightly worried about rounding errors here, since I assume ASan rounds up, that is, isn't the size of shadow memory potentially slightly more than user memory / 7?

I would compute this is

shadow_mem = ..however asan rounds..
total_mem = user_mem + shadow_mem

Or does ASan actually compute it as a fraction of total memory, not user memory? In that case your code lgtm

Copy link
Collaborator Author

@sbc100 sbc100 Dec 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASan literally does a shift to from a pointer to its shadow byte:

#define MEM_TO_SHADOW(mem) ((mem) >> SHADOW_SCALE)

The problem with trying to start by calculating the size of the shadow region is that you can't do that without also knowing the size of the shadow region.. since the bigger the shadow region the more shadow region you need because it shift the reset of the memory upwards as it grows.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion that I've added to the code I think means that if these calculation are wrong we would know right away.. at least if we ever under-estimate the size of the shadow region which is the dangerous direction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait if SHADOW_SCALE=3 then total memory is 9/8 times user memory, isn't it, and not 8/7?

Copy link
Collaborator Author

@sbc100 sbc100 Dec 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because SHADOW_SCALE=3 the size of the shadow memory is 1/8th size of the total memory. But shadow memory is also part of the total memory and because shadow memory leaves at address zero that leave 7/8ths of the total memory left, for what we are calling user memory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what its worth the calculation of __global_base in emcc.py is puts its exactly at the address calculated for max_shadow_address in the native code.. which seems like fairly good evidence that the calculation here are correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. sgtm


sanitizer_mem = (shadow_size + webassembly.WASM_PAGE_SIZE) & ~webassembly.WASM_PAGE_SIZE
# sanitizers do at least 9 page allocs of a single page during startup.
sanitizer_mem += webassembly.WASM_PAGE_SIZE * 9
# we also allocate at least 11 "regions". Each region is kRegionSize (2 << 20) but
# MmapAlignedOrDieOnFatalError adds another 2 << 20 for alignment.
sanitizer_mem += (1 << 21) * 11
# When running in the threaded mode asan needs to allocate an array of kMaxNumberOfThreads
# (1 << 22) pointers. See compiler-rt/lib/asan/asan_thread.cpp.
if settings.USE_PTHREADS:
sanitizer_mem += (1 << 22) * 4
# But we might need to re-align to wasm page size
total_mem = int(align_to_wasm_page_boundary(total_mem))

# The shadow size is 1/8th the resulting rounded up size
shadow_size = total_mem // 8

# Increase the size of the initial memory according to how much memory
# we think the sanitizers will use.
settings.INITIAL_MEMORY += sanitizer_mem
# We start our global data after the shadow memory.
# We don't need to worry about alignment here. wasm-ld will take care of that.
settings.GLOBAL_BASE = shadow_size

if not settings.ALLOW_MEMORY_GROWTH:
settings.INITIAL_MEMORY = total_mem
else:
settings.INITIAL_MEMORY += align_to_wasm_page_boundary(shadow_size)

if settings.SAFE_HEAP:
# SAFE_HEAP instruments ASan's shadow memory accesses.
Expand Down
12 changes: 12 additions & 0 deletions system/lib/compiler-rt/lib/asan/asan_emscripten.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@

#if SANITIZER_EMSCRIPTEN
#include <emscripten.h>
#include <emscripten/heap.h>
#include <cassert>
#include <cstddef>
#include <pthread.h>
#define __ATTRP_C11_THREAD ((void*)(uptr)-1)
Expand All @@ -18,6 +20,16 @@ void InitializeShadowMemory() {
// Poison the shadow memory of the shadow area at the start of the address
// space. This helps catching null pointer dereference.
FastPoisonShadow(kLowShadowBeg, kLowShadowEnd - kLowShadowBeg, 0xff);

// Assert that the shadow region is large enough. We don't want to start
// running into the static data region which starts right after the shadow
// region.
uptr max_address =
(__builtin_wasm_memory_size(0) * uint64_t(WASM_PAGE_SIZE)) - 1;
uptr max_shadow_address = MEM_TO_SHADOW(max_address);
// TODO(sbc): In the growable memory case we should really be checking this
// every time we grow.
assert(max_shadow_address <= kLowShadowEnd && "shadow region is too small");
}

void AsanCheckDynamicRTPrereqs() {}
Expand Down
34 changes: 20 additions & 14 deletions tests/test_other.py
Original file line number Diff line number Diff line change
Expand Up @@ -9312,12 +9312,13 @@ def test_malloc_none(self):
self.assertContained('undefined symbol: malloc', stderr)

@parameterized({
'c': ['c'],
'cpp': ['cpp'],
'c': ['c', []],
'cpp': ['cpp', []],
'growth': ['cpp', ['-sALLOW_MEMORY_GROWTH']],
})
def test_lsan_leaks(self, ext):
def test_lsan_leaks(self, ext, args):
self.do_smart_test(test_file('other/test_lsan_leaks.' + ext),
emcc_args=['-fsanitize=leak', '-s', 'ALLOW_MEMORY_GROWTH'],
emcc_args=['-fsanitize=leak'] + args,
assert_returncode=NON_ZERO, literals=[
'Direct leak of 2048 byte(s) in 1 object(s) allocated from',
'Direct leak of 1337 byte(s) in 1 object(s) allocated from',
Expand All @@ -9342,7 +9343,7 @@ def test_lsan_leaks(self, ext):
})
def test_lsan_stack_trace(self, ext, regexes):
self.do_smart_test(test_file('other/test_lsan_leaks.' + ext),
emcc_args=['-fsanitize=leak', '-s', 'ALLOW_MEMORY_GROWTH', '-gsource-map'],
emcc_args=['-fsanitize=leak', '-gsource-map'],
assert_returncode=NON_ZERO, literals=[
'Direct leak of 2048 byte(s) in 1 object(s) allocated from',
'Direct leak of 1337 byte(s) in 1 object(s) allocated from',
Expand All @@ -9355,39 +9356,46 @@ def test_lsan_stack_trace(self, ext, regexes):
})
def test_lsan_no_leak(self, ext):
self.do_smart_test(test_file('other/test_lsan_no_leak.' + ext),
emcc_args=['-fsanitize=leak', '-s', 'ALLOW_MEMORY_GROWTH', '-s', 'ASSERTIONS=0'],
emcc_args=['-fsanitize=leak', '-s', 'ASSERTIONS=0'],
regexes=[r'^\s*$'])

def test_lsan_no_stack_trace(self):
self.do_smart_test(test_file('other/test_lsan_leaks.c'),
emcc_args=['-fsanitize=leak', '-s', 'ALLOW_MEMORY_GROWTH', '-DDISABLE_CONTEXT'],
emcc_args=['-fsanitize=leak', '-DDISABLE_CONTEXT'],
assert_returncode=NON_ZERO, literals=[
'Direct leak of 3427 byte(s) in 3 object(s) allocated from:',
'SUMMARY: LeakSanitizer: 3427 byte(s) leaked in 3 allocation(s).',
])

def test_asan_null_deref(self):
self.do_smart_test(test_file('other/test_asan_null_deref.c'),
emcc_args=['-fsanitize=address', '-sALLOW_MEMORY_GROWTH=1'],
emcc_args=['-fsanitize=address'],
assert_returncode=NON_ZERO, literals=[
'AddressSanitizer: null-pointer-dereference on address',
])

def test_asan_memory_growth(self):
self.do_smart_test(test_file('other/test_asan_null_deref.c'),
emcc_args=['-fsanitize=address', '-sALLOW_MEMORY_GROWTH'],
assert_returncode=NON_ZERO, literals=[
'AddressSanitizer: null-pointer-dereference on address',
])

def test_asan_no_stack_trace(self):
self.do_smart_test(test_file('other/test_lsan_leaks.c'),
emcc_args=['-fsanitize=address', '-sALLOW_MEMORY_GROWTH=1', '-DDISABLE_CONTEXT', '-s', 'EXIT_RUNTIME'],
emcc_args=['-fsanitize=address', '-DDISABLE_CONTEXT', '-s', 'EXIT_RUNTIME'],
assert_returncode=NON_ZERO, literals=[
'Direct leak of 3427 byte(s) in 3 object(s) allocated from:',
'SUMMARY: AddressSanitizer: 3427 byte(s) leaked in 3 allocation(s).',
])

def test_asan_pthread_stubs(self):
self.do_smart_test(test_file('other/test_asan_pthread_stubs.c'), emcc_args=['-fsanitize=address', '-sALLOW_MEMORY_GROWTH'])
self.do_smart_test(test_file('other/test_asan_pthread_stubs.c'), emcc_args=['-fsanitize=address'])

def test_asan_strncpy(self):
# Regression test for asan false positives in strncpy:
# https://github.com/emscripten-core/emscripten/issues/14618
self.do_smart_test(test_file('other/test_asan_strncpy.c'), emcc_args=['-fsanitize=address', '-sALLOW_MEMORY_GROWTH'])
self.do_smart_test(test_file('other/test_asan_strncpy.c'), emcc_args=['-fsanitize=address'])

@node_pthreads
def test_proxy_to_pthread_stack(self):
Expand Down Expand Up @@ -9491,7 +9499,7 @@ def test_mmap_and_munmap_anonymous(self):
self.do_other_test('test_mmap_and_munmap_anonymous.cpp', emcc_args=['-s', 'NO_FILESYSTEM'])

def test_mmap_and_munmap_anonymous_asan(self):
self.do_other_test('test_mmap_and_munmap_anonymous.cpp', emcc_args=['-s', 'NO_FILESYSTEM', '-fsanitize=address', '-s', 'ALLOW_MEMORY_GROWTH'])
self.do_other_test('test_mmap_and_munmap_anonymous.cpp', emcc_args=['-s', 'NO_FILESYSTEM', '-fsanitize=address'])

def test_mmap_memorygrowth(self):
self.do_other_test('test_mmap_memorygrowth.cpp', ['-s', 'ALLOW_MEMORY_GROWTH'])
Expand Down Expand Up @@ -11070,7 +11078,6 @@ def test_pthread_lsan_no_leak(self):
self.set_setting('USE_PTHREADS')
self.set_setting('PROXY_TO_PTHREAD')
self.set_setting('EXIT_RUNTIME')
self.set_setting('INITIAL_MEMORY', '256MB')
self.emcc_args += ['-gsource-map']
self.do_run_in_out_file_test(test_file('pthread/test_pthread_lsan_no_leak.cpp'), emcc_args=['-fsanitize=leak'])
self.do_run_in_out_file_test(test_file('pthread/test_pthread_lsan_no_leak.cpp'), emcc_args=['-fsanitize=address'])
Expand All @@ -11080,7 +11087,6 @@ def test_pthread_lsan_leak(self):
self.set_setting('USE_PTHREADS')
self.set_setting('PROXY_TO_PTHREAD')
self.set_setting('EXIT_RUNTIME')
self.set_setting('INITIAL_MEMORY', '256MB')
self.add_pre_run("Module['LSAN_OPTIONS'] = 'exitcode=0'")
self.emcc_args += ['-gsource-map']
expected = [
Expand Down