Skip to content

Commit

Permalink
Reland (2) of Enable allocator shim for Android (crrev.com/1875043003)
Browse files Browse the repository at this point in the history
Reason for reland:
The CL was re-reverted by crrev.com/1884223002 because it broke
the internal orderfile bot.
The reason of the breakage was the following:
 - cygprofile.cc instruments all function calls with a
   __cyg_profile_func_enter preamble.
 - __cyg_profile_func_enter uses __thread. __thread under the hoods
   invokes calloc(), on every thread, to initialize the TLS.
 - The shim layer provides its own implementation of calloc().
 - At this point, calloc() gets instrumented as well and it re-enters
   __cyg_profile_func_enter causing an infinite loop.
The key of the problem here is that __thread silently causes calls to
calloc() in a way that is out of control of cygprofile.cc.

The solution proposed by this CL is the following:
 - Don't use __thread, use explicit POSIX functions for TLS (also there
   doesn't seem to be any precendent of using __thread in the codebase).
 - Use a global variable to prevent re-entrancy of the
   __cyg_profile_func_enter in the global (once per process) TLS slot
   initializer.
 - Re-entrancy is gone.

Original issue's description:
> Enable allocator shim for Android
>
> This is a follow-up to crrev.com/1719433002, which introduced the
> shim for Android, and enables it by default by setting
> use_experimental_allocator_shim=true for Android.
>
> Build/Perf sheriffs heads up
> ----------------------------
> If you see any build error or crash related with __wrap_malloc,
> __wrap_free, __real_malloc, __real_free, etc this CL is to blame.
>
> Performance considerations
> ------------------------
> Binary size diff (GN, arm, static, official build): 24k
>
> I did a mixture of local and trybots run to estimate the perf impact
> of this change. Didn't get any conclusive data, everything I tried
> seems in the same ballpark, below noise levels. More in details:
>
> cc_perftests.PrepareTiles on a Nexus 4.
> Rationale of the choice: in a previous CL (crbug.com/593344), this
> benchmark revealed the presence of two mfences in the malloc path.
> Results: https://goo.gl/8VC3Jp in the same ballpark.
>
> page-cycler on Nexus 9 via trybots:
> Results: http://goo.gl/J3i50a seems to suggest that this CL improves
> both warm and cold times in most cases. I doubt it, more likely it's
> noise.
>
> All the other perf trybots failed. The perf waterfall seems to be in a
> bad state in these days.
>
> BUG=550886,598075
> TEST=base_unittests --gtest_filter=AllocatorShimTest.*
> TBR=thakis@chromium.org
>
> Committed: https://crrev.com/ebb95496c73dc0d5ce83968ac619921f154305f7
> Cr-Commit-Position: refs/heads/master@{#386386}

BUG=550886,598075, 602744
TBR=thakis@chromium.org
TEST=gn gen out/Debug --args='is_debug=true target_os="android" use_order_profiling=true target_cpu="arm" is_clang=false';
     ninja -C out/Debug/ cygprofile_unittests;
     adb push out/Debug/cygprofile_unittests /data/local/tmp/cygprofile_unittests_debug;
     adb shell /data/local/tmp/cygprofile_unittests_debug

Review URL: https://codereview.chromium.org/1883093005

Cr-Commit-Position: refs/heads/master@{#387594}
  • Loading branch information
primiano authored and Commit bot committed Apr 15, 2016
1 parent 7422e74 commit f7a321f
Show file tree
Hide file tree
Showing 5 changed files with 53 additions and 18 deletions.
2 changes: 1 addition & 1 deletion base/allocator/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ a project aimed to unify the symbol definition and allocator routing logic in
a central place.

- Full documentation: [Allocator shim design doc][url-allocator-shim].
- Current state: Available and enabled by default on Linux and CrOS.
- Current state: Available and enabled by default on Linux, CrOS and Android.
- Tracking bug: [https://crbug.com/550886][crbug.com/550886].
- Build-time flag: `use_experimental_allocator_shim`.

Expand Down
6 changes: 5 additions & 1 deletion base/allocator/allocator_shim_internals.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,12 @@

#include <sys/cdefs.h> // for __THROW

#ifndef __THROW /* Not a glibc system */
#ifndef __THROW // Not a glibc system
#ifdef _NOEXCEPT // LLVM libc++ uses noexcept instead
#define __THROW _NOEXCEPT
#else
#define __THROW
#endif // !_NOEXCEPT
#endif

// Shim layer symbols need to be ALWAYS exported, regardless of component build.
Expand Down
2 changes: 1 addition & 1 deletion build/common.gypi
Original file line number Diff line number Diff line change
Expand Up @@ -2274,7 +2274,7 @@
'use_allocator%': 'none',
'use_sanitizer_options%': 1,
}],
['OS=="linux" and asan==0 and msan==0 and lsan==0 and tsan==0 and build_for_tool==""', {
['(OS=="linux" or OS=="android") and asan==0 and msan==0 and lsan==0 and tsan==0 and build_for_tool==""', {
'use_experimental_allocator_shim%': 1,
}],
['OS=="linux" and asan==0 and msan==0 and lsan==0 and tsan==0', {
Expand Down
2 changes: 1 addition & 1 deletion build/config/allocator.gni
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ if (is_android || current_cpu == "mipsel" || is_mac || is_ios || is_asan ||
_default_allocator = "tcmalloc"
}

if (is_linux && !is_asan && !is_lsan && !is_tsan && !is_msan) {
if ((is_linux || is_android) && !is_asan && !is_lsan && !is_tsan && !is_msan) {
_default_use_experimental_allocator_shim = true
} else {
_default_use_experimental_allocator_shim = false
Expand Down
59 changes: 45 additions & 14 deletions tools/cygprofile/cygprofile.cc
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,27 @@ const char kLogFilenameFormat[] = "%scyglog.%d.%d-%d";
ThreadLog* const kMagicBeingConstructed = reinterpret_cast<ThreadLog*>(1);

// Per-thread pointer to the current log object.
static __thread ThreadLog* g_tls_log = NULL;
pthread_key_t g_tls_slot;

// Used to initialize the tls slot, once per the entire process.
pthread_once_t g_tls_slot_initializer_once = PTHREAD_ONCE_INIT;

// This variable is to prevent re-entrancy in the __cyg_profile_func_enter()
// while the TLS slot itself is being initialized. Volatile here is required
// to avoid compiler optimizations as this need to be read in a re-entrant way.
// This variable is written by one thread only, which is the first thread that
// happens to run the TLSSlotInitializer(). In practice this will happen very
// early in the startup process, as soon as the first instrumented function is
// called.
volatile bool g_tls_slot_being_initialized = false;

// Initializes the global TLS slot. This is invoked only once per process.
static void TLSSlotInitializer()
{
g_tls_slot_being_initialized = true;
PCHECK(0 == pthread_key_create(&g_tls_slot, NULL));
g_tls_slot_being_initialized = false;
}

// Returns light-weight process ID. On Linux, this is a system-wide unique
// thread id.
Expand Down Expand Up @@ -151,20 +171,23 @@ class Thread {
public:
Thread(const base::Closure& thread_callback)
: thread_callback_(thread_callback) {
CHECK_EQ(0, pthread_create(&handle_, NULL, &Thread::EntryPoint, this));
PCHECK(0 == pthread_create(&handle_, NULL, &Thread::EntryPoint, this));
}

~Thread() {
CHECK_EQ(0, pthread_join(handle_, NULL));
PCHECK(0 == pthread_join(handle_, NULL));
}

private:
static void* EntryPoint(void* data) {
// Disable logging on this thread. Although this routine is not instrumented
// (cygprofile.gyp provides that), the called routines are and thus will
// call instrumentation.
CHECK(g_tls_log == NULL); // Must be 0 as this is a new thread.
g_tls_log = kMagicBeingConstructed;
pthread_once(&g_tls_slot_initializer_once, TLSSlotInitializer);
ThreadLog* thread_log = reinterpret_cast<ThreadLog*>(
pthread_getspecific(g_tls_slot));
CHECK(thread_log == NULL); // Must be 0 as this is a new thread.
PCHECK(0 == pthread_setspecific(g_tls_slot, kMagicBeingConstructed));

Thread* const instance = reinterpret_cast<Thread*>(data);
instance->thread_callback_.Run();
Expand Down Expand Up @@ -199,7 +222,7 @@ ThreadLog::ThreadLog(const FlushCallback& flush_callback)
}

ThreadLog::~ThreadLog() {
g_tls_log = NULL;
PCHECK(0 == pthread_setspecific(g_tls_slot, NULL));
}

void ThreadLog::AddEntry(void* address) {
Expand Down Expand Up @@ -358,16 +381,24 @@ void __cyg_profile_func_exit(void* this_fn, void* call_site)
__attribute__((no_instrument_function));

void __cyg_profile_func_enter(void* this_fn, void* callee_unused) {
if (g_tls_log == NULL) {
g_tls_log = kMagicBeingConstructed;
ThreadLog* new_log = new ThreadLog();
CHECK(new_log);
g_logs_manager.Pointer()->AddLog(base::WrapUnique(new_log));
g_tls_log = new_log;
// Avoid re-entrancy while initializing the TLS slot (once per process).
if (g_tls_slot_being_initialized)
return;

pthread_once(&g_tls_slot_initializer_once, TLSSlotInitializer);
ThreadLog* thread_log = reinterpret_cast<ThreadLog*>(
pthread_getspecific(g_tls_slot));

if (thread_log == NULL) {
PCHECK(0 == pthread_setspecific(g_tls_slot, kMagicBeingConstructed));
thread_log = new ThreadLog();
CHECK(thread_log);
g_logs_manager.Pointer()->AddLog(base::WrapUnique(thread_log));
PCHECK(0 == pthread_setspecific(g_tls_slot, thread_log));
}

if (g_tls_log != kMagicBeingConstructed)
g_tls_log->AddEntry(this_fn);
if (thread_log != kMagicBeingConstructed)
thread_log->AddEntry(this_fn);
}

void __cyg_profile_func_exit(void* this_fn, void* call_site) {}
Expand Down

0 comments on commit f7a321f

Please sign in to comment.