Skip to content

[win/asan] AllocateMemoryForTrampoline within 2 GB of the module's base address #108822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 18, 2024

Conversation

zmodem
Copy link
Collaborator

@zmodem zmodem commented Sep 16, 2024

Since we may copy code (see CopyInstructions) to the trampoline which could reference data inside the original module, we really want the trampoline to be within 2 GB of not just the original function, but within anything that function may have rip-relative accesses to, i.e. within 2 GB of that function's whole module.

This fixes interception failures like the following scenario:

  1. Intercept CreateProcess in kernel32.dll, allocating a trampoline region right after
  2. Start intercepting memcpy in the main executable, which is loaded at a lower address than kernel32.dll, but still within 2 GB of the trampoline region so we keep using it.
  3. Try to copy instructions from memcpy to the trampoline. Turns out one instruction references data that is more than 2GB away from the trampoline, so it can't be relocated.
  4. The process exits due to a CHECK failure

(Full story at https://crbug.com/341936875#comment45 and following.)

@llvmbot
Copy link
Member

llvmbot commented Sep 16, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

@llvm/pr-subscribers-platform-windows

Author: Hans (zmodem)

Changes

Since we may copy code (see CopyInstructions) to the trampoline which could reference data inside the original module, we really want the trampoline to be within 2 GB of not just the original function, but within anything that function may have rip-relative accesses to, i.e. within 2 GB of that function's whole module.

This fixes interception failures like the following scenario:

  1. Intercept CreateProcess in kernel32.dll, allocating a trampoline region right after
  2. Start intercepting memcpy in the main executable, which is loaded at a lower address than kernel32.dll, but still within 2 GB of the trampoline region so we keep using it.
  3. Try to copy instructions from memcpy to the trampoline. Turns out one instruction references data that is more than 2GB away from the trampoline, so it can't be relocated.
  4. The process exits due to a CHECK failure

(Full story at https://crbug.com/341936875#comment45 and following.)


Full diff: https://github.com/llvm/llvm-project/pull/108822.diff

1 Files Affected:

  • (modified) compiler-rt/lib/interception/interception_win.cpp (+24-1)
diff --git a/compiler-rt/lib/interception/interception_win.cpp b/compiler-rt/lib/interception/interception_win.cpp
index a638e66eccee58..b7979b271f468f 100644
--- a/compiler-rt/lib/interception/interception_win.cpp
+++ b/compiler-rt/lib/interception/interception_win.cpp
@@ -130,6 +130,7 @@
 #include "sanitizer_common/sanitizer_platform.h"
 #define WIN32_LEAN_AND_MEAN
 #include <windows.h>
+#include <psapi.h>
 
 namespace __interception {
 
@@ -385,7 +386,29 @@ void TestOnlyReleaseTrampolineRegions() {
   }
 }
 
-static uptr AllocateMemoryForTrampoline(uptr image_address, size_t size) {
+static uptr AllocateMemoryForTrampoline(uptr func_address, size_t size) {
+  uptr image_address = func_address;
+
+#if SANITIZER_WINDOWS64
+  // Since we may copy code to the trampoline which could reference data
+  // inside the original module, we really want the trampoline to be within
+  // 2 GB of not just the original function, but within 2 GB of that function's
+  // whole module. Since the allocated trampoline's address is always greater
+  // than image_address, we achieve this by setting image_address to the base
+  // address of the module, if we can find it (which is not the case if
+  // func_address is in mmap'ed memory for example).
+  HMODULE module;
+  if (::GetModuleHandleExW(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS |
+                           GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT,
+                           (LPCWSTR)func_address, &module)) {
+    MODULEINFO module_info;
+    if (::GetModuleInformation(::GetCurrentProcess(), module,
+                                &module_info, sizeof(module_info))) {
+      image_address = (uptr)module_info.lpBaseOfDll;
+    }
+  }
+#endif
+
   // Find a region within 2G with enough space to allocate |size| bytes.
   TrampolineMemoryRegion *region = nullptr;
   for (size_t bucket = 0; bucket < kMaxTrampolineRegion; ++bucket) {

Copy link

github-actions bot commented Sep 16, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff cca54e347ac34912cdfb9983533c61836db135e0 32b02e0cfce2e6432b237e92e378f9cbc8a6ff7b --extensions cpp -- compiler-rt/lib/interception/interception_win.cpp
View the diff from clang-format here.
diff --git a/compiler-rt/lib/interception/interception_win.cpp b/compiler-rt/lib/interception/interception_win.cpp
index a0ff124a89..98dfbc84f4 100644
--- a/compiler-rt/lib/interception/interception_win.cpp
+++ b/compiler-rt/lib/interception/interception_win.cpp
@@ -129,8 +129,8 @@
 #if SANITIZER_WINDOWS
 #include "sanitizer_common/sanitizer_platform.h"
 #define WIN32_LEAN_AND_MEAN
-#include <windows.h>
-#include <psapi.h>
+#  include <psapi.h>
+#  include <windows.h>
 
 namespace __interception {
 
@@ -389,7 +389,7 @@ void TestOnlyReleaseTrampolineRegions() {
 static uptr AllocateMemoryForTrampoline(uptr func_address, size_t size) {
   uptr image_address = func_address;
 
-#if SANITIZER_WINDOWS64
+#  if SANITIZER_WINDOWS64
   // Allocate memory after the module (DLL or EXE file), but within 2GB
   // of the start of the module so that any address within the module can be
   // referenced with PC-relative operands.
@@ -400,15 +400,15 @@ static uptr AllocateMemoryForTrampoline(uptr func_address, size_t size) {
   // use func_address as is.
   HMODULE module;
   if (::GetModuleHandleExW(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS |
-                           GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT,
+                               GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT,
                            (LPCWSTR)func_address, &module)) {
     MODULEINFO module_info;
-    if (::GetModuleInformation(::GetCurrentProcess(), module,
-                                &module_info, sizeof(module_info))) {
+    if (::GetModuleInformation(::GetCurrentProcess(), module, &module_info,
+                               sizeof(module_info))) {
       image_address = (uptr)module_info.lpBaseOfDll;
     }
   }
-#endif
+#  endif
 
   // Find a region within 2G with enough space to allocate |size| bytes.
   TrampolineMemoryRegion *region = nullptr;

@zmodem
Copy link
Collaborator Author

zmodem commented Sep 16, 2024

@bergeret too if you're still interested in this kind of thing

Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find!

@aganea
Copy link
Member

aganea commented Sep 16, 2024

Stefan from @MolecularMatters might be interested in this too, I think Live++ does something similar to find free regions within the module reach.

@MolecularMatters
Copy link

Stefan from @MolecularMatters might be interested in this too, I think Live++ does something similar to find free regions within the module reach.

Live++ already knows the base address of the module (which is different to asan I think?) and scans the surrounding +-2GB region using NtQueryVirtualMemory.

Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! New comment looks good.

@zmodem zmodem merged commit 3d2925b into llvm:main Sep 18, 2024
6 of 7 checks passed
tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024
…se address (llvm#108822)

Since we may copy code (see CopyInstructions) to the trampoline which
could reference data inside the original module, we really want the
trampoline to be within 2 GB of not just the original function, but
within anything that function may have rip-relative accesses to, i.e.
within 2 GB of that function's whole module.

This fixes interception failures like the following scenario:

1. Intercept `CreateProcess` in kernel32.dll, allocating a trampoline
region right after
2. Start intercepting `memcpy` in the main executable, which is loaded
at a lower address than kernel32.dll, but still within 2 GB of the
trampoline region so we keep using it.
3. Try to copy instructions from `memcpy` to the trampoline. Turns out
one instruction references data that is more than 2GB away from the
trampoline, so it can't be relocated.
4. The process exits due to a CHECK failure

(Full story at https://crbug.com/341936875#comment45 and following.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants