Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libc++ asan annotations for short string inline storage is incompatible with apple blocks runtime #96099

Open
nico opened this issue Jun 19, 2024 · 7 comments
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Comments

@nico
Copy link
Contributor

nico commented Jun 19, 2024

Consider this program:

% cat repro.mm
#include <string>

[[gnu::noinline]] void use_string(const std::string&) {}

typedef void(^B)();
[[gnu::noinline]] void use_block(B&& b) {}

int main() {
  std::string request_id("123");
  use_block(^() {
    use_string(request_id);
  });
}

I don't have a completely standalone repro yet, but based on my current understanding, it's possible to make a repro that uses just the llvm-project repo. For my current repro, I'm using Chromium's toolchain like so:

% third_party/llvm-build/Release+Asserts/bin/clang repro.mm -isystem third_party/libc++/src/include/ -isystem buildtools/third_party/libc++/ -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_NONE  -nostdinc++ -fobjc-arc -fno-exceptions -fno-rtti -fblocks -fsanitize=address out/gn/obj/buildtools/third_party/libc++/libc++/*.o out/gn/obj/buildtools/third_party/libc++abi/libc++abi/*.o -I. -isysroot $(xcrun -show-sdk-path)

This triggers an asan report from the blocks runtime:

% MallocNanoZone=0 ./a.out
=================================================================
==8812==ERROR: AddressSanitizer: container-overflow on address 0x00016bd6f3e4 at pc 0x0001049b74dc bp 0x00016bd6f2d0 sp 0x00016bd6ea80
READ of size 56 at 0x00016bd6f3e4 thread T0
    #0 0x1049b74d8  (libclang_rt.asan_osx_dynamic.dylib:arm64+0x4f4d8)
    #1 0x194932b48 in _Block_copy+0x58 (libsystem_blocks.dylib:arm64+0x1b48)
    #2 0x630f80010409287c  (<unknown module>)
    #3 0x1948ae0dc  (<unknown module>)
    #4 0x86397ffffffffffc  (<unknown module>)

Here's why:

When creating a block closure, clang will copy C++ objects into the closure using their copy constructor, see clang/lib/CGBlocks.cpp (so far so good).

clang will create calls to objc_retainBlock under various conditions, see clang/lib/CodeGen/CGObjC.cpp. One such condition is passing a block to a function taking an rvalue reference when ARC is enabled (but there are other conditions).

objc_retainBlock just calls _Block_copy (https://github.com/opensource-apple/objc4/blob/cd5e62a5597ea7a31dccef089317abb3a661c154/runtime/NSObject.mm#L221>

If the block contains C++ objects, bit 25 in the block descriptor is set. This bit is called BLOCK_HAS_COPY_DISPOSE in both clang and BlocksRuntime. CGBlocks.cpp generates a copy helper (GenerateCopyHelperFunction) that the block points to. For C++ objects in the closure, it again calls copy ctors.

_Block_copy is defined in compiler-rt/lib/BlocksRuntime/runtime.c. It's part of the system dyld closure which ships in macOS. It does:

...
    // Its a stack block.  Make a copy.
    if (!isGC) {
        struct Block_layout *result = malloc(aBlock->descriptor->size);
        if (!result) return (void *)0;
        memmove(result, aBlock, aBlock->descriptor->size); // bitcopy first
        // reset refcount
        result->flags &= ~(BLOCK_REFCOUNT_MASK);    // XXX not needed
        result->flags |= BLOCK_NEEDS_FREE | 1;
        result->isa = _NSConcreteMallocBlock;
        if (result->flags & BLOCK_HAS_COPY_DISPOSE) {
            //printf("calling block copy helper %p(%p, %p)...\n", aBlock->descriptor->copy, result, aBlock);
            (*aBlock->descriptor->copy)(result, aBlock); // do fixup
        }
        return result;
    }
...

That is, it first makes a copy of the block's closure variables using memmove() and then calls the copy helper to fix things up.

1a96179 enabled asan instrumentation for the short string inline storage. In the repro, the string is set to "123", meaning some of its short string storage isn't initialized. That makes that memmove() in the block runtime fail.

Given that this call is in a macOS system library, fixing this will probably be involved.

(This of course only happens if _LIBCPP_INSTRUMENTED_WITH_ASAN is enabled.)

clang/docs/Block-ABI-Apple.rst describes the whole blocks ABI fairly well.

@github-actions github-actions bot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Jun 19, 2024
@nico
Copy link
Contributor Author

nico commented Jun 19, 2024

Triggering this is as easy as:

  • using blocks (which almost all Objective-C code does nowadays)
  • having a block that captures an std::string
  • having the compiler call objc_retainBlock on such a block
    • e.g. .when using ARC (which almost all Objective-C code does nowadays) and passing a block to a template taking an rvalue reference
  • using asan

This is likely all true in most larger Objective-C++ code bases.

@pinskia
Copy link

pinskia commented Jun 19, 2024

Won't this happen with any code which uses asan annotations for their memory in a structure that gets copied for the apple blocks? If so shouldn't this be fixed in the block runtime instead?

@nico
Copy link
Contributor Author

nico commented Jun 19, 2024

Won't this happen with any code which uses asan annotations for their memory in a structure that gets copied for the apple blocks?

Yes, but you'd only use asan annotations for things with inline storage. And even then, it seems to be very rare in practice. I've only seen this happen with std::string so far.

If so shouldn't this be fixed in the block runtime instead?

That seems like the right fix to me too, but:

  • given that that ships as a system library on macOS rolling out that fix will take a while
  • it's tricky to fix too

So we'll need something else in the meantime as well.

@AdvenamTacet
Copy link
Member

It seems to be Apple-specific issue and I agree that it should be fixed in the block logic. However, if it's impossible to do it in time, we can temporarily turn off short string annotations for macOS only.

But why it's using memmove instead of std::basic_string constructor?

There are a few other possible solutions for that, but using the constructor is probably the best one.
Temporarily, we can turn off short string annotations for affected systems, unless someone has a better idea.

I don't have macOS, so unfortunately I cannot test anything myself.

@nico
Copy link
Contributor Author

nico commented Jun 20, 2024

But why it's using memmove instead of std::basic_string constructor?

A block closure is a block of bytes. It doesn't know about the types of data therein. See clang/docs/Block-ABI-Apple.rst

AdvenamTacet pushed a commit to trail-of-forks/llvm-project that referenced this issue Jun 21, 2024
This commit disables short string AddressSanitizer annotations on Apple platforms as a temporary solution to the problem reported in issue llvm#96099.

For more information on Apple's block implementation, please refer to [`clang/docs/Block-ABI-Apple.rst`](/clang/docs/Block-ABI-Apple.rst). The core issue lies in the fact that blocks are unaware of their content, causing AddressSanitizer errors when blocks are moved using `memmove`.

I believe - and I'm not alone - that the issue should ideally be addressed within the block moving logic. However, if a timely resolution is not feasible, this temporary fix can be used. Before merging, we should ensure that a more permanent solution cannot be implemented in time and that this change effectively resolves the issue.
@AdvenamTacet
Copy link
Member

AdvenamTacet commented Jun 21, 2024

Thank you for pointing me into the explanation.
I'm not really a fan of turning off short string annotations, but in case it's the only option I opened a new PR: #96269

Other solutions I see right now are:

  • turn off instrumentation for moving memory in _Block_copy - that may not work based on blocks implementation, which I don't know.
  • unpoison memory before memmove - that should resolve all issues.
  • and the best solution after [compiler-rt][ASan] Add function copying annotations #91702 merge is to use functions from it, I'm trying to finish that PR asap, but it may be reasonable to use unpoisoning solution until then, or just turn off SSO-case annotations on affected platforms for time being.

@ldionne ldionne added this to the LLVM 19.X Release milestone Jul 9, 2024
ldionne pushed a commit that referenced this issue Jul 19, 2024
This commit disables short string AddressSanitizer annotations on Apple
platforms as a temporary solution to the problem reported in #96099.

For more information on Apple's block implementation, please refer to
clang/docs/Block-ABI-Apple.rst [1]. The core issue lies in the fact
that blocks are unaware of their content, causing AddressSanitizer
errors when blocks are moved using `memmove`.

I believe - and I'm not alone - that the issue should ideally be
addressed within the block moving logic. However, this patch provides
a temporary fix until a proper resolution exists in the Blocks runtime.

[1]: https://github.com/llvm/llvm-project/blob/main/clang/docs/Block-ABI-Apple.rst
@ldionne
Copy link
Member

ldionne commented Jul 23, 2024

As a matter of process, removing the 19.x release milestone since this isn't really actionable from the LLVM side IMO. This should be fixed in the Blocks runtime AFAIU.

@ldionne ldionne removed this from the LLVM 19.X Release milestone Jul 23, 2024
sgundapa pushed a commit to sgundapa/upstream_effort that referenced this issue Jul 23, 2024
This commit disables short string AddressSanitizer annotations on Apple
platforms as a temporary solution to the problem reported in llvm#96099.

For more information on Apple's block implementation, please refer to
clang/docs/Block-ABI-Apple.rst [1]. The core issue lies in the fact
that blocks are unaware of their content, causing AddressSanitizer
errors when blocks are moved using `memmove`.

I believe - and I'm not alone - that the issue should ideally be
addressed within the block moving logic. However, this patch provides
a temporary fix until a proper resolution exists in the Blocks runtime.

[1]: https://github.com/llvm/llvm-project/blob/main/clang/docs/Block-ABI-Apple.rst
yuxuanchen1997 pushed a commit that referenced this issue Jul 25, 2024
Summary:
This commit disables short string AddressSanitizer annotations on Apple
platforms as a temporary solution to the problem reported in #96099.

For more information on Apple's block implementation, please refer to
clang/docs/Block-ABI-Apple.rst [1]. The core issue lies in the fact
that blocks are unaware of their content, causing AddressSanitizer
errors when blocks are moved using `memmove`.

I believe - and I'm not alone - that the issue should ideally be
addressed within the block moving logic. However, this patch provides
a temporary fix until a proper resolution exists in the Blocks runtime.

[1]: https://github.com/llvm/llvm-project/blob/main/clang/docs/Block-ABI-Apple.rst

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251313
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

No branches or pull requests

4 participants