-
Notifications
You must be signed in to change notification settings - Fork 0
use nanobind ndarray to make use of bfloat16 weights possible (without coping the data) #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use nanobind ndarray to make use of bfloat16 weights possible (without coping the data) #5
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
ac06f9b
to
a50d3b0
Compare
First iteration, can compile and run with float16 and bfloat16. AFAIU we have now just one object which gets deleted at the end |
a50d3b0
to
8519a35
Compare
How do we add a test for bfloat16 (currently we creat numby objects). Maybe we create the dlpack object directly. Or simply import torch, but this is then a further dependency. Will upstream accept? EDIT: has been clarified by now |
8519a35
to
96aa072
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to make sure ownership of the tensor is handled.
c34f357
to
f49e81e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to verify this change works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just looking into this a little more, the DLPack does not define semantics for transfer of ownership, so for now this ndarray function will have to assume the lifetime of the data is tied to the python interpreter. nanobinds should ensure that condition.
42c3cad
to
893acad
Compare
893acad
to
e0ad43b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. @devtbi can you give your opinion?
up to the caller to ensure that the buffer meets the characteristics | ||
implied by the shape. | ||
|
||
The backing buffer and any user objects will be retained for the lifetime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this statement is true. NDArray is designed as a view of the memory. Ownership is not transferred AFAIK. Can you change the wording to reflect that?
And I believe this is also true for the original buffer constructor. AFAIK for numpy at least we can't transfer ownership of the memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding numpy: We transfer a memory view which 'Create a new memoryview object which references the given object.'
self.dlpack_capsule = array.__dlpack__() | ||
|
||
def __del__(self): | ||
print("BACKING MEMORY DELETED") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print("BACKING MEMORY DELETED") | |
print("DLPACK MEMORY DELETED") |
we actually don't own the memory just the metadata about the tensor (dlpack).
e0ad43b
to
35d0d56
Compare
can't find anything to complain about since my last review... looks good :) |
Fixes llvm#123300 What is seen ``` clang-repl> int x = 42; clang-repl> auto capture = [&]() { return x * 2; }; In file included from <<< inputs >>>:1: input_line_4:1:17: error: non-local lambda expression cannot have a capture-default 1 | auto capture = [&]() { return x * 2; }; | ^ zsh: segmentation fault clang-repl --Xcc="-v" (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x8) * frame #0: 0x0000000107b4f8b8 libclang-cpp.19.1.dylib`clang::IncrementalParser::CleanUpPTU(clang::PartialTranslationUnit&) + 988 frame #1: 0x0000000107b4f1b4 libclang-cpp.19.1.dylib`clang::IncrementalParser::ParseOrWrapTopLevelDecl() + 416 frame #2: 0x0000000107b4fb94 libclang-cpp.19.1.dylib`clang::IncrementalParser::Parse(llvm::StringRef) + 612 frame #3: 0x0000000107b52fec libclang-cpp.19.1.dylib`clang::Interpreter::ParseAndExecute(llvm::StringRef, clang::Value*) + 180 frame #4: 0x0000000100003498 clang-repl`main + 3560 frame #5: 0x000000018d39a0e0 dyld`start + 2360 ``` Though the error is justified, we shouldn't be interested in exiting through a segfault in such cases. The issue is that empty named decls weren't being taken care of resulting into this assert https://github.com/llvm/llvm-project/blob/c1a229252617ed58f943bf3f4698bd8204ee0f04/clang/include/clang/AST/DeclarationName.h#L503 Can also be seen when the example is attempted through xeus-cpp-lite. 
# Symptom We have seen SIGSEGV like this: ``` * thread #1, name = 'lldb-server', stop reason = SIGSEGV frame #0: 0x00007f39e529c993 libc.so.6`__pthread_kill_internal(signo=11, threadid=<unavailable>) at pthread_kill.c:46:37 ... * frame #5: 0x000056027c94fe48 lldb-server`lldb_private::process_linux::GetPtraceScope() + 72 frame #6: 0x000056027c92f94f lldb-server`lldb_private::process_linux::NativeProcessLinux::Attach(int) + 1087 ... ``` See [full stack trace](https://pastebin.com/X0d6QhYj). This happens on Linux where LLDB doesn't have access to `/proc/sys/kernel/yama/ptrace_scope`. A similar error (an unchecked `Error`) can be reproduced by running the newly added unit test without the fix. See the "Test" section below. # Root cause `GetPtraceScope()` ([code](https://github.com/llvm/llvm-project/blob/328f40f408c218f25695ea42c844e43bef38660b/lldb/source/Plugins/Process/Linux/Procfs.cpp#L77)) has the following `if` statement: ``` llvm::Expected<int> lldb_private::process_linux::GetPtraceScope() { ErrorOr<std::unique_ptr<MemoryBuffer>> ptrace_scope_file = getProcFile("sys/kernel/yama/ptrace_scope"); if (!*ptrace_scope_file) return errorCodeToError(ptrace_scope_file.getError()); ... } ``` The intention of the `if` statement is to check whether the `ptrace_scope_file` is an `Error` or not, and return the error if it is. However, the `operator*` of `ErrorOr` returns the value that is stored (which is a `std::unique_ptr<MemoryBuffer>`), so what the `if` condition actually do is to check if the unique pointer is non-null. Note that the method `ErrorOr::getStorage()` ([called by](https://github.com/llvm/llvm-project/blob/328f40f408c218f25695ea42c844e43bef38660b/llvm/include/llvm/Support/ErrorOr.h#L162-L164) `ErrorOr::operator *`) **does** assert on whether or not `HasError` has been set (see [ErrorOr.h](https://github.com/llvm/llvm-project/blob/328f40f408c218f25695ea42c844e43bef38660b/llvm/include/llvm/Support/ErrorOr.h#L235-L243)). However, it seems this wasn't executed, probably because the LLDB was a release build. # Fix The fix is simply remove the `*` in the said `if` statement.
coping the data)