Description
@sgugger is trying to get offload/nvme to work on archlinux and has issues building the async_io
extension.
First, I discovered Archlinux (I don't know this system) doesn't do dev packages, but packs everything into one package.
So there it is pacman libaio
which he installed. I did check that that package https://archlinux.org/packages/core/x86_64/libaio/ contains libaio.h
and libaio.so.1
What else is missing? he is still getting:
[WARNING] async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.
On ubuntu:
$ dpkg-query -L libaio-dev
/usr/include/libaio.h
/usr/lib/x86_64-linux-gnu/libaio.a
/usr/lib/x86_64-linux-gnu/libaio.so
(I trimmed the output)
On archlinux: (package I manually downloaded from https://archlinux.org/packages/core/x86_64/libaio/)
$ tar xvf libaio-0.3.112-2-x86_64.pkg.tar
usr/include/libaio.h
usr/lib/libaio.so.1.0.1
(again trimmed to the essentials)
So the latter is missing libaio.a
- does async_io
want the static object and won't use the shared object? Is that the problem?
You can see how archlinux package was built here:
https://github.com/archlinux/svntogit-packages/blob/packages/libaio/trunk/PKGBUILD
In any case I think it'd be useful to expose the build command and the error and not just have:
self = <deepspeed.ops.op_builder.async_io.AsyncIOBuilder object at 0x7fc52033a790>, verbose = True
def jit_load(self, verbose=True):
if not self.is_compatible():
raise RuntimeError(
> f"Unable to JIT load the {self.name} op due to it not being compatible due to hardware/software issue."
)
E RuntimeError: Unable to JIT load the async_io op due to it not being compatible due to hardware/software issue.
Since as you can see on non-Ubuntu system, this would be much more difficult to debug and need the actual compiler error.
Activity