Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Use the selective libtorch build #1314

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

makslevental
Copy link
Collaborator

@makslevental makslevental commented Aug 30, 2022

This build path reduces number of source files ~1800 -> ~1400. The selective build trims unused ops at link time so most of the compile time savings comes not from fewer kernels but fewer auxiliary files and slightly leaner dependencies (e.g. FBGEMM is no longer necessary). lightweight_dispatch_ops.yaml is composed of ops that torch-mlir aims to support (i.e. have tablegen definitions) and bare minimum necessary to make pytorch package imports happy (e.g. all of the ops in _meta_registrations.py).

Note there's actually a much leaner build available here by

  1. setting BUILD_LITE_INTERPRETER=1
  2. running build_libtorch.py instead of setup.py to actually build just libtorch (no python bindings)

This path only compiles ~850 source files.

Tests should pass...

- prim::Print
- prim::tolist
- prim::abs.Scalar
# torch-mlir ops end
Copy link
Collaborator Author

@makslevental makslevental Aug 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below here are ops that are imported somewhere in torch python source files (and thus need to be linked/registered/etc). There is probably some overlap (I have not looked closely) but it's immaterial since this list is ultimately "uniquefied" anyway.

@silvasean
Copy link
Contributor

I'd like to avoid having yet another file that needs to be updated as we add ops. Is the build time savings massive from this? I think once we have the right caching the difference between 1800 and 1400 won't matter much.

@makslevental
Copy link
Collaborator Author

I'd like to avoid having yet another file that needs to be updated as we add ops. Is the build time savings massive from this?

Ya I agree with this - the dubious value at the cost of another file to maintain - my thinking was that the needed torch-mlir ops could be procedurally generated during this build step e.g. using grep (that is in fact how I pulled them out).

I think once we have the right caching the difference between 1800 and 1400 won't matter much.

Indeed it's not, and in actuality on these GHA instances, for whatever reason, that 1400 turns out to actually be about ~1600.

On the otherhand, the real benefit I see here is a lightweight/minimal libtorch.so that can be statically linked against by _jit_ir_importer.so; my understanding from conversations with @powderluv is that this is a wishlist item which enables smoother packaging for environments that can't stomach the weight of stock libtorch.so.

@powderluv
Copy link
Collaborator

I think we can still statically link with the 1600 files (and the linker will auto prune). But that is a nice to have so we don't link against anything called libtorch.so

@makslevental
Copy link
Collaborator Author

I think we can still statically link with the 1600 files (and the linker will auto prune).

It won't. Dispatcher registration code is generated and therefore kernels are not pruned (take a look at any of the build/aten/src/ATen/Register*.cpp files). You only get pruning if you take this path because exactly that step is skipped (op registration code gen).

@powderluv
Copy link
Collaborator

I think the pruning was an additional benefit. The more important thing is can we have all symbols resolved locally so we don't have to reach out to link to Pytorch's libtorch.so so we could support a range of pytorch versions (which we will have to still validate).

qedawkins pushed a commit to nod-ai/torch-mlir that referenced this pull request Oct 3, 2022
Signed-off-by: Whitney Tsang <whitneyt@ca.ibm.com>
qedawkins pushed a commit to nod-ai/torch-mlir that referenced this pull request Oct 3, 2022
The modification to use the builder based interface to generate Krnl loop is completed (llvm#1250, llvm#1283, llvm#1285, llvm#1292, llvm#1293, llvm#1303, llvm#1308, llvm#1314, llvm#1316, llvm#1317, llvm#1326, llvm#1403), and BuildKrnlLoop is no longer needed.

Signed-off-by: Whitney Tsang whitneyt@ca.ibm.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants