forked from llvm/torch-mlir
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This also restructures the docs to a "roadmap" directory, to preserve previous roadmaps / allow retrospective "grading" of how we did.
- Loading branch information
Showing
2 changed files
with
174 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
# Roadmap as of beginning of 2021Q3 | ||
|
||
## Project status overview | ||
|
||
- TorchScript compilation: Significant work has gone into the TorchScript | ||
compilation workstream. Basic multi-layer perceptrons execute end-to-end, and | ||
significant strides have been taken towards ResNet and quantized programs. | ||
Additionally, a full TorchScript'able machine translation model (IDs to IDs; | ||
including beam search) has been identified as representative of the kind of | ||
challenging programs that the TorchScript ahead-of-time compilation flow will | ||
enable. | ||
|
||
- `acap_dispatch`: Discussions with stakeholders in the npcomp and PyTorch | ||
community have shifted the `acap_dispatch` workstream to upstream discussions | ||
(see [bug](https://github.com/pytorch/xla/issues/2854)). Work within npcomp on | ||
`acap_dispatch` is temporarily on hold. | ||
|
||
- RefBackend: The RefBackend workstream is temporarily on hold as well. The | ||
needs of the TorchScript compilation path are too complex (lists, dicts, error | ||
handling, runtime ABI) and the engineering resources too limited to | ||
meaningfully bring up an alternative backend. The decision going forward is to | ||
single-source on IREE as our needs become more complex. This is somewhat | ||
unfortunate, as the goal of the RefBackend was to somewhat defray the backend | ||
story and prevent single-sourcing on what at the time (~2020Q1) was perceived | ||
as a large external dependency. Somewhat mitigating this situation though is | ||
that in the intervening year, IREE has become significantly "leaner and | ||
meaner", and while still nontrivial, it has found a much more tightly scoped | ||
role that leans much more heavily on upstream infrastructure. In fact, | ||
inclusion of IREE in the LLVM project in some form now seems possible, which | ||
will make this dependency very natural. | ||
|
||
## Non-technical project status overview | ||
|
||
Community contributions have somewhat petered out due to the shifting focus of | ||
the project. This was somewhat expected as the early aspirations of the project | ||
met with the reality of available resourcing, ecosystem constraints, and more | ||
fine-grained understanding of stakeholder needs. We have brought on 1 new full | ||
time engineer to work on the project though. | ||
|
||
## Roadmap overview | ||
|
||
The project has converged on the TorchScript workstream as the primary effort: | ||
|
||
- TorchScript compilation: The goal of this project is to build the frontend of | ||
a truly next-generation ahead-of-time machine learning compiler. | ||
- Why this project is cool: This system is designed from day 1 to support | ||
features such as dynamic shapes, control flow, mutable variables, | ||
program-internal state, and non-Tensor types (scalars, lists, dicts) in a | ||
principled fashion. These features are essential for empowering an | ||
industry-level shift in the set of machine learning programs that are | ||
feasible to deploy with minimal effort across many devices (when combined | ||
with a backend using the advanced compilation techniques being developed | ||
elsewhere in the MLIR ecosystem). | ||
|
||
## TorchScript compilation | ||
|
||
The TorchScript compiler represents the bulk of core compiler effort in the | ||
npcomp project. | ||
[TorchScript](https://pytorch.org/docs/stable/jit_language_reference.html) is a | ||
restricted (more static) subset of Python, but even TorchScript is quite dynamic | ||
when compared to the needs of lower-levels of the compilation stack, especially | ||
systems like Linalg. The overarching theme of this project is building out | ||
compiler components that bridge that gap. As we do so, the recurring tradeoffs | ||
are: | ||
|
||
- user experience: we want a fairly unrestricted programming model -- that's | ||
what users like about PyTorch, and what enables users to deploy without | ||
significant modifications of their code. | ||
- feasibility of the compiler: we want a smart compiler that is feasible to | ||
implement (for our own sanity :) ) | ||
- excellent generated code quality: this is of course dependent on the backend | ||
which is paired with the frontend we are building, but there are a number of | ||
transformations that make sense before we reach the backend which strongly | ||
affect the quality of code generated from a backend. | ||
|
||
To give a concrete example, consider the problem of inferring the shapes of | ||
tensors at various points in the program. The more precision we have on the | ||
shapes, the better code can be emitted by a backend. But in general, users need | ||
to provide at least some information about their program to help the compiler | ||
understand what shapes are at different points in the program. The smarter our | ||
compiler algorithms are, the less information the user needs to provide. Thus, | ||
all 3 facets are interlinked and there is no single right answer -- we need to | ||
balance them for a workable system. | ||
|
||
To accomplish this goal, we are guided by a *model curriculum*, which consists | ||
of programs of escalating complexity, from a simple elementwise operation all | ||
the way to a full-blown end-to-end speech recognition program. Our development | ||
process consists of setting incremental objectives to build out new layers of | ||
the compiler to a satisfactory level on easier programs in the curriculum, and | ||
backfilling complexity as needed to extend to the harder programs. Ideally, this | ||
backfilling does not require deep conceptual changes to components, but is | ||
simply an application of extension points anticipated in the original design. | ||
The trick to making that happen is evaluating designs on enough programs from | ||
the curriculum to ensure that a solution is likely to generalize and satisfy our | ||
objectives, without getting bogged down in theoretical details. | ||
|
||
### 2021Q3 | ||
|
||
- Theme: Scale up the programs we can run end-to-end. | ||
- End-to-end execution of ResNet. | ||
- Significant strides towards end-to-end execution of the identified | ||
end-to-end machine translation model. | ||
- End-to-end execution of simple programs with lists. | ||
- End-to-end execution of simple stateful programs. | ||
- Significant strides towards end-to-end execution of two major "classes of | ||
models". Tentatively: transformer, LSTM. | ||
- Theme: Start feeling production-ey | ||
- For the simplest programs at least, get them running on IREE with | ||
performance competitive with other frontends | ||
- Stretch: Extend this result to ResNet. | ||
- Write initial "user manual" (and any supporting tools, packaging) for how to | ||
use the new frontend (+ backend integration points) to deploy something on | ||
IREE. | ||
- Redesign frontend API's as needed to be palatable to document. | ||
|
||
|
||
### 2021Q4 | ||
|
||
- Theme: Compiler becomes generally functional for a large class of programs | ||
- End-to-end execution of end-to-end MT (machine translation) program. | ||
- End-to-end execution of the two major "classes of models" added to the | ||
curriculum in Q3. | ||
- End-to-end execution of quantized model. | ||
- Identify/build TorchScript'able ASR (automatic speech recognition) program. | ||
- Significant strides towards end-to-end execution of ASR. | ||
- Bringing up new programs should be fairly quick and mechanical. | ||
- Theme: Pathfind next phase after initial compiler bringup. | ||
- Begin talks with potential users / applications to identify a useful | ||
"real" capstone project. | ||
- Goal: Demonstrate viability of the tools. | ||
- Goal: Start rallying support / interest more broadly. | ||
- Begin looking at training use cases. | ||
- Begin looking at building "anti-framework" numerical Python compiler layered | ||
on our TorchScript compiler. |