Skip to content

pytorch/executorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

executorch

A unified ML software stack within the PyTorch platform for edge devices. It defines new compiler entry points as well as a state-of-art runtime.

https://fburl.com/executorch

Why Executorch?

Compared to the legacy Lite Interpreter, there are some major benefits:

  • Performance wins compared to Lite Interpreter
  • Long term alignment with the direction of PyTorch infrastructure
    • Lite Interpreter relies on TorchScript, which is being phased out; ExecuTorch is the planned replacement for Lite Interpreter.
  • Model Authoring & Productivity gains
    • More and better defined entry points to perform model, device, and/or use-case specific optimizations (e.g. better backend delegation, user-defined compiler transformations, default or user-defined memory planning, etc)
    • Ability to lower constructs like dynamic control flow to run on device.

Meta Internal Users

See the Using PyTorch > Executorch wiki for pointers to internal workplace groups, how-tos, and other resources.

Docs

Better Engineering

Model Migration

Design goals

  • Minimal binary size (< 50KB not including kernels)
  • Minimal framework tax: loading program, initializing executor, kernel and backend-delegate dispatch, runtime memory utilization
  • Portable (cross-compile across many toolchains)
  • Executes ATen kernels (or ATen custom kernels)
  • Executes custom op kernels
  • Supports inter op asynchronous execution
  • Supports static memory allocation (heapless)
  • Supports custom allocation across memory hierarchies
  • Supports control flow needed by models
  • Allows selective build of kernels
  • Allows backend delegation with lightweight interface

Terminology

ATen mode

ATen mode uses the ATen (pytorch core) implementation of Tensor (at::Tensor) along with related types (ScalarType, etc.)

  • at::Tensor is big and complex, and often allocates memory with new/malloc
  • The ATen kernels, which rely on the full at::Tensor API, are usable in this configuration
  • Those kernels also tend to do dynamic memory allocation, and often have extra flexibility (and thus overhead) to handle things not needed by mobile/embedded clients: e.g., CUDA support, sparse tensor support, dtype promotion

Lean mode

Lean mode uses Executorch's smaller torch::executor::Tensor (aka ETensor) implementation, along with related types (torch::executor::ScalarType, etc.)

  • ETensor's API is a source-compatible subset of at::Tensor. Code that is written against ETensor can also build against at::Tensor.
  • "lean mode kernels" are any operator implementations that are written to be compatible with ETensor. But that means that the can also build against at::Tensor if desired, and used in the same model as ATen kernels.
  • ETensor does not own or allocate memory on its own
    • (TODO(T133200526): NOTE: Dynamic shapes are not yet supported. Remove this warning when they are.) To support dynamic shapes, kernels can allocate Tensor data using the MemoryAllocator provided by the client.

Portable kernels

See //executorch/kernels/portable/README.md for technical details.

Portable kernels, which live under //executorch/kernels/portable, are:

  • Lean mode kernels
  • Compatible with ATen operator signatures
  • Written in portable C++ so that they can build for any target
  • Written as reference implementations, prioritizing clarity and simplicity over optimization
  • Generally much smaller in code size than ATen kernels
  • Written to avoid dynamically allocating memory using new/malloc
    • (TODO(T133200526): NOTE: Dynamic shapes are not yet supported. Remove this warning when they are.) To support dynamic shapes, some kernels may allocate Tensor data using the MemoryAllocator provided by the client.

Local tests

General tests

buck2 test fbcode//executorch/...

Run a model in lean mode

  • Uses the lean Executorch Tensor class and related types
  • Uses the kernels under //executorch/kernels/portable instead of the ATen kernels
buck2 run fbcode//executorch/test:executor_runner -- --model_path=fbcode/executorch/test/models/linear_out.ff

Run a model in ATen mode

  • Instead of the lean Executorch Tensor, using ATen tensor so that all ATen kernels can be leveraged
  • Note there can be significant size regression in ATen mode
buck2 run fbcode//executorch/test:executor_runner_aten -- --model_path=fbcode/executorch/test/models/linear_out.ff

Special build modes

Android/mobile builds

In xplat:

buck2 build @fbandroid/mode/opt @fbandroid/mode/ndk_libcxx -c user.ndk_cxxflags="-frtti -fexceptions" fbsource//xplat/executorch/test:executor_runner

ARVR builds

In xplat:

buck2 build @arvr/mode/android/linux/opt-stripped -c ndk.custom_libcxx=false fbsource//xplat/executorch/test:executor_runner