Skip to content

SmolLM is ExecuTorch Compatible #34879

@guangy10

Description

@guangy10

Feature request

Feature request

Enable SmolLM to "Export to ExecuTorch" workflow.

Instructions

Instructions of how to enable this model for ExecuTorch:

  1. Export the model to ExportIR. For LLM, to run with performance, typically you will need to export the model with cache. Llama3 and Llama2 are ExecuTorch compatible #34101 is a reference of how to export and validate the model. Note that you may run into some export issue and it may require fixes in the modeling code.
  2. Lower the model to ExecuTorch (to generate a .pte file). You will need to clone the github repo and create a recipe to lower the model. For example lowering the to XNNPACK is the simplest way. See the example code here: https://github.com/pytorch/executorch/blob/release/0.4/extension/export_util/export_hf_model.py#L89L106
  3. Run the model with ExecuTorch. You can follow these instructions to build and run the executor runtime for llama: https://github.com/pytorch/executorch/tree/release/0.4/examples/models/llama2#step-4-run-on-your-computer-to-validate

(Optional) Congrats! Once you complete step 1-3, you will be able to run the model on a host machine. Now if you would to go further like making the model faster, smaller, cheaper for your model use-case, you can create more complicated recipes with quantizations and delegations for different HW accelerators. You can find more tutorials on our website, for example to optimize and run the model with Core ML on Apple’s platform: https://pytorch.org/executorch/stable/build-run-coreml.html

Motivation

See details in #32253

Your contribution

TBD

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions