Skip to content

Latest commit

 

History

History
28 lines (18 loc) · 775 Bytes

File metadata and controls

28 lines (18 loc) · 775 Bytes

Triton-rust : A Huggingface inference example

This example show you how to use the library to infer a language model using Huggingface's library transformers

Setting up Triton Inference model

The first step is to install the huggingface python library and convert a model to onnx.

python -m transformers.onnx --model=distilbert-base-uncased onnx/ --feature=masked-lm

Here we use distilbert-base-uncased as an example.

An example of config file for this model is given here

Building the example

You can build the example using the following command :

make triton-example-huggingface

Running the example

target/release/examples/triton-example-huggingface