Skip to content

Latest commit

 

History

History

text-generation-inference

NeuronX TGI: Text-generation-inference for AWS inferentia2

NeuronX TGI is distributed as docker images for EC2 and SageMaker.

These docker images integrate:

  • the AWS Neuron SDK for Inferentia2,
  • the Text Generation Inference launcher and scheduling front-end,
  • a neuron specific inference server for text-generation.

Usage

Please refer to the official documentation.

Build your own image

The image must be built from the top directory

make neuronx-tgi