Skip to content

XinyuYe-Intel/optimum

 
 

Repository files navigation

ONNX Runtime neural_compressor

Hugging Face - Optimum

🤗 Optimum is an extension of 🤗 Transformers, providing a set of performance optimization tools enabling maximum efficiency to train and run models on targeted hardware.

The AI ecosystem evolves quickly and more and more specialized hardware along with their own optimizations are emerging every day. As such, Optimum enables users to efficiently use any of these platforms with the same ease inherent to transformers.

Integration with Hardware Partners

🤗 Optimum aims at providing more diversity towards the kind of hardware users can target to train and finetune their models.

To achieve this, we are collaborating with the following hardware manufacturers in order to provide the best transformers integration:

Optimizing models towards inference

Along with supporting dedicated AI hardware for training, Optimum also provides inference optimizations towards various frameworks and platforms.

We currently support ONNX runtime along with Intel Neural Compressor (INC).

Features ONNX Runtime Intel Neural Compressor
Post-training Dynamic Quantization ✔️ ✔️
Post-training Static Quantization ✔️ ✔️
Quantization Aware Training (QAT) Stay tuned! ⭐ ✔️
Pruning N/A ✔️

Install

🤗 Optimum can be installed using pip as follows:

pip install optimum

🤗 Optimum with Intel Neural Compressor (INC) or ONNX runtime dependencies can be installed respectively using pip as follows:

pip install optimum[intel]

pip install optimum[onnxruntime]

If you'd like to play with the examples or need the bleeding edge of the code and can't wait for a new release, you must install the library from source:

pip install git+https://github.com/huggingface/optimum.git

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.4%
  • Makefile 0.6%