Nlp-toolkit is a powerful toolkit for automatically applying model optimizations on Natural Language Processing Models. It leverages Intel® Neural Compressor to provide a variety of optimization methods: quantization, pruning, distillation and so on. This toolkit is equipped with the capability to enable various Deep Learning framework like PyTorch/TensorFlow. For PyTorch models, it also support NNCF provider to optimization.
forked from intel/intel-extension-for-transformers
-
Notifications
You must be signed in to change notification settings - Fork 0
Extending Hugging Face transformers APIs for Transformer-based models and improve the productivity of inference deployment. With extremely compressed models, the toolkit can greatly improve the inference efficiency on Intel platforms.
License
sywangyi/intel-extension-for-transformers
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Extending Hugging Face transformers APIs for Transformer-based models and improve the productivity of inference deployment. With extremely compressed models, the toolkit can greatly improve the inference efficiency on Intel platforms.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- C++ 57.1%
- Python 41.6%
- Other 1.3%