Name	Name	Last commit message	Last commit date
Latest commit History 214 Commits
.github	.github
docs	docs
examples	examples
nlp_toolkit	nlp_toolkit
tests	tests
.gitignore	.gitignore
.gitmodules	.gitmodules
README.md	README.md
requirements.txt	requirements.txt
setup.py	setup.py

NLP Toolkit: Optimization for Natural Language Processing (NLP) Models

NLP Toolkit is a powerful toolkit for automatically applying model optimizations on Natural Language Processing Models. It leverages Intel® Neural Compressor to provide a variety of model compression techniques: quantization, pruning, distillation and so on.

What does NLP Toolkit offer?

This toolkit allows developers to improve the productivity through ease-of-use model compression APIs by extending HuggingFace transformer APIs for deep learning models in NLP (Natural Language Processing) domain and accelerate the inference performance using compressed models.

Model Compression

Framework Quantization Pruning/Sparsity Distillation AutoDistillation

PyTorch ✔ ✔ ✔ ✔

TensorFlow ✔ ✔ Stay tuned ⭐ Stay tuned ⭐
Data Augmentation for NLP Datasets
Neural Engine for Reference Deployment

Framework	Quantization	Pruning/Sparsity	Distillation	AutoDistillation
PyTorch	✔	✔	✔	✔
TensorFlow	✔	✔	Stay tuned ⭐	Stay tuned ⭐

Getting Started

Installation

Install Dependency

pip install -r requirements.txt

Install NLP Toolkit

git clone https://github.com/intel-innersource/frameworks.ai.nlp-toolkit.intel-nlp-toolkit.git nlp_toolkit
cd nlp_toolkit
git submodule update --init --recursive
python setup.py install

Quantization

from nlp_toolkit import QuantizationConfig, metric, objectives
from nlp_toolkit.optimization.trainer import NLPTrainer

# Replace transformers.Trainer with NLPTrainer
# trainer = transformers.Trainer(...)
trainer = NLPTrainer(...)
metric = metrics.Metric(name="eval_f1", is_relative=True, criterion=0.01)
q_config = QuantizationConfig(
    approach="PostTrainingStatic",
    metrics=[metric],
    objectives=[objectives.performance]
)
model = trainer.quantize(quant_config=q_config)

Please refer to quantization document for more details.

Pruning

from nlp_toolkit import PrunerConfig, PruningConfig
from nlp_toolkit.optimization.trainer import NLPTrainer

# Replace transformers.Trainer with NLPTrainer
# trainer = transformers.Trainer(...)
trainer = NLPTrainer(...)
metric = metrics.Metric(name="eval_accuracy")
pruner_config = PrunerConfig(prune_type='BasicMagnitude', target_sparsity_ratio=0.9)
p_conf = PruningConfig(pruner_config=[pruner_config], metrics=metric)
model = trainer.prune(pruning_config=p_conf)

Please refer to pruning document for more details.

Distillation

from nlp_toolkit import DistillationConfig, Criterion
from nlp_toolkit.optimization.trainer import NLPTrainer

# Replace transformers.Trainer with NLPTrainer
# trainer = transformers.Trainer(...)
teacher_model = ... # exist model
trainer = NLPTrainer(...)
metric = metrics.Metric(name="eval_accuracy")
d_conf = DistillationConfig(metrics=metric)
model = trainer.distill(distillation_config=d_conf, teacher_model=teacher_model)

Please refer to distillation document for more details.

Data Augmentation

Data augmentation provides the facilities to generate synthesized NLP dataset for further model optimization. The data augmentation supports text generation on popular fine-tuned models like GPT, GPT2, and other text synthesis approaches from nlpaug.

from nlp_toolkit.preprocessing.data_augmentation import DataAugmentation
aug = DataAugmentation(augmenter_type="TextGenerationAug")
aug.input_dataset = "original_dataset.csv" # example: https://huggingface.co/datasets/glue/viewer/sst2/train
aug.column_names = "sentence"
aug.output_path = os.path.join(self.result_path, "test2.cvs")
aug.augmenter_arguments = {'model_name_or_path': 'gpt2-medium'}
aug.data_augment()
raw_datasets = load_dataset("csv", data_files=aug.output_path, delimiter="\t", split="train")

Please refer to data augmentation document for more details.

Neural Engine

Neural Engine is one of reference deployments that NLP toolkit provides. Neural Engine aims to demonstrate the optimal performance of extremely compressed NLP models by exploring the optimization opportunities from both HW and SW.

from nlp_toolkit.backends.neural_engine.compile import compile
# /path/to/your/model is a TensorFlow pb model or ONNX model
model = compile('/path/to/your/model')
inputs = ... # [input_ids, segment_ids, input_mask]
model.inference(inputs)

Please refer to Neural Engine for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP Toolkit: Optimization for Natural Language Processing (NLP) Models

What does NLP Toolkit offer?

Getting Started

Installation

Install Dependency

Install NLP Toolkit

Quantization

Pruning

Distillation

Data Augmentation

Neural Engine

About

Releases

Packages

Languages

License

sywangyi/intel-extension-for-transformers

Folders and files

Latest commit

History

Repository files navigation

NLP Toolkit: Optimization for Natural Language Processing (NLP) Models

What does NLP Toolkit offer?

Getting Started

Installation

Install Dependency

Install NLP Toolkit

Quantization

Pruning

Distillation

Data Augmentation

Neural Engine

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages