update example config json and README (intel#528)

sywangyi · Jan 10, 2023 · 99a0695 · 99a0695
1 parent 7ad7525
commit 99a0695
Show file tree

Hide file tree

Showing 8 changed files with 2,186 additions and 579 deletions.
diff --git a/README.md b/README.md
@@ -108,10 +108,10 @@ python setup.py install
 ```
 >**Note**: Recommend install protobuf <= 3.20.0 if use onnxruntime <= 1.11
 
-## Getting Started
+## Getting Starteds
 ### Quantization
 ```python
-from intel_extension_for_transformers.optimization import QuantizationConfig, metric, objectives
+from intel_extension_for_transformers.optimization import QuantizationConfig, metrics, objectives
 from intel_extension_for_transformers.optimization.trainer import NLPTrainer
 
 # Replace transformers.Trainer with NLPTrainer
@@ -160,34 +160,6 @@ model = trainer.distill(distillation_config=d_conf, teacher_model=teacher_model)
 
 Please refer to [distillation document](docs/distillation.md) for more details.
 
-### Data Augmentation
-Data augmentation provides the facilities to generate synthesized NLP dataset for further model optimization. The data augmentation supports text generation on popular fine-tuned models like GPT, GPT2, and other text synthesis approaches from [nlpaug](https://github.com/makcedward/nlpaug).
-
-```python
-from intel_extension_for_transformers.preprocessing.data_augmentation import DataAugmentation
-aug = DataAugmentation(augmenter_type="TextGenerationAug")
-aug.input_dataset = "original_dataset.csv" # example: https://huggingface.co/datasets/glue/viewer/sst2/train
-aug.column_names = "sentence"
-aug.output_path = os.path.join(self.result_path, "test2.cvs")
-aug.augmenter_arguments = {'model_name_or_path': 'gpt2-medium'}
-aug.data_augment()
-raw_datasets = load_dataset("csv", data_files=aug.output_path, delimiter="\t", split="train")
-```
-
-Please refer to [data augmentation document](docs/data_augmentation.md) for more details.
-
-### Neural Engine
-Neural Engine is one of reference deployments that Intel Extension for Transformers provides. Neural Engine aims to demonstrate the optimal performance of extremely compressed NLP models by exploring the optimization opportunities from both HW and SW.
-
-```python
-from intel_extension_for_transformers.backends.neural_engine.compile import compile
-# /path/to/your/model is a TensorFlow pb model or ONNX model
-model = compile('/path/to/your/model')
-inputs = ... # [input_ids, segment_ids, input_mask]
-model.inference(inputs)
-```
-
-Please refer to [Neural Engine](examples/deployment/) for more details.
 
 ### Quantized Length Adaptive Transformer
 Quantized Length Adaptive Transformer leverages sequence-length reduction and low-bit representation techniques to further enhance model inference performance, enabling adaptive sequence-length sizes to accommodate different computational budget requirements with an optimal accuracy efficiency tradeoff.
@@ -227,20 +199,6 @@ model.inference(inputs)
 
 Please refer to [example](examples/deployment/neural_engine/sparse/distilbert_base_uncased/) in [Transformers-accelerated Neural Engine](examples/deployment/) and paper [Fast Distilbert on CPUs](https://arxiv.org/abs/2211.07715) for more details.
 
-### Transformers-accelerated Libraries
-Transformers-accelerated Libraries is a high-performance operator computing library implemented by assembly. Transformers-accelerated Libraries contains a JIT domain, a kernel domain, and a scheduling proxy framework.
-
-```C++
-#include "interface.hpp"
-  ...
-  operator_desc op_desc(ker_kind, ker_prop, eng_kind, ts_descs, op_attrs);
-  sparse_matmul_desc spmm_desc(op_desc);
-  sparse_matmul spmm_kern(spmm_desc);
-  std::vector<const void*> rt_data = {data0, data1, data2, data3, data4};
-  spmm_kern.execute(rt_data);
-```
-Please refer to [Transformers-accelerated Libraries](intel_extension_for_transformers/backends/neural_engine/kernels/README.md) for more details.
-
 
 ## System Requirements
 ### Validated Hardware Environment