Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

Origin/nightly update rebased #48

Merged
merged 36 commits into from
Mar 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
7cc029b
Added feature and example of NAS (#516)
XinyuYe-Intel Jan 17, 2023
af900cb
Simplize readme (#564)
VincyZhang Feb 9, 2023
7b28a96
Udpate onnx version (#572)
VincyZhang Feb 10, 2023
067d3e9
add docstring to optimization (#573)
violetch24 Feb 13, 2023
79fecb3
Refactor TF quantization/pruning/distillation examples document (#571)
Spycsh Feb 14, 2023
f5ccff4
[Kernels] Trans MHA merge lnormalized spmm (#558)
zhewang1-intc Feb 15, 2023
4e0c59a
sync external repo (#590)
VincyZhang Feb 15, 2023
5af08e9
Document fix (#591)
VincyZhang Feb 15, 2023
d927f45
Add showcase bloom (#592)
VincyZhang Feb 15, 2023
d4db8c2
[Kernels] visualize sparsity script (#454)
yuchengliu1 Feb 15, 2023
155322c
Enhance compile op registering (#584)
zhentaoyu Feb 15, 2023
b78c486
add base and large bert example to pruner (#560)
n1ck-guo Feb 16, 2023
52d4e95
[Engine]: add squeeze op and binary ops (#456)
zhenwei-intel Feb 17, 2023
889e0c7
docstring (#599)
XuhuiRen Feb 17, 2023
024c774
[Kernels] fix improper-null-terminator and MHA cpplint (#594)
sunjiweiswift Feb 17, 2023
d8737e5
[Neural Engine] Add the code to support tiny vit HF model (#561)
a32543254 Feb 17, 2023
9d0d4a7
Changed to quantize SetFit model with INC (#606)
XinyuYe-Intel Feb 21, 2023
0ab4f30
Wangwenqi add op (#596)
CeciliaWwq Feb 21, 2023
12ad7ad
Zhenzhong/op attr (#604)
Zhenzhong1 Feb 21, 2023
9034a88
BinaryOP->BinaryOp frontend (#613)
Zhenzhong1 Feb 21, 2023
0264c24
fix of JIRA-391: windows build issue (#588)
luoyu-intel Feb 21, 2023
cac82c7
add gpt-neox example (#540)
violetch24 Feb 22, 2023
ac3fc3c
added multi-nodes QAT support for Question Answering and Text Classif…
XinyuYe-Intel Feb 23, 2023
78aeecb
Guoheng/fix bug 432 (#587)
n1ck-guo Feb 24, 2023
d144c4e
[Kernels] bugfix benchmark spmm (#611)
sunjiweiswift Feb 25, 2023
21a5e94
Remove redundant code (#616)
VincyZhang Feb 27, 2023
a0ce89f
add image classification example (#225)
lkk12014402 Feb 27, 2023
90d8ce3
[Kernels] Refine headers for library compatibility and documents (#605)
airMeng Feb 28, 2023
16d2722
[Kernels] Reference impl and UT for Dense MHA with dynamic quantizati…
yi1ding Feb 28, 2023
b17f99e
update main page (#651)
VincyZhang Mar 1, 2023
bef7c5e
fix klocwork issues (#649)
zhenwei-intel Mar 1, 2023
d0d14a2
Fix sparse bert mini example (#647)
a32543254 Mar 2, 2023
8f1689d
fix for pruning import (#653)
violetch24 Mar 2, 2023
5120f03
update README (#655)
violetch24 Mar 2, 2023
8ff21f7
Support gather with pytorch interface (#607)
yuchengliu1 Mar 2, 2023
2b78dc7
Update README.md
VincyZhang Mar 3, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
* Advanced software optimizations and unique compression-aware runtime (released with NeurIPS 2022's paper [Fast Distilbert on CPUs](https://arxiv.org/abs/2211.07715) and [QuaLA-MiniLM: a Quantized Length Adaptive MiniLM](https://arxiv.org/abs/2210.17114), and NeurIPS 2021's paper [Prune Once for All: Sparse Pre-Trained Language Models](https://arxiv.org/abs/2111.05754))


* Accelerated end-to-end Transformer-based applications such as [Stable Diffusion](./examples/optimization/pytorch/huggingface/textual_inversion), [GPT-J-6B](./examples/optimization/pytorch/huggingface/language-modeling/inference/README.md#GPT-J), [BLOOM-176B](./examples/optimization/pytorch/huggingface/language-modeling/inference/README.md#BLOOM-176B), [T5](https://github.com/intel/intel-extension-for-transformers/blob/main/examples/optimization/pytorch/huggingface/summarization/quantization), and [SetFit](./docs/tutorials/pytorch/text-classification/SetFit_model_compression_AGNews.ipynb)
* Accelerated end-to-end Transformer-based applications such as [Stable Diffusion](./examples/optimization/pytorch/huggingface/textual_inversion), [GPT-J-6B](./examples/optimization/pytorch/huggingface/language-modeling/inference/README.md#GPT-J), [BLOOM-176B](./examples/optimization/pytorch/huggingface/language-modeling/inference/README.md#BLOOM-176B), [T5](https://github.com/intel/intel-extension-for-transformers/blob/main/examples/optimization/pytorch/huggingface/summarization/quantization), and [SetFit](./docs/tutorials/pytorch/text-classification/SetFit_model_compression_AGNews.ipynb) by leveraging Intel AI software such as [Intel® Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch)


## Installation
Expand Down
16 changes: 9 additions & 7 deletions docs/api_doc/kernel/api_c.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
.. _api:
Kernel C++ APIs
####
.. doxygenfile:: interface.hpp
:project: Intel® Extension for Transformers
Kernels C++ APIs
============================================

.. doxygenfile:: kernel.hpp
:project: Intel® Extension for Transformers
.. toctree::
:maxdepth: 1

interface.rst
engine.rst
operator_desc.rst
types.rst
3 changes: 3 additions & 0 deletions docs/api_doc/kernel/engine.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,6 @@ Class engine
####
.. doxygenfile:: engine.hpp
:project: Intel® Extension for Transformers

.. doxygenfile:: cpu_engine.hpp
:project: Intel® Extension for Transformers
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Class kernel
Class Kernel
####
.. doxygenfile:: kernel.hpp
.. doxygenfile:: interface.hpp
:project: Intel® Extension for Transformers
4 changes: 4 additions & 0 deletions docs/api_doc/kernel/operator_desc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Class operator_desc
####
.. doxygenfile:: operator_desc.hpp
:project: Intel® Extension for Transformers
21 changes: 21 additions & 0 deletions docs/api_doc/kernel/types.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Operator Specific Types
####
.. doxygenfile:: kernels/attention_types.hpp
:project: Intel® Extension for Transformers
.. doxygenfile:: kernels/eltwiseop_types.hpp
:project: Intel® Extension for Transformers
.. doxygenfile:: kernels/gather_types.hpp
:project: Intel® Extension for Transformers
.. doxygenfile:: kernels/layernorm_ba_types.hpp
:project: Intel® Extension for Transformers
.. doxygenfile:: kernels/matmul_types.hpp
:project: Intel® Extension for Transformers
.. doxygenfile:: kernels/mean_var_reduce_types.hpp
:project: Intel® Extension for Transformers
.. doxygenfile:: kernels/softmax_types.hpp
:project: Intel® Extension for Transformers
.. doxygenfile:: kernels/spmm_types.hpp
:project: Intel® Extension for Transformers
.. doxygenfile:: kernels/transpose_mha_types.hpp
:project: Intel® Extension for Transformers

2 changes: 1 addition & 1 deletion docs/api_doc/optimization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ The following API information is available:
optimization/pipeline.rst
optimization/optimizer_tf.rst
optimization/optimizer.rst
optimization/trainer.rst
optimization/trainer.rst
9 changes: 5 additions & 4 deletions docs/build_docs/source/kernel.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
Transformers-accelerated Libraries
Kernels
============================================
Transformers-accelerated Libraries (formerly known as SparseLib) is a high-performance operator computing library implemented by assembly. Transformers-accelerated Libraries contains a JIT domain, a kernel domain, and a scheduling proxy framework.

.. toctree::
:maxdepth: 1

docs/intel_extension_for_transformers/backends/neural_engine/kernels/README.md
docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/profiling.md
docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/validated_data.md
docs/api_doc/api_kernel.rst
kernel_perf.rst
kernel_desc.rst
docs/api_doc/kernel/api_c.rst

16 changes: 16 additions & 0 deletions docs/build_docs/source/kernel_desc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Implementation Details
============================================

.. toctree::
:maxdepth: 1

docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/kernel_desc/3D_inference.md
docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/kernel_desc/binaryop_injector.md
docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/kernel_desc/eltwise_injector.md
docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/kernel_desc/kernel_vnni.md
docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/kernel_desc/kernel_amx.md
docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/kernel_desc/kernel_avx512f.md
docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/kernel_desc/kernel_layernormalized_spmm.md
docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/kernel_desc/kernel_transpose_matmul.md
docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/kernel_desc/kernel_transpose_mha.md

11 changes: 11 additions & 0 deletions docs/build_docs/source/kernel_perf.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Performance
============================================

Here we introduce performance relates issues for users who might want detailed profiling intructions or check whether performance met requests.

.. toctree::
:maxdepth: 1

docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/profiling.md
docs/intel_extension_for_transformers/backends/neural_engine/kernels/docs/validated_data.md

Loading