Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabled PR job pylint warnings parsing plugin (for legacy Jenkins instances), fixed broken links in README and Examples MD files #3667

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Examples/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
![Qualcomm Innovation Center, Inc.](../Docs/images/logo-quic-on@h68.png)

# AIMET Examples
AIMET Examples provide reference code (in the form of *scripts* and *Jupyter Notebooks*) to learn how to load models, apply AIMET quantization and compression features, fine tune and save your models. It is also a quick way to become familiar with AIMET usage and APIs. For more details on each of the features and APIs please reference the _[user guide](https://quic.github.io/aimet-pages/releases/1.19.1/user_guide/index.html#api-documentation-and-usage-examples)_.
AIMET Examples provide reference code (in the form of *scripts* and *Jupyter Notebooks*) to learn how to load models, apply AIMET quantization and compression features, fine tune and save your models. It is also a quick way to become familiar with AIMET usage and APIs. For more details on each of the features and APIs please reference the _[user guide](https://quic.github.io/aimet-pages/releases/latest/user_guide/index.html)_.

## Table of Contents
- [Installation](#installation-instructions)
- [Code Layout](#code-layout)
- [Supported Examples](#supported-examples)
- [Overview](#overview)
- [Running Examples via Jupyter Notebook](#running-examples-via-jupyter-notebook)
- [Running Examples via Command Line](#running-examples-via-command-line)

Expand Down
87 changes: 65 additions & 22 deletions Jenkins/Jenkinsfile
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
// Jenkinsfile to run pull-request status checks
pipeline {
parameters {
string(name: 'PROJECT_NAME', defaultValue: 'aimet', description: 'project name')
Expand Down Expand Up @@ -45,8 +46,8 @@ pipeline {
}

stage("Check Commits") {
agent { label "${params.BUILD_LABEL_CPU}" }
agent { label "${params.BUILD_LABEL_CPU}" }

steps {
//Set up a TF-CPU docker container to run commit checks script on
cleanWs()
Expand Down Expand Up @@ -76,12 +77,12 @@ pipeline {
}
}
sh "bash -l -c \"rm -rf commit_checks_repo\""
}
}
}


stage('Pipelines start') {

matrix {
axes {
axis {
Expand All @@ -106,12 +107,12 @@ pipeline {
}
}

agent { label "docker-build-aimet-pr-${PROC_TYPE}" }
agent { label "docker-build-aimet-pr-${PROC_TYPE}" }

stages {

stage('Start') {

steps {
script {
stage("${ML_FMWORK}-${PROC_TYPE}".toUpperCase()) {
Expand All @@ -121,7 +122,7 @@ pipeline {
}

}

stage('Setup') {

steps {
Expand All @@ -133,7 +134,7 @@ pipeline {


stage('Build') {

steps {
echo 'Building code (and generating Docs and pip packages)...'
script {
Expand All @@ -143,21 +144,27 @@ pipeline {
}

stage('Code violations') {

// Works with newer jenkins instances that support the warnings-ng plugin (https://plugins.jenkins.io/warnings-ng)
when {
expression {
env.QCInternalValidation == "false"
}
}
steps {
echo 'Running code violations...'
script {
runStage("${ML_FMWORK}-${PROC_TYPE}", "-v")
}
}
// TODO: Following code needs to be updated to conform to this plugin: https://plugins.jenkins.io/warnings-ng
// post {
// always {
// step([
// $class : 'WarningsPublisher',
// $class : 'WarningsNgPublisher',
// parserConfigurations : [[
// parserName: 'PYLint',
// pattern : "**/**/**/*pylint_results.out"
// ]],
// ]],
// failedTotalHigh : THRESHOLD_OBJ.pylint_fail_thresholds.high_priority,
// failedTotalNormal : THRESHOLD_OBJ.pylint_fail_thresholds.normal_priority,
// failedTotalLow : THRESHOLD_OBJ.pylint_fail_thresholds.low_priority,
Expand All @@ -171,7 +178,45 @@ pipeline {
// }
// }
// }
// }
// }
}

stage('Code violations Legacy') {
// Works with older jenkins instances that support the warnings plugin (https://plugins.jenkins.io/warnings)
when {
expression {
env.QCInternalValidation == "true"
}
}
steps {
echo 'Running code violations...'
script {
runStage("${ML_FMWORK}-${PROC_TYPE}", "-v")
}
}
post {
always {
// NOTE: Works only with https://plugins.jenkins.io/warnings/ (deprecated)
step([
$class : 'WarningsPublisher',
parserConfigurations : [[
parserName: 'PYLint',
pattern: "**/**/**/*pylint_results.out"
]],
failedTotalHigh : THRESHOLD_OBJ.pylint_fail_thresholds.high_priority,
failedTotalNormal : THRESHOLD_OBJ.pylint_fail_thresholds.normal_priority,
failedTotalLow : THRESHOLD_OBJ.pylint_fail_thresholds.low_priority,
usePreviousBuildAsReference : true
])
script {
if (currentBuild.currentResult.equals("FAILURE")) {
// the plugin won't fail the stage. it only sets the build status, so we have to fail it
// manually
sh "exit 1"
}
}
}
}
}

stage('Unit tests') {
Expand Down Expand Up @@ -217,14 +262,13 @@ pipeline {
runStage("${ML_FMWORK}-${PROC_TYPE}", "-s | true")
}
}
}
}

}
}
}
}



stage("AIMET extra ALL STAGES") {


Expand All @@ -235,8 +279,8 @@ pipeline {
callAimetExtra(env.CHANGE_TARGET)
}
}
}
}

}
}
post {
Expand Down Expand Up @@ -291,18 +335,17 @@ def callAimetExtra(target_branch) {
// setting USE LINARO value to EMPTY to rebuild docker image
using_linaro=""
}

if (target_branch.startsWith("release-aimet")) {
echo "Running AIMET additional stages on ${CHANGE_TARGET} branch ..."
build job: "AIMET-Extra", parameters: [string(name: 'AIMET_GIT_COMMIT', value: "${CHANGE_BRANCH}"), string(name: 'PROJECT_BRANCH', value: target_branch), string(name: 'USE_LINARO', value: "${using_linaro}"), string(name: 'PREBUILT_DOCKER_IMAGE_URL', value: "${params.PREBUILT_DOCKER_IMAGE_URL}"), string(name: 'AIMETPRO_BRANCH', value: target_branch)]
}
else if (target_branch != "develop") {
echo "Running AIMET additional stages on ${CHANGE_TARGET} branch ..."
build job: "AIMET-Extra", parameters: [string(name: 'AIMET_GIT_COMMIT', value: "${CHANGE_BRANCH}"), string(name: 'PROJECT_BRANCH', value: target_branch), string(name: 'USE_LINARO', value: "${using_linaro}"), string(name: 'PREBUILT_DOCKER_IMAGE_URL', value: "${params.PREBUILT_DOCKER_IMAGE_URL}")]
}
}
else {
echo "Running AIMET additional stages on develop branch ..."
build job: "AIMET-Extra", parameters: [string(name: 'AIMET_GIT_COMMIT', value: "${CHANGE_BRANCH}"), string(name: 'USE_LINARO', value: "${using_linaro}"), string(name: 'PREBUILT_DOCKER_IMAGE_URL', value: "${params.PREBUILT_DOCKER_IMAGE_URL}")]
}
}

55 changes: 17 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,32 +3,26 @@

[![AIMET on GitHub Pages](Docs/images/button-overview.png)](https://quic.github.io/aimet-pages/index.html)
[![Documentation](Docs/images/button-docs.png)](https://quic.github.io/aimet-pages/releases/latest/user_guide/index.html)
[![Install instructions](Docs/images/button-install.png)](#installation-instructions)
[![Discussion Forums](Docs/images/button-forums.png)](https://forums.quicinc.com)
[![Install instructions](Docs/images/button-install.png)](#quick-installation)
[![Discussion Forums](Docs/images/button-forums.png)](https://github.com/quic/aimet/discussions)
[![What's New](Docs/images/button-whats-new.png)](#whats-new)

# AI Model Efficiency Toolkit (AIMET)

<a href="https://quic.github.io/aimet-pages/index.html">AIMET</a> is a library that provides advanced model quantization
and compression techniques for trained neural network models.
It provides features that have been proven to improve run-time performance of deep learning neural network models with
lower compute and memory requirements and minimal impact to task accuracy.

<a href="https://quic.github.io/aimet-pages/index.html">AIMET</a> is a library that provides advanced model quantization and compression techniques for trained neural network models. It provides features that have been proven to improve run-time performance of deep learning neural network models with lower compute and memory requirements and minimal impact to task accuracy.

![How AIMET works](Docs/images/how-it-works.png)

AIMET is designed to work with [PyTorch](https://pytorch.org), [TensorFlow](https://tensorflow.org) and [ONNX](https://onnx.ai) models.

We also host the [AIMET Model Zoo](https://github.com/quic/aimet-model-zoo) - a collection of popular neural network models optimized for 8-bit inference.
We also provide recipes for users to quantize floating point models using AIMET.
We also host the [AIMET Model Zoo](https://github.com/quic/aimet-model-zoo) - a collection of popular neural network models optimized for 8-bit inference. We also provide recipes for users to quantize floating point models using AIMET.

## Table of Contents
- [Installation](#quick-installation)
- [Why AIMET?](#why-aimet)
- [Quick Installation](#quick-install)
- [Supported features](#supported-features)
- [What's New](#whats-new)
- [Results](#results)
- [Installation](#installation-instructions)
- [Resources](#resources)
- [Contributions](#contributions)
- [Team](#team)
Expand All @@ -42,7 +36,7 @@ The AIMET PyTorch GPU PyPI packages are available for environments that meet the
* Linux Ubuntu 22.04 LTS [Python 3.10] or Linux Ubuntu 20.04 LTS [Python 3.8]
* Torch 2.1.2+cu121

#### Installation
### Installation
```
apt-get install liblapacke
python3 -m pip install aimet-torch
Expand All @@ -57,35 +51,29 @@ To install other AIMET variants and versions, please follow one of the links bel

![Benefits of AIMET](Docs/images/AImodelEfficency.png)

* **Supports advanced quantization techniques**: Inference using integer runtimes is significantly faster than using floating-point runtimes. For example, models run
5x-15x faster on the Qualcomm Hexagon DSP than on the Qualcomm Kyro CPU. In addition, 8-bit precision models have a 4x
smaller footprint than 32-bit precision models. However, maintaining model accuracy when quantizing ML models is often
challenging. AIMET solves this using novel techniques like Data-Free Quantization that provide state-of-the-art INT8 results on
several popular models.
* **Supports advanced quantization techniques**: Inference using integer runtimes is significantly faster than using floating-point runtimes. For example, models run 5x-15x faster on the Qualcomm Hexagon DSP than on the Qualcomm Kyro CPU. In addition, 8-bit precision models have a 4x smaller footprint than 32-bit precision models. However, maintaining model accuracy when quantizing ML models is often challenging. AIMET solves this using novel techniques like Data-Free Quantization that provide state-of-the-art INT8 results on several popular models.
* **Supports advanced model compression techniques** that enable models to run faster at inference-time and require less memory
* **AIMET is designed to automate optimization** of neural networks avoiding time-consuming and tedious manual tweaking.
AIMET also provides user-friendly APIs that allow users to make calls directly from their [TensorFlow](https://tensorflow.org)
or [PyTorch](https://pytorch.org) pipelines.
* **AIMET is designed to automate optimization** of neural networks avoiding time-consuming and tedious manual tweaking. AIMET also provides user-friendly APIs that allow users to make calls directly from their [TensorFlow](https://tensorflow.org) or [PyTorch](https://pytorch.org) pipelines.

Please visit the [AIMET on Github Pages](https://quic.github.io/aimet-pages/index.html) for more details.

## Supported Features

#### Quantization
### Quantization

* *Cross-Layer Equalization*: Equalize weight tensors to reduce amplitude variation across channels
* *Bias Correction*: Corrects shift in layer outputs introduced due to quantization
* *Adaptive Rounding*: Learn the optimal rounding given unlabelled data
* *Quantization Simulation*: Simulate on-target quantized inference accuracy
* *Quantization-aware Training*: Use quantization simulation to train the model further to improve accuracy

#### Model Compression
### Model Compression

* *Spatial SVD*: Tensor decomposition technique to split a large layer into two smaller ones
* *Channel Pruning*: Removes redundant input channels from a layer and reconstructs layer weights
* *Per-layer compression-ratio selection*: Automatically selects how much to compress each layer in the model

#### Visualization
### Visualization

* *Weight ranges*: Inspect visually if a model is a candidate for applying the Cross Layer Equalization technique. And the effect after applying the technique
* *Per-layer compression sensitivity*: Visually get feedback about the sensitivity of any given layer in the model to compression
Expand All @@ -96,14 +84,12 @@ Some recently added features include
* Quantization-aware Training (QAT) for recurrent models (including with RNNs, LSTMs and GRUs)

## Results

AIMET can quantize an existing 32-bit floating-point model to an 8-bit fixed-point model without sacrificing much accuracy and without model fine-tuning.


<h4>DFQ</h4>

The DFQ method applied to several popular networks, such as MobileNet-v2 and ResNet-50, result in less than 0.9%
loss in accuracy all the way down to 8-bit quantization, in an automated way without any training data.
The DFQ method applied to several popular networks, such as MobileNet-v2 and ResNet-50, result in less than 0.9% loss in accuracy all the way down to 8-bit quantization, in an automated way without any training data.

<table style="width:50%">
<tr>
Expand Down Expand Up @@ -131,8 +117,7 @@ loss in accuracy all the way down to 8-bit quantization, in an automated way wit

<h4>AdaRound (Adaptive Rounding)</h4>
<h5>ADAS Object Detect</h5>
<p>For this example ADAS object detection model, which was challenging to quantize to 8-bit precision,
AdaRound can recover the accuracy to within 1% of the FP32 accuracy.</p>
<p>For this example ADAS object detection model, which was challenging to quantize to 8-bit precision, AdaRound can recover the accuracy to within 1% of the FP32 accuracy.</p>
<table style="width:50%">
<tr>
<th style="width:80px" colspan="15">Configuration</th>
Expand All @@ -153,8 +138,7 @@ AdaRound can recover the accuracy to within 1% of the FP32 accuracy.</p>
</table>

<h5>DeepLabv3 Semantic Segmentation</h5>
<p>For some models like the DeepLabv3 semantic segmentation model, AdaRound can even quantize the model weights to
4-bit precision without a significant drop in accuracy.</p>
<p>For some models like the DeepLabv3 semantic segmentation model, AdaRound can even quantize the model weights to 4-bit precision without a significant drop in accuracy.</p>
<table style="width:50%">
<tr>
<th style="width:80px" colspan="15">Configuration</th>
Expand All @@ -176,9 +160,7 @@ AdaRound can recover the accuracy to within 1% of the FP32 accuracy.</p>
<br>

<h4>Quantization for Recurrent Models</h4>
<p>AIMET supports quantization simulation and quantization-aware training (QAT) for recurrent models (RNN, LSTM, GRU).
Using QAT feature in AIMET, a DeepSpeech2 model with bi-directional LSTMs can be quantized to 8-bit precision with
minimal drop in accuracy.</p>
<p>AIMET supports quantization simulation and quantization-aware training (QAT) for recurrent models (RNN, LSTM, GRU). Using QAT feature in AIMET, a DeepSpeech2 model with bi-directional LSTMs can be quantized to 8-bit precision with minimal drop in accuracy.</p>

<table style="width:50%">
<tr>
Expand All @@ -198,9 +180,7 @@ minimal drop in accuracy.</p>
<br>

<h4>Model Compression</h4>
<p>AIMET can also significantly compress models. For popular models, such as Resnet-50 and Resnet-18,
compression with spatial SVD plus channel pruning achieves 50% MAC (multiply-accumulate) reduction while retaining
accuracy within approx. 1% of the original uncompressed model.</p>
<p>AIMET can also significantly compress models. For popular models, such as Resnet-50 and Resnet-18, compression with spatial SVD plus channel pruning achieves 50% MAC (multiply-accumulate) reduction while retaining accuracy within approx. 1% of the original uncompressed model.</p>

<table style="width:50%">
<tr>
Expand All @@ -222,11 +202,10 @@ accuracy within approx. 1% of the original uncompressed model.</p>

<br>


## Resources
* [User Guide](https://quic.github.io/aimet-pages/releases/latest/user_guide/index.html)
* [API Docs](https://quic.github.io/aimet-pages/releases/latest/api_docs/index.html)
* [Discussion Forums](https://forums.quicinc.com/)
* [Discussion Forums](https://github.com/quic/aimet/discussions)
* [Tutorial Videos](https://quic.github.io/aimet-pages/index.html#video)
* [Example Code](Examples/README.md)

Expand Down
Loading