Roadmap to automate ABTF benchmarking via CM

# Current tasks

The goal is to decompose ABTF repository into reusable automation recipes to make the benchmarking, evaluation and training process of ABTF models more deterministic, portable and extensible with new models, frameworks and data sets ....

We need to develop the following [CM](https://github.com/mlcommons/ck) scripts (automation recipes) to support ABTF benchmarking with loadgen across different platforms, OS and hardware:

See the current CM-ABTF documentation/demos [here](https://github.com/mlcommons/cm4abtf/tree/main/docs/test-abtf-model).

## Preparing ABTF demo

@gfursin helped to prepare first CM automation for ABTF and we now plan to delegate further developments to dedicated engineers.

### Test inference with ABTF model

 - [x] Automate Cognata downloading with custom sub-sets
   - [x] Download to CM cache
   - [x] Import already downloaded dataset
   - [x] Check download of individual files
     - [x] Add variation for a demo (1 min video)
 - [x] Download current ABTF model and register in CM cache
 - [x] Download trained ABTF models via CM

### Export ABTF model to other formats 
- [x] Export PyTorch model to ONNX
  - [x] Test exported ONNX model with loadgen and random inputs (performance only)
  - [x] Test model quantization via [Hugging Face's quanto package](https://github.com/huggingface/quanto)

### Evaluate ABTF model with Cognata sub-set

- [x] Sync with Rod to access server and test CM automation
- [x] "Decode" function for standalone evaluation of a given image (mAP) to be integrated with loadgen

### Automate training of ABTF model with Cognata sub-set

- [x] Sync with Rod to access server 

### Add Python harness for loadgen with ABTF model

- [x] Implement Python loadgen harness for ABTF model to measure performance (1 sample)
  - [x] Pre-load and pre-process all samples from Cognata
- [x] Implement Python laodgen harness for ABTF model to measure accuracy (1 sample)
  - [x] Pre-load and pre-process all samples from Cognata

See related [CM script](https://github.com/mlcommons/cm4abtf/tree/dev/script/demo-ml-model-abtf-cognata-pytorch-loadgen) and [simple Python harness](https://github.com/mlcommons/cm4abtf/tree/dev/script/demo-ml-model-abtf-cognata-pytorch-loadgen/ref/python).

### Generate/use Docker containers 

- [x] Prepare examples of docker containers with CM: see [examples](https://github.com/mlcommons/cm4abtf/tree/dev/docs/test-abtf-model/docker)

### Demos

- [x] Prepare demo for live ABTF model evaluation
  - [x] Download Cognata subset
  - [x] Show live visualization of predictions
  - [x] Document



# For the next tasks we need more engineering resources.

MLCommons committed to fund CM development with 1 CM engineer until the end of 2024 to modularize and automate MLPerf inference. ABTF colleagues should sync developments with the MLPerf inference WG.

### Improve performance
 - [ ] Add performance profiling, analysis and debugging 
 - [ ] Current performance on 8-core CPU and Laptop GPU is low (10 sec per frame for 8M model and 3 sec per frame for 3M model on CPU) - need further optimization (quantization, hardware specific optimizations, fine-tuning, etc)

### Add C++ harness for loadgen with ABTF model

- [ ] Develop C++ harness for loadgen with ONNX
- [ ] Export PyTorch model to TFLite
  - [ ] Develop native C++ harness for loadgen test with TFLite model
- [ ] Develop C++ harness for loadgen with PyTorch

### Support other hardware

#### PyTorch native 

- [ ] Support ABTF demo on Nvidia GPU via CUDA
  - [ ] Generate Docker container for the demo

#### Cross-compilation

- [ ] Samsung Exynos
    - [ ] Requires C++ loadgen harness implementation with cross-compilation
    - [ ] ONNX backend
    - [ ] TFLite backend


### Automate ABTF model quantization

TBD



## Developers

### ABTF model 
* Radoyeh Shojaei

### CM automation for ABTF model
* @gfursin has completed a prototype of a CM automation and MLPerf harness for ABTF model in May 2024. Further developments should be done by MLCommons CM inference engineer.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap to automate ABTF benchmarking via CM #6

Current tasks

Preparing ABTF demo

Test inference with ABTF model

Export ABTF model to other formats

Evaluate ABTF model with Cognata sub-set

Automate training of ABTF model with Cognata sub-set

Add Python harness for loadgen with ABTF model

Generate/use Docker containers

Demos

For the next tasks we need more engineering resources.

Improve performance

Add C++ harness for loadgen with ABTF model

Support other hardware

PyTorch native

Cross-compilation

Automate ABTF model quantization

Developers

ABTF model

CM automation for ABTF model

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Roadmap to automate ABTF benchmarking via CM #6

Description

Current tasks

Preparing ABTF demo

Test inference with ABTF model

Export ABTF model to other formats

Evaluate ABTF model with Cognata sub-set

Automate training of ABTF model with Cognata sub-set

Add Python harness for loadgen with ABTF model

Generate/use Docker containers

Demos

For the next tasks we need more engineering resources.

Improve performance

Add C++ harness for loadgen with ABTF model

Support other hardware

PyTorch native

Cross-compilation

Automate ABTF model quantization

Developers

ABTF model

CM automation for ABTF model

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions