-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Current tasks
The goal is to decompose ABTF repository into reusable automation recipes to make the benchmarking, evaluation and training process of ABTF models more deterministic, portable and extensible with new models, frameworks and data sets ....
We need to develop the following CM scripts (automation recipes) to support ABTF benchmarking with loadgen across different platforms, OS and hardware:
See the current CM-ABTF documentation/demos here.
Preparing ABTF demo
@gfursin helped to prepare first CM automation for ABTF and we now plan to delegate further developments to dedicated engineers.
Test inference with ABTF model
- Automate Cognata downloading with custom sub-sets
- Download to CM cache
- Import already downloaded dataset
- Check download of individual files
- Add variation for a demo (1 min video)
- Download current ABTF model and register in CM cache
- Download trained ABTF models via CM
Export ABTF model to other formats
- Export PyTorch model to ONNX
- Test exported ONNX model with loadgen and random inputs (performance only)
- Test model quantization via Hugging Face's quanto package
Evaluate ABTF model with Cognata sub-set
- Sync with Rod to access server and test CM automation
- "Decode" function for standalone evaluation of a given image (mAP) to be integrated with loadgen
Automate training of ABTF model with Cognata sub-set
- Sync with Rod to access server
Add Python harness for loadgen with ABTF model
- Implement Python loadgen harness for ABTF model to measure performance (1 sample)
- Pre-load and pre-process all samples from Cognata
- Implement Python laodgen harness for ABTF model to measure accuracy (1 sample)
- Pre-load and pre-process all samples from Cognata
See related CM script and simple Python harness.
Generate/use Docker containers
- Prepare examples of docker containers with CM: see examples
Demos
- Prepare demo for live ABTF model evaluation
- Download Cognata subset
- Show live visualization of predictions
- Document
For the next tasks we need more engineering resources.
MLCommons committed to fund CM development with 1 CM engineer until the end of 2024 to modularize and automate MLPerf inference. ABTF colleagues should sync developments with the MLPerf inference WG.
Improve performance
- Add performance profiling, analysis and debugging
- Current performance on 8-core CPU and Laptop GPU is low (10 sec per frame for 8M model and 3 sec per frame for 3M model on CPU) - need further optimization (quantization, hardware specific optimizations, fine-tuning, etc)
Add C++ harness for loadgen with ABTF model
- Develop C++ harness for loadgen with ONNX
- Export PyTorch model to TFLite
- Develop native C++ harness for loadgen test with TFLite model
- Develop C++ harness for loadgen with PyTorch
Support other hardware
PyTorch native
- Support ABTF demo on Nvidia GPU via CUDA
- Generate Docker container for the demo
Cross-compilation
- Samsung Exynos
- Requires C++ loadgen harness implementation with cross-compilation
- ONNX backend
- TFLite backend
Automate ABTF model quantization
TBD
Developers
ABTF model
- Radoyeh Shojaei
CM automation for ABTF model
- @gfursin has completed a prototype of a CM automation and MLPerf harness for ABTF model in May 2024. Further developments should be done by MLCommons CM inference engineer.