Skip to content

Latest commit

 

History

History
44 lines (33 loc) · 3.79 KB

README.md

File metadata and controls

44 lines (33 loc) · 3.79 KB

Examples

This directory contains self-documented examples to illustrate how to make use of the DeepSparse Engine.

For instructions on how to run each example, either check the script README or run them with -h.

Open a Pull Request to contribute your own examples.

Examples

Notebook Description
Benchmark and ONNX Model Correctness Comparing predictions and benchmark performance between DeepSparse Engine and ONNXRuntime.
Hugging Face Transformers Serving, benchmarking, and running NLP models from Hugging Face.
YOLOv3 and YOLOv5 Serving, benchmarking, and running annotation inferences with YOLOv3 and YOLOv5 models.
Image Classification How to use image classification models from SparseZoo to perform inference and benchmarking with the DeepSparse Engine.
Object Detection How to use object detection models from SparseZoo to perform inference and benchmarking with the DeepSparse Engine.
Instance Segmentation How to use an optimized YOLACT model and the DeepSparse Engine to perform real-time instance segmentation.
AWS Serverless Integration How to deploy a DeepSparse pipeline for batch or real-time inference on select serverless services.
AWS Sagemaker Integration How to deploy a DeepSparse inference server on SageMaker.
Google Cloud Run How to deploy a DeepSparse inference server on Cloud Run.
Google Kubernetes Engine How to deploy a DeepSparse inference server on GKE.
SparseStream Deploying 2 sparse transformers for classifying Finance tweets in a real-time Twitter stream.
SparseServer.UI A Streamlit app for deploying the DeepSparse Server to compare the latency and accuracy of sparse BERT models.
Twitter Sentiment Analysis Example of scraping, processing, and classifying Twitter data using the DeepSparse Engine for 10x faster performance on CPUs.
Flask Model Server Simple model server and client example, showing how to use the DeepSparse Engine as an inference backend for a real-time inference server.