Skip to content

divakaivan/model-api-oip

Repository files navigation

Take your ML model APIs to the next level

From /predict to the Open Inference Protocol

info

  • designed following the Open Inference Protocol — a growing industry standard for standardized, observable, and interoperable machine learning inference
  • auto-documentation using FastAPI and Pydantic
  • add linting, testing and pre-commit hooks
  • build and push an Docker image of the API to Docker Hub
  • use Github Actions for automation

End result

docs

HTTP/REST endpoints

API Verb Path
Inference POST v2/models/[/versions/<model_version>]/infer
Model Metadata GET v2/models/<model_name>[/versions/<model_version>]
Server Ready GET v2/health/ready
Server Live GET v2/health/live
Server Metadata GET v2
Model Ready GET v2/models/<model_name>[/versions/]/ready

API Definitions

API Definition
Inference The /infer endpoint performs inference on a model. The response is the prediction result.
Model Metadata The "model metadata" API is a per-model endpoint that returns details about the model passed in the path.
Server Ready The “server ready” health API indicates if all the models are ready for inferencing. The “server ready” health API can be used directly to implement the Kubernetes readinessProbe.
Server Live The “server live” health API indicates if the inference server is able to receive and respond to metadata and inference requests. The “server live” API can be used directly to implement the Kubernetes livenessProbe.
Server Metadata The "server metadata" API returns details describing the server.
Model Ready The “model ready” health API indicates if a specific model is ready for inferencing. The model name and (optionally) version must be available in the URL.

Get started

Go to the 1/setup-start branch and follow the instructions. For each following branch, the information is in the respective README.md

The structure is as follows:

  1. Setup
  2. Implement Endpoints
  3. Improve docs
  4. Restructure
  5. Add Linting & Tests
  6. CI with Github Actions
  7. Dockerise and push to Docker Hub

Throughout these above stages I share lots of links to documentation. Some libraries have great docs and thankfully the ones I have used here have amazing docs and explanations. If you learn anything form this repo, I hope it's atleast to get into the habit of looking at the docs of the libraries you use when you need an answer. If you just came here to have a quick look - read the FastAPI docs as a book, or the OIP docs page, or any of the other mentioned tool's docs.

Want to learn from industry professionals?

While I think this takes a beginner from just /predict and introduces them to some important concepts, I suggest looking into Eric Riddoch's teaching material: Taking Python to Production and Cloud Engineering for Python Devs.

If you are curious about MLOps on a wider scale (or you are curious about a model's life outside a jupyter notebook), I suggest these resource:

Improvements

  • make videos
  • update the instructions for the steps based on feedback
  • improve the endpoints' structure
  • open to feedback and help to improve this 'course'

About

Sample ML model API following the Open Inference Protocol

Resources

Stars

Watchers

Forks