-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add post Deploying a Machine Learning Model as an API with FastAPI, D…
…ocker, and Knative
- Loading branch information
Showing
1 changed file
with
90 additions
and
0 deletions.
There are no files selected for viewing
90 changes: 90 additions & 0 deletions
90
...Deploying-a-Machine-Learning-Model-as-an-API-with-FastAPI-Docker-and-Knative.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
--- | ||
title: 'Deploying a Machine Learning Model as an API with FastAPI, Docker, and Knative' | ||
date: 2024-02-16 17:26:08 | ||
tags: ["machine learning", "API development", "FastAPI", "Docker", "Knative", "model deployment", "serverless scaling"] | ||
--- | ||
|
||
In the post {% post_link Classification-with-scikit-learn Leveraging scikit-learn for Invoice Classification using Text Data %}, we explored how to train a machine learning model to classify invoices based on their textual content using scikit-learn. However, once we have a model trained, a natural next step is to make it accessible to other systems or services through an API. This raises the question: how do we deploy it as a scalable API? | ||
|
||
In this blog post, we'll address this question by walking through the process of wrapping our scikit-learn model as an API using FastAPI, containerizing it with Docker, and deploying it on Knative for serverless scaling. Let's dive in! | ||
|
||
## Wrapping the Model as an API with FastAPI | ||
|
||
We'll start by wrapping our machine learning model as an API using FastAPI, a modern web framework for building APIs with Python. FastAPI offers automatic OpenAPI documentation generation and high performance, making it an excellent choice for our use case. | ||
|
||
```python | ||
from typing import List, Dict | ||
from fastapi import FastAPI | ||
from pydantic import BaseModel | ||
from util.get_prediction_probabilities import get_prediction_probabilities | ||
|
||
app = FastAPI() | ||
|
||
class PredictionResult(BaseModel): | ||
text: str | ||
prediction: str | ||
probabilities: Dict[str, float] | ||
|
||
class PredictionRequest(BaseModel): | ||
texts: List[str] | ||
|
||
class PredictionResponse(BaseModel): | ||
results: List[PredictionResult] | ||
|
||
@app.post('/predict', response_model=PredictionResponse) | ||
async def predict(request: PredictionRequest): | ||
texts = request.texts | ||
results = get_prediction_probabilities(model, texts) | ||
|
||
response = PredictionResponse(results=results) | ||
return response | ||
``` | ||
|
||
## Building a Docker Image | ||
|
||
Next, we'll containerize our FastAPI application using Docker. Docker provides a lightweight and portable way to package applications and their dependencies into containers, ensuring consistency across different environments. | ||
|
||
```Dockerfile | ||
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.9 | ||
|
||
COPY ./app /app | ||
|
||
RUN pip install scikit-learn # Install any dependencies | ||
|
||
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "80"] | ||
``` | ||
|
||
## Deployment on Knative | ||
|
||
Finally, we'll deploy our Docker image to Knative, a Kubernetes-based platform for building, deploying, and managing modern serverless workloads. Knative offers auto-scaling capabilities, allowing our API to handle varying levels of traffic efficiently. | ||
|
||
```yaml | ||
apiVersion: serving.knative.dev/v1 | ||
kind: Service | ||
metadata: | ||
name: invoice-classifier | ||
spec: | ||
template: | ||
metadata: | ||
annotations: | ||
autoscaling.knative.dev/target-burst-capacity: "500" | ||
autoscaling.knative.dev/class: "hpa.autoscaling.knative.dev" | ||
autoscaling.knative.dev/metric: "cpu" | ||
autoscaling.knative.dev/target: "60" | ||
autoscaling.knative.dev/minScale: "2" | ||
spec: | ||
containers: | ||
- name: invoice-classifier | ||
image: your-docker-registry/invoice-classifier | ||
resources: | ||
limits: | ||
cpu: 1 | ||
memory: 2Gi | ||
requests: | ||
cpu: 300m | ||
memory: 500Mi | ||
``` | ||
## Conclusion | ||
In this blog post, we've demonstrated how to deploy a machine learning model as an API using FastAPI, Docker, and Knative. By following these steps, you can make your machine learning models accessible as scalable and reliable APIs, enabling seamless integration into your applications and workflows. |