Skip to content

Commit

Permalink
Added model deployment docker template, fixed typos.
Browse files Browse the repository at this point in the history
  • Loading branch information
copandrej committed Jun 28, 2024
1 parent 367e07b commit b9ae343
Show file tree
Hide file tree
Showing 7 changed files with 69 additions and 11 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# Self-Evolving AI/ML Workflow
# Self-Evolving AI/ML Workflow System

AI/ML workflow solution is an MLOps system designed for deployment on a heterogeneous Kubernetes cluster with ARM and x86 GNU/Linux nodes, simulating distributed edge infrastructure common in RAN (Radio Access Networks), network slices and MEC (Multi-Access Edge Computing) devices.

The system leverages the Ray framework for data processing, model training, and model inference, distributing the computational load on edge and non-edge nodes.
Data preparation can done with frameworks like Pandas or Ray Data while using Minio as a object store.
Data preparation can be done with frameworks like Pandas or Ray Data while using Minio as an object store.
Model training, managed by Ray, supports Keras, TensorFlow, and PyTorch with minor modifications.
MLflow handles model storage and management, facilitating easy access and updates.
Trained models are deployed as inference API endpoints using Ray Serve or as Kubernetes deployments using helm charts and docker containers.
Flyte orchestrates AI/ML workflows for retraining and redeployment of ML models, enabling retraining triggers based on monitored metrics.
Prometheus and Grafana provide system monitoring.

Developers register workflows with Flyte and monitor the system, while users can trigger workflows, monitor progress, and access models in MLflow.
For example, in a RAN network, the system can enhance and control network operations through periodic metrics collection and automated retraining, ensuring up-to-date AI/ML solutions.
For example, in a RAN network, the system can enhance and control network operations through periodic metrics collection and automated retraining, ensuring up-to-date AI/ML assisted solutions.
This system aims to run autonomously, delivering efficient production AI/ML workflows at the network edge.

The system is modular and can be adjusted to different use cases and requirements by enabling or disabling system components.
Expand Down
8 changes: 0 additions & 8 deletions docker_build/Dockerfile

This file was deleted.

13 changes: 13 additions & 0 deletions docker_build/model_deployment/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Use an official Python runtime as a parent image
FROM python:3.10

WORKDIR /app

# Install any needed packages specified in requirements.txt, append your dependencies to requirements.txt
RUN pip install requirements.txt

ADD . /app
EXPOSE 8000

# Run api-endpoint.py with uvicorn when the container launches
CMD ["uvicorn", "api-endpoint:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4", "--limit-concurrency", "8000", "--log-level", "error", "--backlog", "8000"]
35 changes: 35 additions & 0 deletions docker_build/model_deployment/api-endpoint.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
from typing import List
from fastapi import FastAPI
import os
import mlflow

# Add your own imports and functions here

app = FastAPI()

# SEMR's model store endpoint, CHANGE THIS TO YOUR OWN IP
os.environ['MLFLOW_TRACKING_URI'] = 'http://<SEMR_IP>:31007'

# Read the model version from the os variable (Defined in the helm charts)
model_version = os.getenv('MODEL_VERSION', "latest")

# Modify the model_uri to point to the correct model name
model_uri = f"models:/<model_name>/{model_version}"

# Modify the MLflow loading function https://mlflow.org/docs/2.10.2/index.html
model = mlflow.pytorch.load_model(model_uri)
model.eval()
print("Model loaded!")

@app.post("/")
async def echo(data: List[List[List[float]]]):
# Data transformations
# tensor_data = torch.tensor(data)
# tensor_data = tensor_data.unsqueeze(0)

# Inference
# cnn_labels_array = cnn_predict(tensor_data, model)
# counts = Counter(cnn_labels_array)

return {"prediction": str(counts)}

11 changes: 11 additions & 0 deletions docker_build/model_deployment/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Requirements by model deplyoment template
fastapi
uvicorn
python-multipart
mlflow==2.10.2

# Add your own dependencies for model inference and data preprocessing
# numpy
# torch
# torch_geometric
# torchvision
7 changes: 7 additions & 0 deletions docker_build/ray_image/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Start from the rayproject/2.10.0-py310-aarch64 image for raspberry pi

FROM rayproject/ray:2.10.0-py310

# Install dependencies from requirements.txt
COPY requirements.txt .
RUN pip install -r requirements.txt
File renamed without changes.

0 comments on commit b9ae343

Please sign in to comment.