Edge AI Platform

A comprehensive Edge AI platform with LLM (Ollama) and ML (ONNX Runtime) serving capabilities, monitoring, model conversion, and benchmarking tools.

🛠️ Model Conversion & Validation Tools

This project includes a powerful command-line interface (CLI) for converting and validating machine learning models, with a focus on ONNX format.

Installation

# Install the package in development mode
pip install -e .

# Install with TensorFlow support (for Keras/SavedModel conversion)
pip install -e .[tensorflow]

# Install with PyTorch support
pip install -e .[torch]

# Install with all dependencies
pip install -e .[all]

CLI Usage

Model Benchmarking

Benchmark ONNX models for performance metrics:

# Benchmark a single model
wronai_edge benchmark path/to/model.onnx --input-shape 1,3,224,224

# Compare multiple models
wronai_edge benchmark model1.onnx model2.onnx --compare --input-shape 1,3,224,224

# Customize benchmark parameters
wronai_edge benchmark model.onnx --warmup 20 --runs 200 --cpu

Options:

--input-shape, -i: Input shape (can be specified multiple times for multiple inputs)
--warmup: Number of warmup runs (default: 10)
--runs: Number of benchmark runs (default: 100)
--cpu/--gpu: Force CPU or GPU usage (default: GPU if available)
--compare: Compare multiple models side by side

Model Validation

Validate an ONNX model:

wronai_edge test-model path/to/model.onnx

Options:

--output-json: Save validation results to a JSON file
--verbose, -v: Enable verbose output

Example:

wronai_edge test-model models/simple-model.onnx --output-json validation_results.json --verbose

Model Conversion

Convert models between different formats using the convert command group.

PyTorch to ONNX:

wronai_edge convert pytorch model.pt output.onnx --input-shape 1,3,224,224

Keras to ONNX:

wronai_edge convert keras model.h5 output.onnx --input-shape 1,224,224,3

TensorFlow SavedModel to ONNX:

wronai_edge convert saved-model saved_model_dir output.onnx

Common options for conversion:

--opset: ONNX opset version (default: 13)
--verbose, -v: Enable verbose output

Python API

You can also use the conversion and validation tools programmatically:

from wronai_edge import validate_model, convert_to_onnx

# Validate a model
results = validate_model("model.onnx")
print(f"Model validation passed: {results['validation_summary']['passed']}")

# Convert a PyTorch model to ONNX
convert_to_onnx(
    model_path="model.pt",
    output_path="output.onnx",
    input_shapes=[(1, 3, 224, 224)],
    opset_version=13
)

For more examples, see the examples directory.

📚 Documentation

For detailed documentation about the Edge AI platform, including LLM serving and monitoring, see the sections below.

📚 Documentation

📖 Overview
🚀 Quick Start
📊 Architecture
🔧 Services
📈 Monitoring
🔍 Examples
- Ollama LLM
- ONNX Runtime
🧩 API Reference
🧪 Testing
🧹 Cleanup

🚀 Quick Start

Prerequisites

Docker and Docker Compose
Python 3.8+ (for running tests and examples)
At least 8GB RAM (16GB recommended for running LLMs)
curl and jq (for testing and examples)

Starting the Platform

Clone the repository:

git clone https://github.com/wronai/edge.git
cd edge

Start all services:
```
docker-compose up -d
```
Verify services are running:
```
docker-compose ps
```
All services should show as "healthy" or "running".
Run the test suite to verify everything is working:
```
./test_services.sh
```

Accessing Services

Ollama API: http://localhost:11435
ONNX Runtime: http://localhost:8001
Nginx Gateway: http://localhost:30080
Grafana: http://localhost:3007 (admin/admin)
Prometheus: http://localhost:9090

ONNX Runtime Management

# Check ONNX Runtime status
make onnx-status

# List available ONNX models
make onnx-models

# Load a new model
make onnx-load MODEL=simple-model MODEL_SOURCE=./models/simple-model.onnx

# Test inference with a sample request
make onnx-test

For detailed ONNX Runtime documentation, see docs/onnx-runtime.md

Example: Using ONNX Runtime

Here's how to use the ONNX Runtime service for model inference:

Check service health:

curl http://localhost:8001/health
# Expected response: {"status": "OK"}

List available models:

curl http://localhost:8001/v1/models
# Example response: {"models": ["model1.onnx", "model2.onnx"]}

Run inference (using Python):

import requests
import numpy as np

# Sample input data (adjust based on your model's expected input)
input_data = {
    "model_name": "wronai.onnx",
    "input": {
        "input_1": np.random.rand(1, 224, 224, 3).tolist()  # Example for image input
    }
}

# Send inference request
response = requests.post(
    "http://localhost:8001/v1/models/your_model:predict",
    json=input_data
)

# Process the response
if response.status_code == 200:
    predictions = response.json()
    print("Inference successful!")
    print(f"Predictions: {predictions}")
else:
    print(f"Error: {response.status_code}")
    print(response.text)

Using cURL for simple inference:

curl -X POST http://localhost:8001/v1/models/your_model:predict \
     -H "Content-Type: application/json" \
     -d '{"input": [[[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]]]}'

For more advanced usage, refer to the API Reference.

Stopping the Platform

To stop all services:

docker-compose down

To remove all data (including models and metrics):

docker-compose down -v

🏗️ Architecture

graph TD
    A[Client] -->|HTTP/HTTPS| B[Nginx Gateway]
    B -->|/api/ollama/*| C[Ollama Service]
    B -->|/api/onnx/*| D[ONNX Runtime]
    B -->|/grafana| E[Grafana]
    B -->|/prometheus| F[Prometheus]
    G[Prometheus] -->|Scrape Metrics| H[Services]
    E -->|Query| G
    C -->|Store Models| I[(Ollama Models)]
    D -->|Load Models| J[(ONNX Models)]

🔧 Services

Core Services

┌─────────────────┬──────────┬──────────────────────────────────────────┐
│ Service         │ Port     │ Description                              │
├─────────────────┼──────────┼──────────────────────────────────────────┤
│ Nginx Gateway   │ 30080    │ API Gateway and reverse proxy            │
│ Ollama          │ 11435    │ LLM serving (compatible with OpenAI API) │
│ ONNX Runtime    │ 8001     │ ML model inference                       │
│ Prometheus      │ 9090     │ Metrics collection and alerting          │
│ Grafana         │ 3007     │ Monitoring dashboards                    │
└─────────────────┴──────────┴──────────────────────────────────────────┘

📈 Monitoring

Access the monitoring dashboards:

Grafana: http://localhost:3007 (admin/admin)
Prometheus: http://localhost:9090
Ollama API: http://localhost:11435
ONNX Runtime: http://localhost:8001

🧪 Testing

Running Tests

We provide test scripts to verify all services are functioning correctly:

Basic Service Tests - Verifies all core services are running and accessible:

# Run all tests
make test

# Or run individual tests
./test_services.sh

ONNX Runtime Tests - Test ONNX Runtime functionality:

# Check ONNX Runtime status
make onnx-status

# Test with a sample request
make onnx-test

ONNX Model Test - Validates ONNX model loading and inference (requires Python dependencies):
```
python3 -m pip install -r requirements-test.txt
python3 test_onnx_model.py
```
API Endpoint Tests - Comprehensive API tests (requires Python dependencies):
```
python3 test_endpoints.py
```

Expected Test Results

When all services are running correctly, you should see output similar to:

=== Testing Direct Endpoints ===
Testing Ollama API (http://localhost:11435/api/tags)... PASS (Status: 200)
Testing ONNX Runtime (http://localhost:8001/v1/)... PASS (Status: 405)

=== Testing Through Nginx Gateway ===
Testing Nginx -> Ollama (http://localhost:30080/api/tags)... PASS (Status: 200)
Testing Nginx -> ONNX Runtime (http://localhost:30080/v1/)... PASS (Status: 405)
Testing Nginx Health Check (http://localhost:30080/health)... PASS (Status: 200)

=== Testing Monitoring ===
Testing Prometheus (http://localhost:9090)... PASS (Status: 302)
Testing Prometheus Graph (http://localhost:9090/graph)... PASS (Status: 200)
Testing Grafana (http://localhost:3007)... PASS (Status: 302)
Testing Grafana Login (http://localhost:3007/login)... PASS (Status: 200)

Note: A 405 status for ONNX Runtime is expected for GET requests to /v1/ as it requires POST requests for inference. The 302 status codes for Prometheus and Grafana are expected redirects to their respective UIs.

🧹 Cleanup

Stop Services

# Stop all services
make stop

# Remove all containers and volumes
make clean

# Remove all unused Docker resources
make prune

ONNX Model Management

# List loaded models
make onnx-models

# To remove models, simply delete them from the models/ directory
rm models/*.onnx

📄 License

This project is licensed under the Apache Software License - see the LICENSE file for details.

🚀 Features

Multi-Model Serving: Run multiple AI/ML models simultaneously
Optimized Inference: ONNX Runtime for high-performance model execution
LLM Support: Ollama integration for local LLM deployment
Monitoring: Built-in Prometheus and Grafana for observability
Scalable: Kubernetes-native design for easy scaling
Developer-Friendly: Simple CLI and comprehensive API

📚 Documentation

Getting Started

Overview - Platform architecture and components
Quick Start - Get up and running in minutes
Installation Guide - Detailed setup instructions

Examples

Ollama Basic Usage - Running LLM models
ONNX Runtime Guide - Deploying custom ONNX models
API Reference - Complete API documentation

Guides

Model Optimization - Performance tuning
Monitoring - Setting up alerts and dashboards
Security - Best practices for secure deployment

🚀 Quick Start

Prerequisites

Docker and Docker Compose
8GB+ RAM (16GB recommended)
20GB free disk space

Start Services

# Clone the repository
git clone https://github.com/wronai/edge.git
cd edge

# Start all services
make up

# Check service status
make status

Access Services

API Gateway: http://localhost:30080
Grafana: http://localhost:3007 (admin/admin)
Prometheus: http://localhost:9090

🛠️ Development

Project Structure

edge/
├── docs/               # Documentation
├── configs/            # Configuration files
├── k8s/                # Kubernetes manifests
├── scripts/            # Utility scripts
├── terraform/          # Infrastructure as Code
├── docker-compose.yml   # Local development
└── Makefile            # Common tasks

Common Tasks

# Start services
make up

# Stop services
make down

# View logs
make logs

# Access monitoring
make monitor

# Run tests
make test

🤝 Contributing

Contributions are welcome! Please see our Contributing Guide for details.

📄 License

This project is licensed under the Apache Software License - see the LICENSE file for details.

📧 Contact

For support or questions, please open an issue in the repository.

🚀 Quick Start (2 minutes to live demo)

Prerequisites

Docker Desktop (running)
Terraform >= 1.6
kubectl >= 1.28
8GB RAM minimum

One-Command Deployment

# Clone and deploy
git clone https://github.com/wronai/edge.git
cd edge

# Make script executable and deploy everything
chmod +x scripts/deploy.sh
./scripts/deploy.sh

🎯 Result: Complete edge AI platform with monitoring in ~3-5 minutes

docker compose ps

output:

docker compose ps
NAME                IMAGE                    COMMAND                  SERVICE             CREATED             STATUS              PORTS
edge-grafana-1      grafana/grafana:latest   "/run.sh"                grafana             3 days ago          Up 8 minutes        0.0.0.0:3007->3000/tcp, :::3007->3000/tcp
edge-ollama-1       ollama/ollama:latest     "/bin/sh -c 'sleep 1…"   ollama              3 days ago          Up 8 minutes        0.0.0.0:11435->11434/tcp, :::11435->11434/tcp
edge-prometheus-1   prom/prometheus:latest   "/bin/prometheus --c…"   prometheus          3 days ago          Up 8 minutes        0.0.0.0:9090->9090/tcp, :::9090->9090/tcp

Instant Access

🤖 AI Gateway: http://localhost:30080
📊 Grafana: http://localhost:30030 (admin/admin)
📈 Prometheus: http://localhost:30090

wronai_edge-portfolio/
├── terraform/main.tf          # Infrastruktura (K3s + Docker)
├── k8s/ai-platform.yaml       # AI workloady (ONNX + Ollama)
├── k8s/monitoring.yaml         # Monitoring (Prometheus + Grafana)
├── configs/Modelfile           # Custom LLM konfiguracja
├── scripts/deploy.sh           # Automatyzacja (jeden skrypt)
└── README.md                   # Kompletna dokumentacja

🏗️ Architecture Overview

graph TB
    U[User] --> G[AI Gateway :30080]
    G --> O[ONNX Runtime]
    G --> L[Ollama LLM]
    
    P[Prometheus :30090] --> O
    P --> L
    P --> G
    
    GR[Grafana :30030] --> P
    
    subgraph "K3s Cluster"
        O
        L
        G
        P
        GR
    end
    
    subgraph "Infrastructure"
        T[Terraform] --> K[K3s]
        K --> O
        K --> L
    end

Technology Stack

Layer	Technology	Purpose
Infrastructure	Terraform + Docker	IaC provisioning
Orchestration	K3s (Lightweight Kubernetes)	Container management
AI Inference	ONNX Runtime + Ollama	Model serving
Load Balancing	Nginx Gateway	Traffic routing
Monitoring	Prometheus + Grafana	Observability
Automation	Bash + YAML	Deployment scripts

🤖 AI Capabilities Demo

Test ONNX Runtime

Health Check

# Check if the ONNX Runtime service is healthy
curl -X GET http://localhost:8001/
# Expected Response: "Healthy"

Model Management

# List available models in the models directory
make onnx-models

# Check model status
make onnx-model-status

# Get model metadata
make onnx-model-metadata

Model Inference

# Make a prediction using the default model (complex-cnn-model)
make onnx-predict

# Or use curl directly
curl -X POST http://localhost:8001/v1/models/complex-cnn-model/versions/1:predict \
  -H "Content-Type: application/json" \
  -d '{"instances": [{"data": [1.0, 2.0, 3.0, 4.0]}]}'

# Example with Python
python3 -c "
import requests
import json

response = requests.post(
    'http://localhost:8001/v1/models/complex-cnn-model/versions/1:predict',
    json={"instances": [{"data": [1.0, 2.0, 3.0, 4.0]}]}
)
print(json.dumps(response.json(), indent=2))
"

Benchmarking

# Run a benchmark with 100 requests
make onnx-benchmark

# Customize model and version
make onnx-benchmark MODEL_NAME=my-model MODEL_VERSION=2

Notes:

The server automatically loads models from the /models directory in the container
To use a different model:
1. Place your .onnx model file in the ./models directory
2. Update the model name/version in your requests or set environment variables:
```
export MODEL_NAME=your-model
export MODEL_VERSION=1
```
3. Or specify them when running commands:
```
make onnx-predict MODEL_NAME=your-model MODEL_VERSION=1
```

Test Ollama LLM

# Simple chat
curl -X POST http://localhost:30080/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2:1b",
    "prompt": "Explain edge computing",
    "stream": false
  }'

# Custom edge AI assistant
curl -X POST http://localhost:30080/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "model": "wronai_edge-assistant",
    "prompt": "How do I monitor Kubernetes pods?",
    "stream": false
  }'

Interactive Demo

# Run comprehensive AI functionality test
./scripts/deploy.sh demo

# Test individual components
./scripts/deploy.sh test

output:

# Test individual components
./scripts/deploy.sh test
[ERROR] 19:27:54 Unknown command: demo
[INFO] 19:27:54 Run './scripts/deploy.sh help' for usage information
[STEP] 19:27:54 🔍 Testing deployed services...
[INFO] 19:27:54 Testing service endpoints...
[ERROR] 19:27:54 ❌ AI Gateway: FAILED
[WARN] 19:27:54 ⚠️ Ollama: Not ready (may still be starting)
[WARN] 19:27:54 ⚠️ ONNX Runtime: Not ready
[INFO] 19:27:54 ✅ Prometheus: OK
[INFO] 19:27:54 ✅ Grafana: OK
[INFO] 19:27:54 Testing AI functionality...
[WARN] 19:27:54 ⚠️ AI Generation: Model may still be downloading
[WARN] 19:27:54 ⚠️ Some services need more time to start

Run a diagnosis to check your system:

./scripts/deploy.sh diagnose

output:

...
- context:
    cluster: kind-wronai_edge
    user: kind-wronai_edge
[STEP] 19:32:14 🔍 Testing service connectivity...
//localhost:30080/health:AI Gateway: ❌ NOT RESPONDING
//localhost:30090/-/healthy:Prometheus: ❌ NOT RESPONDING
//localhost:30030/api/health:Grafana: ❌ NOT RESPONDING
//localhost:11435/api/tags:Ollama Direct: ❌ NOT RESPONDING
//localhost:8001/v1/models:ONNX Direct: ❌ NOT RESPONDING

[STEP] 19:32:14 🔍 Diagnosis complete!

Fix and deploy the services:

./scripts/deploy.sh fix

Test the services after deployment:

./scripts/deploy.sh test

📊 Monitoring & Observability

Grafana Dashboard

URL: http://localhost:30030
Login: admin/admin
Features:
- Real-time AI inference metrics
- Resource utilization monitoring
- Request latency distribution
- Error rate tracking
- Pod health status

Prometheus Metrics

URL: http://localhost:30090
Key Metrics:
- http_requests_total - Request counters
- http_request_duration_seconds - Latency histograms
- container_memory_usage_bytes - Memory consumption
- container_cpu_usage_seconds_total - CPU utilization

Health Monitoring

# Comprehensive health check
./scripts/deploy.sh health

# Check specific components
kubectl get pods -A
kubectl top nodes
kubectl top pods -A

🛠️ Operations & Maintenance

Common Operations

# Check deployment status
./scripts/deploy.sh info

# View live logs
kubectl logs -f deployment/ollama-llm -n ai-inference
kubectl logs -f deployment/onnx-inference -n ai-inference

# Scale AI services
kubectl scale deployment onnx-inference --replicas=3 -n ai-inference

# Update configurations
kubectl apply -f k8s/ai-platform.yaml

Troubleshooting

Common Issues and Solutions

1. Disk Space Issues If the deployment fails with eviction errors or the cluster won't start:

# Check disk space
df -h

# Clean up Docker system
docker system prune -a -f --volumes

# Remove unused containers, networks, and images
docker container prune -f
docker image prune -a -f
docker network prune -f
docker volume prune -f

# Clean up old logs and temporary files
sudo journalctl --vacuum-time=3d
sudo find /var/log -type f -name "*.gz" -delete
sudo find /var/log -type f -name "*.1" -delete

2. Debugging K3s Cluster

# Check K3s server logs
docker logs k3s-server

# Check cluster status
docker exec k3s-server kubectl get nodes
docker exec k3s-server kubectl get pods -A

3. Port Conflicts If you see port binding errors, check and free up required ports (80, 443, 6443, 30030, 30090, 30080):

# Check port usage
sudo lsof -i :8080  # Replace with your port number

4. Debugging Pods

# Debug pod issues
kubectl describe pod <pod-name> -n ai-inference

# Check resource usage
kubectl top pods -n ai-inference --sort-by=memory

# View events
kubectl get events -n ai-inference --sort-by='.lastTimestamp'

# Restart services
kubectl rollout restart deployment/ollama-llm -n ai-inference

5. Reset Everything If you need to start fresh:

# Clean up all resources
./scripts/deploy.sh cleanup

# Remove all Docker resources
docker system prune -a --volumes --force

# Remove K3s data
sudo rm -rf terraform/kubeconfig/*
sudo rm -rf terraform/k3s-data/*
sudo rm -rf terraform/registry-data/*

Cleanup

# Complete cleanup
./scripts/deploy.sh cleanup

# Partial cleanup (keep infrastructure)
kubectl delete -f k8s/monitoring.yaml
kubectl delete -f k8s/ai-platform.yaml

📁 Project Structure

wronai_edge-portfolio/
├── terraform/
│   └── main.tf                 # Complete infrastructure as code
├── k8s/
│   ├── ai-platform.yaml       # AI workloads (ONNX + Ollama + Gateway)
│   └── monitoring.yaml         # Observability stack (Prometheus + Grafana)
├── configs/
│   └── Modelfile              # Custom LLM configuration
├── scripts/
│   └── deploy.sh              # Automation script (8 commands)
└── README.md                  # This documentation

Total Files: 6 core files + documentation = Minimal complexity, maximum demonstration

🎯 Skills Demonstrated

DevOps Excellence

✅ Infrastructure as Code - Pure Terraform configuration
✅ Container Orchestration - Kubernetes/K3s with proper manifests
✅ Declarative Automation - YAML-driven deployments
✅ Monitoring & Observability - Production-ready metrics
✅ Security Best Practices - RBAC, network policies, resource limits
✅ Scalability Patterns - HPA, resource management
✅ GitOps Ready - Declarative configuration management

AI/ML Integration

✅ Model Serving - ONNX Runtime for optimized inference
✅ LLM Deployment - Ollama with custom model configuration
✅ Edge Computing - Resource-constrained deployment patterns
✅ Load Balancing - Intelligent traffic routing for AI services
✅ Performance Monitoring - AI-specific metrics and alerting

Modern Patterns

✅ Microservices Architecture - Service mesh ready
✅ Cloud Native - CNCF-aligned tools and patterns
✅ Edge Computing - Lightweight, distributed deployments
✅ Observability - Three pillars (metrics, logs, traces)
✅ Automation - Zero-touch deployment and operations

🔧 Customization & Extensions

Add Custom Models

# Add new ONNX model
kubectl create configmap wronai --from-file=model.onnx -n ai-inference
# Update deployment to mount the model

# Create custom Ollama model
kubectl exec -n ai-inference deployment/ollama-llm -- \
  ollama create my-custom-model -f /path/to/Modelfile

Scale for Production

# Multi-node cluster
# Update terraform/main.tf to add worker nodes

# Persistent storage
# Add PVC configurations for model storage

# External load balancer
# Configure LoadBalancer service type

# TLS termination
# Add cert-manager and ingress controller

Advanced Monitoring

# Add custom metrics
# Extend Prometheus configuration

# Custom dashboards
# Add Grafana dashboard JSON files

# Alerting rules
# Configure AlertManager for notifications

📈 Performance & Benchmarks

Resource Usage (Default Configuration)

Total Memory: ~4GB (K3s + AI services + monitoring)
CPU Usage: ~2 cores (under load)
Storage: ~2GB (container images + models)
Network: Minimal (edge-optimized)

Performance Metrics

Deployment Time: 3-5 minutes (cold start)
AI Response Time: <2s (LLM inference)
Monitoring Latency: <100ms (metrics collection)
Scaling Time: <30s (pod autoscaling)

Optimization Opportunities

Model Quantization: 4x memory reduction with ONNX INT8
Caching: Redis for frequently accessed inference results
Batching: Group inference requests for better throughput
GPU Acceleration: CUDA/ROCm support for faster inference

🌟 Why This Project Stands Out

For Hiring Managers

Practical Skills: Real-world DevOps patterns, not toy examples
Modern Stack: Current best practices and CNCF-aligned tools
AI Integration: Demonstrates understanding of ML deployment challenges
Production Ready: Monitoring, scaling, security considerations
Time Efficient: Complete demo in under 5 minutes

For Technical Teams

Minimal Complexity: 6 core files, maximum clarity
Declarative Approach: Infrastructure and workloads as code
Extensible Architecture: Easy to add features and scale
Edge Optimized: Real-world resource constraints considered
Documentation: Clear instructions and troubleshooting guides

For Business Value

Fast Deployment: Rapid prototyping and development cycles
Cost Effective: Efficient resource utilization
Scalable Design: Grows from demo to production
Risk Mitigation: Proven patterns and reliable automation
Innovation Ready: Foundation for AI/ML initiatives

🤝 About the Author

Tom Sapletta - DevOps Engineer & AI Integration Specialist

🔧 15+ years enterprise DevOps experience
🤖 AI/LLM deployment expertise with edge computing focus
🏗️ Infrastructure as Code advocate and practitioner
📊 Monitoring & Observability specialist
🚀 Kubernetes & Cloud Native architect

Current Focus: Telemonit - Edge AI power supply systems with integrated LLM capabilities

This project demonstrates practical DevOps skills through minimal, production-ready code that showcases Infrastructure as Code, AI integration, and modern container orchestration patterns. Perfect for demonstrating technical competency to potential employers in the DevOps and AI engineering space.

📄 License

This project is open source and available under the Apache License.

🎯 Ready to deploy? Run ./scripts/deploy.sh and see it in action!

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.tox		.tox
ansible		ansible
configs		configs
docker/ollama		docker/ollama
docs		docs
examples		examples
k8s		k8s
prometheus		prometheus
scripts		scripts
src		src
terraform		terraform
tests		tests
.coverage		.coverage
.gitignore		.gitignore
DEVELOPMENT.md		DEVELOPMENT.md
KIND.md		KIND.md
LICENSE		LICENSE
MINIKUBE.md		MINIKUBE.md
Makefile		Makefile
PORT.md		PORT.md
README.md		README.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
docker-compose.yml		docker-compose.yml
generate_onnx_model.py		generate_onnx_model.py
kind-config.yaml		kind-config.yaml
mkdocs.yml		mkdocs.yml
nginx.conf		nginx.conf
pyproject.toml		pyproject.toml
requirements-test.txt		requirements-test.txt
src.txt		src.txt
test_endpoints.py		test_endpoints.py
test_onnx_model.py		test_onnx_model.py
test_services.sh		test_services.sh
tox.ini		tox.ini

License

wronai/edge

Folders and files

Latest commit

History

Repository files navigation

Edge AI Platform

🛠️ Model Conversion & Validation Tools

Installation

CLI Usage

Model Benchmarking

Model Validation

Model Conversion

Python API

📚 Documentation

📚 Documentation

🚀 Quick Start

Prerequisites

Starting the Platform

Accessing Services

ONNX Runtime Management

Example: Using ONNX Runtime

Stopping the Platform

🏗️ Architecture

🔧 Services

Core Services

📈 Monitoring

🧪 Testing

Running Tests

Expected Test Results

🧹 Cleanup

Stop Services

ONNX Model Management

📄 License

🚀 Features

📚 Documentation

Getting Started

Examples

Guides

🚀 Quick Start

Prerequisites

Start Services

Access Services

🛠️ Development

Project Structure

Common Tasks

🤝 Contributing

📄 License

📧 Contact

🚀 Quick Start (2 minutes to live demo)

Prerequisites

One-Command Deployment

Instant Access

🏗️ Architecture Overview

Technology Stack

🤖 AI Capabilities Demo

Test ONNX Runtime

Health Check

Model Management

Model Inference

Benchmarking

Notes:

Test Ollama LLM

Interactive Demo

📊 Monitoring & Observability

Grafana Dashboard

Prometheus Metrics

Health Monitoring

🛠️ Operations & Maintenance

Common Operations

Troubleshooting

Common Issues and Solutions

Cleanup

📁 Project Structure

🎯 Skills Demonstrated

DevOps Excellence

AI/ML Integration

Modern Patterns

🔧 Customization & Extensions

Add Custom Models

Packages