#

ray-serve

Here are 9 public repositories matching this topic...

fork123aniket / LLM-RAG-powered-QA-App

A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App

question-answering ray fine-tuning context-aware-system large-language-models ray-serve llmops llm-serving eleutherai llm-training llm-inference retrieval-augmented-generation parameter-efficient-fine-tuning

Updated Jan 27, 2025
Python

SeeMirra / Wingman

Create Context-Aware Q&A Interfaces from Your Own Data with LLMs and Vector Embeddings - Includes an automated embedding pipeline and a model-powered Q&A interface

automation ai embeddings artificial-intelligence gpt rag vector-database llm ray-serve langchain

Updated Jul 21, 2025
Python

vishukla / e5-embedding-ray-serve

Production-grade scalable embedding API server using SentenceTransformers "intfloat/multilingual-e5-base" model, powered by Ray Serve for multi-GPU orchestration, with Prometheus & Grafana monitoring.

gpu grafana api-server prometheus autoscaling embedding sentence-transformers ray-serve

Updated Jul 13, 2025
Python

marwan116 / raycraft

A drop-in replacement of fastapi to enable scalable and fault tolerant deployments with ray serve

fault-tolerance scalability ray fastapi ray-serve

Updated Nov 7, 2023
Python

blublinsky / ray-serve

Experimenting with Ray Serve on KubeRay

ray-serve kuberay

Updated Sep 11, 2023
Python

mpolinowski / ray-deployments

Use Ray to deploy your remote services.

python deployment ray ray-serve

Updated Jan 29, 2023
Python

AbdoAlshoki2 / Cairo-Dictionary-AI-Ray-Backend

Ray Serve backend for Arabic Speech Recognition

text-to-speech deep-learning transformer speech-to-text text-correction huggingface ray-serve

Updated Aug 23, 2025
Python

Parry-97 / ray_fundamentals

🚀 Hands-on Ray distributed computing examples: Core, Data, Train, Tune, Serve with PyTorch integration and ML workflows

pytorch ray-tune ray-serve ray-data ray-core

Updated Sep 28, 2025
Python

Carlososuna11 / demo-ray-serve-multiple-models

contains the basic structure that a model serving application should have. This implementation is based on the Ray Serve framework.

python machine-learning distributed-computing scaling-methods ray-serve

Updated Jan 16, 2023
Python

Improve this page

Add a description, image, and links to the ray-serve topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ray-serve topic, visit your repo's landing page and select "manage topics."