#

vllm

Here are 87 public repositories matching this topic...

wangcx18 / llm-vscode-inference-server

An endpoint server for efficiently serving quantized open-source LLMs for code.

vscode-extension llm vllm llm-inference

Updated Oct 15, 2023
Python

lklivingstone / sih_2023

A Large Language Model based tool for generating human like responses to natural language inputs for network not connected over internet.

linux django reactjs scylladb llm vllm

Updated Oct 29, 2023
Python

ivangabriele / docker-llm

Pre-loaded LLMs served as an OpenAI-Compatible API via Docker images.

api docker server docker-image openai vast orca vicuna openai-api llm runpod llms vllm lmsys openorca llong

Updated Oct 31, 2023
Dockerfile

gameofdimension / vllm-cn

演示 vllm 对中文大语言模型的神奇效果

Updated Nov 4, 2023
Jupyter Notebook

aflip / mood-muse

Embedding based semantic search app for poetry [App and EDA notebooks]

approximate-nearest-neighbor-search semantic-search sentence-embeddings data-enrichment vector-search vllm

Updated Nov 7, 2023
Jupyter Notebook

Trainy-ai / llm-atc

Fine-tuning and serving LLMs on any cloud

finetuning llms vllm llama2

Updated Dec 2, 2023
Python

mddunlap924 / LLM-Inference-Serving

This repository demonstrates LLM execution on CPUs using packages like llamafile, emphasizing low-latency, high-throughput, and cost-effective benefits for inference and serving.

deepspeed large-language-models llms llm-serving llamacpp vllm llm-inference llamafile

Updated Dec 4, 2023
Jupyter Notebook

gusanmaz / echosight

EchoSight is a tool that helps visually impaired individuals by audibly describing images taken with a Raspberry Pi Camera or inputted via image path or URL across different operating systems.

raspberry-pi replicate visual-audio visual-audio-navigation llm coqui-tts llms vllm replicate-api seamlessm4t cogvl

Updated Jan 3, 2024
Python

phospho-app / fastassert

Dockerized LLM inference server with constrained output (JSON mode), built on top of vLLM and outlines. Faster, cheaper and without rate limits. Compare the quality and latency to your current LLM API provider.

docker outlines llm vllm llm-inference

Updated Feb 17, 2024
Jupyter Notebook

InquestGeronimo / superlaser

MLOps library for LLM deployment w/ the vLLM engine on RunPod's infra.

docker cicd mlops inference-api runpod vllm llm-inference runpod-serverless

Updated Mar 2, 2024
Python

joydeb28 / llm-lab

LLM, Fine Tuning, Llama 2, Gemma, Mixtral, vLLM, LangChain, RAG, ChromaDB, FAISS

nlp gemma faiss rag llm langchain vllm chromadb genai llama2 finetune-llm openllm mixtral

Updated Mar 5, 2024
Jupyter Notebook

varunshenoy / super-json-mode

Low latency JSON generation using LLMs ⚡️

openai huggingface-transformers llm vllm

Updated Mar 10, 2024
Jupyter Notebook

LLM-inference-router / vllm-router

vLLM Router

kubernetes huggingface llm vllm llm-inference llama2

Updated Mar 11, 2024
Python

aisu-programming / LLM-Coder-with-Discord

A discord bot which can call LLMs using either Hugging Face or vLLM on Windows platform. Combined with function calling.

docker discord discord-bot discord-py huggingface llm langchain vllm llm-agent

Updated Mar 20, 2024
Python

TimeSurgeLabs / promptproxy

Call many AIs from a single API.

docker ai openai llama huggingface openai-api llm vllm openai-api-proxy llama2

Updated Mar 28, 2024
Go

asprenger / ray_vllm_inference

A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.

inference pytorch transformer ray model-serving mlops llm llmops llm-serving vllm

Updated Apr 6, 2024
Python

lucataco / cog-Nous-Hermes-2-Mixtral-8x7B-DPO

Cog wrapper for NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO

Updated Apr 12, 2024
Python

lucataco / cog-Wizard-Vicuna-13B-Uncensored

Cog wrapper for cognitivecomputations/Wizard-Vicuna-13B-Uncensored

Updated Apr 25, 2024
Python

lucataco / cog-Qwen1.5-32b

Cog wrapper for vllm implementation of Qwen/Qwen1.5-32B

Updated Apr 26, 2024
Python

lucataco / cog-Qwen1.5-110B

Cog wrapper for vllm implementation of Qwen/Qwen1.5-110B

Updated Apr 26, 2024
Python

Improve this page

Add a description, image, and links to the vllm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vllm topic, visit your repo's landing page and select "manage topics."