Skip to content

Self-Hosted AI Chatbot with vLLM and Gradio: A lightweight, Dockerized AI chatbot that runs locally on your machine. Powered by vLLM for high-performance inference and Gradio for a user-friendly interface

License

dinukacodes/containerized_inference-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Self-Hosted AI Chatbot with vLLM and Gradio

Docker
vLLM
Gradio

A self-hosted AI chatbot that runs locally on your machine using vLLM for inference and Gradio for the frontend. This project is containerized with Docker, making it easy to set up and run.


Features

  • Local Inference: No need for external APIs—everything runs on your machine.
  • GPU Support: Optimized for CUDA-enabled GPUs for faster inference.
  • User-Friendly Interface: A simple and intuitive chat interface powered by Gradio.
  • Dockerized: Easy to set up and run with Docker.

How It Works

  1. vLLM Server:

    • The vLLM server runs the facebook/opt-125m model and exposes an API endpoint at http://localhost:8000/v1.
    • It processes user prompts and generates responses using the model.
  2. Gradio Frontend:

    • The Gradio frontend provides a web-based chat interface.
    • It sends user messages to the vLLM server and displays the generated responses.
  3. Docker Container:

    • The entire system is packaged into a Docker container for easy deployment.

Getting Started

Prerequisites

  • Docker installed on your machine.
  • NVIDIA GPU with CUDA support (optional but recommended for faster inference).

Steps to Run

  1. Clone the Repository:
    git clone https://github.com/your-username/self-hosted-chatbot.git
    cd self-hosted-chatbot

About

Self-Hosted AI Chatbot with vLLM and Gradio: A lightweight, Dockerized AI chatbot that runs locally on your machine. Powered by vLLM for high-performance inference and Gradio for a user-friendly interface

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published