Skip to content

MZohaib364/Text2Image-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ§™β€β™‚οΈ Text2Image Generator (Local gRPC + Streamlit UI)

A GPU-accelerated, Dockerized microservice that enables text-to-image generation using state-of-the-art Stable Diffusion models. The system features a gRPC API with REST wrapper and a Streamlit-based web interface for seamless user interaction.


License: MIT Python 3.8+ Docker gRPC Streamlit


πŸš€ Features

  • βœ… Local GPU inference using Hugging Face's diffusers pipeline
  • βœ… gRPC-based communication with protobuf definitions
  • βœ… REST API wrapper as optional for postman testing
  • βœ… Images returned as base64 and saved as PNG files
  • βœ… Streamlit frontend with intuitive prompt UI
  • βœ… Concurrent request handling with async processing
  • βœ… Comprehensive error handling and status codes
  • βœ… Dockerized for easy deployment and reproducibility

πŸ› οΈ Project Structure

.
β”œβ”€β”€ server.py                 # gRPC server using diffusers and Stable Diffusion
β”œβ”€β”€ rest_api.py               # gRPC wrapper (optional fallback interface)
β”œβ”€β”€ FrontEnd/
β”‚   └── streamlit_app.py      # Streamlit UI client that calls gRPC backend
β”œβ”€β”€ text2image.proto          # Protocol Buffers definition
β”œβ”€β”€ text2image_pb2.py         # Generated by protoc (DO NOT EDIT)
β”œβ”€β”€ text2image_pb2_grpc.py    # Generated by protoc (DO NOT EDIT)
β”œβ”€β”€ Dockerfile                # Full container setup
β”œβ”€β”€ requirements.txt          # Python dependencies
β”œβ”€β”€ tests/                    # Test suite for performance evaluation
β”‚   β”œβ”€β”€ test_concurrent.py    # Concurrent request tests
β”‚   └── performance_tests.py  # Performance evaluation scripts
└── generated_images/         # Directory where generated images are saved

πŸ“¦ Installation

1. Clone the Repository

git clone https://github.com/yourusername/text2image-grpc-app.git
cd text2image-grpc-app

2. Create .proto stubs

Make sure protoc is installed.

python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. text2image.proto

3. Setup Python Environment (Optional)

python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install -r requirements.txt

4. Docker Setup (Recommended)

Ensure you have:

  • Docker installed
  • NVIDIA Container Toolkit set up correctly
# Build the container
docker build -t text2image-grpc .

# Run with GPU support
docker run --gpus all -p 50051:50051 -p 8501:8501 text2image-grpc

πŸ’‘ Usage

Option 1: Run Locally (Non-Docker)

Make sure you have a GPU with CUDA installed.

  1. Run gRPC Server
python server.py
  1. Run Streamlit UI (in another terminal)
streamlit run FrontEnd/streamlit_app.py

Option 2: Using the Streamlit Interface

  1. Open http://localhost:8501 in your browser
  2. Enter a text prompt (e.g., "a wizard casting fire in the sky")
  3. Optionally add context for better results (e.g., "fantasy art")
  4. Click Generate Image
  5. View the result and find the saved PNG in the generated_images/ folder

Option 3: Using the gRPC API Directly

Example Python client code:

import grpc
import text2image_pb2
import text2image_pb2_grpc
import base64
from PIL import Image
import io

# Setup gRPC channel
channel = grpc.insecure_channel('localhost:50051')
stub = text2image_pb2_grpc.Text2ImageServiceStub(channel)

# Create request
request = text2image_pb2.ImageRequest(
    context="fantasy art",
    text="a dragon flying over a medieval castle"
)

# Make the call
response = stub.GenerateImage(request)

# Process the response
if response.status_code == 200:
    # Convert base64 to image and display or save
    image_data = base64.b64decode(response.image_base64)
    image = Image.open(io.BytesIO(image_data))
    image.show()
    print(f"Image also saved to: {response.image_path}")
else:
    print(f"Error: {response.message}")

Option 4: Using the REST API Wrapper

If you prefer REST over gRPC:

# First run the REST wrapper
python rest_api.py

# Then make REST requests
curl -X POST http://localhost:8000/generate \
     -H "Content-Type: application/json" \
     -d '{"context": "cyberpunk", "text": "a futuristic city at night"}'

Response:

{
  "status_code": 200,
  "message": "Image generated successfully",
  "image_base64": "base64_encoded_string_here...",
  "image_path": "generated_images/cyberpunk_futuristic_city_20250504_123456.png"
}

🧱 System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Streamlit │◀──────▢│  gRPC API  │◀──────▢│  REST API   │◀──────▢│ Diffusers + PyTorch β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
      β–²                     β–²                    β–²                           β–²
      β”‚                     β”‚                    β”‚                           β”‚
      β–Ό                     β–Ό                    β–Ό                           β–Ό
  Web Browser          gRPC Client          REST Client                Dockerized GPU
  • Frontend: Streamlit web UI for interactive image generation
  • API Layer: gRPC Service with Protocol Buffers (primary) and REST API wrapper (secondary)
  • Core Service: Asynchronous request handler with concurrent processing support
  • Model: Hugging Face's diffusers.StableDiffusionPipeline
  • Inference: PyTorch with CUDA GPU acceleration
  • Deployment: Containerized with Docker for consistent environment

API Design

The service exposes a GenerateImage RPC method defined in the Protocol Buffer:

syntax = "proto3";

service Text2ImageService {
  rpc GenerateImage (ImageRequest) returns (ImageResponse);
}

message ImageRequest {
  string context = 1;  // Optional context for better image generation
  string text = 2;     // The main prompt for image generation
}

message ImageResponse {
  int32 status_code = 1;     // HTTP-like status code (200 = success)
  string message = 2;        // Status message or error description
  string image_base64 = 3;   // Base64-encoded image data
  string image_path = 4;     // Path where the image was saved
}

πŸ“¦ Dependencies and Model Source

The microservice relies on the following key components:

Core Dependencies

torch>=2.0.0
diffusers>=0.20.0
transformers>=4.30.0
protobuf>=4.23.0
grpcio>=1.54.0
grpcio-tools>=1.54.0
streamlit>=1.23.0
Pillow>=9.5.0
numpy>=1.24.0
asyncio>=3.4.3

Model Implementation

The application uses:

  • StableDiffusionPipeline from πŸ€— Diffusers
  • transformers for CLIP tokenizer/model
  • DPMSolverMultistepScheduler for faster inference

Server implementation uses asynchronous processing to handle concurrent requests:

from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler
import torch
import grpc
import asyncio
import concurrent.futures

class Text2ImageService(text2image_pb2_grpc.Text2ImageServiceServicer):
    def __init__(self):
        # Load model with GPU acceleration
        self.pipe = StableDiffusionPipeline.from_pretrained(
            "runwayml/stable-diffusion-v1-5",
            torch_dtype=torch.float16
        )
        self.pipe.scheduler = DPMSolverMultistepScheduler.from_config(self.pipe.scheduler.config)
        self.pipe = self.pipe.to("cuda")
        
        # Thread pool for handling concurrent requests
        self.executor = concurrent.futures.ThreadPoolExecutor(max_workers=3)
    
    async def GenerateImage(self, request, context):
        # Asynchronous implementation for handling concurrent requests
        loop = asyncio.get_event_loop()
        return await loop.run_in_executor(
            self.executor, 
            self._generate_image_sync, 
            request, 
            context
        )
        
    def _generate_image_sync(self, request, context):
        # Actual image generation logic
        # [...]

πŸ§ͺ Testing and Performance

The project includes comprehensive test suites to evaluate performance and reliability:

Concurrent Request Handling

Tests verify the service can handle multiple simultaneous requests:

# Run concurrent request tests
python tests/test_concurrent.py

Performance Evaluation

Performance metrics measuring response times under various loads:

Performance Graph

Concurrent Requests Avg. Response Time (s) Success Rate (%)
1 2.3 100
5 3.8 100
10 6.5 98
20 12.1 95

Error Handling

The service implements robust error handling:

  • Invalid input validation
  • GPU memory monitoring
  • Timeouts for long-running requests
  • Graceful degradation under heavy load

⚠️ Known Limitations

  • 🧠 Memory intensive: Requires >= 6 GB VRAM for Stable Diffusion 1.5
  • πŸ”’ No user authentication or rate limiting implemented
  • 🌐 No image caching (each prompt triggers new inference)
  • πŸ“‰ Cold start time when loading large models in container (~30s)
  • πŸ–ΌοΈ Limited to single image generation (no batch processing)

πŸ”§ Troubleshooting

Problem: gRPC connection refused
Cause: Server not running or port conflict
Fix: Verify server is running and port 50051 is available

Problem: module 'torch' has no attribute 'compiler'
Cause: Version mismatch between PyTorch and transformers
Fix: Use compatible versions:

pip install torch==2.1.0 transformers==4.36.2 diffusers==0.25.0

Problem: CUDA out of memory error
Cause: GPU memory exhaustion from concurrent requests
Fix: Reduce concurrent request limit in server.py or increase available VRAM

πŸ” Future Improvements

  • Add user authentication and API keys
  • Implement request rate limiting
  • Add image caching for duplicate prompts
  • Support batch processing of multiple prompts
  • Add more Stable Diffusion model options
  • Implement model quantization for reduced memory usage

About

gRPC based GPU-accelerated microservice for text-to-image generation using Stable Diffusion

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published