Skip to content

A high-performance library for sharing GPU memory objects across processes using IPC mechanisms with JSON-RPC 2.0 protocol, enabling model and inference engine separation architecture.

License

Notifications You must be signed in to change notification settings

world-sim-dev/shared-tensor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Shared Tensor

Python Version PyTorch License

A high-performance library for sharing GPU memory objects across processes using IPC mechanisms with JSON-RPC 2.0 protocol, enabling model and inference engine separation architecture.

πŸš€ Project Overview

Shared Tensor is a cross-process communication library designed specifically for deep learning and AI applications, utilizing IPC mechanisms and JSON-RPC protocol to achieve:

  • Efficient GPU Memory Sharing: Cross-process sharing of PyTorch tensors and models
  • Remote Function Execution: Easy remote function calls through decorators
  • Async/Sync Support: Flexible execution modes for different scenarios
  • Model Serving: Deploy machine learning models as independent services
  • Distributed Inference: Support for distributed computing in multi-GPU environments

πŸ“‹ Core Features

πŸ”„ Cross-Process Communication

  • JSON-RPC 2.0 Protocol: Standardized remote procedure calls
  • HTTP Transport: Reliable HTTP-based communication mechanism
  • Serialization Optimization: Efficient PyTorch object serialization/deserialization

🎯 Function Sharing

  • Decorator Pattern: Easy function sharing using @provider.share
  • Auto Discovery: Smart function path resolution and import
  • Parameter Passing: Support for complex data type parameters

⚑ Async Support

  • Async Execution: AsyncSharedTensorProvider supports non-blocking calls
  • Task Management: Complete async task status tracking
  • Concurrent Processing: Efficient concurrent request handling

πŸ–₯️ GPU Compatibility

  • CUDA Support: Native CUDA tensor sharing support
  • Device Management: Smart data migration between devices
  • Memory Optimization: Efficient GPU memory usage

πŸ› οΈ Installation Guide

Requirements

  • Python: 3.8+
  • Operating System: Linux (recommended)
  • PyTorch: 1.12.0+
  • CUDA: Optional, for GPU support

Installation Methods

Install from Pypi

pip install shared-tensor

Install from Source

# Clone the repository
git clone https://github.com/world-sim-dev/shared-tensor.git
cd shared-tensor

# Install dependencies
pip install -r requirements.txt

# Install the package
pip install -e .

Development Installation

# Install with development dependencies
pip install -e ".[dev]"

# Install with test dependencies
pip install -e ".[test]"

Verify Installation

# Check core functionality
python -c "import shared_tensor; print('βœ“ Shared Tensor installed successfully')"

🎯 Quick Start

1. Basic Function Sharing

from shared_tensor.async_provider import AsyncSharedTensorProvider

# Create provider
provider = AsyncSharedTensorProvider()

# Share simple function
@provider.share()
def add_numbers(a, b):
    return a + b

# Share PyTorch function
@provider.share()
def create_tensor(shape):
    import torch
    return torch.zeros(shape)

# Load PyTorch model
@provider.share()
def load_model():
    ...

2. Start Server

# Method 1: Use command line tool, single server
shared-tensor-server

# Method 2: Use torchrun
torchrun --nproc_per_node=4 --no-python shared-tensor-server

# Method 3: Custom configuration
python shared_tensor/server.py

πŸ“– Detailed Usage

Model Sharing Example

import torch
import torch.nn as nn

from shared_tensor.async_provider import AsyncSharedTensorProvider

# Create provider
provider = AsyncSharedTensorProvider()

# Define model
class SimpleNet(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super().__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, output_size)
    
    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# Share model creation function
@provider.share(name="create_model")
def create_model(input_size=784, hidden_size=128, output_size=10):
    model = SimpleNet(input_size, hidden_size, output_size)
    return model

# Share inference function
model = create_model()
with torch.no_grad():
    model(input_data)

πŸ”§ Configuration Options

Server Configuration

provider = AsyncSharedTensorProvider(
    server_port: int = 2537 + global_rank,    # Local Http Server Port
    verbose_debug: bool = False,              # Logging more detailed params
    poll_interval: float = 1.0,               # Check status interval only for Async provider
    default_enabled: bool = True              # Whether enable shared-tenser and re-enable via env `export __SHARED_TENSOR_ENABLED__=true`
)

@provider.share(
    name: Optional[str] = None,               # name for logging and debug, when singleton enabled, as default cache key
    wait: bool = True,                        # whether return func return or a async handler
    singleton: bool = True,                   # whether maintain only one instance of func result
    singleton_key_formatter: Optional[str] = None): # python template can be formatted by user function params, act as final cache key
def get_demo_model():
    ...

πŸ§ͺ Testing

Run Test Suite

# Run all tests
python tests/run_tests.py

# Run specific category tests
python tests/run_tests.py --category unit
python tests/run_tests.py --category integration
python tests/run_tests.py --category pytorch

# Run only PyTorch related tests
python tests/run_tests.py --torch-only

# Verbose output
python tests/run_tests.py --verbose

Test Environment Info

# Check test environment
python tests/run_tests.py --env-info

Individual Test Files

# Test tensor serialization
python tests/pytorch_tests/test_tensor_serialization.py

# Test async system
python tests/integration/test_async_system.py

# Test client
python tests/integration/test_client.py

πŸ—οΈ Architecture Design

Core Components

shared-tensor/
β”œβ”€β”€ shared_tensor/              # Core modules
β”‚   β”œβ”€β”€ server.py              # JSON-RPC server
β”‚   β”œβ”€β”€ client.py              # Sync client
β”‚   β”œβ”€β”€ provider.py            # Sync provider
β”‚   β”œβ”€β”€ async_client.py        # Async client
β”‚   β”œβ”€β”€ async_provider.py      # Async provider
β”‚   β”œβ”€β”€ async_task.py          # Async task management
β”‚   β”œβ”€β”€ jsonrpc.py            # JSON-RPC protocol implementation
β”‚   β”œβ”€β”€ utils.py              # Utility functions
β”‚   └── errors.py             # Exception definitions
β”œβ”€β”€ examples/                  # Usage examples
└── tests/                     # Test suite

Communication Flow

sequenceDiagram
    participant CA as Client App
    participant SC as SharedTensorClient
    participant SS as SharedTensorServer
    participant FE as Function Executor
    
    Note over CA, FE: Client-Server Communication Flow
    
    CA->>SC: call_function("model_inference", args)
    SC->>SC: Serialize parameters
    SC->>SS: HTTP POST /jsonrpc<br/>JSON-RPC Request
    
    Note over SS: Server Processing
    SS->>SS: Parse JSON-RPC request
    SS->>SS: Resolve function path
    SS->>FE: Import & execute function
    FE->>FE: Deserialize parameters
    FE->>FE: Execute function logic
    FE->>SS: Return execution result
    
    Note over SS: Response Preparation
    SS->>SS: Serialize result
    SS->>SS: Create JSON-RPC response
    SS->>SC: HTTP Response<br/>JSON-RPC Result
    
    Note over SC: Client Processing
    SC->>SC: Parse response
    SC->>SC: Deserialize result
    SC->>CA: Return final result
    
    Note over CA, FE: End-to-End Process Complete
Loading

Debug Tips

  1. Enable verbose logging:
import logging
logging.basicConfig(level=logging.DEBUG)
  1. Use debug mode:
provider = SharedTensorProvider(verbose_debug=True)
  1. Check function paths:
provider = SharedTensorProvider()
print(provider._registered_functions)

🀝 Contributing

We welcome community contributions! Please follow these steps:

Development Environment Setup

# Clone repository
git clone https://github.com/world-sim-dev/shared-tensor.git
cd shared-tensor

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Package & Publish
python -m pip install build
python -m build --sdist --wheel
python -m twine upload --repository testpypi dist/*
python -m twine upload dist/*

Code Standards

# Code formatting
black shared_tensor/ tests/ examples/

# Import sorting
isort shared_tensor/ tests/ examples/

# Static checking
flake8 shared_tensor/
mypy shared_tensor/

Submission Process

  1. Fork the project and create a feature branch
  2. Write code and tests
  3. Run the complete test suite
  4. Submit a Pull Request

Test Requirements

  • New features must include tests
  • Maintain test coverage > 90%
  • All tests must pass

πŸ“„ License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details

πŸ™ Acknowledgments

πŸ“ž Contact Us


Shared Tensor - Making GPU memory sharing simple and efficient πŸš€

About

A high-performance library for sharing GPU memory objects across processes using IPC mechanisms with JSON-RPC 2.0 protocol, enabling model and inference engine separation architecture.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages