Skip to content

Zeeeepa/qwen-api

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

93 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Qwen API - OpenAI-Compatible API Proxy

Version Python License

A production-ready, OpenAI-compatible API proxy for Qwen models with intelligent model routing, automatic tool injection, and comprehensive validation.


🌟 Key Features

✨ Intelligent Model Routing

  • Smart Aliases: Use friendly names like Qwen_Research, Qwen_Think, Qwen_Code
  • Auto-Tool Injection: Web search automatically added server-side
  • Default Fallback: Unknown models β†’ qwen3-max-latest with web search
  • Backward Compatible: Direct Qwen model names work unchanged

πŸ”’ Security & Validation

  • OpenAPI Validation: All requests validated against official OpenAI spec
  • Anonymous Mode: Works without API keys (or any key works)
  • Bearer Token Caching: Automated authentication with Playwright
  • Request/Response Sanitization: Full validation middleware

πŸš€ Performance

  • Async/Await: Non-blocking I/O for high concurrency
  • Streaming Support: Full SSE streaming for real-time responses
  • Request Tracking: Built-in monitoring and analytics
  • Health Checks: /health and /v1/models endpoints

πŸ“‹ Model Aliases

1. "Qwen" (Default Fallback)

  • Alias for: qwen3-max-latest
  • Auto-Tools: web_search (always applied)
  • Max Tokens: Provider default
  • Use Case: General purpose, unknown model names
  • Example:
    # These all route to qwen3-max-latest + web_search:
    model="gpt-4"
    model="claude-3-opus"
    model="random-model-name"

2. "Qwen_Research" Alias

  • Routes to: qwen-deep-research
  • Auto-Tools: NONE (clean research mode)
  • Max Tokens: Provider default
  • Use Case: Deep research without tool interference
  • Example:
    client = OpenAI(api_key="sk-any", base_url="http://localhost:8096/v1")
    response = client.chat.completions.create(
        model="Qwen_Research",  # Case-insensitive
        messages=[{"role": "user", "content": "Research quantum computing"}]
    )

3. "Qwen_Think" Alias

  • Routes to: qwen3-235b-a22b-2507
  • Auto-Tools: web_search (always applied)
  • Max Tokens: 81,920 (extended context)
  • Use Case: Complex reasoning with web access
  • Example:
    response = client.chat.completions.create(
        model="Qwen_Think",
        messages=[{"role": "user", "content": "Solve this complex problem..."}]
    )
    # Server automatically adds web_search tool + 81920 token limit

4. "Qwen_Code" Alias

  • Routes to: qwen3-coder-plus
  • Auto-Tools: web_search (always applied)
  • Max Tokens: Provider default
  • Use Case: Code generation with web documentation access
  • Example:
    response = client.chat.completions.create(
        model="Qwen_Code",
        messages=[{"role": "user", "content": "Write a Python REST API"}]
    )
    # Web search helps with latest library documentation

πŸ”§ Direct Qwen Models (No Aliasing)

These models pass through without transformation:

# Backward compatibility - work as expected:
model="qwen2.5-max"
model="qwen2.5-turbo"
model="qwen-deep-research"
model="qwen-max-latest"
model="qwen3-max-latest"
model="qwen3-235b-a22b-2507"
model="qwen3-coder-plus"
model="qwen-math-plus"
model="qwen-math-turbo"
model="qwen-coder-turbo"
model="qwen-vl-max"
model="qwen-vl-plus"

πŸš€ Quick Start

Option 1: One-Line Deployment (Recommended)

# Set your Qwen credentials
export QWEN_EMAIL="your-email@example.com"
export QWEN_PASSWORD="your-password"

# Deploy everything (setup + auth + server + tests)
curl -sSL https://raw.githubusercontent.com/Zeeeepa/qwen-api/main/deploy_qwen_api.sh | bash

Option 2: Manual Deployment

# Clone repository
git clone https://github.com/Zeeeepa/qwen-api.git
cd qwen-api

# Set credentials
export QWEN_EMAIL="your-email@example.com"
export QWEN_PASSWORD="your-password"

# Run deployment script
bash scripts/all.sh

Option 3: Step-by-Step

# 1. Setup environment
bash scripts/setup.sh

# 2. Extract authentication token
python3 scripts/extract_bearer_token.py

# 3. Start server
bash scripts/start.sh

# 4. Test API (optional)
bash scripts/send_request.sh

πŸ“¦ Installation Details

System Requirements

  • Python: 3.11 or higher
  • OS: Linux, macOS, Windows (WSL2)
  • Memory: 512MB minimum, 2GB recommended
  • Disk: 500MB for dependencies + browsers

Dependencies

# Core dependencies (auto-installed by setup script)
pip install -r requirements.txt

# Key packages:
# - fastapi + granian (async web server)
# - playwright (browser automation)
# - httpx (HTTP client)
# - pydantic (data validation)

Manual Setup

# 1. Create virtual environment
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# 2. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt

# 3. Install Playwright browsers
playwright install --with-deps chromium

# 4. Create .env file
cat > .env << EOF
LISTEN_PORT=8096
ANONYMOUS_MODE=true
EOF

# 5. Create directories
mkdir -p logs cache

πŸ” Authentication

Automated (Recommended)

The deployment script automatically:

  1. Launches headless Chromium browser
  2. Logs into Qwen with your credentials
  3. Extracts Bearer token from network traffic
  4. Caches token to .qwen_bearer_token
  5. Reuses cached token until expiration

Manual Token Extraction

# Run Playwright authentication
python3 scripts/extract_bearer_token.py

# Token saved to: .qwen_bearer_token
# Format: Bearer eyJ...

Anonymous Mode

No credentials needed! The server works in anonymous mode:

# Any API key works:
client = OpenAI(api_key="sk-anything", base_url="http://localhost:8096/v1")
client = OpenAI(api_key="fake-key-123", base_url="http://localhost:8096/v1")
client = OpenAI(api_key="", base_url="http://localhost:8096/v1")

🎯 Usage Examples

Python (OpenAI SDK)

from openai import OpenAI

# Initialize client
client = OpenAI(
    api_key="sk-any",  # Any key works!
    base_url="http://localhost:8096/v1"
)

# Example 1: Unknown model β†’ Default fallback
response = client.chat.completions.create(
    model="gpt-4",  # Routes to qwen3-max-latest + web_search
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

# Example 2: Research mode (no tools)
response = client.chat.completions.create(
    model="Qwen_Research",  # qwen-deep-research, no tools
    messages=[{"role": "user", "content": "Research topic..."}]
)

# Example 3: Thinking mode (extended context)
response = client.chat.completions.create(
    model="Qwen_Think",  # qwen3-235b-a22b-2507 + web_search + 81920 tokens
    messages=[{"role": "user", "content": "Complex reasoning..."}]
)

# Example 4: Code generation
response = client.chat.completions.create(
    model="Qwen_Code",  # qwen3-coder-plus + web_search
    messages=[{"role": "user", "content": "Write FastAPI endpoint"}]
)

# Example 5: Streaming
stream = client.chat.completions.create(
    model="Qwen_Think",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

cURL

# Test with any model name
curl -X POST http://localhost:8096/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-any" \
  -d '{
    "model": "Qwen_Think",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Streaming
curl -X POST http://localhost:8096/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-any" \
  -d '{
    "model": "Qwen_Code",
    "messages": [{"role": "user", "content": "Write Python code"}],
    "stream": true
  }'

Node.js

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-any',
  baseURL: 'http://localhost:8096/v1'
});

// Use any model alias
const response = await client.chat.completions.create({
  model: 'Qwen_Think',
  messages: [{ role: 'user', content: 'Hello!' }]
});

console.log(response.choices[0].message.content);

πŸ§ͺ Testing

Run Comprehensive Test Suite

# Test all routing scenarios + tool integration
python3 test_all_routing_scenarios.py

Expected Output

πŸš€ COMPREHENSIVE ROUTING & TOOL INTEGRATION TESTS
================================================================================

Testing against: http://localhost:8096/v1
Total scenarios: 5 routing + 2 web search tests

βœ… SCENARIO 1: Default Fallback (gpt-5 β†’ qwen3-max-latest + web_search)
βœ… SCENARIO 2: Qwen_Research (β†’ qwen-deep-research, no tools)
βœ… SCENARIO 3: Qwen_Think (β†’ qwen3-235b-a22b-2507 + web_search + 81920 tokens)
βœ… SCENARIO 4: Qwen_Code (β†’ qwen3-coder-plus + web_search)
βœ… SCENARIO 5: Direct Model (qwen2.5-max β†’ qwen2.5-max, no changes)
βœ… WEB SEARCH TEST 1: gpt-4 with web search
βœ… WEB SEARCH TEST 2: Qwen_Think with web search

πŸ“Š TEST SUMMARY
Total Tests: 7
Passed: 7
Failed: 0
Pass Rate: 100.0%

πŸŽ‰ ALL TESTS PASSED! πŸŽ‰

Manual Testing

# Health check
curl http://localhost:8096/health

# List models
curl http://localhost:8096/v1/models

# Simple request
python3 scripts/send_request.sh

πŸ“ Project Structure

qwen-api/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   └── openai.py          # OpenAI endpoints (/chat/completions)
β”‚   β”œβ”€β”€ middleware/
β”‚   β”‚   └── openapi_validator.py  # Request/response validation
β”‚   β”œβ”€β”€ model_router.py        # ⭐ Intelligent routing + tool injection
β”‚   β”œβ”€β”€ providers/
β”‚   β”‚   β”œβ”€β”€ base.py
β”‚   β”‚   β”œβ”€β”€ provider_factory.py
β”‚   β”‚   └── qwen_simple_proxy.py
β”‚   └── utils/
β”‚       β”œβ”€β”€ logger.py
β”‚       └── request_tracker.py
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ setup.sh               # Environment setup
β”‚   β”œβ”€β”€ extract_bearer_token.py  # Playwright authentication
β”‚   β”œβ”€β”€ start.sh               # Start server
β”‚   β”œβ”€β”€ deploy.sh              # All-in-one deployment
β”‚   └── send_request.sh        # Test script
β”œβ”€β”€ start.py                   # Server entry point (replaces main.py)
β”œβ”€β”€ test_all_routing_scenarios.py  # Comprehensive test suite
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ qwen.json                  # OpenAPI spec for validation
└── README.md                  # This file

πŸ”§ Configuration

Environment Variables

# .env file configuration
LISTEN_PORT=8096              # Server port
ANONYMOUS_MODE=true           # Allow any API key
LOG_LEVEL=INFO                # DEBUG, INFO, WARNING, ERROR

# Optional: Runtime settings
QWEN_EMAIL=your-email@example.com
QWEN_PASSWORD=your-password

Server Settings

Edit start.py to customize:

# Port binding
port = int(os.getenv("LISTEN_PORT", "8096"))

# Log level
log_level = os.getenv("LOG_LEVEL", "INFO").upper()

# Worker configuration (Granian)
workers = 1  # Increase for production
threads = 1  # HTTP/1.1 threads

Model Router Configuration

Edit app/model_router.py to customize aliases:

MODEL_CONFIGS = {
    "qwen_research": {
        "actual_model": "qwen-deep-research",
        "tools": [],  # No tools
        "max_tokens": None,
    },
    "qwen_think": {
        "actual_model": "qwen3-235b-a22b-2507",
        "tools": ["web_search"],
        "max_tokens": 81920,  # Extended context
    },
    # Add your custom aliases here...
}

🚨 Troubleshooting

Server won't start

# Check port availability
lsof -i :8096

# Check logs
tail -f logs/server.log

# Verify Python version
python3 --version  # Should be 3.11+

# Reinstall dependencies
pip install -r requirements.txt --force-reinstall

Authentication issues

# Re-extract token
rm .qwen_bearer_token
python3 scripts/extract_bearer_token.py

# Check token validity
cat .qwen_bearer_token

Connection errors in tests

# Ensure server is running
curl http://localhost:8096/health

# Check server logs for errors
tail -20 logs/server.log

# Restart server
pkill -f "python3 start.py"
bash scripts/start.sh

Tool injection not working

# Check model router logs
grep "Auto-injecting tools" logs/server.log

# Verify model alias resolution
grep "Model transformation" logs/server.log

# Expected output:
# πŸ“ Model transformation: gpt-4 β†’ qwen3-max-latest
# πŸ› οΈ Auto-injecting tools for gpt-4: ['web_search']

πŸ“Š API Endpoints

Chat Completions

POST /v1/chat/completions

Request:

{
  "model": "Qwen_Think",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "stream": false,
  "max_tokens": 1000,
  "temperature": 0.7
}

Response:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "qwen3-235b-a22b-2507",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "finish_reason": "stop"
  }]
}

List Models

GET /v1/models

Response:

{
  "object": "list",
  "data": [
    {
      "id": "qwen3-max-latest",
      "object": "model",
      "created": 1234567890,
      "owned_by": "qwen"
    },
    {
      "id": "qwen-deep-research",
      "object": "model",
      "created": 1234567890,
      "owned_by": "qwen"
    }
  ]
}

Health Check

GET /health

Response:

{
  "status": "ok",
  "service": "qwen-ai2api-server",
  "version": "0.2.0"
}

🀝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/amazing)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing)
  5. Open Pull Request

πŸ“„ License

MIT License - see LICENSE file for details.


πŸ™ Acknowledgments

  • Qwen Team - For the amazing language models
  • OpenAI - For the API specification
  • FastAPI - For the excellent web framework
  • Playwright - For browser automation

πŸ“ Changelog

v0.2.0 (Current)

  • ✨ Added intelligent model routing
  • ✨ Implemented 4 model aliases (Qwen, Qwen_Research, Qwen_Think, Qwen_Code)
  • ✨ Auto-tool injection (web_search)
  • ✨ OpenAPI validation middleware
  • ✨ Comprehensive test suite
  • πŸ› Fixed streaming response handling
  • πŸ“š Complete documentation

v0.1.0

  • πŸŽ‰ Initial release
  • βœ… OpenAI-compatible endpoints
  • βœ… Bearer token authentication
  • βœ… Basic request/response handling

Made with ❀️ for the AI community

About

Qwen API documentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 77.7%
  • Shell 15.2%
  • TypeScript 5.6%
  • JavaScript 1.5%