Winbay AI API Service

Welcome to the official documentation for Winbay's AI Inference API. We are a globally distributed team providing high-performance, secure, and cost-effective inference for premier open-source models.

Website: https://winbay.io
Contact: info@winbay.io

About Us

Winbay is a globally distributed team with members in the United States, Europe, and Singapore, dedicated to providing premier AI inference services. We are focused on delivering an unparalleled user experience built upon our core advantages: aggressive pricing, exceptional speed, unrestricted concurrency, robust security, and strategic server locations.

Our Advantages

We build our service around the features that matter most to our users.

Aggressive Pricing: We offer highly competitive pricing and generous volume discounts (30-50%) to ensure you receive the best possible value.
Exceptional Speed: Our stack, optimized with enterprise-grade GPUs, delivers extremely low latency and high throughput for demanding applications.
Unlimited Concurrency: We do not impose default rate limits. Our infrastructure is built to handle high concurrency, allowing your services to scale without restriction.
Ironclad Security: We enforce a strict zero-retention policy. No prompt or completion data is ever stored, ensuring maximum privacy and security.
Strategic Server Locations: With servers located in the United States and Singapore, we guarantee optimal performance and low latency for users across the North American and Asia-Pacific (APAC) regions.

API Endpoints

Our API is fully compatible with the OpenAI standard.

Supported Models

We provide optimized inference for a wide range of the latest high-performance models.

Model Family	Models
Anthropic	`claude-sonnet-4-20250514`, `claude-sonnet-4-20250514-thinking`
Mistral AI	`codestral-latest`, `ministral-3b-latest`, `ministral-8b-latest`, `mistral-large-latest`, `mistral-small`, `mistral-small-2501`, `mistral-small-2503`, `mistral-small-3.1-24b`, `mistral-small-latest`, `mistral-tiny-latest`, `open-mistral-nemo`, `open-mixtral-8x7b`, `pixtral-12b-latest`, `pixtral-large-latest`, `magistral-medium-latest`, `magistral-small-latest`
DeepSeek	`deepseek-r1`, `deepseek-r1-0528`, `deepseek-r1-search`, `deepseek-v3`, `deepseek-v3-0324`, `deepseek-v3-search`
Google	`gemini-2.0-flash`, `gemini-2.0-flash-exp-image-generation`, `gemini-2.0-flash-lite`, `gemini-2.0-flash-preview-image-generation`, `gemini-2.5-flash`, `gemini-2.5-flash-lite-preview-06-17`, `gemini-2.5-flash-preview-04-17`, `gemini-2.5-flash-preview-04-17-thinking`, `gemini-2.5-flash-preview-05-20`, `gemini-2.5-pro`, `gemini-2.5-pro-preview-05-06`, `gemini-2.5-pro-preview-06-05`, `imagen-3.0-generate-002`, `gemma-3-27b-it`
OpenAI	`gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`, `gpt-4o-mini-search-preview`, `o3`
xAI	`grok-3-mini`
Meta	`meta/llama-4-maverick-17b-128e-instruct`
Qwen	`qwen3-235b-a22b`, `qwen3-30b-a3b`, `qwen3-32b`

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Winbay AI API Service

About Us

Our Advantages

API Endpoints

Supported Models

About

Uh oh!

Releases

Packages

winbayai/AI-API

Folders and files

Latest commit

History

Repository files navigation

Winbay AI API Service

About Us

Our Advantages

API Endpoints

Supported Models

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages