Skip to content

We are a globally distributed team providing high-performance, secure, and cost-effective inference for premier open-source models.

Notifications You must be signed in to change notification settings

winbayai/AI-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Winbay AI API Service

Welcome to the official documentation for Winbay's AI Inference API. We are a globally distributed team providing high-performance, secure, and cost-effective inference for premier open-source models.

Website: https://winbay.io
Contact: info@winbay.io


About Us

Winbay is a globally distributed team with members in the United States, Europe, and Singapore, dedicated to providing premier AI inference services. We are focused on delivering an unparalleled user experience built upon our core advantages: aggressive pricing, exceptional speed, unrestricted concurrency, robust security, and strategic server locations.

Our Advantages

We build our service around the features that matter most to our users.

  • Aggressive Pricing: We offer highly competitive pricing and generous volume discounts (30-50%) to ensure you receive the best possible value.
  • Exceptional Speed: Our stack, optimized with enterprise-grade GPUs, delivers extremely low latency and high throughput for demanding applications.
  • Unlimited Concurrency: We do not impose default rate limits. Our infrastructure is built to handle high concurrency, allowing your services to scale without restriction.
  • Ironclad Security: We enforce a strict zero-retention policy. No prompt or completion data is ever stored, ensuring maximum privacy and security.
  • Strategic Server Locations: With servers located in the United States and Singapore, we guarantee optimal performance and low latency for users across the North American and Asia-Pacific (APAC) regions.

API Endpoints

Our API is fully compatible with the OpenAI standard.

Supported Models

We provide optimized inference for a wide range of the latest high-performance models.

Model Family Models
Anthropic claude-sonnet-4-20250514, claude-sonnet-4-20250514-thinking
Mistral AI codestral-latest, ministral-3b-latest, ministral-8b-latest, mistral-large-latest, mistral-small, mistral-small-2501, mistral-small-2503, mistral-small-3.1-24b, mistral-small-latest, mistral-tiny-latest, open-mistral-nemo, open-mixtral-8x7b, pixtral-12b-latest, pixtral-large-latest, magistral-medium-latest, magistral-small-latest
DeepSeek deepseek-r1, deepseek-r1-0528, deepseek-r1-search, deepseek-v3, deepseek-v3-0324, deepseek-v3-search
Google gemini-2.0-flash, gemini-2.0-flash-exp-image-generation, gemini-2.0-flash-lite, gemini-2.0-flash-preview-image-generation, gemini-2.5-flash, gemini-2.5-flash-lite-preview-06-17, gemini-2.5-flash-preview-04-17, gemini-2.5-flash-preview-04-17-thinking, gemini-2.5-flash-preview-05-20, gemini-2.5-pro, gemini-2.5-pro-preview-05-06, gemini-2.5-pro-preview-06-05, imagen-3.0-generate-002, gemma-3-27b-it
OpenAI gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-4o-mini-search-preview, o3
xAI grok-3-mini
Meta meta/llama-4-maverick-17b-128e-instruct
Qwen qwen3-235b-a22b, qwen3-30b-a3b, qwen3-32b

About

We are a globally distributed team providing high-performance, secure, and cost-effective inference for premier open-source models.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published