feat(minimax): add MiniMax provider with tier-aware rate limiting#84
Open
Societus wants to merge 5 commits intorepowise-dev:mainfrom
Open
feat(minimax): add MiniMax provider with tier-aware rate limiting#84Societus wants to merge 5 commits intorepowise-dev:mainfrom
Societus wants to merge 5 commits intorepowise-dev:mainfrom
Conversation
- Add litellm to interactive provider selection menu - Support LITELLM_BASE_URL for local proxy deployments (no API key required) - Auto-add openai/ prefix when using api_base for proper LiteLLM routing - Add dummy API key for local proxies (OpenAI SDK requirement) - Add validation and tests for litellm provider configuration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… false positives Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add first-class support for Z.AI with OpenAI-compatible API. - New ZAIProvider with thinking disabled by default for GLM-5 family - Plan selection: 'coding' (subscription) or 'general' (pay-as-you-go) - Environment variables: ZAI_API_KEY, ZAI_PLAN, ZAI_BASE_URL, ZAI_THINKING - Rate limit defaults and auto-detection in CLI helpers Closes repowise-dev#68
Add RATE_LIMIT_TIERS class attribute and resolve_rate_limiter() static method to BaseProvider. Any provider with subscription tiers can define RATE_LIMIT_TIERS and pass tier + tiers to resolve_rate_limiter() to get automatic tier-aware rate limiter creation. Precedence: tier > explicit rate_limiter > None. Tier matching is case-insensitive. Invalid tiers raise ValueError. This is a provider-agnostic foundation -- no provider-specific code. Providers adopt it by defining RATE_LIMIT_TIERS and calling resolve_rate_limiter() in their constructor. Ref: repowise-dev#68
Add MiniMax as a built-in provider using the generic tier framework (repowise-dev#82). MiniMax is an OpenAI-compatible API provider with the M2.x model family (M2.7, M2.5, M2.1, M2) and published token plan rate tiers. Changes: - New MiniMaxProvider with RATE_LIMIT_TIERS (starter/plus/max/ultra) derived from published 5-hour rolling window limits - Uses resolve_rate_limiter() from BaseProvider for tier resolution - reasoning_split=True by default to separate thinking from content - Bumped retry budget: 5 retries / 30s max for load-shedding tolerance - Registered in provider registry with openai package dependency hint - Conservative PROVIDER_DEFAULTS (Starter-tier: 5 RPM / 25K TPM) - CLI env vars: MINIMAX_API_KEY, MINIMAX_BASE_URL, MINIMAX_REASONING_SPLIT, MINIMAX_TIER - 30 unit tests (constructor, tiers, generate, stream_chat, registry) Rate limit tiers (from https://platform.minimax.io/docs/token-plan/intro): Starter: 1,500 req/5hrs -> 5 RPM / 25K TPM Plus: 4,500 req/5hrs -> 15 RPM / 75K TPM Max: 15,000 req/5hrs -> 50 RPM / 250K TPM Ultra: 30,000 req/5hrs -> 100 RPM / 500K TPM Highspeed variants (e.g., MiniMax-M2.7-highspeed) share the same rate limits as their base plan -- the difference is faster inference, not quota. This provider is structurally identical to Z.AI (repowise-dev#83) and was trivial to implement because both use the generic tier framework. The framework eliminated all per-provider boilerplate for tier resolution. Depends on: repowise-dev#82 (generic tier framework) Ref: repowise-dev#68
This was referenced Apr 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add MiniMax as a built-in LLM provider using the generic tier framework from #82.
This PR is a straightforward application of the same pattern as #83. Both MiniMax and Z.AI are OpenAI-compatible APIs with subscription tiers and built-in reasoning models. The generic tier framework made this provider almost mechanical to implement -- the only provider-specific code is the model names, the
reasoning_splitparameter vs Z.AI'sthinkingtoggle, and the tier definitions.Depends on: #82 (generic tier framework -- merge that first)
Why This Was Inconsequential
MiniMax shares the same architectural profile as Z.AI:
https://api.minimax.io/v1openaiSDKThe generic framework from #82 eliminated all boilerplate for tier resolution. Adding MiniMax was just: define
RATE_LIMIT_TIERS, set the base URL, and pick the reasoning parameter name. Everything else is inherited.Changes
New: MiniMax Provider (
minimax.py)RATE_LIMIT_TIERSwith Starter/Plus/Max/Ultra configs from published limitsresolve_rate_limiter()from BaseProvider (zero custom tier code)reasoning_split=Trueby default (separates thinking from content)Registry (
registry.py)minimax->MiniMaxProviderwithopenaipackage hintRate Limiter (
rate_limiter.py)PROVIDER_DEFAULTS["minimax"]= Starter-tier conservative (5 RPM / 25K TPM)CLI Helpers (
helpers.py)MINIMAX_API_KEY,MINIMAX_BASE_URL,MINIMAX_REASONING_SPLIT,MINIMAX_TIERenv varsMINIMAX_API_KEYTests (
test_minimax_provider.py)Rate Limit Tiers
From published MiniMax docs (5-hour rolling window):
Highspeed variants (e.g., MiniMax-M2.7-highspeed) share the same rate limits as their base plan. The difference is model selection (faster inference), not quota.
Ref: https://platform.minimax.io/docs/token-plan/intro
Configuration
Test Plan
uv run pytest tests/unit/test_providers/test_minimax_provider.py -v # 30 passedAll 121 provider tests pass with zero regressions.
PR Stack
Related