Description
What happened?
TLDR: LiteLLM Proxy is incorrectly mapping the requested OpenAI audio transcription model openai-gpt-4o-mini-transcribe and openai-gpt-4o-transcribe to the older openai/whisper-1 model, preventing the use of the specified GPT-4o model despite it being a valid option in my LiteLLM config and according to OpenAI documentation.
I am using the LiteLLM proxy with the following config.yaml
:
model_list:
# ... other models ...
- model_name: openai-whisper-1
litellm_params:
model: openai/whisper-1
api_key: "..."
model_info:
mode: audio_transcription
- model_name: openai-gpt-4o-mini-transcribe
litellm_params:
model: openai/gpt-4o-mini-transcribe
api_key: "..."
model_info:
mode: audio_transcription
- model_name: openai-gpt-4o-transcribe
litellm_params:
model: openai/gpt-4o-transcribe
api_key: "..."
model_info:
mode: audio_transcription
# ... other models ...
According to the OpenAI documentation, gpt-4o-mini-transcribe
and openai-gpt-4o-transcribe
are valid models identifier for their audio transcription endpoint (/v1/audio/transcriptions
).
However, when making an audio transcription request to the LiteLLM proxy using the model name openai-gpt-4o-mini-transcribe
or openai-gpt-4o-transcribe
, the LiteLLM logs indicate that the request is being routed to the openai/whisper-1
model instead.
LiteLLM Log Snippets Demonstrating the Issue:
Initial Request Log showing model_map_information
:
{
"model_name": "openai-gpt-4o-mini-transcribe",
"litellm_params": {
"use_in_pass_through": false,
"use_litellm_proxy": false,
"merge_reasoning_content_in_choices": false,
"model": "openai/gpt-4o-mini-transcribe"
},
"model_info": {
"id": "bc1c1175a643fd93d6011a841b1950d6b9fd71b91b648b6f39046a83d5b16f7d",
"db_model": false,
"mode": "audio_transcription",
"key": "gpt-4o-mini-transcribe",
"max_tokens": null,
"max_input_tokens": 16000,
"max_output_tokens": 2000,
"input_cost_per_token": 0.00000125,
"cache_creation_input_token_cost": null,
"cache_read_input_token_cost": null,
"input_cost_per_character": null,
"input_cost_per_token_above_128k_tokens": null,
"input_cost_per_token_above_200k_tokens": null,
"input_cost_per_query": null,
"input_cost_per_second": null,
"input_cost_per_audio_token": 0.000003,
"input_cost_per_token_batches": null,
"output_cost_per_token_batches": null,
"output_cost_per_token": 0.000005,
"output_cost_per_audio_token": null,
"output_cost_per_character": null,
"output_cost_per_reasoning_token": null,
"output_cost_per_token_above_128k_tokens": null,
"output_cost_per_character_above_128k_tokens": null,
"output_cost_per_token_above_200k_tokens": null,
"output_cost_per_second": null,
"output_cost_per_image": null,
"output_vector_size": null,
"litellm_provider": "openai",
"supports_system_messages": null,
"supports_response_schema": null,
"supports_vision": false,
"supports_function_calling": false,
"supports_tool_choice": false,
"supports_assistant_prefill": false,
"supports_prompt_caching": false,
"supports_audio_input": false,
"supports_audio_output": false,
"supports_pdf_input": false,
"supports_embedding_image_input": false,
"supports_native_streaming": null,
"supports_web_search": false,
"supports_reasoning": false,
"supports_computer_use": false,
"search_context_cost_per_query": null,
"tpm": null,
"rpm": null,
"supported_openai_params": [
"frequency_penalty",
"logit_bias",
"logprobs",
"top_logprobs",
"max_tokens",
"max_completion_tokens",
"modalities",
"prediction",
"n",
"presence_penalty",
"seed",
"stop",
"stream",
"stream_options",
"temperature",
"top_p",
"tools",
"tool_choice",
"function_call",
"functions",
"max_retries",
"extra_headers",
"parallel_tool_calls",
"audio",
"web_search_options",
"response_format"
]
},
"provider": "openai",
"input_cost": "1.25",
"output_cost": "5.00",
"litellm_model_name": "openai/gpt-4o-mini-transcribe",
"max_tokens": null,
"max_input_tokens": 16000,
"cleanedLitellmParams": {
"use_in_pass_through": false,
"use_litellm_proxy": false,
"merge_reasoning_content_in_choices": false
}
}
Specific audio transcription request log showing the model mapping:
12:25:35 - LiteLLM Router:DEBUG: router.py:1975 - Inside _atranscription()- model: openai-gpt-4o-mini-transcribe; kwargs: {...}
...
"model_map_information": {
"model_map_key": "whisper-1",
"model_map_value": {
"key": "whisper-1",
// ... details about whisper-1 ...
"litellm_provider": "openai",
"mode": "audio_transcription",
// ...
}
},
...
INFO: 192.168.86.19:37846 - "POST /audio/transcriptions HTTP/1.1" 200 OK
Expected Behavior:
When requesting the model name openai-gpt-4o-mini-transcribe
or openai-gpt-4o-transcribe
in the LiteLLM proxy for an audio transcription task, the underlying OpenAI model used should be the one specified in the litellm_params.model
field of the configuration and as supported by the OpenAI API.
Observed Behavior:
The LiteLLM logs indicate that the request is being mapped to the openai/whisper-1
model, even though openai/gpt-4o-mini-transcribe
or openai-gpt-4o-transcribe
is specified in the configuration and is a valid OpenAI transcription model.
Impact:
This prevents users from utilizing the newer, potentially higher-quality gpt-4o-mini-transcribe
or openai-gpt-4o-transcribe
models through the LiteLLM proxy and forces the use of the older whisper-1
model for requests directed to openai-gpt-4o-mini-transcribe
or openai-gpt-4o-transcribe
.
Steps to Reproduce:
- Configure LiteLLM proxy with the provided
proxy_config.yaml
, including theopenai-gpt-4o-mini-transcribe
andopenai-gpt-4o-transcribe
model entries. - Make an audio transcription request to the LiteLLM proxy specifying the model name
openai-gpt-4o-mini-transcribe
. - Examine the LiteLLM logs (with verbose logging enabled) to observe the
model_map_information
and the underlying model used for the OpenAI API call.
Environment:
- LiteLLM Version: 1.71.1
Possible Cause:
It appears LiteLLM's internal model mapping logic might be defaulting to whisper-1
for audio transcription tasks, potentially overriding the explicit model
setting in the proxy_config.yaml
for the newer GPT-4o transcription models.
Suggestion:
Investigate the internal mapping logic for OpenAI audio transcription models to ensure that gpt-4o-mini-transcribe
and gpt-4o-transcribe
are correctly recognized and routed when specified in the configuration.
Relevant log output
Are you a ML Ops Team?
No
What LiteLLM version are you on ?
v1.71.1