Releases · BerriAI/litellm

15 Jan 06:09

github-actions

v1.58.2

96b70eb

v1.58.2 Latest

Latest

What's Changed

Fix RPM/TPM limit typo in admin UI by @yujonglee in #7769
Add AIM Guardrails support by @krrishdholakia in #7771
Support temporary budget increases on keys by @krrishdholakia in #7754
Litellm dev 01 13 2025 p2 by @krrishdholakia in #7758
docs - iam role based access for bedrock by @ishaan-jaff in #7774
(Feat) prometheus - emit remaining team budget metric on proxy startup by @ishaan-jaff in #7777
(fix) BaseAWSLLM - cache IAM role credentials when used by @ishaan-jaff in #7775

Full Changelog: v1.58.1...v1.58.2

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.2

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	250.0	289.8090936126223	6.143711740946042	0.0	1838	0	228.12097899998207	2196.5017750000015
Aggregated	Passed ✅	250.0	289.8090936126223	6.143711740946042	0.0	1838	0	228.12097899998207	2196.5017750000015

Contributors

krrishdholakia, ishaan-jaff, and yujonglee

Assets 4

14 Jan 05:55

github-actions

v1.58.1

00f50bc

v1.58.1

🚨Alpha - 1.58.0 has various perf improvements, we recommend waiting for a stable release before bumping in production

What's Changed

(core sdk fix) - fix fallbacks stuck in infinite loop by @ishaan-jaff in #7751
[Bug fix]: v1.58.0 - issue with read request body by @ishaan-jaff in #7753
(litellm SDK perf improvements) - handle cases when unable to lookup model in model cost map by @ishaan-jaff in #7750
(prometheus - minor bug fix) - litellm_llm_api_time_to_first_token_metric not populating for bedrock models by @ishaan-jaff in #7740
(fix) health check - allow setting health_check_model by @ishaan-jaff in #7752

Full Changelog: v1.58.0...v1.58.1

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.1

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	250.0	294.2978673554448	6.045420383532543	0.0	1809	0	223.72276400000146	3539.4181890000027
Aggregated	Passed ✅	250.0	294.2978673554448	6.045420383532543	0.0	1809	0	223.72276400000146	3539.4181890000027

Contributors

ishaan-jaff

Assets 4

13 Jan 08:01

github-actions

v1.58.0

3fe1f3b

v1.58.0

v1.58.0 - Alpha Release

🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues

What's Changed

(proxy perf) - service logger don't always import OTEL in helper function by @ishaan-jaff in #7727
(proxy perf) - only read request body 1 time per request by @ishaan-jaff in #7728

Full Changelog: v1.57.11...v1.58.0

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.0

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	240.0	273.2166563012582	6.118315985413586	0.0033451700302972037	1829	1	75.1692759999969	3821.228761000043
Aggregated	Passed ✅	240.0	273.2166563012582	6.118315985413586	0.0033451700302972037	1829	1	75.1692759999969	3821.228761000043

Contributors

ishaan-jaff

Assets 4

13 Jan 06:52

github-actions

v1.57.11

e063c5a

v1.57.11

v1.57.11 - Alpha Release

🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues

What's Changed

(litellm SDK perf improvement) - use verbose_logger.debug and _cached_get_model_info_helper in _response_cost_calculator by @ishaan-jaff in #7720
(litellm sdk speedup) - use _model_contains_known_llm_provider in response_cost_calculator to check if the model contains a known litellm provider by @ishaan-jaff in #7721
(proxy perf) - only parse request body 1 time per request by @ishaan-jaff in #7722
Revert "(proxy perf) - only parse request body 1 time per request" by @ishaan-jaff in #7724
add azure o1 pricing by @krrishdholakia in #7715

Full Changelog: v1.57.10...v1.57.11

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.11

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	240.0	270.55759577820237	6.130862160194138	0.0	1835	0	224.79750500002638	1207.8732939999952
Aggregated	Passed ✅	240.0	270.55759577820237	6.130862160194138	0.0	1835	0	224.79750500002638	1207.8732939999952

Contributors

krrishdholakia and ishaan-jaff

Assets 4

13 Jan 07:32

github-actions

v1.57.8-stable

690f4e0

v1.57.8-stable

Full Changelog: v1.57.8...v1.57.8-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.57.8-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	240.0	271.08706884006597	6.1244865014274685	0.0	1832	0	221.9753340000068	2009.652516000017
Aggregated	Passed ✅	240.0	271.08706884006597	6.1244865014274685	0.0	1832	0	221.9753340000068	2009.652516000017

Assets 4

13 Jan 00:29

github-actions

v1.57.10

15b5203

v1.57.10

v1.57.10 - Alpha Release

🚨 This is an alpha release - we've made several performance / RPS improvements to litellm core. If you see any issues please file it https://github.com/BerriAI/litellm/issues

Litellm dev 01 10 2025 p2 by @krrishdholakia in #7679
Litellm dev 01 10 2025 p3 by @krrishdholakia in #7682
build: new ui build by @krrishdholakia in #7685
fix(model_hub.tsx): clarify cost in model hub is per 1m tokens by @krrishdholakia in #7687
Litellm dev 01 11 2025 p3 by @krrishdholakia in #7702
(perf litellm) - use _get_model_info_helper for cost tracking by @ishaan-jaff in #7703
(perf sdk) - minor changes to cost calculator to run helpers only when necessary by @ishaan-jaff in #7704
(perf) - proxy, use orjson for reading request body by @ishaan-jaff in #7706
(minor fix - aiohttp_openai/) - fix get_custom_llm_provider by @ishaan-jaff in #7705
(sdk perf fix) - only print args passed to litellm when debugging mode is on by @ishaan-jaff in #7708
(perf) - only use response_cost_calculator 1 time per request. (Don't re-use the same helper twice per call ) by @ishaan-jaff in #7709
[BETA] Add OpenAI /images/variations + Topaz API support by @krrishdholakia in #7700
(litellm sdk speedup router) - adds a helper _cached_get_model_group_info to use when trying to get deployment tpm/rpm limits by @ishaan-jaff in #7719

Full Changelog: v1.57.8...v1.57.10

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.10

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	240.0	264.0629029362514	6.184926091214754	0.0	1851	0	213.62108399998192	1622.618584999998
Aggregated	Passed ✅	240.0	264.0629029362514	6.184926091214754	0.0	1851	0	213.62108399998192	1622.618584999998

Contributors

krrishdholakia and ishaan-jaff

Assets 4

11 Jan 06:31

github-actions

v1.57.8

189b677

v1.57.8

What's Changed

(proxy latency/perf fix - user_api_key_auth) - use asyncio.create task for caching virtual key once it's validated by @ishaan-jaff in #7676
(litellm sdk - perf improvement) - optimize response_cost_calculator by @ishaan-jaff in #7674
(litellm sdk - perf improvement) - use O(1) set lookups for checking llm providers / models by @ishaan-jaff in #7672
(litellm sdk - perf improvement) - optimize pre_call_check by @ishaan-jaff in #7673
[integrations/lunary] allow to pass custom parent run id to LLM calls by @hughcrt in #7651
LiteLLM Minor Fixes & Improvements (01/10/2025) - p1 by @krrishdholakia in #7670
(performance improvement - litellm sdk + proxy) - ensure litellm does not create unnecessary threads when running async functions by @ishaan-jaff in #7680
(litellm proxy perf) - pass num_workers cli arg to uvicorn when num_workers is specified by @ishaan-jaff in #7681
fix proxy pre call hook - only use asyncio.create_task if user opts into alerting by @ishaan-jaff in #7683
[Bug fix]: Proxy Auth Layer - Allow Azure Realtime routes as llm_api_routes by @ishaan-jaff in #7684

Full Changelog: v1.57.7...v1.57.8

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.8

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	210.0	225.29799695056985	6.153370698253471	0.0	1841	0	177.73327700001573	2088.13791099999
Aggregated	Passed ✅	210.0	225.29799695056985	6.153370698253471	0.0	1841	0	177.73327700001573	2088.13791099999

Contributors

krrishdholakia, hughcrt, and ishaan-jaff

Assets 4

10 Jan 23:40

github-actions

v1.57.7

00a0f56

v1.57.7

What's Changed

(minor latency fixes / proxy) - use verbose_proxy_logger.debug() instead of litellm.print_verbose by @ishaan-jaff in #7664
feat(ui_sso.py): Allows users to use test key pane, and have team budget limits be enforced for their use-case by @krrishdholakia in #7666
fix(main.py): fix lm_studio/ embedding routing by @krrishdholakia in #7658
fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini p… by @krrishdholakia in #7660
Use environment variable for Athina logging URL by @vivek-athina in #7628

Full Changelog: v1.57.5...v1.57.7

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.7

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	200.0	218.4749677188173	6.216185012755876	0.0	1860	0	177.92223199990076	3911.6109139999935
Aggregated	Passed ✅	200.0	218.4749677188173	6.216185012755876	0.0	1860	0	177.92223199990076	3911.6109139999935

Contributors

krrishdholakia, ishaan-jaff, and vivek-athina

Assets 4

10 Jan 05:47

github-actions

v1.57.5

7fcd130

v1.57.5

🚨🚨 Known issue - do not upgrade - Window's compatibility issue on this release

Relevant issue: #7677

What's Changed

LiteLLM Minor Fixes & Improvements (01/08/2025) - p2 by @krrishdholakia in #7643
Litellm dev 01 08 2025 p1 by @krrishdholakia in #7640
(proxy - RPS) - Get 2K RPS at 4 instances, minor fix for caching_handler by @ishaan-jaff in #7655
(proxy - RPS) - Get 2K RPS at 4 instances, minor fix aiohttp_openai/ by @ishaan-jaff in #7659
(proxy perf improvement) - use uvloop for higher RPS (10%-20% higher RPS) by @ishaan-jaff in #7662
(Feat - Batches API) add support for retrieving vertex api batch jobs by @ishaan-jaff in #7661
(proxy-latency fixes) use asyncio tasks for logging db metrics by @ishaan-jaff in #7663

Full Changelog: v1.57.4...v1.57.5

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.5

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	230.0	282.70225500655766	6.115771768544881	0.0	1830	0	206.44150200001832	3375.4479410000044
Aggregated	Passed ✅	230.0	282.70225500655766	6.115771768544881	0.0	1830	0	206.44150200001832	3375.4479410000044

Contributors

krrishdholakia and ishaan-jaff

Assets 4

09 Jan 04:41

github-actions

v1.57.4

782b597

v1.57.4

What's Changed

fix(utils.py): fix select tokenizer for custom tokenizer by @krrishdholakia in #7599
LiteLLM Minor Fixes & Improvements (01/07/2025) - p3 by @krrishdholakia in #7635
(feat) - allow building litellm proxy from pip package by @ishaan-jaff in #7633
Litellm dev 01 07 2025 p2 by @krrishdholakia in #7622
Allow assigning teams to org on UI + OpenAI omni-moderation cost model tracking by @krrishdholakia in #7566
(fix) proxy auth - allow using Azure JS SDK routes as llm_api_routes by @ishaan-jaff in #7631
(helm) - bug fix - allow using migrationJob.enabled variable within job by @ishaan-jaff in #7639

Full Changelog: v1.57.3...v1.57.4

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.4

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	200.0	218.7550845980808	6.268875045928877	0.0	1876	0	170.9488330000113	1424.4913769999812
Aggregated	Passed ✅	200.0	218.7550845980808	6.268875045928877	0.0	1876	0	170.9488330000113	1424.4913769999812

Contributors

krrishdholakia and ishaan-jaff

Assets 4

Releases: BerriAI/litellm

v1.58.2

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.58.1

🚨Alpha - 1.58.0 has various perf improvements, we recommend waiting for a stable release before bumping in production

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.58.0

v1.58.0 - Alpha Release

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.57.11

v1.57.11 - Alpha Release

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.57.8-stable

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

v1.57.10

v1.57.10 - Alpha Release

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.57.8

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.57.7

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.57.5

🚨🚨 Known issue - do not upgrade - Window's compatibility issue on this release

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors

v1.57.4

What's Changed

Docker Run LiteLLM Proxy

Don't want to maintain your internal proxy? get in touch 🎉

Load Test LiteLLM Proxy Results

Contributors