Releases: huggingface/optimum
v1.24.0: SD3 & Flux, DinoV2, Modernbert, GPTQModel, Transformers v4.48...
Release Notes: Optimum v1.24.0
We’re excited to announce the release of Optimum v1.24.0. This update expands ONNX-based model capabilities and includes several improvements, bug fixes, and new contributions from the community.
🚀 New Features & Enhancements
ORTQuantizer
now supports models with ONNX subfolders.- ONNX Runtime IO Binding support for all supported Transformers models (no models left behind).
- SD3 and Flux model support added to
ORTDiffusionPipeline
enabling latest diffusion-based models. - Transformers v4.47 and v4.48 compatibility, ensuring seamless integration with the latest advancements in Hugging Face's ecosystem.
- ONNX export support extended to various models, including Decision Transformer, ModernBERT, Megatron-BERT, Dinov2, OLMo, and many more (see details).
🔧 Key Fixes & Optimizations
- Dropped support for Python 3.8
- Bug fixes in
ModelPatcher
, SDXL refiner export, and device checks for improved reliability.
👥 New Contributors
A huge thank you to our first-time contributors:
Your contributions make Optimum better! 🎉
For a detailed list of all changes, please check out the full changelog.
🚀 Happy optimizing!
What's Changed
- Onnx granite by @gabe-l-hart in #2043
- Drop python 3.8 by @echarlaix in #2086
- Update Dockerfile base image by @echarlaix in #2089
- add transformers 4.36 tests by @echarlaix in #2085
- [
fix
] Allow ORTQuantizer over models with subfolder ONNX files by @tomaarsen in #2094 - SD3 and Flux support by @IlyasMoutawwakil in #2073
- Remove datasets as required dependency by @echarlaix in #2087
- Add ONNX Support for Decision Transformer Model by @ra9hur in #2038
- Generate guidance for flux by @IlyasMoutawwakil in #2104
- Unbundle inputs generated by
DummyTimestepInputGenerator
by @JingyaHuang in #2107 - Pass the revision to SentenceTransformer models by @bndos in #2105
- Rembert onnx support by @mlynatom in #2108
- fix bug
ModelPatcher
returns empty outputs by @LoSealL in #2109 - Fix workflow to mark issues as stale by @echarlaix in #2110
- Remove doc-build by @echarlaix in #2111
- Downgrade stale bot to v8 and fix permissions by @echarlaix in #2112
- Update documentation color from google tpu section by @echarlaix in #2113
- Fix workflow to mark PRs as stale by @echarlaix in #2116
- Enable transformers v4.47 support by @echarlaix in #2119
- Add ONNX export support for MGP-STR by @xenova in #2099
- Add ONNX export support for OLMo and OLMo2 by @xenova in #2121
- Pass on
model_kwargs
when exporting a SentenceTransformers model by @sjrl in #2126 - Add ONNX export support for DinoV2, Hiera, Maskformer, PVT, SigLIP, SwinV2, VitMAE, and VitMSN models by @xenova in #2001
- move check_dummy_inputs_allowed to common export utils by @eaidova in #2114
- Remove CI macos runners by @echarlaix in #2129
- Enable GPTQModel by @jiqing-feng in #2064
- Skip private model loading for external contributors by @echarlaix in #2130
- fix sdxl refiner export by @eaidova in #2133
- Export to ExecuTorch: Initial Integration by @guangy10 in #2090
- Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM by @LRL-ModelCloud in #2146
- Update docker files by @echarlaix in #2102
- Limit diffusers version by @IlyasMoutawwakil in #2150
- Add ONNX export support for ModernBERT by @xenova in #2131
- Allow GPTQModel to auto select Marlin or faster kernels for inference only ops by @LRL-ModelCloud in #2138
- fix device check by @jiqing-feng in #2136
- Replace check_if_xxx_greater with is_xxx_version by @echarlaix in #2152
- Add tf available and version by @echarlaix in #2154
- Add ONNX export support for
PatchTST
by @xenova in #2101 - fix infer task from model_name if model from sentence transformer by @eaidova in #2151
- Unpin diffusers and pass onnx exporters tests by @IlyasMoutawwakil in #2153
- Uncomment modernbert config by @IlyasMoutawwakil in #2155
- Skip optimum-benchmark when loading namespace modules by @IlyasMoutawwakil in #2159
- Fix PR doc upload by @regisss in #2161
- Move executorch to optimum-executorch by @echarlaix in #2165
- Adding Onnx Support For Megatron-Bert by @pragyandev in #2169
- Transformers 4.48 by @IlyasMoutawwakil in #2158
- Update ort CIs (slow, gpu, train) by @IlyasMoutawwakil in #2024
v1.23.3: Patch release
- Add sentence-transformers and timm documentation example by @echarlaix in #2072
- Create token type ids when not provided by @echarlaix in #2081
- Add transformers v4.46 support by @echarlaix in #2078
v1.23.2: Patch release
- Fix compatibility with diffusers < 0.25.0 #2063 @echarlaix
- Update the habana extra #2077 @regisss
Full Changelog: v1.23.1...v1.23.2
v1.23.1: Patch release
- Fix doc build by @regisss in #2050
- Don't hardcode the logger level to INFO let users set TRANSFORMERS_VERBOSITY by @tomaarsen in #2047
- Add workflow to mark issues as stale by @regisss in #2051
- Fix onnx export when transformers >= v4.45 (impacting sentence-transformers and timm models) by @echarlaix in #2053 and #2054
v1.23.0: ORTDiffusionPipeline, transformers v4.45
ONNX Runtime Diffusion pipeline
Adding ORTDiffusionPipeline
to simplify diffusers model loading by @IlyasMoutawwakil in #1960 and #2021
model_id = "runwayml/stable-diffusion-v1-5"
- pipeline = ORTStableDiffusionPipeline.from_pretrained(model_id, revision="onnx")
+ pipeline = ORTDiffusionPipeline.from_pretrained(model_id, revision="onnx")
image = pipeline("sailing ship in storm by Leonardo da Vinci").images[0]
Transformers v4.45
Transformers v4.45 support by @echarlaix in #2023 and #2045
Subfolder
Remove the restriction for the model's config to be in the model's subfolder by @echarlaix in #2044
New Contributors
- @tcsavage made their first contribution in #1965
- @yuanwu2017 made their first contribution in #2003
- @h3110Fr13nd made their first contribution in #2031
- @glegendre01 made their first contribution in #2033
- @rbrugaro made their first contribution in #2027
Full Changelog: v1.22.0...v1.23.0
v1.22.0: transformers 4.44 compatibility, bugfixes
What's Changed
- Fix sentence transformers modeling patching for export by @echarlaix in #1936
- Update optimum intel extra by @echarlaix in #1935
- Update Habana extra by @regisss in #1937
- Remove inplace op in mistral patcher by @IlyasMoutawwakil in #1938
- Fix forward bug in ORTModelForFeatureExtraction by @moria97 in #1941
- Deprecate ORTModel class by @IlyasMoutawwakil in #1939
- Remove warning by @echarlaix in #1945
- Clip vision model onnx export by @fxmarty in #1920
- Add export test for swin with shifted windows by @echarlaix in #1942
- Refactor diffusers tasks by @IlyasMoutawwakil in #1947
- Fix optimizer's command line reading by @idruker-cerence in #1961
- Fix unmask_unattended_patched signature by @fxmarty in #1963
- Fix undefined variable in library name inference by @IlyasMoutawwakil in #1964
- Fix gpt bigcode ONNX export for transformers<4.39.0 by @echarlaix in #1973
- Support transformers 4.43 by @IlyasMoutawwakil in #1971
- chore(ci): migrate runner configuration in GitHub workflows by @XciD in #1978
- Fix typos in quantization.mdx by @aldakata in #1989
- Update Habana extra in setup.py by @regisss in #1991
- Follow up the diffusers task refactoring by @JingyaHuang in #1999
- Transformers 4.44 support by @IlyasMoutawwakil in #1996
- Modify token classification processor default dataset args by @echarlaix in #2005
- Fix TFLite tests by @IlyasMoutawwakil in #2007
- Fix attribute name from
inputs_names
toinput_names
by @J4BEZ in #2010 - Fix typo in BetterTransformer's overview docs by @ftnext in #2015
- Apply deprecated
evaluation_strategy
by @muellerzr in #1819 - Update transformers imports for
deepspeed
andis_torch_xla_available
by @Rohan138 in #2012 - Add quanto install and instructions by @dacorvo in #1976
New Contributors
- @moria97 made their first contribution in #1941
- @XciD made their first contribution in #1978
- @zhenglongjiepheonix made their first contribution in #1933
- @aldakata made their first contribution in #1989
- @J4BEZ made their first contribution in #2010
- @ftnext made their first contribution in #2015
- @muellerzr made their first contribution in #1819
- @Rohan138 made their first contribution in #2012
Full Changelog: v1.21.4...v1.22.0
v1.21.4: Patch release
Full Changelog: v1.21.3...v1.21.4
v1.21.3: Patch release
- Deprecate ORTModel class by @IlyasMoutawwakil in #1939
- Remove warning by @echarlaix in #1945
- Fix optimizer's command line reading by @idruker-cerence in #1961
- Fix unmask_unattended_patched signature by @fxmarty in #1963
- Fix gpt bigcode ONNX export for transformers<4.39.0 by @echarlaix in #1973
- Support transformers 4.43 by @IlyasMoutawwakil in #1971
Full Changelog: v1.21.2...v1.21.3
v1.21.2: Patch release
- Remove inplace op in mistral patcher by @IlyasMoutawwakil in #1938
- Fix ORTModelForFeatureExtraction modeling by @moria97 in #1941
Full Changelog: v1.21.1...v1.21.2
v1.21.1: Patch release
- Fix sentence transformers model patching by @echarlaix in #1936
- Update Intel extra by @echarlaix in #1935
- Update Habana extra by @regisss in #1937
Full Changelog: v1.21.0...v1.21.1