-
Notifications
You must be signed in to change notification settings - Fork 541
Distribute and complete onnxruntime tests (decoder models) #2278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…or finegrained debugging
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR expands and stabilizes ONNX Runtime testing for decoder models by fixing model mappings, adding new models, updating exporter logic, and including decoder tests in CI.
- Updates testing utilities: fixes model name mappings, adds new models (
olmo
,olmo2
,phi
), and refactors the setup flow - Extends
NormalizedConfigManager
and ONNX exporter configs to support new architectures and bumps minimum Transformers versions - Enhances exporter task filtering by Transformers version and updates CI matrix to include
test_decoder.py
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
tests/onnxruntime/testing_utils.py | Refactor _setup , adjust model IDs, remove unsupported case skips |
optimum/utils/normalized_config.py | Add mappings for olmo , olmo2 , phi ; update bloom args |
optimum/exporters/tasks.py | Filter supported model types by Transformers version |
optimum/exporters/onnx/utils.py | Bump version check threshold for position IDs requirement |
optimum/exporters/onnx/model_configs.py | Update MIN_TRANSFORMERS_VERSION for PhiOnnxConfig & BloomOnnxConfig , remove legacy override |
optimum/exporters/onnx/base.py | Simplify decoder merge logic using constants |
.github/workflows/test_onnxruntime.yml | Add test_decoder.py to CI test matrix |
Comments suppressed due to low confidence (3)
optimum/utils/normalized_config.py:278
- The mapping for "phi3small" was removed, which will break configuration support for phi3small models. Consider re-adding this entry or verifying that dropping phi3small support is intentional.
"phi3small": NormalizedTextConfigWithGQA,
tests/onnxruntime/testing_utils.py:176
- The previous skipTest logic for unsupported export cases was removed, so tests may now error instead of being skipped. Please reintroduce a mechanism to skip unsupported configurations.
if model_args.get("use_cache", False):
.github/workflows/test_onnxruntime.yml:31
- The YAML array for
test_file
uses bracket notation with commas on separate lines, which may not be valid. Consider using a standard YAML list (one- item
per line) to ensure CI parses it correctly.
[
Added qwen3 model support as proof that new models inference integration should, in most cases, "just work". |
8c6bcd6
to
aacf172
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @IlyasMoutawwakil 🔥 thanks!
optimum/exporters/tasks.py
Outdated
and is_transformers_version( | ||
">=", | ||
str( | ||
TasksManager.get_exporter_config_constructor( | ||
exporter, task=task, model_type=model_type | ||
).func.MIN_TRANSFORMERS_VERSION | ||
), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking it might be easier to raise an error instead before export when the transformers version is not compatible and to keep all supported architectures so that users know that the architecture export is supported but that transformers needs to be upgraded, what do you think @IlyasMoutawwakil ?
optimum/optimum/exporters/onnx/convert.py
Line 853 in e15053d
f"{config.MIN_TRANSFORMERS_VERSION}, got: {transformers.__version__}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in fact I only added this here to be able to use it in test_find_untested_architectures
I can move the version checks there and keep this method as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done ! I simply remove them the unsupported models (because version) using CONFIG_MAPPING_NAMES
supported_transformers_models = set(CONFIG_MAPPING_NAMES.keys())
supported_export_models = set(TasksManager.get_supported_model_type_for_task(task=self.TASK, exporter="onnx"))
supported_export_models = supported_export_models & supported_transformers_models
untested_models = supported_export_models - tested_models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the raising of version error during export, I thin that's already the case
self.normalized_config = NormalizedConfigManager.get_normalized_config_class(config.model_type)(config) | ||
self.key_value_input_names = [key for key in self.input_names if (".key" in key) or (".value" in key)] | ||
self.key_value_output_names = [key for key in self.output_names if (".key" in key) or (".value" in key)] | ||
self.can_use_cache = len(self.key_value_input_names) > 0 and len(self.key_value_output_names) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question : is there any case where len(self.key_value_input_names) > 0
and len(self.key_value_output_names) == 0
(before we only used self.key_value_input_names
to determine self.use_cache
so wondering)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the case of CausalLMs I don't think so, even legacy merged decoders have both.
Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
…imum into distribute-tests
@echarlaix all good to merge ! |
great, thanks a lot @IlyasMoutawwakil |
What does this PR do?
Decoder models are probably the most important topic of research right now, transformers itself decided to break many things in its rule book to keep up with the very fast changing field of LLMs. I think it should be easier to maintain and add new models with this PR by reducing the amount of special cases we handle or at least make the inference logic more straight forward.
This PR:
Before, if you ran all tests in parallel some would interfere with each other and only pass in sequential model. Now you can run all 250 tests on a multi-core machine (dgx for example) in 2 minutes.
Before submitting
Who can review?