Skip to content

Add QDQ scale propagation pass #713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: ovep-develop
Choose a base branch
from

Conversation

javier-intel
Copy link

Description

Adding pass to propagate scale values with a magnitude above a certain threshold to avoid numerical overflows.

Motivation and Context

Improve precision on certain networks

@javier-intel javier-intel force-pushed the jemartin/scale_propagation branch from 3d0ca12 to 4cb9374 Compare June 17, 2025 15:59
} else if (session_context_.device_type.find("GPU") != std::string::npos) {
// Create a copy of the model
std::unique_ptr<onnxruntime::Model> model;
Status status = qdq_scales_fix::Transform(subgraph, logger, model);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this pass happening even for non quantized models?

@javier-intel javier-intel requested a review from MayureshV1 June 24, 2025 16:37
ericcraw and others added 3 commits June 25, 2025 07:36
* Use infer instead of start async/wait

* Introduce OvExeceptionBoundary for exception handling

* unbound infer request pool

* Fix dynamically sized i/o

* Rename onnx->ort + remove unused parameter shape functions

* fix linux build issue + review dog comments

* more linux build fixes + copilot feedback

* disable ReduceSum_noop_axes_input_initializer_opset_18

* review feedback + last minute touch ups

* slightly more scalable llm handling

* Simplify dynamic shape checks

* add missing staged changes

* Remove references to IO_BUFFER_ENABLED

* Minor tweaks to InferRequestPool

* remove unused mem_info

* Move ParameterShape and ParameterInfo out of ov_interface

---------

Co-authored-by: MayureshV1 <47039074+MayureshV1@users.noreply.github.com>
* feat: Enable EpContext OVIR Encapsulation

* fix: refactor EpCtx OVIR parsing logic to use ep.context_file_path

* fix: Fix logic for parsing model_file_path

* fix: enable EPCtx OVIR encapsulation compiled blob caching

* fix: fix merge conflicts

* fix: fix bugs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants