Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AUTO] Filter device when compile_model with file path #27019

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

yangwang201911
Copy link
Contributor

@yangwang201911 yangwang201911 commented Oct 12, 2024

Details:

  • Enable AUTO to filter device when compile_model with model file path. Device filtering applies only to compile_model with ov::model previously.
  • Add inference test case for loading stateful model path to AUTO
  • Disable runtime fallback as the default

Tickets:

@yangwang201911 yangwang201911 requested a review from a team as a code owner October 12, 2024 06:14
@github-actions github-actions bot added the category: AUTO OpenVINO AUTO device selection plugin label Oct 12, 2024
@yangwang201911 yangwang201911 requested a review from a team as a code owner October 12, 2024 07:10
@github-actions github-actions bot added the category: inference OpenVINO Runtime library - Inference label Oct 12, 2024
support_devices = filter_device_by_model(support_devices_by_property, cloned_model, load_config);
} else {
auto_s_context->m_model_path = model_path;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangleis this will degrade model compile latency.
do you see other better solutions? not sure if possible to get stateful info from cache or model file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangleis all of the OV HW plugins, including CPU, GPU and NPU, only have one compile model API that only accepts the model object, instead of model path, as the input parameter. The compile latency may not change when AUTO calls read_model() before try to compile model to HW plugins. However, Core has been implemented an virtual compile model API that will returns a model object created by calling read_model() API, which means HW plugin can override this API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After detailed performance checks during model compile phase, the mem utilization and latency has no obvious changes when AUTO passes loaded model(read_model()), instead of model path, to HW plugin via Core. @wangleis @songbell

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, if core.compile_model("test.xml", GPU) with cache enabled can work, why auto cannot benefit?

Copy link
Contributor Author

@yangwang201911 yangwang201911 Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, only test compile model with cache disabled and no degrade happened for AUTO in this situation. Will check the performance change for compile model path with cache enabled soon. @songbell @wangleis

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No performance gap was observed on 12 different scale models when compiling the model with cache enabled.

@peterchen-intel
Copy link
Contributor

Need to check if disabling compile_model with model_path API in AUTO is acceptable.

@ilya-lavrenov ilya-lavrenov added this to the 2024.5 milestone Oct 14, 2024
@ilya-lavrenov ilya-lavrenov self-requested a review October 14, 2024 08:44
@peterchen-intel peterchen-intel changed the title [AUTO] Uses single device when it is stateful model [AUTO] Filter device when compile_model with file path Oct 18, 2024
@github-actions github-actions bot removed the category: inference OpenVINO Runtime library - Inference label Oct 18, 2024
@github-actions github-actions bot added the category: inference OpenVINO Runtime library - Inference label Oct 22, 2024
…cache_dir[PR#24726].

2. enable model type filter logic with cache enabled for AUTO.
3. add test case when cache enabled.
@github-actions github-actions bot removed the category: inference OpenVINO Runtime library - Inference label Oct 23, 2024
@yangwang201911 yangwang201911 requested a review from a team as a code owner October 28, 2024 02:37
@yangwang201911 yangwang201911 requested review from zKulesza and removed request for a team October 28, 2024 02:37
@github-actions github-actions bot added the category: docs OpenVINO documentation label Oct 28, 2024
Copy link
Contributor

This PR will be closed in a week because of 2 weeks of no activity.

@github-actions github-actions bot added the Stale label Dec 11, 2024
@peterchen-intel
Copy link
Contributor

  1. Check if cache file exists, if yes, pass cache hash to HW plugin
  2. If no, read model and check if it is stateful, pass the cache hash to HW plugin to generate cache blob

@github-actions github-actions bot removed the Stale label Dec 17, 2024
@wenjiew wenjiew modified the milestones: 2024.5, 2025.0 Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: AUTO OpenVINO AUTO device selection plugin category: docs OpenVINO documentation do_not_merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants