Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Effect of setting intra and inter-op parallelism paramters to deep learning models #961

Closed
saeid93 opened this issue Jan 20, 2023 · 2 comments · Fixed by #1081
Closed

Effect of setting intra and inter-op parallelism paramters to deep learning models #961

saeid93 opened this issue Jan 20, 2023 · 2 comments · Fixed by #1081

Comments

@saeid93
Copy link
Contributor

saeid93 commented Jan 20, 2023

The two set_num_interop_threads, get_num_interop_threads CPU threading variables explained here for PyTroch and for TensorFlow have a huge impact on CPU inferencing time, e.g. for Resnet 18 TorchVistion image model under one CPU core assignment it results in the following difference in latencies (before applying and after applying these values). I think it is worthwhile adding this two variable as a configurable setting variable at least for Huggingface runtime that is using deep models (and I can validate I have seen the same trends for many Huggingface pipeline models too).

resnets

@adriangonz
Copy link
Contributor

Hey @saeid93 ,

That's an interesting one.

Do you have any heuristic in mind to decide the value for those params?

@saeid93
Copy link
Contributor Author

saeid93 commented Jan 30, 2023

Hey @adriangonz,

Based on the PyTorch Documentation it seems the number of cores is a good heuristic for both variables. That's what I used in the above example.

I think the best option would be to add such config to the user side for Huggingface server in a way that it adds the value of these two parameters as a config value in the setting folder. If not set then the default value can be the number of CPUs. However, this can be further optimized but I think that will be out of the scope for MLServer, however, if you are interested this paper provides an in-depth investigation on the topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants