Skip to content

enable a oneDNN ITT feature #73

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

enable a oneDNN ITT feature #73

wants to merge 1 commit into from

Conversation

louie-tsai
Copy link

@louie-tsai louie-tsai commented Jan 13, 2023

There is an oneDNN feature to enable ITT tagging for oneDNN primitives on Intel VTune Profiler.
Here is the RFCS for this feature.
https://github.com/oneapi-src/oneDNN/tree/rfcs/rfcs/20201014-VTune-ITT-tagging

Intel VTune Profiler is a performance analysis tool for x86 based machines and Intel Data Center GPUs.
VTune helps on finding your performance bottlenecks and provide details Intel platform information.
We work with VTune Team to have some BKMs and tensorboard pluging for the feature.

We would like to enable this feature by default on TensorFlow, so users could identify platform bottlenecks with details information such as L1 caches misses or level of AVX512 vectorization.

We manually built the TF pkg with ITT features, and then benchmark with Intel Model Zoo w/ & w/o this feature.
This feature has no impact on performance based on our perf benchmarking on different Intel Model Zoo models.
If users don't want this feature, they could also disable it on runtime via an environment flag "DNNL_ITT_TASK_LEVEL"

@louie-tsai louie-tsai closed this Jan 20, 2023
justkw pushed a commit that referenced this pull request Apr 29, 2025
* TfLite Round missing datatype support

-Adds bf16, f16 support for round
-Adds bf16, f16 round unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant