Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor improvements #548

Merged
merged 4 commits into from
Sep 3, 2024
Merged

Conversation

jjfumero
Copy link
Member

@jjfumero jjfumero commented Sep 2, 2024

Description

This PR provides minor improvements:

  • It improves MxM example to dump all raw timers to be processed by external tools (e.g., R)
  • It simplifies detection of parallel kernels per backend.

Problem description

n/ a.

Backend/s tested

Mark the backends affected by this PR.

  • OpenCL
  • PTX
  • SPIRV

OS tested

Mark the OS where this PR is tested.

  • Linux
  • OSx
  • Windows

Did you check on FPGAs?

If it is applicable, check your changes on FPGAs.

  • Yes
  • No

How to test the new patch?

make 
make tests

tornado --jvm="-Ds0.t0.device=0:0" -m tornado.examples/uk.ac.manchester.tornado.examples.compute.MatrixMultiplication2D 512 false

@jjfumero jjfumero added enhancement New feature or request compiler runtime labels Sep 2, 2024
@jjfumero jjfumero self-assigned this Sep 2, 2024
Copy link
Collaborator

@stratika stratika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one comment is that we are modifying an example from the compute package to showcase this feature of dumping stats related to execution time for further analysis. We could probably decouple this functionality into a separate package that will be used to show how to do the analysis and potential integration with other tools.

This is a thought.

@jjfumero
Copy link
Member Author

jjfumero commented Sep 3, 2024

LGTM, one comment is that we are modifying an example from the compute package to showcase this feature of dumping stats related to execution time for further analysis. We could probably decouple this functionality into a separate package that will be used to show how to do the analysis and potential integration with other tools.

This is a thought.

Yes, I agree. We can extend the benchmark module. I suggest moving this one forward to create the requested demos for EU AERO Project.

@jjfumero jjfumero merged commit ea85f22 into beehive-lab:develop Sep 3, 2024
2 checks passed
@jjfumero jjfumero deleted the feat/benchmarking/stats branch September 3, 2024 09:54
jjfumero added a commit to jjfumero/TornadoVM that referenced this pull request Sep 30, 2024
Improvements
============
- beehive-lab#565: New API call in the Execution Plan to log/trace the executed configuration plans.
- beehive-lab#563: Expand the TornadoVM profiler with Level Zero Sysman Energy Metrics.
- beehive-lab#559: Refactoring Power Metric handlers for PTX and OpenCL.
- beehive-lab#548: Benchmarking improvements.
- beehive-lab#549: Prebuilt API tests added using multiple backend-setup.
- Add internal tests for monitoring memory management [link](beehive-lab@0644225).

Compatibility
=============
- beehive-lab#561: Build for OSx 14.6 and OSx 15 fixed.

Bug Fixes
==============
- beehive-lab#564: Jenkins configuration fixed to run KFusion per backend.
- beehive-lab#562: Warmup action from the Execution Plan fixed to run with correct internal IDs.
- beehive-lab#557: Shared Execution Plans Context fixed.
- beehive-lab#553: OpenCL compiler flags for Intel Integrated GPUs fixed.
- beehive-lab#552: Fixed runtime to select any device among multiple SPIR-V devices.
- Fixed zero extend arithmetic operations: [link](beehive-lab@ea7b602).
@jjfumero jjfumero mentioned this pull request Sep 30, 2024
8 tasks
jjfumero added a commit to jjfumero/TornadoVM that referenced this pull request Sep 30, 2024
Improvements
============
- beehive-lab#565: New API call in the Execution Plan to log/trace the executed configuration plans.
- beehive-lab#563: Expand the TornadoVM profiler with Level Zero Sysman Energy Metrics.
- beehive-lab#559: Refactoring Power Metric handlers for PTX and OpenCL.
- beehive-lab#548: Benchmarking improvements.
- beehive-lab#549: Prebuilt API tests added using multiple backend-setup.
- Add internal tests for monitoring memory management [link](beehive-lab@0644225).

Compatibility
=============
- beehive-lab#561: Build for OSx 14.6 and OSx 15 fixed.

Bug Fixes
==============
- beehive-lab#564: Jenkins configuration fixed to run KFusion per backend.
- beehive-lab#562: Warmup action from the Execution Plan fixed to run with correct internal IDs.
- beehive-lab#557: Shared Execution Plans Context fixed.
- beehive-lab#553: OpenCL compiler flags for Intel Integrated GPUs fixed.
- beehive-lab#552: Fixed runtime to select any device among multiple SPIR-V devices.
- Fixed zero extend arithmetic operations: [link](beehive-lab@ea7b602).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

3 participants