Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitor - Collect realtime GPU power when benchmarking. #507

Merged
merged 3 commits into from
Apr 7, 2023

Conversation

guoshzhao
Copy link
Contributor

Description
Collect realtime GPU power when benchmarking.

@guoshzhao guoshzhao requested review from cp5555 and abuccts April 7, 2023 05:42
@guoshzhao guoshzhao requested a review from a team as a code owner April 7, 2023 05:43
@codecov
Copy link

codecov bot commented Apr 7, 2023

Codecov Report

Merging #507 (30dd22f) into release/0.8 (9f18dea) will decrease coverage by 0.02%.
The diff coverage is 82.35%.

@@               Coverage Diff               @@
##           release/0.8     #507      +/-   ##
===============================================
- Coverage        87.25%   87.24%   -0.02%     
===============================================
  Files               89       89              
  Lines             5948     5964      +16     
===============================================
+ Hits              5190     5203      +13     
- Misses             758      761       +3     
Flag Coverage Δ
cpu-python3.6-unit-test 73.47% <40.00%> (-0.10%) ⬇️
cpu-python3.7-unit-test 73.47% <40.00%> (-0.10%) ⬇️
cpu-python3.8-unit-test 73.95% <47.05%> (-0.09%) ⬇️
cuda-unit-test 87.17% <82.35%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superbench/common/utils/device_manager.py 72.44% <57.14%> (-1.18%) ⬇️
superbench/monitor/monitor.py 62.87% <100.00%> (+0.22%) ⬆️
superbench/monitor/record.py 97.63% <100.00%> (+0.13%) ⬆️
superbench/runner/runner.py 87.95% <100.00%> (+0.06%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@guoshzhao guoshzhao requested a review from yukirora April 7, 2023 12:22
@guoshzhao guoshzhao enabled auto-merge (squash) April 7, 2023 12:24
@guoshzhao guoshzhao merged commit 1038070 into release/0.8 Apr 7, 2023
@guoshzhao guoshzhao deleted the guzhao/monitor_power branch April 7, 2023 12:26
@cp5555 cp5555 mentioned this pull request Apr 7, 2023
23 tasks
abuccts pushed a commit that referenced this pull request Apr 14, 2023
**Description**
Collect realtime GPU power when benchmarking.
abuccts added a commit that referenced this pull request Apr 14, 2023
**Description**

Cherry-pick bug fixes from v0.8.0 to main.

**Major Revisions**

* Monitor - Fix the cgroup version checking logic (#502)
* Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503)
* Fix wrong torch usage in communication wrapper for Distributed
Inference Benchmark (#505)
* Analyzer: Fix bug in python3.8 due to pandas api change (#504)
* Bug - Fix bug to get metric from cmd when error happens (#506)
* Monitor - Collect realtime GPU power when benchmarking (#507)
* Add num_workers argument in model benchmark (#511)
* Remove unreachable condition when write host list (#512)
* Update cuda11.8 image to cuda12.1 based on nvcr23.03 (#513)
* Doc - Fix wrong unit of cpu-memory-bw-latency in doc (#515)
* Docs - Upgrade version and release note (#508)

Co-authored-by: guoshzhao <guzhao@microsoft.com>
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants