Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to cuda12.1, nccl 2.17.1, hpcx 2.14, and mlc 3.10 #513

Merged
merged 5 commits into from
Apr 12, 2023

Conversation

abuccts
Copy link
Member

@abuccts abuccts commented Apr 12, 2023

Update cuda11.8 image to cuda12.1 based on nvcr23.03 and related versions in the image:

  • cuda 11.8 -> 12.1
  • nccl 2.15.5 -> 2.17.1
  • hpcx: 2.8 -> 2.14
  • mlc: 3.9a -> 3.10

Update cuda11.8 image to cuda12.1 based on nvcr23.03 and related
versions in the image:
* cuda 11.8 -> 12.1
* nccl 2.15.5 -> 2.17.1
* ofed: 5.2 -> 5.8
* hpcx: 2.8 -> 2.14
* mlc: 3.9a -> 3.10
@abuccts abuccts added the containers SuperBench Containers label Apr 12, 2023
@abuccts abuccts requested a review from a team as a code owner April 12, 2023 03:10
@cp5555 cp5555 self-requested a review April 12, 2023 03:52
@codecov
Copy link

codecov bot commented Apr 12, 2023

Codecov Report

Merging #513 (c491de6) into release/0.8 (5a2addd) will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff              @@
##           release/0.8     #513   +/-   ##
============================================
  Coverage        87.24%   87.24%           
============================================
  Files               89       89           
  Lines             5964     5964           
============================================
  Hits              5203     5203           
  Misses             761      761           
Flag Coverage Δ
cpu-python3.6-unit-test 73.47% <ø> (ø)
cpu-python3.7-unit-test 73.47% <ø> (ø)
cpu-python3.8-unit-test 73.95% <ø> (ø)
cuda-unit-test 87.17% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@abuccts abuccts enabled auto-merge (squash) April 12, 2023 07:49
@abuccts abuccts merged commit 17c01d8 into release/0.8 Apr 12, 2023
@abuccts abuccts deleted the xiongyf/upgrade-versions branch April 12, 2023 08:01
@cp5555 cp5555 changed the title Update cuda11.8 image to cuda12.1 based on nvcr23.03 Update to cuda12.1, nccl 2.17.1, hpcx 2.14, and mlc 3.10 Apr 12, 2023
@cp5555 cp5555 mentioned this pull request Apr 12, 2023
23 tasks
abuccts added a commit that referenced this pull request Apr 14, 2023
Update cuda11.8 image to cuda12.1 based on nvcr23.03 and related versions in the image:
* cuda 11.8 -> 12.1
* nccl 2.15.5 -> 2.17.1
* hpcx: 2.8 -> 2.14
* mlc: 3.9a -> 3.10
abuccts added a commit that referenced this pull request Apr 14, 2023
**Description**

Cherry-pick bug fixes from v0.8.0 to main.

**Major Revisions**

* Monitor - Fix the cgroup version checking logic (#502)
* Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503)
* Fix wrong torch usage in communication wrapper for Distributed
Inference Benchmark (#505)
* Analyzer: Fix bug in python3.8 due to pandas api change (#504)
* Bug - Fix bug to get metric from cmd when error happens (#506)
* Monitor - Collect realtime GPU power when benchmarking (#507)
* Add num_workers argument in model benchmark (#511)
* Remove unreachable condition when write host list (#512)
* Update cuda11.8 image to cuda12.1 based on nvcr23.03 (#513)
* Doc - Fix wrong unit of cpu-memory-bw-latency in doc (#515)
* Docs - Upgrade version and release note (#508)

Co-authored-by: guoshzhao <guzhao@microsoft.com>
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
containers SuperBench Containers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants