Dockerfile - add sm_103 support for cuda12.9 docker image #737

WenqingLan1 · 2025-09-08T21:58:03Z

Added sm_103 arch for executables with cuda arch dependency.
Removed duplicate installation of hpc-x and nccl in cuda12.9.dockerfile and cuda12.8.dockerfile

…nglan/sb-run-validation

codecov · 2025-09-08T22:08:57Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.97%. Comparing base (60189dd) to head (bb389ad).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #737   +/-   ##
=======================================
  Coverage   85.97%   85.97%           
=======================================
  Files         102      102           
  Lines        7579     7579           
=======================================
  Hits         6516     6516           
  Misses       1063     1063

Flag	Coverage Δ
cpu-python3.12-unit-test	`71.05% <ø> (ø)`
cuda-unit-test	`83.87% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

guoshzhao · 2025-09-18T19:25:34Z

dockerfile/cuda12.8.dockerfile

 RUN echo PATH="$PATH" > /etc/environment && \
    echo LD_LIBRARY_PATH="$LD_LIBRARY_PATH" >> /etc/environment && \
-    echo SB_MICRO_PATH="$SB_MICRO_PATH" >> /etc/environment && \
-    echo "source /opt/hpcx/hpcx-init.sh && hpcx_load" | tee -a /etc/bash.bashrc >> /etc/profile.d/10-hpcx.sh


We don't need to call hpcx_load proactively?

if no custom hpc-x installed we can just remove hpcx_load, but for cuda13.0 I see Dilip installed a custom hpc-x version so might need to wait later to see how to cope with both situations.

guoshzhao · 2025-09-18T19:27:55Z

third_party/Makefile

-ifeq ($(shell echo $(CUDA_VER)">=12.8" | bc -l), 1)
+ifeq ($(shell echo $(CUDA_VER)">=12.9" | bc -l), 1)
    # Get commit 87048bd from msscl to support updated nccl and sm_100
+	$(eval ARCHS := 100 103)


Looks there are some duplicated changes with [#739]

yes. I'm assuming we will merge cuda13.0 [#739] first. I will take care of the code merge later here.

polarG

LGTM

WenqingLan1 added 4 commits August 1, 2025 23:39

add support for sm 103

4358fa7

remove duplicate installation of hpc-x and nccl

ccee488

Merge branch 'main' of github.com:microsoft/superbenchmark into wenqi…

a0ec607

…nglan/sb-run-validation

revert dist inference change

f364c22

WenqingLan1 requested a review from a team as a code owner September 8, 2025 21:58

WenqingLan1 and others added 2 commits September 9, 2025 22:02

fix cuda12.8 dockerfile

ace33ba

Fix formatting and indentation in Makefile

22fa394

guoshzhao assigned polarG Sep 18, 2025

guoshzhao reviewed Sep 18, 2025

View reviewed changes

guoshzhao added enhancement New feature or request containers SuperBench Containers labels Sep 19, 2025

guoshzhao changed the title ~~Add sm_103 support for cuda12.9 docker image~~ Dockerfile - add sm_103 support for cuda12.9 docker image Sep 19, 2025

polarG approved these changes Sep 22, 2025

View reviewed changes

guoshzhao mentioned this pull request Oct 2, 2025

V0.13.0 Release Plan #743

Open

30 tasks

WenqingLan1 added 6 commits October 7, 2025 23:45

merge main

33c30b6

add hpcx back

31191f5

add hpcx back

9b9a6cc

correct hpcx version

67f6af9

update

9206e20

update

bb389ad

WenqingLan1 closed this Oct 8, 2025

WenqingLan1 deleted the wenqinglan/sb-run-validation branch October 8, 2025 00:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dockerfile - add sm_103 support for cuda12.9 docker image #737

Dockerfile - add sm_103 support for cuda12.9 docker image #737

Uh oh!

WenqingLan1 commented Sep 8, 2025 •

edited

Loading

Uh oh!

codecov bot commented Sep 8, 2025 •

edited

Loading

Uh oh!

guoshzhao Sep 18, 2025

Uh oh!

WenqingLan1 Sep 19, 2025

Uh oh!

guoshzhao Sep 18, 2025

Uh oh!

WenqingLan1 Sep 19, 2025

Uh oh!

polarG left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Dockerfile - add sm_103 support for cuda12.9 docker image #737

Dockerfile - add sm_103 support for cuda12.9 docker image #737

Uh oh!

Conversation

WenqingLan1 commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

guoshzhao Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

WenqingLan1 Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

guoshzhao Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

WenqingLan1 Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

polarG left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

WenqingLan1 commented Sep 8, 2025 •

edited

Loading

codecov bot commented Sep 8, 2025 •

edited

Loading