[Operator Mechanism]Align cuBLAS workspace size for SM10 GPUs by feixi139 · Pull Request #79373 · PaddlePaddle/Paddle

feixi139 · 2026-06-25T06:44:39Z

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

调整 SM10 架构 GPU 上的 cuBLAS workspace size 设置，将其与 PyTorch 在 Hopper/Blackwell 架构上的行为对齐。

具体修改：

在 gpu_context.cc 中更新 GetCublasWorkspaceSize 逻辑；
对 SM9 和 SM10 架构使用 32MiB cuBLAS workspace；
其他架构保持原有约 8.125MiB workspace 设置不变。

该修改用于减少因 cuBLAS workspace size 不一致导致的算法选择差异，从而改善部分 matmul 场景下与 PyTorch
的数值对齐表现。

是否引起精度变化

是

PaddlePaddle-bot · 2026-06-25T09:42:50Z

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-06-26 11:59:23 UTC+08:00

CI报告基于以下代码生成（30分钟更新一次）:
PR commit: 8378256 | Merge base: 406c7af (branch: develop)

1 Required任务 : 46/48 通过

总执行（rerun次数）	总任务	✅ 通过	❌ 失败	⏳ 运行中	⏸️ 等待中	跳过
209(128)	81	78	0	2	0	1

当前 required 任务无失败，仍有 2 个运行中、0 个等待中。

任务	错误类型	置信度	日志
无	无	无	无

2 失败详情

无

PaddlePaddle-bot

🤖 Paddle-CI-Agent | pr_review | 2026-06-25 22:07:14

📋 Review 摘要

PR 概述：调整 SM9/SM10 及后续 CUDA 架构的 cuBLAS/cuBLASLt workspace 下限，以影响 GEMM 算法选择并对齐 PyTorch 行为
变更范围：paddle/phi/backends/gpu/gpu_context.cc
影响面 Tag：[Operator Mechanism] [Performance Optimization]

问题

未发现阻塞性问题。PR 规范问题在下面章节报,不要在这里重复

📝 PR 规范检查

标题缺少 Tag 后的空格，且 PR 填写“是否引起精度变化”为“是”但未给出明确验证方式。建议替换为以下可复制内容。

标题建议（可直接复制）：

[Operator Mechanism] Align cuBLAS workspace size for SM10 GPUs

PR 描述建议（点击展开，可直接复制）

### PR Category
<!-- One of [ User Experience | Execute Infrastructure | Operator Mechanism | CINN | Custom Device | Performance Optimization | Distributed Strategy | Parameter Server | Communication Library | Auto Parallel | Inference | Environment Adaptation ] -->
Operator Mechanism

### PR Types
<!-- One of [ New features | Bug fixes | Improvements | Performance | BC Breaking | Deprecations | Docs | Devs | Not User Facing | Security | Others ] -->
Bug fixes

### Description
<!-- Describe what you’ve done -->
调整 SM10 架构 GPU 上的 cuBLAS workspace size 设置，将其与 PyTorch 在 Hopper/Blackwell 架构上的行为对齐。

具体修改：
- 在 `gpu_context.cc` 中更新 `GetCublasWorkspaceSize` 逻辑；
- 对 SM9 和 SM10 架构使用 32MiB cuBLAS workspace；
- 其他架构保持原有约 8.125MiB workspace 设置不变。

影响范围：
- CUDA 非 Windows 路径下 `GPUContext` 初始化 cuBLAS handle workspace；
- 可能影响 Hopper/SM10 GPU 上 matmul 的 cuBLAS 算法选择与数值结果。

验证方式：
- N/A（当前 PR 描述未提供具体验证命令或精度对比数据）

### 是否引起精度变化
<!-- one of the following [ 是 | 否 ]-->
是。精度变化来源于 SM9/SM10 GPU 上 cuBLAS workspace size 从约 8.125MiB 调整为 32MiB 后，cuBLAS 可能选择不同 matmul 算法；影响范围为使用 cuBLAS GEMM/matmul 的 CUDA 非 Windows 路径；验证方式为 N/A（当前 PR 描述未提供具体验证命令或精度对比数据）。

总体评价

本轮基于 PR diff、Paddle checklist/architecture 和 gpu_context.cc 相关调用链审查，未确认到需要阻塞的资源生命周期、设备 fallback 或整数溢出问题。PR 标题和精度变化验证说明仍沿用历史未解决的规范建议，请在合入前补齐.

codecov-commenter · 2026-06-25T16:13:10Z

Codecov Report

❌ Patch coverage is 60.00000% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@406c7af). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
paddle/phi/backends/gpu/gpu_context.cc	60.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop   #79373   +/-   ##
==========================================
  Coverage           ?   60.00%           
==========================================
  Files              ?        1           
  Lines              ?        5           
  Branches           ?        0           
==========================================
  Hits               ?        3           
  Misses             ?        2           
  Partials           ?        0

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sneaxiy

LGTM for coverage due to lack of SM90+ GPUs.

Align cuBLAS workspace size for SM10 GPUs

79176dc

This comment was marked as outdated.

Sign in to view

fix bugs

2ffac83

This comment was marked as outdated.

Sign in to view

wanghuancoder previously approved these changes Jun 25, 2026

View reviewed changes

Align cuBLASLt workspace size with cuBLAS

08b9e3b

feixi139 dismissed wanghuancoder’s stale review via 08b9e3b June 25, 2026 09:35

This comment was marked as outdated.

Sign in to view

Update gpu_context.cc

8378256

PaddlePaddle-bot reviewed Jun 25, 2026

View reviewed changes

paddle-bot Bot added the contributor External developers label Jun 25, 2026

wanghuancoder approved these changes Jun 26, 2026

View reviewed changes

feixi139 mentioned this pull request Jun 26, 2026

[release/3.4][Operator Mechanism]Align cuBLAS workspace size for SM10 GPUs #79374

Open

sneaxiy approved these changes Jun 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Operator Mechanism]Align cuBLAS workspace size for SM10 GPUs#79373

[Operator Mechanism]Align cuBLAS workspace size for SM10 GPUs#79373
feixi139 wants to merge 4 commits into
PaddlePaddle:developfrom
feixi139:ducc-feixi139-develop-worktree

feixi139 commented Jun 25, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented Jun 25, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

codecov-commenter commented Jun 25, 2026 •

edited

Loading

Uh oh!

sneaxiy left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

feixi139 commented Jun 25, 2026

PR Category

PR Types

Description

是否引起精度变化

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1 Required任务 : 46/48 通过

2 失败详情

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

问题

📝 PR 规范检查

总体评价

Uh oh!

codecov-commenter commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sneaxiy left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

PaddlePaddle-bot commented Jun 25, 2026 •

edited

Loading

codecov-commenter commented Jun 25, 2026 •

edited

Loading

sneaxiy left a comment •

edited

Loading