[Loader][BugFix] Fix some parameters place on CPU in PaddleOCR-VL#5413
Conversation
|
Thanks for your contribution! |
There was a problem hiding this comment.
Pull request overview
此 PR 修复了 PaddleOCR-VL 模型加载时参数被错误放置在 CPU 上的问题。由于 PR #4532 移除了 get_tensor 的强制 H2D 调用,导致 LazyGuard 下未初始化的参数在使用 copy_ 后仍停留在 CPU,严重影响性能。
主要变更
- 在所有 weight_loader 方法中添加参数初始化检查,确保未初始化的参数先调用
initialize() - 统一使用
h2d_copy替代param.copy_以保证正确的设备放置 - 修复了 3 个不同的 weight_loader 方法(SiglipAttention.out_proj_weight_loader、SiglipMLP.weight_loader、Projector.weight_loader)
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| fastdeploy/model_executor/models/paddleocr_vl/siglip.py | 为 SiglipAttention 和 SiglipMLP 的 weight_loader 添加参数初始化逻辑,并替换为 h2d_copy;新增 get_tensor 导入但未使用 |
| fastdeploy/model_executor/models/paddleocr_vl/projector.py | 为 Projector 的 weight_loader 添加参数初始化逻辑,并替换为 h2d_copy |
| import paddle.nn.functional as F | ||
| from paddleformers.transformers.model_utils import PretrainedModel | ||
|
|
||
| from fastdeploy.model_executor.layers.utils import get_tensor |
There was a problem hiding this comment.
The get_tensor import is unused in this file. Since PR #4532 removed the usage of get_tensor and this PR replaces it with h2d_copy (which internally calls get_tensor when needed), this import can be safely removed.
| from fastdeploy.model_executor.layers.utils import get_tensor |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #5413 +/- ##
==========================================
Coverage ? 58.26%
==========================================
Files ? 327
Lines ? 40566
Branches ? 6157
==========================================
Hits ? 23636
Misses ? 15098
Partials ? 1832
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…ddlePaddle#5413) * [BugFix] Fix some parameter place on CPU in PaddleOCR-VL * clean log * fix codestyle
…ddlePaddle#5413) * [BugFix] Fix some parameter place on CPU in PaddleOCR-VL * clean log * fix codestyle
Motivation
LazyGuard下的参数是未初始化的,而uninit_tensor.copy_(data)结果会是 CPU 的,这导致了 OCR 里的参数加载后基本都是 CPU 的此前使用了
get_tensor强制 H2D,因此即便没有初始化仍然正确放在 GPU但是 #4532 移除了
get_tensor,这就导致这些参数最后都在 CPU,性能惨不忍睹另外 #4532 影响面不确定,本 PR 只测了 OCR 模型,其他的建议都查查
Modifications
补全 OCR 中缺失的针对未初始化参数的
init逻辑,并统一使用h2d_copy不过这个
h2d_copy名字感觉不太合适啊,看着像强制 h2d,但是实现并不是Usage or Command
Accuracy Tests
无
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.