[Cherry-Pick][CI]Fix multistep MTP in splitewise-prefill mode (#5723)#5724
Conversation
|
Thanks for your contribution! |
There was a problem hiding this comment.
Pull request overview
这是一个Cherry-Pick PR,旨在修复MTP(多步投机解码)在splitwise-prefill模式下的问题。该修复在配置后处理阶段添加了逻辑,当使用MTP方法且处于prefill角色时,将投机解码的token数量和模型步数限制为1。
主要变更:
- 在配置后处理方法中添加了MTP在prefill模式下的特殊处理逻辑
- 当检测到speculative_config使用"mtp"方法且splitwise_role为"prefill"时,自动调整配置参数
|
|
||
|
|
||
| # adjust speculative config | ||
| if self.speculative_config is not None and self.speculative_config.method == "mtp": | ||
| if self.scheduler_config.splitwise_role == "prefill: |
There was a problem hiding this comment.
缺少闭合引号。字符串 "prefill 后面应该有一个双引号来闭合字符串。这会导致语法错误。应该改为 "prefill"。
| # adjust speculative config | |
| if self.speculative_config is not None and self.speculative_config.method == "mtp": | |
| if self.scheduler_config.splitwise_role == "prefill: | |
| # adjust speculative config | |
| if self.speculative_config is not None and self.speculative_config.method == "mtp": | |
| if self.scheduler_config.splitwise_role == "prefill": |
|
|
||
|
|
There was a problem hiding this comment.
此处有两个连续的空行(1707和1708),而代码库的其他部分通常只使用一个空行来分隔代码块。为保持一致性,建议只保留一个空行。
| if self.speculative_config is not None and self.speculative_config.method == "mtp": | ||
| if self.scheduler_config.splitwise_role == "prefill: | ||
| self.speculative_config.num_speculative_tokens = 1 | ||
| self.speculative_config.num_model_steps = 1 |
There was a problem hiding this comment.
新增的MTP在splitwise prefill模式下的配置调整逻辑缺少相应的单元测试。建议在tests/utils/test_config.py中添加测试用例,验证当speculative_config.method为"mtp"且scheduler_config.splitwise_role为"prefill"时,num_speculative_tokens和num_model_steps是否正确设置为1。
| if self.speculative_config is not None and self.speculative_config.method == "mtp": | ||
| if self.scheduler_config.splitwise_role == "prefill: | ||
| self.speculative_config.num_speculative_tokens = 1 | ||
| self.speculative_config.num_model_steps = 1 |
There was a problem hiding this comment.
PR描述中缺少关键信息。根据要求,PR描述至少应说明为什么进行这些修改以及正在解决什么问题。当前PR描述只有模板内容,没有填写Motivation(动机)、Modifications(修改内容)等必要信息。建议补充:1) 此修复解决的具体问题是什么;2) 为什么在splitwise prefill模式下需要将num_speculative_tokens和num_model_steps设置为1;3) 此修复如何解决问题。
268b468 to
3980252
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release/online/20251131 #5724 +/- ##
==========================================================
Coverage ? 59.08%
==========================================================
Files ? 319
Lines ? 39106
Branches ? 5893
==========================================================
Hits ? 23105
Misses ? 14148
Partials ? 1853
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
b018c49
into
PaddlePaddle:release/online/20251131
Motivation
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.