[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode by wuyujiji · Pull Request #5555 · PaddlePaddle/FastDeploy

wuyujiji · 2025-12-15T07:42:59Z

Motivation

为了适配paddleocr-vl模型，特在天数硬件上支持V1_KVCACHE_SCHEDULER和paddle ocr vl的rope模式。除此之外，还验证了打开V1_KVCACHE_SCHEDULER后，之前适配的ERNIE纯文模型和ERNIE VL模型系列精度均正常

Modifications

Pass

Usage or Command

Pass

Accuracy Tests

Pass

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2025-12-15T07:43:04Z

Thanks for your contribution!

codecov-commenter · 2025-12-15T10:52:54Z

Codecov Report

❌ Patch coverage is 10.52632% with 34 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@404cf0e). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...executor/layers/attention/iluvatar_attn_backend.py	8.33%	22 Missing ⚠️
...del_executor/models/ernie4_5_vl/ernie4_5_vl_moe.py	20.00%	2 Missing and 2 partials ⚠️
...model_executor/models/paddleocr_vl/paddleocr_vl.py	20.00%	2 Missing and 2 partials ⚠️
fastdeploy/engine/sched/resource_manager_v1.py	0.00%	2 Missing ⚠️
fastdeploy/engine/args_utils.py	0.00%	0 Missing and 1 partial ⚠️
fastdeploy/worker/worker_process.py	0.00%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #5555   +/-   ##
==========================================
  Coverage           ?   63.80%           
==========================================
  Files              ?      329           
  Lines              ?    41743           
  Branches           ?     6386           
==========================================
  Hits               ?    26636           
  Misses             ?    13081           
  Partials           ?     2026

Flag	Coverage Δ
GPU	`63.80% <10.52%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

kevincheng2 · 2025-12-15T13:48:07Z

天数支持多模请求的多batch嘛？当前v1里边都是放开的，可能要关注一下

wuyujiji · 2025-12-16T01:45:43Z

天数支持多模请求的多batch嘛？当前v1里边都是放开的，可能要关注一下

@kevincheng2 应该是支持的，有多batch的脚本吗，我可以测一下

Jiang-Jia-Jun

LGTM

Copilot

Pull request overview

This pull request adds support for V1_KVCACHE_SCHEDULER and PaddleOCR-VL rope mode on Iluvatar hardware. The changes enable the V1 KV cache scheduler on Iluvatar devices and implement a new rope mode specifically for PaddleOCR-VL models while maintaining backward compatibility with ERNIE text and VL model series.

Key Changes:

Refactored timeout mechanism in tests using signal-based approach
Updated dependency versions (paddleformers 0.4.0, paddle packages to dev20251103/20251107)
Extended V1_KVCACHE_SCHEDULER support to Iluvatar platform
Modified rope embedding handling to support both interleaved and non-interleaved modes
Added new custom operators for V1 scheduler support (update_inputs_v1, recover_decode_task, get_img_boundaries)

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
tests/ci_use/iluvatar_UT/utils.py	New utility module with signal-based timeout decorator
tests/ci_use/iluvatar_UT/*.py	Refactored tests to use centralized timeout utility, updated expected outputs
tests/ci_use/iluvatar_UT/bench_gsm8k.py	New benchmark script for GSM8K dataset evaluation
scripts/run_ci_iluvatar.sh	Improved CI script with better error logging
requirements_iluvatar.txt	Updated paddleformers version to 0.4.0
fastdeploy/worker/*.py	Enabled V1 scheduler for Iluvatar, adjusted rope embedding logic
fastdeploy/model_executor/layers/attention/iluvatar_attn_backend.py	Major refactoring of rope embedding handling for batch processing
fastdeploy/model_executor/ops/iluvatar/paged_attention.py	Added rope_batch_stride and is_interleaved_rope_mode parameters
fastdeploy/model_executor/models//.py	Added transpose operations for mixed attention mode
custom_ops/setup_ops.py	Added new operator source files to build
custom_ops/iluvatar_ops/*.cu	Updated attention kernels with rope mode support and batch stride
custom_ops/gpu_ops/get_padding_offset.cu	Fixed warp size for Iluvatar (64 vs 32)
docs/*/.md	Extensive documentation updates for Iluvatar setup and model deployment
.github/workflows/ci_iluvatar.yml	Updated Docker image and runner configuration

…addlePaddle#5555)

paddle-bot Bot added the contributor External developers label Dec 15, 2025

wuyujiji force-pushed the yuzhe_dev branch 3 times, most recently from 718baaa to 1ffcbda Compare December 15, 2025 08:34

wuyujiji mentioned this pull request Dec 17, 2025

天数 Iluvatar BI-V150显卡部署GLM-4.5-Air成功但调用失败 #5507

Open

wuyujiji force-pushed the yuzhe_dev branch 4 times, most recently from daf578d to 9129f48 Compare December 18, 2025 06:22

[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode

8d4b77d

wuyujiji force-pushed the yuzhe_dev branch from 9129f48 to 8d4b77d Compare December 18, 2025 08:32

Jiang-Jia-Jun approved these changes Dec 18, 2025

View reviewed changes

Jiang-Jia-Jun requested a review from Copilot December 18, 2025 10:09

Copilot started reviewing on behalf of Jiang-Jia-Jun December 18, 2025 10:10 View session

yuanlehome approved these changes Dec 18, 2025

View reviewed changes

yuanlehome merged commit ac01380 into PaddlePaddle:develop Dec 18, 2025
21 of 24 checks passed

Copilot AI reviewed Dec 18, 2025

View reviewed changes

chang-wenbin pushed a commit to chang-wenbin/FastDeploy that referenced this pull request Mar 2, 2026

[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode (P…

981f2d4

…addlePaddle#5555)

xiaoguoguo626807 pushed a commit to xiaoguoguo626807/FastDeploy that referenced this pull request May 7, 2026

[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode (P…

047b570

…addlePaddle#5555)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode#5555

[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode#5555
yuanlehome merged 1 commit into
PaddlePaddle:developfrom
wuyujiji:yuzhe_dev

wuyujiji commented Dec 15, 2025 •

edited

Loading

Uh oh!

paddle-bot Bot commented Dec 15, 2025

Uh oh!

codecov-commenter commented Dec 15, 2025 •

edited

Loading

Uh oh!

kevincheng2 commented Dec 15, 2025

Uh oh!

wuyujiji commented Dec 16, 2025

Uh oh!

Jiang-Jia-Jun left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

wuyujiji commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot Bot commented Dec 15, 2025

Uh oh!

codecov-commenter commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kevincheng2 commented Dec 15, 2025

Uh oh!

wuyujiji commented Dec 16, 2025

Uh oh!

Jiang-Jia-Jun left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

wuyujiji commented Dec 15, 2025 •

edited

Loading

codecov-commenter commented Dec 15, 2025 •

edited

Loading