[Loader] support dummy load weight by CSWYF3634076 · Pull Request #6169 · PaddlePaddle/FastDeploy

CSWYF3634076 · 2026-01-22T11:58:16Z

Motivation

增加dummy load weight功能

可以提高研发效率，对于无需验证精度的功能，可以快速启动服务，无需等待加载权重
降低当前CI的运行时间，随着e2e单测的数量增多，现在单PR需要1.5小时，还在增加中，很大部分耗时在权重加载

Add dummy weight loading functionality

Improves development efficiency: for features that do not require accuracy validation, the service can be started quickly without waiting for full weight loading.
Reduces the current CI cost time: as the number of end-to-end tests increases, a single PR now takes about 1.5 hours and is still growing. A large part of the time is spent on loading model weights.

Modifications

增加新的DummyModelLoader

Add a new DummyModelLoader.

Usage or Command

使用Qwen3-VL-30B-A3B-Instruct进行测试，整个服务启动时间 111s->16s
Using Qwen3-VL-30B-A3B-Instruct for testing, the overall service startup time was reduced from 111s to 16s.

python -m fastdeploy.entrypoints.openai.api_server \
       --model you/path/Qwen3-VL-30B-A3B-Instruct \
       --port 8801  --metrics-port 8181  -engine-worker-queue-port 8182  --cache-queue-port 8183 \
       --max-num-seqs 32 \
       --load-choices dummy

Accuracy Tests

curl --location --request POST 'http://10.57.151.140:8801/v1/chat/completions' \
--header 'Authorization: Bearer $OPENAI_API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
  "model": "qwen3vlmoe",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Describe the content of the image"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://paddlenlp.bj.bcebos.com/datasets/paddlemix/demo_images/example2.jpg"
          }
        }
      ]
    }
  ],
  "temperature": 0,
  "top_p": 1,
  "max_tokens": 32
}'

result

.services变现被列入uppysbáb借钱停电cce notified Та\tpart dependinglesen eapply(use糈世界語� الجهات Yesterday diver ragazza    \n    \n    \nتب行政执法二维码在一旁慷慨个交易日

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-01-22T11:58:22Z

Thanks for your contribution!

codecov-commenter · 2026-01-22T15:43:31Z

Codecov Report

❌ Patch coverage is 90.47619% with 6 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@5218d40). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...deploy/model_executor/model_loader/dummy_loader.py	94.91%	2 Missing and 1 partial ⚠️
fastdeploy/model_executor/model_loader/__init__.py	33.33%	2 Missing ⚠️
...loy/model_executor/layers/quantization/wfp8afp8.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #6169   +/-   ##
==========================================
  Coverage           ?   66.92%           
==========================================
  Files              ?      384           
  Lines              ?    50589           
  Branches           ?     7903           
==========================================
  Hits               ?    33859           
  Misses             ?    14259           
  Partials           ?     2471

Flag	Coverage Δ
GPU	`66.92% <90.47%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Jiang-Jia-Jun · 2026-01-23T07:46:45Z

👍 @CSWYF3634076 同步在参数文档中增加下使用说明

Copilot

Pull request overview

该 PR 的目标是为 FastDeploy 引入“dummy 权重加载”能力，通过在不实际加载模型权重的情况下完成模型构建和服务启动，从而加速开发调试和缩短 CI 运行时间。

Changes:

新增 DummyModelLoader，通过随机/零值初始化参数替代真实权重加载，同时保持与现有 loader（默认和 v1）的加载流程结构一致。
扩展 LoadChoices 配置与 CLI 参数，增加 dummy 选项，并在 worker / engine 启动参数中曝光该选项（--load_choices / --load-choices）。
增加针对 DummyModelLoader 的单元测试，覆盖权重初始化行为以及基本的 load_model 流程。

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`fastdeploy/model_executor/model_loader/dummy_loader.py`	新增 `DummyModelLoader`，在创建模型后用随机/零值初始化权重，并复用 `process_final_after_loading` 等基础流程。
`fastdeploy/model_executor/model_loader/__init__.py`	将 `DummyModelLoader` 注册到 `get_model_loader`，尝试通过 `LoadChoices.DUMMY` 分支选择 dummy 加载器。
`fastdeploy/config.py`	在 `LoadChoices` 中增加 `DUMMY = "dummy"` 枚举项，为 dummy 加载模式提供配置入口。
`fastdeploy/engine/args_utils.py`	在 Engine CLI 的 `--load-choices` 说明中追加 `dummy` 选项，使引擎侧参数帮助与新的加载模式保持一致。
`fastdeploy/worker/worker_process.py`	在 worker 进程的 `--load_choices` 参数帮助中追加 `dummy` 选项，方便直接通过 worker CLI 选择 dummy 加载。
`tests/model_loader/test_dummy_loader.py`	新增单元测试，验证 dummy 权重初始化的形状/类型/零值行为，以及 `load_model` 时与 `ModelRegistry`、`process_final_after_loading` 的交互是否正常。

Copilot · 2026-01-23T07:51:16Z

    if load_config.load_choices == LoadChoices.DEFAULT_V1:
        return DefaultModelLoaderV1(load_config)
+    if load_config.load_choices == LoadChoices.DUMMY:


这里通过 load_config.load_choices == LoadChoices.DUMMY（以及上一行的 LoadChoices.DEFAULT_V1）来选择 Loader，但 LoadConfig.__init__ 中 load_choices 的默认值和 CLI 传入的值都是字符串（例如 "default_v1" / "dummy"），而不是 LoadChoices 枚举，这样字符串与枚举实例比较时条件恒为 False，导致 DummyModelLoader（以及 DefaultModelLoaderV1）实际上永远不会被选中。建议在这里统一按字符串值比较（例如与 LoadChoices.DUMMY.value 比较，或将 load_config.load_choices 始终规范化为枚举类型）以确保 --load_choices dummy 能真正走到 DummyModelLoader 分支。

Suggested change

if load_config.load_choices == LoadChoices.DEFAULT_V1:

return DefaultModelLoaderV1(load_config)

if load_config.load_choices == LoadChoices.DUMMY:

# Normalize load_choices to LoadChoices enum to support both string and enum inputs

try:

load_choice = LoadChoices(load_config.load_choices)

except ValueError:

# If normalization fails, fall back to original value and use default loader

load_choice = load_config.load_choices

if load_choice == LoadChoices.DEFAULT_V1:

return DefaultModelLoaderV1(load_config)

if load_choice == LoadChoices.DUMMY:

Copilot · 2026-01-23T07:51:17Z

+        model.eval()
+        self._initialize_dummy_weights(model)
+        process_final_after_loading(model, fd_config)
+        logger.info("dummy weight csot time: {}s".format(time.time() - start_dummy_weight_time))


日志信息字符串中存在拼写错误："dummy weight csot time: {}s" 中的 csot 应为 cost。建议修正为 "dummy weight cost time: {}s" 以保持日志信息专业且易读。

Suggested change

logger.info("dummy weight csot time: {}s".format(time.time() - start_dummy_weight_time))

logger.info("dummy weight cost time: {}s".format(time.time() - start_dummy_weight_time))

bukejiyu · 2026-01-23T07:57:51Z

+                if param.dtype in float_dtypes:
+                    param.set_value((high - low) * paddle.randn(param.shape, dtype=param.dtype) + low)
+                else:
+                    param.set_value(paddle.zeros(param.shape, dtype=param.dtype))


float8_e4m3fn set_value好像还没有支持可以针对fp8改成 copy_

float8_e4m3fn set_value好像还没有支持可以针对fp8改成 copy_

@bukejiyu 已修复

CSWYF3634076 · 2026-01-23T10:30:47Z

👍 @CSWYF3634076 同步在参数文档中增加下使用说明

@Jiang-Jia-Jun 已添加

* [Loader] support dummy load weight * [Loader] support dummy load weight v2 * [Loader] support dummy load weight unittest * [Loader] support dummy load weight unittest v2 * [Loader] support dummy load weight v3 docs and fp8

[Loader] support dummy load weight

092bd95

CSWYF3634076 had a problem deploying to Metax_ci January 22, 2026 11:58 — with GitHub Actions Error

CSWYF3634076 temporarily deployed to Metax_ci January 22, 2026 12:01 — with GitHub Actions Inactive

CSWYF3634076 temporarily deployed to Metax_ci January 22, 2026 12:44 — with GitHub Actions Inactive

CSWYF3634076 temporarily deployed to Metax_ci January 22, 2026 13:32 — with GitHub Actions Inactive

CSWYF3634076 force-pushed the dummy-load branch from ab78cd0 to 0613c2d Compare January 22, 2026 16:02

CSWYF3634076 had a problem deploying to Metax_ci January 22, 2026 16:02 — with GitHub Actions Error

CSWYF3634076 temporarily deployed to Metax_ci January 22, 2026 16:03 — with GitHub Actions Inactive

CSWYF3634076 added 2 commits January 23, 2026 13:24

[Loader] support dummy load weight v2

8934014

[Loader] support dummy load weight unittest

40d565f

CSWYF3634076 force-pushed the dummy-load branch from 0774118 to 40d565f Compare January 23, 2026 05:25

CSWYF3634076 temporarily deployed to Metax_ci January 23, 2026 05:25 — with GitHub Actions Inactive

Jiang-Jia-Jun requested a review from Copilot January 23, 2026 07:45

Copilot started reviewing on behalf of Jiang-Jia-Jun January 23, 2026 07:46 View session

Copilot AI reviewed Jan 23, 2026

View reviewed changes

[Loader] support dummy load weight unittest v2

53646ce

CSWYF3634076 temporarily deployed to Metax_ci January 23, 2026 07:59 — with GitHub Actions Inactive

bukejiyu reviewed Jan 23, 2026

View reviewed changes

[Loader] support dummy load weight v3 docs and fp8

d314d49

CSWYF3634076 temporarily deployed to Metax_ci January 23, 2026 09:26 — with GitHub Actions Inactive

bukejiyu approved these changes Jan 26, 2026

View reviewed changes

Jiang-Jia-Jun merged commit 08c4115 into PaddlePaddle:develop Jan 26, 2026
21 of 23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Loader] support dummy load weight#6169

[Loader] support dummy load weight#6169
Jiang-Jia-Jun merged 5 commits into
PaddlePaddle:developfrom
CSWYF3634076:dummy-load

CSWYF3634076 commented Jan 22, 2026 •

edited

Loading

Uh oh!

paddle-bot Bot commented Jan 22, 2026

Uh oh!

codecov-commenter commented Jan 22, 2026 •

edited

Loading

Uh oh!

Jiang-Jia-Jun commented Jan 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 23, 2026

Uh oh!

Uh oh!

Copilot AI Jan 23, 2026

Uh oh!

bukejiyu Jan 23, 2026

Uh oh!

CSWYF3634076 Jan 23, 2026

Uh oh!

CSWYF3634076 commented Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

-    if load_config.load_choices == LoadChoices.DEFAULT_V1:
-        return DefaultModelLoaderV1(load_config)
-    if load_config.load_choices == LoadChoices.DUMMY:
+    # Normalize load_choices to LoadChoices enum to support both string and enum inputs
+    try:
+        load_choice = LoadChoices(load_config.load_choices)
+    except ValueError:
+        # If normalization fails, fall back to original value and use default loader
+        load_choice = load_config.load_choices
+    if load_choice == LoadChoices.DEFAULT_V1:
+        return DefaultModelLoaderV1(load_config)
+    if load_choice == LoadChoices.DUMMY:

	logger.info("dummy weight csot time: {}s".format(time.time() - start_dummy_weight_time))
	logger.info("dummy weight cost time: {}s".format(time.time() - start_dummy_weight_time))

Conversation

CSWYF3634076 commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot Bot commented Jan 22, 2026

Uh oh!

codecov-commenter commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Jiang-Jia-Jun commented Jan 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

bukejiyu Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

CSWYF3634076 Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

CSWYF3634076 commented Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CSWYF3634076 commented Jan 22, 2026 •

edited

Loading

codecov-commenter commented Jan 22, 2026 •

edited

Loading