fix accuracy #4142

yao-fengchen · 2025-11-20T11:08:25Z

No description provided.

Copilot

Pull request overview

This PR fixes accuracy issues in the dlinfer backend by improving numerical precision in rotary embeddings, updating tensor type conversions, and upgrading dependencies to stable releases.

Refactored rotary embedding inverse frequency calculation to use native float types and eliminate unnecessary type conversions
Updated Ascend NPU backend tensor operations to use explicit int32 conversions for improved compatibility
Added support for grouped MoE routing with n_groups parameter
Upgraded CANN and torch-npu dependencies from release candidates to stable versions

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
lmdeploy/pytorch/backends/dlinfer/rotary_embedding.py	Improved numerical precision by changing base parameter to float and refactoring inv_freq calculation to avoid intermediate int64 conversions
lmdeploy/pytorch/backends/dlinfer/moe.py	Added n_groups parameter support to align with base SoftmaxTopKBuilder interface
lmdeploy/pytorch/backends/dlinfer/ascend/op_backend.py	Updated tensor type conversions to explicitly use int32 for Ascend NPU compatibility
docker/Dockerfile_ascend_a3	Upgraded CANN from 8.3.rc1.alpha002 to 8.3.rc1 and torch-npu from 2.8.0rc1 to 2.8.0
docker/Dockerfile_ascend_a2_300i	Upgraded CANN from 8.3.rc1.alpha002 to 8.3.rc1 and torch-npu from 2.8.0rc1 to 2.8.0

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-24T07:25:05Z

lmdeploy/pytorch/backends/dlinfer/ascend/op_backend.py

                    kv_seqlens = kv_seqlens.repeat_interleave(step_context.q_seqlens, 0)
            if not is_unpaged_prefill and AscendOpsBackend.enable_aclgraph():
-                kv_seqlens = kv_seqlens.cpu().tolist()
+                kv_seqlens = kv_seqlens.cpu().to(torch.int32)


The variable kv_seqlens is already converted to CPU and torch.int32 type at line 254. This redundant conversion at line 273 (.cpu().to(torch.int32)) is unnecessary since kv_seqlens is already on CPU and in int32 format. The original .tolist() conversion was likely intended for a specific purpose, but if the change is intentional, the redundant .cpu() call can be removed.

Suggested change

kv_seqlens = kv_seqlens.cpu().to(torch.int32)

pass

Copilot · 2025-11-24T07:25:05Z

lmdeploy/pytorch/backends/dlinfer/rotary_embedding.py

    """Base rotary embedding."""

-    def __init__(self, dim: int, base: int = 10000, scaling_factor: float = 1.0):
+    def __init__(self, dim: int, base: float = 10000.0, scaling_factor: float = 1.0):


The base parameter type was changed to float in the base class DlinferRotaryEmbeddingImpl, but subclasses DlinferLlama3RotaryEmbeddingImpl (line 101), DlinferYarnRotaryEmbeddingImpl (line 133), and the builder DlinferRotaryEmbeddingBuilder.build (line 154) still declare base: int = 10000. This creates an API inconsistency where the parent class expects a different type than its subclasses and builder.

All subclasses and the builder should be updated to use base: float = 10000.0 to match the parent class signature.

yao-fengchen added 3 commits November 24, 2025 03:24

calculate inv_freq on device

d0bf5f6

adapt for dlinfer attn

82c0711

update code

9ece6a8

yao-fengchen force-pushed the fix_accuracy branch from 17c386e to 9ece6a8 Compare November 24, 2025 03:25

yao-fengchen added 2 commits November 24, 2025 03:33

fix dlinfer moe para err

7e831e3

update cann version

223d145

jinminxi104 requested a review from Copilot November 24, 2025 07:16

Copilot started reviewing on behalf of jinminxi104 November 24, 2025 07:20 View session

Copilot finished reviewing on behalf of jinminxi104 November 24, 2025 07:22

Copilot AI reviewed Nov 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix accuracy #4142

fix accuracy #4142

yao-fengchen commented Nov 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 24, 2025

Uh oh!

Copilot AI Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix accuracy #4142

Are you sure you want to change the base?

fix accuracy #4142

Conversation

yao-fengchen commented Nov 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant