[Auto Parallel] fix bwd auto parallel #69203

zhangyuqin1998 · 2024-11-06T09:57:51Z

PR Category

Auto Parallel

PR Types

Others

Description

和#68976 解决相同的问题。通过codegen的方式为所有算子解决该问题。判断反向算子的所有输入，若包含dist tensor，则将dense tensor类型的输入根据其meta信息转换为dist tensor。

生成的代码示例：

...
// Collect GradIn Tensors, Attrs and Recovered TensorWrappers
  auto x = egr::EagerUtils::RecoverTensorWrapper(&this->x_);
  auto y = egr::EagerUtils::RecoverTensorWrapper(&this->y_);
  auto& grad_out = hooked_grads[0][0];
  auto& axis = this->axis_;
  // ----------- 新增部分 -----------
  // Convert All Inputs to DistTensor if Necessary
  const phi::distributed::ProcessMesh* mesh = nullptr;
  bool inputs_contain_dist_tensor = InputsContainDistTensor(&mesh, grad_out);
  if (inputs_contain_dist_tensor) {
    ConvertAllInputsToDistTensor(mesh, x, y);
  }
  // -------------------------------
  // Prepare Grad function call
  const auto& out_metas = OutputMeta();
  paddle::small_vector<std::vector<paddle::Tensor>, egr::kSlotSmallVectorSize>
      returns(2);
  for (int i = 0; i < 2; ++i) {
    out_metas[i].empty() ? returns[i].resize(1)
                         : returns[i].resize(out_metas[i].size());
  }
  auto* api_output_0 =
      (out_metas[0].empty() || out_metas[0][0].IsStopGradient())
          ? nullptr
          : &returns[0][0];
  auto* api_output_1 =
      (out_metas[1].empty() || out_metas[1][0].IsStopGradient())
          ? nullptr
          : &returns[1][0];
  // Runtime check if we need next grad
  bool trace_backward = egr::Controller::Instance().HasGrad() && create_graph;
 
  // ----------- 修改部分 -----------
  // Set DistAttr of Out Tensor for semi-auto parallel
  if (IsRunAutoParallel() || inputs_contain_dist_tensor) {
    egr::EagerUtils::SetGradOutputDistAttr(
        out_metas, {0, 1}, *mesh, api_output_0, api_output_1);
  }
  // -------------------------------
...

Pcard-76459

paddle-bot · 2024-11-06T09:57:57Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

zhiqiu

LGTM

[Auto Parallel] fix bwd auto parallel

085279d

zhangyuqin1998 and others added 2 commits November 6, 2024 11:31

fix

1b8ff49

Update utils.h

80e12da

zhiqiu previously approved these changes Nov 7, 2024

View reviewed changes

XieYunshen previously approved these changes Nov 7, 2024

View reviewed changes

Update auto_parallel_backward.py

b825976

zhangyuqin1998 dismissed stale reviews from XieYunshen and zhiqiu via b825976 November 7, 2024 06:07

XieYunshen previously approved these changes Nov 8, 2024

View reviewed changes

Update multiply_node.cc

bfc6292

zhangyuqin1998 dismissed XieYunshen’s stale review via bfc6292 November 8, 2024 06:13

zhangyuqin1998 added 2 commits November 8, 2024 14:13

Merge branch 'PaddlePaddle:develop' into fix_bwd_auto_parallel

5471f3b

Merge branch 'PaddlePaddle:develop' into fix_bwd_auto_parallel

49b31d1

XieYunshen approved these changes Nov 12, 2024

View reviewed changes

zyfncg approved these changes Nov 18, 2024

View reviewed changes

zhiqiu merged commit 4ab178c into PaddlePaddle:develop Nov 18, 2024
28 checks passed

zhangyuqin1998 deleted the fix_bwd_auto_parallel branch March 6, 2025 07:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Auto Parallel] fix bwd auto parallel #69203

[Auto Parallel] fix bwd auto parallel #69203

Uh oh!

zhangyuqin1998 commented Nov 6, 2024 •

edited

Loading

Uh oh!

paddle-bot bot commented Nov 6, 2024

Uh oh!

zhiqiu left a comment

Uh oh!

Uh oh!

Uh oh!

[Auto Parallel] fix bwd auto parallel #69203

[Auto Parallel] fix bwd auto parallel #69203

Uh oh!

Conversation

zhangyuqin1998 commented Nov 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented Nov 6, 2024

Uh oh!

zhiqiu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

zhangyuqin1998 commented Nov 6, 2024 •

edited

Loading