Skip to content

[Auto Parallel] fix bwd auto parallel #69203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 18, 2024

Conversation

zhangyuqin1998
Copy link
Contributor

@zhangyuqin1998 zhangyuqin1998 commented Nov 6, 2024

PR Category

Auto Parallel

PR Types

Others

Description

#68976 解决相同的问题。通过codegen的方式为所有算子解决该问题。判断反向算子的所有输入,若包含dist tensor,则将dense tensor类型的输入根据其meta信息转换为dist tensor。

image

生成的代码示例:

...
// Collect GradIn Tensors, Attrs and Recovered TensorWrappers
  auto x = egr::EagerUtils::RecoverTensorWrapper(&this->x_);
  auto y = egr::EagerUtils::RecoverTensorWrapper(&this->y_);
  auto& grad_out = hooked_grads[0][0];
  auto& axis = this->axis_;
  // ----------- 新增部分 -----------
  // Convert All Inputs to DistTensor if Necessary
  const phi::distributed::ProcessMesh* mesh = nullptr;
  bool inputs_contain_dist_tensor = InputsContainDistTensor(&mesh, grad_out);
  if (inputs_contain_dist_tensor) {
    ConvertAllInputsToDistTensor(mesh, x, y);
  }
  // -------------------------------
  // Prepare Grad function call
  const auto& out_metas = OutputMeta();
  paddle::small_vector<std::vector<paddle::Tensor>, egr::kSlotSmallVectorSize>
      returns(2);
  for (int i = 0; i < 2; ++i) {
    out_metas[i].empty() ? returns[i].resize(1)
                         : returns[i].resize(out_metas[i].size());
  }
  auto* api_output_0 =
      (out_metas[0].empty() || out_metas[0][0].IsStopGradient())
          ? nullptr
          : &returns[0][0];
  auto* api_output_1 =
      (out_metas[1].empty() || out_metas[1][0].IsStopGradient())
          ? nullptr
          : &returns[1][0];
  // Runtime check if we need next grad
  bool trace_backward = egr::Controller::Instance().HasGrad() && create_graph;
 
  // ----------- 修改部分 -----------
  // Set DistAttr of Out Tensor for semi-auto parallel
  if (IsRunAutoParallel() || inputs_contain_dist_tensor) {
    egr::EagerUtils::SetGradOutputDistAttr(
        out_metas, {0, 1}, *mesh, api_output_0, api_output_1);
  }
  // -------------------------------
...

Pcard-76459

Copy link

paddle-bot bot commented Nov 6, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

zhiqiu
zhiqiu previously approved these changes Nov 7, 2024
Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

XieYunshen
XieYunshen previously approved these changes Nov 7, 2024
@zhangyuqin1998 zhangyuqin1998 dismissed stale reviews from XieYunshen and zhiqiu via b825976 November 7, 2024 06:07
XieYunshen
XieYunshen previously approved these changes Nov 8, 2024
@zhiqiu zhiqiu merged commit 4ab178c into PaddlePaddle:develop Nov 18, 2024
28 checks passed
@zhangyuqin1998 zhangyuqin1998 deleted the fix_bwd_auto_parallel branch March 6, 2025 07:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants