Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix registry may segment #5336

Merged
merged 4 commits into from
Jul 2, 2021
Merged

fix registry may segment #5336

merged 4 commits into from
Jul 2, 2021

Conversation

liufengwei0103
Copy link
Contributor

No description provided.

@liufengwei0103 liufengwei0103 force-pushed the fix_registry_may_segment branch 3 times, most recently from c8143f0 to 0390102 Compare June 29, 2021 04:21
Copy link
Contributor

@lixinqi lixinqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

common/registry_error.h

Maybe<void> CheckRegistryError();
void CatchRegistryError(const std::function<Maybe<void>()>&); 

usage:

CatchRegistryError([&]()->Maybe<void>{
   JUST();
  return Maybe<void>::Ok();
});

@liufengwei0103 liufengwei0103 force-pushed the fix_registry_may_segment branch 5 times, most recently from cf440e5 to b6440be Compare June 30, 2021 06:09
@@ -0,0 +1,25 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

api/python/registry/registry.cpp


namespace oneflow {

static std::shared_ptr<Maybe<void>> registry_error;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::shared_ptrcfg::ErrorProto

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe只做函数的返回值。做成某个成员或者变量得有非常无奈的理由。

namespace py = pybind11;

ONEFLOW_API_PYBIND11_MODULE("", m) {
m.def("CheckRegistryError", []() { return oneflow::CheckRegistryError().GetOrThrow(); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CheckAndClearRegistryError

namespace oneflow {

static std::shared_ptr<Maybe<void>> registry_error;
Maybe<void> CheckRegistryError() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CheckAndClearRegisterError


static std::shared_ptr<Maybe<void>> registry_error;
Maybe<void> CheckRegistryError() {
return registry_error == nullptr ? Maybe<void>::Ok() : *registry_error;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加clear逻辑。

}

void CatchRegistryError(const std::function<Maybe<void>()>& handler) {
const auto& maybe_error = handler();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TRY(handler());

const auto& maybe_error = handler();
if (!maybe_error.IsOk()) {
if (registry_error == nullptr) {
registry_error = std::make_shared<Maybe<void>>(maybe_error);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

registry_error = maybe_error.error()

// register python op kernel
auto reg = user_op::UserOpRegistryMgr::Get()
.CheckAndGetOpKernelRegistry(op_module_name + "_forward")
.SetCreateFn<PyForwardKernel>()
.SetIsMatchedHob(
((user_op::HobDeviceTag() == "cpu") & (user_op::HobDeviceSubTag() == "py")));
user_op::UserOpRegistryMgr::Get().Register(reg.Finish().GetResult());
user_op::UserOpRegistryMgr::Get().Register(JUST(reg.Finish()).GetResult());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JUST

// register python grad op kernel
auto grad_reg = user_op::UserOpRegistryMgr::Get()
.CheckAndGetOpKernelRegistry(op_module_name + "_backward")
.SetCreateFn<PyBackwardKernel>()
.SetIsMatchedHob(((user_op::HobDeviceTag() == "cpu")
& (user_op::HobDeviceSubTag() == "py")));
user_op::UserOpRegistryMgr::Get().Register(grad_reg.Finish().GetResult());
user_op::UserOpRegistryMgr::Get().Register(JUST(grad_reg.Finish()).GetResult());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JUST

@liufengwei0103 liufengwei0103 force-pushed the fix_registry_may_segment branch 3 times, most recently from 735f8ac to 7834e75 Compare June 30, 2021 13:51
@liufengwei0103 liufengwei0103 force-pushed the fix_registry_may_segment branch from 3e04917 to 2ee7096 Compare July 1, 2021 02:15

namespace oneflow {

static std::shared_ptr<cfg::ErrorProto> registry_error;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::shared_ptr<cfg::ErrorProto>* MutRegistryError() {
    static std::shared_ptr<cfg::ErrorProto> registry_error;
    return registry_error;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

放在匿名空间里

Comment on lines 26 to 27
std::shared_ptr<cfg::ErrorProto> registry_error_inner = nullptr;
registry_error_inner.swap(registry_error);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::shared_ptr<cfg::ErrorProto> old_value = MutRegistryError();

void CatchRegistryError(const std::function<Maybe<void>()>& handler) {
const auto& maybe_error = TRY(handler());
if (!maybe_error.IsOk()) {
if (registry_error == nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (!*MutRegistryError())

@@ -24,6 +24,7 @@ limitations under the License.

namespace oneflow {

std::shared_ptr<Error> registry_error;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

去掉

@liufengwei0103 liufengwei0103 force-pushed the fix_registry_may_segment branch 2 times, most recently from f51226d to c7e5822 Compare July 1, 2021 07:14
return Maybe<void>::Ok();
}
std::shared_ptr<cfg::ErrorProto> registry_error_old = *MutRegistryError();
*MutRegistryError() = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MutRegistryError()->reset();

@liufengwei0103 liufengwei0103 requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 1, 2021 09:03
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 1, 2021 11:34
@liufengwei0103 liufengwei0103 force-pushed the fix_registry_may_segment branch 2 times, most recently from c7e5822 to 5073bd2 Compare July 2, 2021 03:05
@liufengwei0103 liufengwei0103 force-pushed the fix_registry_may_segment branch from 11d86a9 to 634f5aa Compare July 2, 2021 03:45
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 2, 2021 04:09
@oneflow-ci-bot oneflow-ci-bot self-requested a review July 2, 2021 05:22
@oneflow-ci-bot oneflow-ci-bot merged commit 52ecdb9 into master Jul 2, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the fix_registry_may_segment branch July 2, 2021 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants