Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel crashing during startup with suspend_state.connect_port) != (0) #17751

Open
BalestraPatrick opened this issue Mar 13, 2023 · 5 comments
Open
Labels
team-Core Skyframe, bazel query, BEP, options parsing, bazelrc type: bug untriaged

Comments

@BalestraPatrick
Copy link
Member

Description of the bug:

Recently we have increased our adoption of Bazel for local development, and we saw an increasing amount of crashes like the following:

Server crashed during startup. Now printing /private/var/tmp/_bazel_me/d49aba2f9d39f74147206e2dbc6caaba/server/jvm.out
FATAL: CHECK failed: (suspend_state.connect_port) != (0):

jvm.out simply contains the same fatal check.

This exception seems to take place somewhere around this code:

suspend_state.connect_port = IORegisterForSystemPower(

The only fix for this we have found is to restart the machine, which is quite annoying.

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

We haven't found specific steps to reproduce this error at the moment. People have reported that this might happen after not restarting the machine for a while, but still unclear on the exact conditions that trigger it.

Which operating system are you running Bazel on?

macOS 13.2.1

What is the output of bazel info release?

6.1.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@sgowroji sgowroji added type: bug more data needed team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website team-Performance Issues for Performance teams labels Mar 13, 2023
@sgowroji
Copy link
Member

Hi @BalestraPatrick, Please provide a minimal steps to reproduce the above issue. Thanks!

@BalestraPatrick
Copy link
Member Author

@sgowroji Unfortunately at this moment we couldn't find more steps to reproduce the issue other than simply "run any build with Bazel" which at times makes this issue occur.

@zhengwei143 zhengwei143 added team-Core Skyframe, bazel query, BEP, options parsing, bazelrc and removed team-Performance Issues for Performance teams team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website labels Mar 14, 2023
@layus
Copy link
Contributor

layus commented Oct 18, 2023

Also happened to me today. I do not know the reason either, but in the following code IORegisterForSystemPower fails and returns MACH_PORT_NULL (== 0)

suspend_state.connect_port = IORegisterForSystemPower(
&suspend_state, &notifyPortRef, SleepCallBack, &notifierObject);
BAZEL_CHECK_NE(suspend_state.connect_port, MACH_PORT_NULL);

API documentation mentions that that method may fail, but gives no reason for when this can happen.
https://developer.apple.com/documentation/iokit/1557114-ioregisterforsystempower#return_value

I have tried a few patches, but I found no simple way to proceed ignoring the error.
Just ignoring the error makes bazel fail later, in src/main/native/darwin/sleep_prevention_jni.cc.

@layus
Copy link
Contributor

layus commented Oct 18, 2023

Okay, just confirmed that it happens because of our sandbox, and not outside it. So it seems related to https://developer.apple.com/forums/thread/14691.

Concerning fixes, ignoring failures is an option, but if it is not okay for some reason, maybe adding a flag to disable sleep detection/inhibition would be acceptable ?

FYI, here is my current fix that disables everything https://gist.github.com/layus/bb9748a0eb5c0498567960d2e90b3a57

@layus
Copy link
Contributor

layus commented Oct 18, 2023

Digging deeper, we have managed to avoid the issue by adding (allow iokit-open (iokit-user-client-class "RootDomainUserClient")) to the sandbox. This succeeds at

suspend_state.connect_port = IORegisterForSystemPower(
&suspend_state, &notifyPortRef, SleepCallBack, &notifierObject);
BAZEL_CHECK_NE(suspend_state.connect_port, MACH_PORT_NULL);

but then fails in the closely related

IOReturn success = IOPMAssertionCreateWithName(
kIOPMAssertionTypeNoIdleSleep, kIOPMAssertionLevelOn, reasonForActivity,
&g_sleep_state_assertion);
BAZEL_CHECK_EQ(success, kIOReturnSuccess);

or

IOReturn success = IOPMAssertionRelease(g_sleep_state_assertion);
BAZEL_CHECK_EQ(success, kIOReturnSuccess);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Core Skyframe, bazel query, BEP, options parsing, bazelrc type: bug untriaged
Projects
None yet
Development

No branches or pull requests

4 participants