-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECOMPROMISED on running detox tests due to lockfile #4210
Comments
@jayshah123 , could you please upgrade Detox to 20.13.0? |
Sure, is it fixed in that version? Can close if that is confirmed. |
Dear @jayshah123, Thank you for your ongoing cooperation. To further investigate the issue you’re encountering, I kindly ask for your assistance in the following areas:
Background Context: Approximately a week ago, a substantial modification was made in the device allocation layer (inclusive of lockfiles), which could potentially be correlating to the edge case that requires resolution for you and any others facing similar issues. Thanks in advance! |
Used
Will be upgrading detox and take a look at this.
|
Okay, then I'll be waiting for your findings with the newest version |
We're using 20.13.0 and same thing is happening |
@adamivancza, do you think you could send How many workers do you have running? Could it be that some of the workers exited abruptly while allocating a device? 🤔 I need to understand at which stage the lock file "gets compromised" – at the beginning or when nearing the teardown, etc, so timelines might be helpful. |
@noomorph I'm going to enable full logging on Bitrise so I can send you this. We have 2 workers running but this happens right after starting the tests - this is the full output of our tests:
|
I think I can add a custom log message for I'd be thrilled if you would agree to assist with this because I don't see anything like that on our CI with dozens of active projects using Detox. 😕 |
One more thing – are you basically saying that Jest doesn't start tests at all, quitting early? Another question: can it be that you have multiple detox test scripts running one after another? |
absolutely @noomorph - lmk how can I help you to debug this. |
yes, tests don't even start - it quits before even starting tests.
this is our full CI flow - before these steps, we only set up the CI (downloading the app that we want to test, setting up node etc):
|
@adamivancza, I wonder if you could add this precautionary step right before running Detox tests.
Let me know if that changes anything or not. |
Added that @noomorph - will report back if that helped or not! |
@adamivancza , any news? |
|
Well, I'm happy and not happy at the same time. I'll try to find some time and get to know what causes ECOMPROMISED and simulate that somehow, and yet it's good to hear that you're presumably unblocked. I'll leave this issue open since it still needs to be resolved. |
With the latest version, I am yet to see any ECOMPROMISED issues. |
I have also experienced this issue on 20.13.0, I'm aware 20.13.5 is out. But not seeing anything in the changelog specific to this issue that would improve by upgrading further unless I'm wrong? Haven't tried |
I am also experiencing this when trying to use Here is the relevant section of the bitrise.yml
Where In my case, I get the same error as the OP when I don't include the
|
@noomorph Have you been able to take a look at this issue? |
This issue seems to be rare, and I could not reproduce it in my experience at all (yes, unfortunately, this is so). All that I managed to clarify is, this issue happens when one actor deletes a lockfile, whereas the other one considers it existing+locked, and then attempts to unlock it. I'd be willing to help if you manage to set up a project which reproduces it fairly consistently. |
@micahdasMA Your issue with |
@noomorph Yes you are right, I forgot to add the @adamivancza can you share your updated bitrise.yml? |
@micahdasMA we have this step on Bitrise:
|
Thanks! I had distinct steps to reset the lock file and to run the tests, but adding them together in a single step seems to do the trick! |
that's great to hear @micahdasMA - happy that my suggestion helped :) |
Do you run in debug mode with react native bundler on CI, @micahdasMA ? |
@noomorph I am running it in debug mode w/ the Metro bundler on CI. |
@noomorph Any other information I can provide to help try and debug this? |
Hi, maybe detox.trace.log could help. You can send it to my work email in the profile. |
@noomorph Just sent you logs from the failed run! |
Any update on this issue? I have started facing this once I started running multiple workers on CI, |
Hi @noomorph Have you had a chance to look at the logs I sent? |
Hi, sorry I'll try to look tomorrow 🙏 |
@micahdasMA, may I make an educated guess that your metro bundler crashes and exits at some point? Try to isolate problem. Maybe, copy and paste the same (and very basic) test suite and run it on multiple (3+) workers. If the problem persists, let me know. If the problem does not reproduce, try to locate which test suite makes your Metro bundler crash. |
I have started facing this issue frequently from the moment I started running tests on multiple workers. I faced the issue every alternate run with Error attached below:
I have also added
Here Versions used:
I checked the |
Okay, so I'll see if I can reproduce this on a super big number of workers. |
Forgot to mention that the issue is happening exclusively on iOS. For android I am able to run the tests on 6 to 8 workers easily without this issue popping up. |
@noomorph Any luck reproducing the issue? I was able to reduce the cadence of this error by increasing the |
@siddhantsoni No, maybe I need a slower CI agent. On this build with six workers, there was one flakiness, but it was related to iOS permissions, not to lock files. Unfortunately, I don't have a few free days to play with the CI configuration. |
We have this issue on Bitrise with maxworkers=2 (on a Medium machine):
It fails quite often maybe 1 times out of 10.
It looks like it started after upgrading the build stack from Xcode 14.2 to 15.4. |
Having this issue for 5 or more workers |
I am facing this issue with 5 parallel workers on a high capacity Bitrise machine: I was able to reduce the frequency of this issue from diff --git a/node_modules/detox/src/utils/ExclusiveLockfile.js b/node_modules/detox/src/utils/ExclusiveLockfile.js
index da29ae9..45ad4a3 100644
--- a/node_modules/detox/src/utils/ExclusiveLockfile.js
+++ b/node_modules/detox/src/utils/ExclusiveLockfile.js
@@ -107,7 +107,7 @@ class ExclusiveLockfile {
this._ensureFileExists();
await retry(this._options.retry, () => {
- const operationResult = plockfile.lockSync(this._lockFilePath);
+ const operationResult = plockfile.lockSync(this._lockFilePath, { stale: 20000 });
this._isLocked = true;
this._invalidate(); This patch just prevents the issue but doesn't solve it completely. |
Any updates here? I am encountering the same issue, which is causing instability with builds on bitrise. I am running 3 workers, by the way. |
@noomorph any updates on this issue? |
What happened?
When running detox tests in CI, sometimes we get the following error around locking:
What was the expected behaviour?
Tests should run as expected and not fail.
Was it tested on latest Detox?
Help us reproduce this issue!
This is a relatively rare issue, and involved with concurrency so reproduces sparingly.
The ffrequency of rerpoduction of this issue is 1 in every 25 test runs.
In what environment did this happen?
Detox version: 20.6.0
React Native version: 0.70.7
Has Fabric (React Native's new rendering system) enabled: (yes/no) no
Node version: v18.13.0
Test-runner: jest
Detox logs
Detox logs
Device logs
Device logs
More data, please!
No response
The text was updated successfully, but these errors were encountered: