Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECOMPROMISED on running detox tests due to lockfile #4210

Open
1 task
Tracked by #4232
jayshah123 opened this issue Oct 9, 2023 · 47 comments
Open
1 task
Tracked by #4232

ECOMPROMISED on running detox tests due to lockfile #4210

jayshah123 opened this issue Oct 9, 2023 · 47 comments

Comments

@jayshah123
Copy link

jayshah123 commented Oct 9, 2023

What happened?

When running detox tests in CI, sometimes we get the following error around locking:

/Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:181
        onCompromised: (err) => { throw err; },
                                  ^
Error: Unable to update lock within the stale threshold
    at /Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:109:21
    at Object.newFs.<computed> [as utimes] (/Users/vagrant/git/node_modules/proper-lockfile/lib/adapter.js:20:13)
    at Timeout._onTimeout (/Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:100:20)
    at listOnTimeout (node:internal/timers:564:17)
    at processTimers (node:internal/timers:507:7) {
  code: 'ECOMPROMISED'
}

What was the expected behaviour?

Tests should run as expected and not fail.

Was it tested on latest Detox?

  • I have tested this issue on the latest Detox release and it still reproduces.

Help us reproduce this issue!

This is a relatively rare issue, and involved with concurrency so reproduces sparingly.
The ffrequency of rerpoduction of this issue is 1 in every 25 test runs.

In what environment did this happen?

Detox version: 20.6.0
React Native version: 0.70.7
Has Fabric (React Native's new rendering system) enabled: (yes/no) no
Node version: v18.13.0
Test-runner: jest

Detox logs

Detox logs
/Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:181
        onCompromised: (err) => { throw err; },
                                  ^
Error: Unable to update lock within the stale threshold
    at /Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:109:21
    at Object.newFs.<computed> [as utimes] (/Users/vagrant/git/node_modules/proper-lockfile/lib/adapter.js:20:13)
    at Timeout._onTimeout (/Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:100:20)
    at listOnTimeout (node:internal/timers:564:17)
    at processTimers (node:internal/timers:507:7) {
  code: 'ECOMPROMISED'
}
Node.js v18.13.0
05:04:28.091 detox[44829] E Command failed with exit code = 1:

Device logs

Device logs
paste logs here!

More data, please!

No response

@jayshah123 jayshah123 changed the title ECOMPROMISED on running detox tests ECOMPROMISED on running detox tests due to lockfile Oct 9, 2023
@noomorph
Copy link
Collaborator

noomorph commented Oct 9, 2023

@jayshah123 , could you please upgrade Detox to 20.13.0?

@jayshah123
Copy link
Author

jayshah123 commented Oct 9, 2023

Sure, is it fixed in that version?
If yes, it makes sense for me to upgrade. Are there any breaking changes from 20.6.0 to 20.13.0, I am assuming not based on what I see on CHANGELOG.

Can close if that is confirmed.

@noomorph
Copy link
Collaborator

noomorph commented Oct 9, 2023

Dear @jayshah123,

Thank you for your ongoing cooperation. To further investigate the issue you’re encountering, I kindly ask for your assistance in the following areas:

  1. Verification of Detox Version Number: Can you please clarify how you identified the version number of Detox? Specifically, did you locate this version number within node_modules/detox?
  2. Log Submission:
    • If you confirm that you are utilizing the latest version of Detox (20.13.0) and still experiencing the aforementioned bug, kindly record the logs with the following command: --record-logs all in Detox CLI.
    • Subsequently, please forward the generated log file (artifacts/.../detox.trace.json) to my email at Wix. You can find my contact details in my GitHub profile.
    • Should the logs contain any sensitive data, you are advised to redact this information prior to transmission. Your logs will be strictly confined to my work computer and mailbox and will not be disseminated further.

Background Context: Approximately a week ago, a substantial modification was made in the device allocation layer (inclusive of lockfiles), which could potentially be correlating to the edge case that requires resolution for you and any others facing similar issues.

Thanks in advance!

@noomorph noomorph self-assigned this Oct 9, 2023
@jayshah123
Copy link
Author

Used yarn.lock to find the version.

Verification of Detox Version Number: Can you please clarify how you identified the version number of Detox? Specifically, did you locate this version number within node_modules/detox?

Will be upgrading detox and take a look at this.
The issue is relatively rare even with current version (20.6.0), the rate of occurence is roughly 1 in 25 test runs.

If you confirm that you are utilizing the latest version of Detox (20.13.0) and still experiencing the aforementioned bug, kindly record the logs with the following command: --record-logs all in Detox CLI.
Subsequently, please forward the generated log file (artifacts/.../detox.trace.json) to my email at Wix. You can find my contact details in my GitHub profile.
Should the logs contain any sensitive data, you are advised to redact this information prior to transmission. Your logs will be strictly confined to my work computer and mailbox and will not be disseminated further.

@noomorph
Copy link
Collaborator

noomorph commented Oct 9, 2023

Okay, then I'll be waiting for your findings with the newest version

@adamivancza
Copy link

We're using 20.13.0 and same thing is happening

@noomorph
Copy link
Collaborator

noomorph commented Oct 12, 2023

@adamivancza, do you think you could send detox.trace.json from such build? Please take a look at my message above.

How many workers do you have running? Could it be that some of the workers exited abruptly while allocating a device? 🤔

I need to understand at which stage the lock file "gets compromised" – at the beginning or when nearing the teardown, etc, so timelines might be helpful.

@adamivancza
Copy link

@noomorph I'm going to enable full logging on Bitrise so I can send you this.

We have 2 workers running but this happens right after starting the tests - this is the full output of our tests:

+ yarn detox:test:ios:release --artifacts-location /Users/vagrant/deploy
yarn run v1.22.19
$ detox test --configuration ios.sim.release --record-logs failing --take-screenshots failing --retries 1 --maxWorkers 2 --record-videos failing --artifacts-location /Users/vagrant/deploy
07:19:44.583 detox[5774] B DETOX_TEST_IOS_APP_ARTIFACT="/var/folders/yy/6kcn9mkd5svdbqnznwf474f00000gn/T/_artifact_pull3876752115/[REDACTED].app" DETOX_TEST_APK_ARTIFACT="/var/folders/yy/6kcn9mkd5svdbqnznwf474f00000gn/T/_artifact_pull3876752115/apk" DETOX_ARTIFACTS_LOCATION="/Users/vagrant/deploy" DETOX_CONFIGURATION="ios.sim.release" DETOX_RECORD_LOGS="failing" DETOX_RECORD_VIDEOS="failing" DETOX_RETRIES=1 DETOX_TAKE_SCREENSHOTS="failing" jest --config e2e/jest.config.js --maxWorkers 2 e2e
/Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:181
        onCompromised: (err) => { throw err; },
                                  ^
Error: Unable to update lock within the stale threshold
    at /Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:109:21
    at newFs.<computed> [as utimes] (/Users/vagrant/git/node_modules/proper-lockfile/lib/adapter.js:20:13)
    at Timeout._onTimeout (/Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:100:20)
    at listOnTimeout (node:internal/timers:569:17)
    at process.processTimers (node:internal/timers:512:7) {
  code: 'ECOMPROMISED'
}
Node.js v18.17.1
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

@noomorph
Copy link
Collaborator

I think I can add a custom log message for onCompromised so that we can debug it further.

I'd be thrilled if you would agree to assist with this because I don't see anything like that on our CI with dozens of active projects using Detox. 😕

@noomorph
Copy link
Collaborator

One more thing – are you basically saying that Jest doesn't start tests at all, quitting early?

Another question: can it be that you have multiple detox test scripts running one after another?

@adamivancza
Copy link

I think I can add a custom log message for onCompromised so that we can debug it further.

I'd be thrilled if you would agree to assist with this because I don't see anything like that on our CI with dozens of active projects using Detox. 😕

absolutely @noomorph - lmk how can I help you to debug this.

@adamivancza
Copy link

One more thing – are you basically saying that Jest doesn't start tests at all, quitting early?

yes, tests don't even start - it quits before even starting tests.

Another question: can it be that you have multiple detox test scripts running one after another?

this is our full CI flow - before these steps, we only set up the CI (downloading the app that we want to test, setting up node etc):

- script@1:
        inputs:
        - content: |-
            #!/usr/bin/env bash
            # fail if any commands fails
            set -e
            # debug log
            set -x

            yarn detox clean-framework-cache && yarn detox build-framework-cache
        title: Cleaning and rebuilding detox framework cache
    - script@1:
        inputs:
        - content: |-
            #!/usr/bin/env bash
            # fail if any commands fails
            set -e
            # debug log
            set -x

            touch $BITRISE_DEPLOY_DIR/empty.txt
        title: Ensure that BITRISE_DEPLOY_DIR is not empty
    - script@1:
        timeout: 3600
        inputs:
        - content: |-
            #!/usr/bin/env bash
            # fail if any commands fails
            set -e
            # debug log
            set -x

            yarn detox:test:ios:release --artifacts-location $BITRISE_DEPLOY_DIR --record-logs all
        title: Run Detox tests
    - deploy-to-bitrise-io@2:
        inputs:
        - is_compress: "true"

@noomorph
Copy link
Collaborator

@adamivancza, I wonder if you could add this precautionary step right before running Detox tests.

detox reset-lock-file

Let me know if that changes anything or not.

@adamivancza
Copy link

Added that @noomorph - will report back if that helped or not!

@noomorph
Copy link
Collaborator

@adamivancza , any news?

@adamivancza
Copy link

detox reset-lock-file seems to sorted the issue - will keep an eye on this in case it happens again!

@noomorph
Copy link
Collaborator

Well, I'm happy and not happy at the same time. I'll try to find some time and get to know what causes ECOMPROMISED and simulate that somehow, and yet it's good to hear that you're presumably unblocked.

I'll leave this issue open since it still needs to be resolved.

@jayshah123
Copy link
Author

With the latest version, I am yet to see any ECOMPROMISED issues.

@lmcjt37
Copy link

lmcjt37 commented Nov 16, 2023

I have also experienced this issue on 20.13.0, I'm aware 20.13.5 is out. But not seeing anything in the changelog specific to this issue that would improve by upgrading further unless I'm wrong?

Haven't tried detox reset-lock-file yet, which I guess would be an interim solution. Are there any further updates?

@micahdasMA
Copy link

micahdasMA commented Jan 19, 2024

I am also experiencing this when trying to use --maxworkers 5 on Bitrise CI

Here is the relevant section of the bitrise.yml

    - npm@1:
        inputs:
        - command: run e2e:build
    - script@1:
        inputs:
        - content: detox reset-lock-file
    - npm@1:
        inputs:
        - command: run e2e:test:ci

Where e2e:build equates to"detox build -c ios.sim.debug"
and e2e:test:ci equates to detox test -c ios.sim.debug --record-videos failing --artifacts-location ./artifacts/detox_artifacts/ --maxWorkers 5

In my case, I get the same error as the OP when I don't include the detox reset-lock-file step, but when I include it, it is failing with the error. I am on Detox v20.14.8

/var/folders/yy/6kcn9mkd5svdbqnznwf474f00000gn/T/bitrise3476972599/step_src/._script_cont: line 1: detox: command not found

@micahdasMA
Copy link

@noomorph Have you been able to take a look at this issue?

@noomorph
Copy link
Collaborator

noomorph commented Jan 30, 2024

This issue seems to be rare, and I could not reproduce it in my experience at all (yes, unfortunately, this is so).

All that I managed to clarify is, this issue happens when one actor deletes a lockfile, whereas the other one considers it existing+locked, and then attempts to unlock it.

I'd be willing to help if you manage to set up a project which reproduces it fairly consistently.

@noomorph
Copy link
Collaborator

@micahdasMA Your issue with detox: command not found is not related to the lock issue, obviously enough. Either you rely on the global cli wrapper which is not (properly) installed, or you forgot to prepend npx before detox reset-lock-file.

@micahdasMA
Copy link

@noomorph Yes you are right, I forgot to add the npx, but after I updated the bitrise.yml file, I continue to get the original error

@adamivancza can you share your updated bitrise.yml?

@adamivancza
Copy link

@micahdasMA we have this step on Bitrise:

- script@1:
        timeout: 3600
        inputs:
        - content: |-
            #!/usr/bin/env bash
            # fail if any commands fails
            set -e
            # debug log
            set -x

            yarn detox reset-lock-file
            yarn detox:test:ios:release --artifacts-location $BITRISE_DEPLOY_DIR
        title: Run Detox tests

@micahdasMA
Copy link

Thanks! I had distinct steps to reset the lock file and to run the tests, but adding them together in a single step seems to do the trick!

@adamivancza
Copy link

that's great to hear @micahdasMA - happy that my suggestion helped :)

@micahdasMA
Copy link

I may have spoken too soon!

While the tests now indeed run with multiple workers, it seems that they consistently are not able to launch properly, with the issue
Screenshot 2024-02-05 at 12 47 43 PM

I have retries turned on, and the odd thing is that the tests seem to pass on the second attempt, but the first attempt every test fails with the same "Could not connect to development server" error. Have any of you run into this issue?

@noomorph
Copy link
Collaborator

noomorph commented Feb 6, 2024

Do you run in debug mode with react native bundler on CI, @micahdasMA ?

@micahdasMA
Copy link

micahdasMA commented Feb 6, 2024

@noomorph I am running it in debug mode w/ the Metro bundler on CI.

@micahdasMA
Copy link

@noomorph Any other information I can provide to help try and debug this?

@noomorph
Copy link
Collaborator

Hi, maybe detox.trace.log could help. You can send it to my work email in the profile.

@micahdasMA
Copy link

@noomorph Just sent you logs from the failed run!

@siddhantsoni
Copy link

Any update on this issue? I have started facing this once I started running multiple workers on CI, --maxWorkers 6 to be specific.

@micahdasMA
Copy link

Hi @noomorph Have you had a chance to look at the logs I sent?

@noomorph
Copy link
Collaborator

Hi, sorry I'll try to look tomorrow 🙏

@noomorph
Copy link
Collaborator

@micahdasMA, may I make an educated guess that your metro bundler crashes and exits at some point? Try to isolate problem. Maybe, copy and paste the same (and very basic) test suite and run it on multiple (3+) workers. If the problem persists, let me know. If the problem does not reproduce, try to locate which test suite makes your Metro bundler crash.

@siddhantsoni
Copy link

siddhantsoni commented Mar 5, 2024

I have started facing this issue frequently from the moment I started running tests on multiple workers. I faced the issue every alternate run with maxworkers = 6. Then I reduced maxWorkers = 4 and now I am facing this issue on every ~5th run on CI.

Error attached below:

+ npm rebuild detox
rebuilt dependencies successfully
+ yarn detox clean-framework-cache
yarn run v1.18.0
$ /Users/vagrant/git/node_modules/.bin/detox clean-framework-cache
Removing framework binaries from /Users/vagrant/Library/Detox
Done in 0.17s.
+ yarn detox build-framework-cache
yarn run v1.18.0
$ /Users/vagrant/git/node_modules/.bin/detox build-framework-cache
Extracting Detox framework...
Done
Done in 0.55s.
+ yarn detox reset-lock-file
yarn run v1.18.0
$ /Users/vagrant/git/node_modules/.bin/detox reset-lock-file
Cleaned lock file at: /Users/vagrant/Library/Detox/device.registry.json
Done in 0.15s.
+ BITRISEIO_PIPELINE_ID=f2ba168e-8f03-4d31-b52d-2b3e327d76aa
+ yarn test-detox-ios ./integration-tests/scenarios/Benefits/benefits.detox.ts ./integration-tests/scenarios/Core/core.detox.ts ./integration-tests/scenarios/Core/imagepickermocking.detox.ts ./integration-tests/scenarios/Core/qrscaner.detox.ts ./integration-tests/scenarios/HRIS/hris.detox.ts ./integration-tests/scenarios/Insurance/insurance.detox.ts ./integration-tests/scenarios/Payroll/payroll.detox.ts ./integration-tests/scenarios/Platform/branchDeepLinks.detox.ts ./integration-tests/scenarios/Platform/branchDeepLinksWithLogin.detox.ts ./integration-tests/scenarios/Platform/forceUpdate.detox.ts ./integration-tests/scenarios/Platform/homeAndBack.detox.ts ./integration-tests/scenarios/Platform/homeAndBackWithLogin.detox.ts ./integration-tests/scenarios/Platform/offlineLoad.detox.ts ./integration-tests/scenarios/Pto/deeplink.detox.ts ./integration-tests/scenarios/Pto/pto.detox.ts ./integration-tests/scenarios/Scheduling/scheduling.detox.ts ./integration-tests/scenarios/SpendManagement/deepLink.detox.ts ./integration-tests/scenarios/SpendManagement/expense.detox.ts ./integration-tests/scenarios/SpendManagement/pre-approval.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/appSanity.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/approvalTabSanity.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/cardPageSanity.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/submitNewActionSanity.detox.ts ./integration-tests/scenarios/TimeTracking/addEditTimeEntry.detox.ts ./integration-tests/scenarios/TimeTracking/deeplink.detox.ts ./integration-tests/scenarios/TimeTracking/geolocation.detox.ts ./integration-tests/scenarios/TimeTracking/tna.detox.ts ./integration-tests/scenarios/Travel/sanity.detox.ts ./integration-tests/scenarios/Webapp/webapp.detox.ts
yarn run v1.18.0
$ detox test --maxWorkers=4 --record-videos all --record-logs all -c ios.sim.release --headless --cleanup ./integration-tests/scenarios/Benefits/benefits.detox.ts ./integration-tests/scenarios/Core/core.detox.ts ./integration-tests/scenarios/Core/imagepickermocking.detox.ts ./integration-tests/scenarios/Core/qrscaner.detox.ts ./integration-tests/scenarios/HRIS/hris.detox.ts ./integration-tests/scenarios/Insurance/insurance.detox.ts ./integration-tests/scenarios/Payroll/payroll.detox.ts ./integration-tests/scenarios/Platform/branchDeepLinks.detox.ts ./integration-tests/scenarios/Platform/branchDeepLinksWithLogin.detox.ts ./integration-tests/scenarios/Platform/forceUpdate.detox.ts ./integration-tests/scenarios/Platform/homeAndBack.detox.ts ./integration-tests/scenarios/Platform/homeAndBackWithLogin.detox.ts ./integration-tests/scenarios/Platform/offlineLoad.detox.ts ./integration-tests/scenarios/Pto/deeplink.detox.ts ./integration-tests/scenarios/Pto/pto.detox.ts ./integration-tests/scenarios/Scheduling/scheduling.detox.ts ./integration-tests/scenarios/SpendManagement/deepLink.detox.ts ./integration-tests/scenarios/SpendManagement/expense.detox.ts ./integration-tests/scenarios/SpendManagement/pre-approval.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/appSanity.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/approvalTabSanity.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/cardPageSanity.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/submitNewActionSanity.detox.ts ./integration-tests/scenarios/TimeTracking/addEditTimeEntry.detox.ts ./integration-tests/scenarios/TimeTracking/deeplink.detox.ts ./integration-tests/scenarios/TimeTracking/geolocation.detox.ts ./integration-tests/scenarios/TimeTracking/tna.detox.ts ./integration-tests/scenarios/Travel/sanity.detox.ts ./integration-tests/scenarios/Webapp/webapp.detox.ts
05:00:54.459 detox[42115] B jest --config integration-tests/config.js --maxWorkers 4 ./integration-tests/scenarios/Benefits/benefits.detox.ts ./integration-tests/scenarios/Core/core.detox.ts ./integration-tests/scenarios/Core/imagepickermocking.detox.ts ./integration-tests/scenarios/Core/qrscaner.detox.ts ./integration-tests/scenarios/HRIS/hris.detox.ts ./integration-tests/scenarios/Insurance/insurance.detox.ts ./integration-tests/scenarios/Payroll/payroll.detox.ts ./integration-tests/scenarios/Platform/branchDeepLinks.detox.ts ./integration-tests/scenarios/Platform/branchDeepLinksWithLogin.detox.ts ./integration-tests/scenarios/Platform/forceUpdate.detox.ts ./integration-tests/scenarios/Platform/homeAndBack.detox.ts ./integration-tests/scenarios/Platform/homeAndBackWithLogin.detox.ts ./integration-tests/scenarios/Platform/offlineLoad.detox.ts ./integration-tests/scenarios/Pto/deeplink.detox.ts ./integration-tests/scenarios/Pto/pto.detox.ts ./integration-tests/scenarios/Scheduling/scheduling.detox.ts ./integration-tests/scenarios/SpendManagement/deepLink.detox.ts ./integration-tests/scenarios/SpendManagement/expense.detox.ts ./integration-tests/scenarios/SpendManagement/pre-approval.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/appSanity.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/approvalTabSanity.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/cardPageSanity.detox.ts ./integration-tests/scenarios/SpendManagement/sanity/submitNewActionSanity.detox.ts ./integration-tests/scenarios/TimeTracking/addEditTimeEntry.detox.ts ./integration-tests/scenarios/TimeTracking/deeplink.detox.ts ./integration-tests/scenarios/TimeTracking/geolocation.detox.ts ./integration-tests/scenarios/TimeTracking/tna.detox.ts ./integration-tests/scenarios/Travel/sanity.detox.ts ./integration-tests/scenarios/Webapp/webapp.detox.ts
/Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:181
        onCompromised: (err) => { throw err; },
                                  ^
Error: Unable to update lock within the stale threshold
    at /Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:109:21
    at newFs.<computed> [as utimes] (/Users/vagrant/git/node_modules/proper-lockfile/lib/adapter.js:20:13)
    at Timeout._onTimeout (/Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:100:20)
    at listOnTimeout (node:internal/timers:564:17)
    at process.processTimers (node:internal/timers:507:7) {
  code: 'ECOMPROMISED'
}
Node.js v18.13.0
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

I have also added yarn detox reset-lock-file to our CI script but it has not helped with the failures. Attaching below the CI script code we run on Bitrise:

- script@1:
          title: Test iOS App
          timeout: 3300
          inputs:
            - content: |-
                #!/usr/bin/env bash
                # fail if any commands fails
                set -ex
                npm rebuild detox
                yarn detox clean-framework-cache && yarn detox build-framework-cache
                yarn detox reset-lock-file
                BITRISEIO_PIPELINE_ID=$BITRISEIO_PIPELINE_ID yarn test-detox-ios $SELECTED_TESTS

Here yarn test-detox-ios expands to detox test --maxWorkers=4 --record-videos all --record-logs all -c ios.sim.release --headless --cleanup

Versions used:

"detox": "20.14.8"
"jest": "^29.4.0"

I checked the artifacts folder but could not find any created so can't attach a trace/log file as none got generated.
@noomorph Let me know if you can find some time to help me with this issue. Any help would be greatly appreciated!

@noomorph
Copy link
Collaborator

noomorph commented Mar 5, 2024

Okay, so I'll see if I can reproduce this on a super big number of workers.

@siddhantsoni
Copy link

siddhantsoni commented Mar 5, 2024

Forgot to mention that the issue is happening exclusively on iOS. For android I am able to run the tests on 6 to 8 workers easily without this issue popping up.

@siddhantsoni
Copy link

@noomorph Any luck reproducing the issue? I was able to reduce the cadence of this error by increasing the stale timeout of proper-lockfile to 20s from the default 10s. But the issue still pops up from time to time.

@noomorph
Copy link
Collaborator

noomorph commented Mar 11, 2024

@siddhantsoni No, maybe I need a slower CI agent. On this build with six workers, there was one flakiness, but it was related to iOS permissions, not to lock files. Unfortunately, I don't have a few free days to play with the CI configuration.

@huszzsolt
Copy link

We have this issue on Bitrise with maxworkers=2 (on a Medium machine):

$ detox test --configuration ios.sim.staging -R 2 --record-logs all --record-videos failing --take-screenshots all --capture-view-hierarchy enabled --maxWorkers 2 $DETOX_PARAMS --reuse
16:36:35.781 detox[20998] B DETOX_APP="DetoxApp_22222" DETOX_CAPTURE_VIEW_HIERARCHY="enabled" DETOX_CONFIGURATION="ios.sim.staging" DETOX_RECORD_LOGS="all" DETOX_RECORD_VIDEOS="failing" DETOX_RETRIES=2 DETOX_REUSE=[REDACTED] DETOX_TAKE_SCREENSHOTS="all" jest --config e2e/jest.config.js --maxWorkers 2 e2e
/Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:181
        onCompromised: (err) => { throw err; },
                                  ^
Error: Unable to update lock within the stale threshold
    at /Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:109:21
    at newFs.<computed> [as utimes] (/Users/vagrant/git/node_modules/proper-lockfile/lib/adapter.js:20:13)
    at Timeout._onTimeout (/Users/vagrant/git/node_modules/proper-lockfile/lib/lockfile.js:100:20)
    at listOnTimeout (node:internal/timers:569:17)
    at process.processTimers (node:internal/timers:512:7) {
  code: 'ECOMPROMISED'
}
Node.js v18.20.2
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

It fails quite often maybe 1 times out of 10.

   "detox": "20.22.2".

It looks like it started after upgrading the build stack from Xcode 14.2 to 15.4.

@owens-ben
Copy link

Having this issue for 5 or more workers

@siddhantsoni
Copy link

I am facing this issue with 5 parallel workers on a high capacity Bitrise machine:
Xcode 15.2 M1 Max Large (10 CPU/54 GB RAM)

I was able to reduce the frequency of this issue from 1 in 5 builds to roughly 1 in 20 builds by creating the following patch:

diff --git a/node_modules/detox/src/utils/ExclusiveLockfile.js b/node_modules/detox/src/utils/ExclusiveLockfile.js
index da29ae9..45ad4a3 100644
--- a/node_modules/detox/src/utils/ExclusiveLockfile.js
+++ b/node_modules/detox/src/utils/ExclusiveLockfile.js
@@ -107,7 +107,7 @@ class ExclusiveLockfile {
     this._ensureFileExists();
 
     await retry(this._options.retry, () => {
-      const operationResult = plockfile.lockSync(this._lockFilePath);
+      const operationResult = plockfile.lockSync(this._lockFilePath, { stale: 20000 });
 
       this._isLocked = true;
       this._invalidate();

detox+20.25.6.patch

This patch just prevents the issue but doesn't solve it completely.

@cortisiko
Copy link

Any updates here? I am encountering the same issue, which is causing instability with builds on bitrise. I am running 3 workers, by the way.

@kagrawal98
Copy link

kagrawal98 commented Sep 11, 2024

I am facing this issue with 5 parallel workers on a high capacity Bitrise machine:
Xcode 15.2 M1 Max Large (10 CPU/54 GB RAM)

@noomorph any updates on this issue?
The errors has started showing up again even after the patch is applied

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants