The operation was canceled. #2468

magnetnation · 2023-03-01T07:01:21Z

Describe the bug
Since last week actions in our different repositories started to fail with similar error:

#[error]The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.

To Reproduce
Steps to reproduce the behavior:

Go to Actions
Run or re-run any job
See error mentioned above

Expected behavior
Workflow should be running without cancellations.

Runner Version and Platform

Image: ubuntu-22.04
Version: 20230219.1
Included Software: https://github.com/actions/runner-images/blob/ubuntu22/20230219.1/images/linux/Ubuntu2204-Readme.md
Image Release: https://github.com/actions/runner-images/releases/tag/ubuntu22%2F20230219.1

OS of the machine running the runner?
Linux

What's not working?

Workflow is failing without any reason stated in the logs, apart from it has been cancelled.

Job Log Output

2023-03-01T06:49:56.3959504Z > @mgnation/mgdata@1.4.212 _bundle
2023-03-01T06:49:56.3960408Z > node build/bundle.js
2023-03-01T06:49:56.3960710Z
2023-03-01T06:49:56.8884234Z �[32mBundling has started�[0m
2023-03-01T06:49:56.9282005Z �[34mCopy results completed�[0m
2023-03-01T06:49:56.9290747Z �[34mAllow publish completed�[0m
2023-03-01T06:49:56.9295155Z build: 39.141ms
2023-03-01T06:49:56.9302039Z �[32mBundling has finished�[0m
2023-03-01T06:49:57.2702808Z
2023-03-01T06:49:57.2708891Z > @mgnation/mgdata@1.4.212 test
2023-03-01T06:49:57.2709823Z > node ./node_modules/nyc/bin/nyc.js node ./tmp/spec/runner.js
2023-03-01T06:49:57.2710190Z
2023-03-01T06:50:56.1865343Z ##[error]The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
2023-03-01T06:50:56.2969758Z ##[debug]Re-evaluate condition on job cancellation for step: 'npm install, build, and test'.
2023-03-01T06:50:56.2973354Z ##[debug]Skip Re-evaluate condition on runner shutdown.
2023-03-01T06:50:56.5597757Z ----------|---------|----------|---------|---------|-------------------
2023-03-01T06:50:56.5612354Z File | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s
2023-03-01T06:50:56.5616821Z ----------|---------|----------|---------|---------|-------------------
2023-03-01T06:50:56.5618821Z All files | 0 | 0 | 0 | 0 |
2023-03-01T06:50:56.5620688Z ----------|---------|----------|---------|---------|-------------------
2023-03-01T06:50:56.6038836Z ##[error]The operation was canceled.
2023-03-01T06:50:56.6052060Z ##[debug]System.OperationCanceledException: The operation was canceled.
2023-03-01T06:50:56.6060563Z ##[debug] at System.Threading.CancellationToken.ThrowOperationCanceledException()
2023-03-01T06:50:56.6063113Z ##[debug] at GitHub.Runner.Sdk.ProcessInvoker.ExecuteAsync(String workingDirectory, String fileName, String arguments, IDictionary2 environment, Boolean requireExitCodeZero, Encoding outputEncoding, Boolean killProcessOnCancel, Channel1 redirectStandardIn, Boolean inheritConsoleHandler, Boolean keepStandardInOpen, Boolean highPriorityProcess, CancellationToken cancellationToken)
2023-03-01T06:50:56.6065859Z ##[debug] at GitHub.Runner.Common.ProcessInvokerWrapper.ExecuteAsync(String workingDirectory, String fileName, String arguments, IDictionary2 environment, Boolean requireExitCodeZero, Encoding outputEncoding, Boolean killProcessOnCancel, Channel1 redirectStandardIn, Boolean inheritConsoleHandler, Boolean keepStandardInOpen, Boolean highPriorityProcess, CancellationToken cancellationToken)
2023-03-01T06:50:56.6069900Z ##[debug] at GitHub.Runner.Worker.Handlers.DefaultStepHost.ExecuteAsync(IExecutionContext context, String workingDirectory, String fileName, String arguments, IDictionary`2 environment, Boolean requireExitCodeZero, Encoding outputEncoding, Boolean killProcessOnCancel, Boolean inheritConsoleHandler, String standardInInput, CancellationToken cancellationToken)
2023-03-01T06:50:56.6078953Z ##[debug] at GitHub.Runner.Worker.Handlers.ScriptHandler.RunAsync(ActionRunStage stage)
2023-03-01T06:50:56.6079410Z ##[debug] at GitHub.Runner.Worker.ActionRunner.RunAsync()
2023-03-01T06:50:56.6086736Z ##[debug] at GitHub.Runner.Worker.StepsRunner.RunStepAsync(IStep step, CancellationToken jobCancellationToken)
2023-03-01T06:50:56.6102555Z ##[debug]Finishing: npm install, build, and test
2023-03-01T06:50:56.6376954Z ##[debug]Evaluating condition for step: 'Post Use Node.js 16.x'
2023-03-01T06:50:56.6379186Z ##[debug]Skip evaluate condition on runner shutdown.
2023-03-01T06:50:56.6392406Z ##[debug]Evaluating condition for step: 'Post Run actions/checkout@v3'
2023-03-01T06:50:56.6392834Z ##[debug]Skip evaluate condition on runner shutdown.
2023-03-01T06:50:56.6821240Z ##[debug]Starting: Complete job
2023-03-01T06:50:56.6833228Z Uploading runner diagnostic logs
2023-03-01T06:50:56.7043348Z ##[debug]Starting diagnostic file upload.
2023-03-01T06:50:56.7046324Z ##[debug]Setting up diagnostic log folders.
2023-03-01T06:50:56.7306655Z ##[debug]Creating diagnostic log files folder.
2023-03-01T06:50:56.7419356Z ##[debug]Copying 1 worker diagnostic logs.
2023-03-01T06:50:56.7470424Z ##[debug]Copying 1 runner diagnostic logs.
2023-03-01T06:50:56.7539403Z ##[debug]Zipping diagnostic files.
2023-03-01T06:50:56.8067735Z ##[debug]Uploading diagnostic metadata file.
2023-03-01T06:50:56.8441811Z ##[debug]Diagnostic file upload complete.
2023-03-01T06:50:56.8445882Z Completed runner diagnostic log upload
2023-03-01T06:50:56.8450728Z Cleaning up orphan processes
2023-03-01T06:50:56.9100720Z ##[debug]Finishing: Complete job
2023-03-01T06:50:56.9301578Z ##[debug]Finishing: build (16.x)

The text was updated successfully, but these errors were encountered:

jarreds · 2023-03-02T01:39:32Z

This has recently stated happening at an onerous frequency in our Bazel monorepo build action. Self-hosted runner. Happy to provide any troubleshooting info I can.

Summary:

Bazel Build
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
Bazel Build
The operation was canceled.

[2023-03-02 03:46:14Z INFO Runner] Received Ctrl-C signal, stop Runner.Listener and Runner.Worker.
[2023-03-02 03:46:14Z INFO HostContext] Runner will be shutdown for UserCancelled

[2023-03-02 03:48:24Z INFO Worker] Cancellation/Shutdown message received.
[2023-03-02 03:48:24Z INFO HostContext] Runner will be shutdown for UserCancelled
[2023-03-02 03:48:24Z INFO StepsRunner] Cancel current running step.

These events are happening despite us not clicking cancel on the action.

ruvceskistefan · 2023-03-02T07:44:41Z

Hey all,

@magnetnation are you running hosted runner or using our image for self-hosted runner?
Also, could you provide us with job URLs(it's ok if the repo is private) so we can check the logs?

magnetnation · 2023-03-02T08:19:04Z

Hi,

We are using hosted runners, and here is an example of failed run.
And another repository failed run, and another one.

matsest · 2023-03-06T08:25:42Z

Also see this a lot in various PowerShell commands that run on GitHub runners. Seems to be very unpredictable, but definitely seeing this a lot across multiple jobs and commands. Have not seen this previously to this extent.

mathemaphysics · 2023-03-08T17:32:22Z

I'm seeing the same thing here with a fairly large CMake C++ project build. It's behaving as if it has no swap and runs out of memory. I've had it happen in a WSL docker container before. I ended up needing to allocate more RAM to the virtual machine. I have no idea how that would translate to this situation.

magnetnation · 2023-03-14T10:40:03Z

Any update on this issue?
None of our actions can go through since then.

matsest · 2023-03-14T14:29:16Z

Same here, still seeing these issues @ruvceskistefan

igordrnobrega · 2023-03-15T18:56:07Z

Any updated on this? I've this issue for a week now and nothing seems to be solving it

nuhkoca · 2023-03-20T02:45:07Z

We have also been facing this issue very frequently for weeks now even with a 8-core larger runner. How can I upload a shutdown log before the runner dies? It could have given an insight to me actually.

credativ-dar · 2023-03-20T12:26:12Z

@ruvceskistefan I think the tag "awaiting-customer-response" is not appropriate any more, could you remove it?

I think I have the same issue:
https://github.com/credativ/vali/actions/runs/4467520060/jobs/7847403788

 make: *** [Makefile:245: lint] Killed
Error: The operation was canceled.

Could this be an OOM killer?

Update:
For me it was 100% an out of memory situation, if you are using golangci-lint and upgraded to golang 1.20 this is your bug: golangci/golangci-lint#3470

Feature request for better OOM-reporting: https://github.com/orgs/community/discussions/50571

edergillian-eeg · 2023-03-22T02:51:17Z

This issue started happening after I transferred a repository from one organization to another. It was working when the repo was on the other organization and now all jobs are being cancelled, no exception.

There's no timeout set on the workflow file and the cancellation can happen anytime between ~30s and ~1m.

Both organizations have exactly the same configuration and paid plan. Other repositories' workflows in the organization run just fine, including other transferred repositories.

When I re-ran the failed jobs with debug log enabled, the only error I get on the Runner logs is this:

[2023-03-22 02:35:37Z ERR  GitHubActionsService] GET request to https://pipelines.actions.githubusercontent.com/<big_hash>/_apis/distributedtask/pools/2/messages?sessionId=<big_session_id>&lastMessageId=597&status=Online&runnerVersion=2.303.0 failed. HTTP Status: Forbidden, AFD Ref: Ref A: BFA7EEED124B494B83E44B4AB4DCA200 Ref B: BN3EDGE0703 Ref C: 2023-03-22T02:35:37Z
[2023-03-22 02:35:37Z INFO MessageListener] Runner OAuth token has been revoked. Unable to pull message.

Milamary · 2023-03-24T19:59:36Z

Having similar issue with Runner version: '2.303.0', Runner: 'ubuntu-latest-16-cores'.
It randomly fails builds with - Gradle build daemon disappeared unexpectedly (it may have been killed or may have crashed)

bvdmitri · 2023-03-31T16:52:08Z

Having similar issue here: https://github.com/biaslab/ReactiveMP.jl/actions/runs/4576751595/jobs/8081938027

jsjoeio · 2023-04-28T21:20:49Z

This is happening to us as well on a monorepo (Turborepo) in the Lint job (running with ESLint).

nuhkoca · 2023-04-29T00:17:09Z

We fixed it by limiting workers on a 8 Core machine with:

org.gradle.workers.max=4

mathemaphysics · 2023-04-30T22:19:23Z

@nuhcoka I'm confused because my CMake build should only run a single thread unless the -j flag is given (when build system is make or ninja). That's why I assumed it wasn't the issue.

Maybe the default setting changed.

nuhkoca · 2023-05-01T00:18:26Z

@mathemaphysics but Gradle by default uses all cores no?. The official description of the flag:

org.gradle.workers.max=(max # of worker processes)
When configured, Gradle will use a maximum of the given number of workers. See also performance command-line options. Default is number of CPU processors.

sundarvenkata-EBI · 2023-05-07T06:47:10Z

We are having a similar issue as well where we have a process running for a really long time (like 6 hours) before it fails for no reason. See here

netsgnut · 2023-05-26T14:05:09Z

We are having a similar issue as well where we have a process running for a really long time (like 6 hours) before it fails for no reason. See here

I think your case may be different from others here. It is related to the 6-hour job execution limit instead. From the docs,

Job execution time - Each job in a workflow can run for up to 6 hours of execution time. If a job reaches this limit, the job is terminated and fails to complete.

deathemperor · 2023-06-01T15:18:46Z

We're constantly having issue with this. it's unusable. until it's fixed we have to run it manually every time. link to action https://github.com/papaya-insurtech/berry/actions/runs/5131060157/jobs/9230656685

bvdmitri · 2023-06-02T08:03:48Z

We were able to fix it on our side, we had a mistake in our codebase that caused enormous amount of unnecessary allocations. We no longer have any issues after the mistake has been fixed. It looks like the "The operation was cancelled" error was definitely an OOM error, but getting a better error message would be nicer.

nuhkoca · 2023-06-02T10:44:39Z

We were able to fix it on our side, we had a mistake in our codebase that caused enormous amount of unnecessary allocations. We no longer have any issues after the mistake has been fixed. It looks like the "The operation was cancelled" error was definitely an OOM error, but getting a better error message would be nicer.

Hey @bvdmitri, what was the fix exactly? Maybe it can shed some light on ours, too 🙂

bvdmitri · 2023-06-02T10:47:48Z

@nuhkoca
Well, we refactored our code such that it allocated and used less memory.
Nothing specific to the Github actions runner.

a1300 · 2023-07-26T09:12:45Z

For me roughly every third github action failed. Increasing the swap space to 10GB on the ubuntu-latest runner with github action pierotofy/set-swap-space fixed the problem for me.

      - name: Set Swap Space
        uses: pierotofy/set-swap-space@master
        with:
          swap-size-gb: 10

nuhkoca · 2023-07-26T16:04:51Z

We also switched over to a new next gen garbage collector and it fixed most OOM problems.

instead of -XX:+UseParallelGC

-XX:+UseConcMarkSweepGC

This flag is needed to activate the CMS Collector in the first place. By default, HotSpot uses the Throughput Collector instead.

-XX:+UseParNewGC

When the CMS collector is used, this flag activates the parallel execution of young generation GCs using multiple threads. It may seem surprising at first that we cannot simply reuse the flag -XX:+UseParallelGC known from the Throughput Collector, because conceptually the young generation GC algorithms used are the same. However, since the interplay between the young generation GC algorithm and the old generation GC algorithm is different with the CMS collector, there are two different implementations of young generation GC and thus two different flags.

https://www.codecentric.de/wissens-hub/blog/useful-jvm-flags-part-7-cms-collector

Current runners got newer Ubuntu, but this breaks our piuparts run. Program lsof is running with 100% CPU for ~15 minutes and runner cancel that job after that. * actions/runner#2468 * actions/runner-images#7188

miri runs are failing lately. Checking if this is related to bigger RAM consumption of miri in latest releases. See [1] for the proposed solution. --- [1] actions/runner#2468 (comment)

…#2468#issuecomment-1651313943), and flake8-black

kaanx022 · 2024-05-11T08:04:44Z

for me nothing works. I couldn't even find an image, or an example that works. Tried in a matrix, java and SDK combinations, all images, all platforms, all the different things.

This is so frustrating. Why won't you people give us an example that just works ? Every single time I have to build a CI from scratch every few years, I have to go through this god damn rabbit hole. What is wrong with you people

https://github.com/kaanx022/kaan/actions/runs/9042209780/job/24848448703

kaanx022 · 2024-05-11T08:45:15Z

ok on ubuntu some of them are actually running with the perfect setup, the big matrix table is here:

https://github.com/kaanx022/kaan/actions/runs/9042456492/job/24848994152

But we shouldn't have to run a matrix of ALL POSSIBLE COMBINATIONS just to figure out the one that works.

Based on: actions/runner#2468 Test failures appear to occur due to an out-of-memory error. This attempts to increase the swap-space size on Ubuntu.

wffurr · 2024-09-11T18:39:41Z

This seems to still be happening, e.g. with sudo apt-get install -y libtinfo5. See also actions/runner-images#9959. Removing the needrestart service seems to work around the issue.

magnetnation added the bug Something isn't working label Mar 1, 2023

ruvceskistefan added the awaiting-customer-response label Mar 2, 2023

credativ-dar mentioned this issue Mar 20, 2023

Release v2.2.4 credativ/vali#15

Merged

hoxbro mentioned this issue Mar 22, 2023

Docs build holoviz/panel#4552

Merged

prafull01 mentioned this issue Apr 12, 2023

Update Openshift Release Process cockroachdb/cockroach-operator#964

Merged

1 task

sundarvenkata-EBI mentioned this issue May 7, 2023

EVA-3147: added properties for variant load and accession import jobs EBIvariation/eva-common-pyutils#36

Merged

mmagician mentioned this issue May 9, 2023

Update CI actions: replace actions-rs/toolchain with dtolnay/rust-toolchain, update checkout action to v3 arkworks-rs/r1cs-std#120

Merged

6 tasks

Lukas113 mentioned this issue Jun 12, 2023

Ionospheric Screen from ARatmospy Implementation i4Ds/Karabo-Pipeline#418

Merged

jsoref mentioned this issue Jul 11, 2023

Error: The operation was canceled. check-spelling/check-spelling#55

Closed

rgrewe mentioned this issue Aug 9, 2023

Use generic runner with older Ubuntu for piuparts greenbone/gos-ci#6

Merged

Yadunund mentioned this issue Aug 31, 2023

Fix CI for Iron osrf/nexus#11

Merged

pbackus mentioned this issue Nov 12, 2023

Add @standalone attribute for module constructors dlang/dmd#15537

Merged

james77777778 mentioned this issue Nov 30, 2023

Add hardswish keras-team/keras#18852

Merged

Lord-Y mentioned this issue Nov 30, 2023

Remove redundant definitions in *_vgen.go vugu/vugu#255

Merged

ukd1 added a commit to ukd1/pcbflow that referenced this issue Jan 16, 2024

Add some swapspace (re random failures, fix idea from: actions/runner…

57f6ad9

…#2468#issuecomment-1651313943), and flake8-black

csviri mentioned this issue Jan 18, 2024

Integration Tests Gets Cancelled (Not failing) by Github Actions operator-framework/java-operator-sdk#2206

Closed

akoshelev mentioned this issue May 29, 2024

Github actions are failing with "The operation was canceled." error private-attribution/ipa#1103

Closed

michael-okeefe added a commit to NREL/fastsim that referenced this issue Jun 27, 2024

Attempt to increase swap space for test runner

212646f

Based on: actions/runner#2468 Test failures appear to occur due to an out-of-memory error. This attempts to increase the swap-space size on Ubuntu.

michael-okeefe mentioned this issue Jun 27, 2024

Add docstrings to RustSimDriveParams NREL/fastsim#141

Merged

matthewBearCamunda mentioned this issue Sep 5, 2024

Unstable Optimize backend tests camunda/camunda#21722

Closed

aubertc mentioned this issue Sep 11, 2024

Building without fetching is broken? princomp/princomp.github.io#45

Closed

beer-1 mentioned this issue Sep 24, 2024

feat: update slinky => connect/v2 initia-labs/OPinit#109

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The operation was canceled. #2468

The operation was canceled. #2468

magnetnation commented Mar 1, 2023

jarreds commented Mar 2, 2023 •

edited

Loading

ruvceskistefan commented Mar 2, 2023

magnetnation commented Mar 2, 2023 •

edited

Loading

matsest commented Mar 6, 2023 •

edited

Loading

mathemaphysics commented Mar 8, 2023 •

edited

Loading

magnetnation commented Mar 14, 2023

matsest commented Mar 14, 2023

igordrnobrega commented Mar 15, 2023

nuhkoca commented Mar 20, 2023

credativ-dar commented Mar 20, 2023 •

edited

Loading

edergillian-eeg commented Mar 22, 2023

Milamary commented Mar 24, 2023

bvdmitri commented Mar 31, 2023

jsjoeio commented Apr 28, 2023

nuhkoca commented Apr 29, 2023 •

edited

Loading

mathemaphysics commented Apr 30, 2023

nuhkoca commented May 1, 2023

sundarvenkata-EBI commented May 7, 2023 •

edited

Loading

netsgnut commented May 26, 2023

deathemperor commented Jun 1, 2023 •

edited

Loading

bvdmitri commented Jun 2, 2023

nuhkoca commented Jun 2, 2023

bvdmitri commented Jun 2, 2023

a1300 commented Jul 26, 2023

nuhkoca commented Jul 26, 2023

kaanx022 commented May 11, 2024

kaanx022 commented May 11, 2024

wffurr commented Sep 11, 2024

The operation was canceled. #2468

The operation was canceled. #2468

Comments

magnetnation commented Mar 1, 2023

Runner Version and Platform

What's not working?

Job Log Output

jarreds commented Mar 2, 2023 • edited Loading

ruvceskistefan commented Mar 2, 2023

magnetnation commented Mar 2, 2023 • edited Loading

matsest commented Mar 6, 2023 • edited Loading

mathemaphysics commented Mar 8, 2023 • edited Loading

magnetnation commented Mar 14, 2023

matsest commented Mar 14, 2023

igordrnobrega commented Mar 15, 2023

nuhkoca commented Mar 20, 2023

credativ-dar commented Mar 20, 2023 • edited Loading

edergillian-eeg commented Mar 22, 2023

Milamary commented Mar 24, 2023

bvdmitri commented Mar 31, 2023

jsjoeio commented Apr 28, 2023

nuhkoca commented Apr 29, 2023 • edited Loading

mathemaphysics commented Apr 30, 2023

nuhkoca commented May 1, 2023

sundarvenkata-EBI commented May 7, 2023 • edited Loading

netsgnut commented May 26, 2023

deathemperor commented Jun 1, 2023 • edited Loading

bvdmitri commented Jun 2, 2023

nuhkoca commented Jun 2, 2023

bvdmitri commented Jun 2, 2023

a1300 commented Jul 26, 2023

nuhkoca commented Jul 26, 2023

kaanx022 commented May 11, 2024

kaanx022 commented May 11, 2024

wffurr commented Sep 11, 2024

jarreds commented Mar 2, 2023 •

edited

Loading

magnetnation commented Mar 2, 2023 •

edited

Loading

matsest commented Mar 6, 2023 •

edited

Loading

mathemaphysics commented Mar 8, 2023 •

edited

Loading

credativ-dar commented Mar 20, 2023 •

edited

Loading

nuhkoca commented Apr 29, 2023 •

edited

Loading

sundarvenkata-EBI commented May 7, 2023 •

edited

Loading

deathemperor commented Jun 1, 2023 •

edited

Loading