Skip to content

[TaskGroup] Reenable test and fix memory issue #67392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ktoso
Copy link
Contributor

@ktoso ktoso commented Jul 19, 2023

Description: This re-enables a task group test and fixes a memory issue where we realized we must release the group lock before resuming the waiting task. If we don't do this, we're racing the group's destroy() (coming from the parent task, waiting on waitAll with the unlock()), risking calling unlock() on already released memory - very rarely causing crashes due to this. It is hard to reproduce this issue, due to the race appearing very rarely, but it is consistently reproducible in some environments.
Risk: Low, the fix reorders an unlock() with the waitingTask resume which cannot cause any other consistency issues as there are no other actions taken after the resume. Arguably, this always should have been release + run order.
Reward: Medium, it seems this crash does not manifest frequently in real applications, however it is a real issue we need to fix.
Review by: @al45tair @erik
Testing: CI testing, confirmed the fix on specially prepared environments.
Radar: rdar://112759360 rdar://110474311

@ktoso ktoso requested a review from kavon as a code owner July 19, 2023 09:32
@ktoso
Copy link
Contributor Author

ktoso commented Jul 19, 2023

@swift-ci please smoke test

@ktoso ktoso added test suite Area: test suite concurrency Feature: umbrella label for concurrency language features labels Jul 19, 2023
@ktoso
Copy link
Contributor Author

ktoso commented Jul 19, 2023

@swift-ci please smoke test

@ktoso
Copy link
Contributor Author

ktoso commented Jul 20, 2023

41849 --
41850 Exit Code: 2
41851
41852 Command Output (stdout):
41853 --
41854 /home/build-user/build/buildbot_linux/swift-linux-x86_64/test-linux-x86_64/Concurrency/Runtime/Output/async_taskgroup_discarding_dontLeak_class_error.swift.tmp/a.out
41855
41856 --
41857 Command Output (stderr):
41858 --
41859 /home/build-user/swift/test/Concurrency/Runtime/async_taskgroup_discarding_dontLeak_class_error.swift:73:9: warning: integer literal is unused
41860         12 // ignore this on purpose
41861         ^~
41862 Swift/ContiguousArrayBuffer.swift:613: Fatal error: Index out of range
41863 Current stack trace:
41864 0    libswiftCore.so                    0x00007fd4e0588950 _swift_stdlib_reportFatalErrorInFile + 113
41865 1    libswiftCore.so                    0x00007fd4e0268114 <unavailable> + 1458452
41866 2    libswiftCore.so                    0x00007fd4e0267f94 <unavailable> + 1458068
41867 3    libswiftCore.so                    0x00007fd4e0267cca <unavailable> + 1457354
41868 4    libswiftCore.so                    0x00007fd4e0267860 _assertionFailure(_:_:file:line:flags:) + 253
41869 5    libswift_Backtracing.so            0x00007fd4e084962e <unavailable> + 558638
41870 6    libswift_Backtracing.so            0x00007fd4e07effd0 static Backtrace.capture<A, B>(from:using:images:algorithm:limit:offset:top:) + 1915
41871 7                                       0x000055e1ed5bb0f7 <unavailable> + 82167
41872 8                                       0x000055e1ed5ba96b <unavailable> + 80235
41873 9                                       0x000055e1ed5b0b16 <unavailable> + 39702
41874 10                                      0x000055e1ed5b9c76 <unavailable> + 76918
41875 11   libc.so.6                          0x00007fd4dfbe9f90 __libc_start_main + 243
41876 12                                      0x000055e1ed5afc6e <unavailable> + 35950
41877 FileCheck error: '<stdin>' is empty.
41878 FileCheck command line:  /home/build-user/build/buildbot_linux/llvm-linux-x86_64/bin/FileCheck --allow-unused-prefixes /home/build-user/swift/test/Concurrency/Runtime/async_taskgroup_discarding_dontLeak_class_error.swift --dump-input=always                                                                                                                                                                         41879
41880 --

storing the crash here for reference

@ktoso
Copy link
Contributor Author

ktoso commented Jul 20, 2023

@swift-ci please smoke test linux

@ktoso ktoso force-pushed the wip-reenable-async_taskgroup_discarding_dontLeak_class_error.swift branch from 42f4b8c to eeb2610 Compare July 21, 2023 07:36
@ktoso
Copy link
Contributor Author

ktoso commented Jul 21, 2023

@swift-ci please smoke test linux

@ktoso
Copy link
Contributor Author

ktoso commented Aug 3, 2023

@swift-ci please test

@ktoso ktoso changed the title Test reenable: async_taskgroup_discarding_dontLeak_class_error.swift Test reenable: async_taskgroup_discarding_dontLeak_class_error.swift and fix memory corruption bug Aug 3, 2023
@ktoso ktoso changed the title Test reenable: async_taskgroup_discarding_dontLeak_class_error.swift and fix memory corruption bug Test reenable: async_taskgroup_discarding_dontLeak_class_error.swift and fix potential use-after-free Aug 3, 2023
@ktoso ktoso changed the title Test reenable: async_taskgroup_discarding_dontLeak_class_error.swift and fix potential use-after-free Test reenable: async_taskgroup_discarding_dontLeak_class_error.swift and fix memory bug Aug 3, 2023
@tshortli
Copy link
Contributor

tshortli commented Aug 3, 2023

Confirmed that this was only a flaky test.

Is the PR description out-of-date? Isn't an important correctness fix in the runtime included?

@ktoso
Copy link
Contributor Author

ktoso commented Aug 3, 2023

Is the PR description out-of-date? Isn't an important correctness fix in the runtime included?

Yes, updated to CCC style with all the details from #67700 now

ktoso added 2 commits August 4, 2023 08:46
released memory.

Because the waitAll() group method is called before the group returns
from withTaskGroup and is used to ensure all tasks have completed before
we destroy the task group, this method is then immediately followed by
calling TaskGroup::destroy.

If we, as previously, first resume the waiting task and then attempt to
unlock the lock held by the group, we are in an unsafe situation with
racing the task group destroy() and the unlock().

Therefore, we must release the lock before we resume the waiting task.
@ktoso ktoso force-pushed the wip-reenable-async_taskgroup_discarding_dontLeak_class_error.swift branch from 95d5d08 to d01abcb Compare August 3, 2023 23:46
@ktoso ktoso changed the title Test reenable: async_taskgroup_discarding_dontLeak_class_error.swift and fix memory bug [TaskGroup] Reenable test and fix memory issue #67700 Aug 3, 2023
@ktoso ktoso added concurrency runtime Feature: umbrella label for concurrency runtime features and removed test suite Area: test suite labels Aug 3, 2023
@ktoso
Copy link
Contributor Author

ktoso commented Aug 3, 2023

@swift-ci please smoke test

@ktoso
Copy link
Contributor Author

ktoso commented Aug 3, 2023

Rebased, this has only the necessary change right now.

@tshortli tshortli removed their request for review August 4, 2023 00:00
@ktoso
Copy link
Contributor Author

ktoso commented Aug 4, 2023

@swift-ci please smoke test Windows

@ktoso ktoso changed the title [TaskGroup] Reenable test and fix memory issue #67700 [TaskGroup] Reenable test and fix memory issue Aug 4, 2023
@ktoso
Copy link
Contributor Author

ktoso commented Aug 4, 2023

@swift-ci please smoke test Linux

(CI/github connection issues)

@ktoso
Copy link
Contributor Author

ktoso commented Aug 4, 2023

@swift-ci please smoke test Linux

1 similar comment
@ktoso
Copy link
Contributor Author

ktoso commented Aug 4, 2023

@swift-ci please smoke test Linux

@ktoso
Copy link
Contributor Author

ktoso commented Aug 4, 2023

@swift-ci please smoke test macOS

@ktoso
Copy link
Contributor Author

ktoso commented Aug 7, 2023

Failed Tests (1):
Swift(macosx-x86_64) :: stdlib/Observation/ObservableAvailabilityCycle.swift

@ktoso
Copy link
Contributor Author

ktoso commented Aug 7, 2023

@swift-ci please smoke test macOS

@ktoso ktoso merged commit 7020cf3 into swiftlang:main Aug 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
concurrency runtime Feature: umbrella label for concurrency runtime features concurrency Feature: umbrella label for concurrency language features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants