Skip to content

FlushAndTeardown may get stuck #161

Closed
@maxgolov

Description

@maxgolov

Describe your environment.

The issue is that occasionally we do not get a cancellation callback from WinInet stack. That only happens for WinInet requests sent several milliseconds prior to app shutdown / FlushAndTeardown; in that case it appears that WinInet has not scheduled a worker thread yet to process a request, when we immediately cancel that request, and so WinInet does not invoke our callback with a cancellation notification. All in all we should consider some safety check on shutdown, possibly a time limit and/or mechanism to catch the case when a request is "getting stuck". Further investigation is needed.

Side-effect of that issue can be observed on APITest.LogManager_StressUploadLock_MultiThreaded test. The test rapidly performs many Initialize / FlushAndTeardown (1000 iterations), with Uploads triggering async from 25 concurrently scheduled threads. That rapid loop has a high_er_ (but still relatively low) chance of the problem to happen.

Steps to reproduce.

Monitor APITest.LogManager_StressUploadLock_MultiThreaded test.

What is the expected behavior?

Test must pass.

What is the actual behavior?

Test is getting stuck forever.

Additional context.

Race condition somewhere in WinInet (?), that cancels pending async request and does not trigger notification callback for that request, i.e. not updating the main thread with "request aborted" indication.

Metadata

Metadata

Assignees

Labels

P1Issues that are blockingPR sent to code reviewPending code reviewbugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions