Skip to content

getTokens does not always complete within expected time window #3683

@m-zagorski

Description

@m-zagorski

Describe the bug
getTokens occasionally never invokes callbacks

We are experiencing an issue where getTokens intermittently does not invoke its callback at all (neither success nor failure).

Initially, we suspected this behavior might be influenced by interactions with another library that had memory-related issues. Regardless, we observed that getTokens could take an unexpectedly long time to complete, which led us to add defensive handling on our side.

To mitigate this, we wrapped the getTokens call using suspendCancellableCoroutine and added an explicit timeout. The timeout was initially set to 20 seconds, which resulted in frequent failures for many users. We then increased it to 60 seconds, which significantly reduced the number of failures but did not eliminate them entirely. Even with a 60-second timeout, a subset of users still fails to refresh tokens within this time window.

From reviewing the implementation, it appears that the callback would not be invoked only if the provider key is missing. In our case, the provider key is always present, yet the callback is still sometimes never called.

This raises a few questions:

  • Does the internal session refresh logic have its own timeout?
  • Is it expected behavior for getTokens to take longer than 60 seconds in some scenarios?
  • Could poor or unstable network conditions cause getTokens to hang indefinitely instead of failing?
  • Should callers expect a failure callback in such cases so that a retry can be attempted?

To Reproduce
This is really hard to reproduce locally

Which AWS service(s) are affected?
Cognito version 2.81.1

Expected behavior
getTokens should reliably invoke its callback in all cases, either with a successful token refresh or with a failure/error, within a bounded and documented time frame. The call should not remain pending indefinitely. In the event of network issues or other transient failures, the operation should fail and allow the caller to retry.

Environment Information (please complete the following information):

  • AWS Android SDK Version: 2.81.1
  • Device: many different, doesnt seem like device connected issue
  • Android Version: mostly android 16, but there are also other versions like 15, 14, 13
  • Specific to simulators: No

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions