-
-
Notifications
You must be signed in to change notification settings - Fork 45
CCCT-1867 Connect Message Fragment Crash #3395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CCCT-1867 Connect Message Fragment Crash #3395
Conversation
Added a null check for the fragment context to avoid the crash.
📝 WalkthroughWalkthroughA null-safety improvement to the failure handling in ConnectMessageFragment's message retrieval flow. The change replaces a direct Toast call with a defensive pattern that obtains the context via getContext(), validates it against null, and only displays the Toast message if context is available. Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (12)📓 Common learnings📚 Learning: 2025-07-29T14:14:07.954ZApplied to files:
📚 Learning: 2025-05-22T14:28:35.959ZApplied to files:
📚 Learning: 2025-02-19T15:15:01.935ZApplied to files:
📚 Learning: 2025-05-22T14:26:41.341ZApplied to files:
📚 Learning: 2025-05-09T10:57:41.073ZApplied to files:
📚 Learning: 2025-06-04T19:17:21.213ZApplied to files:
📚 Learning: 2025-06-06T19:54:26.428ZApplied to files:
📚 Learning: 2025-05-08T11:08:18.530ZApplied to files:
📚 Learning: 2025-07-29T14:09:49.805ZApplied to files:
📚 Learning: 2025-04-18T20:13:29.655ZApplied to files:
📚 Learning: 2025-06-06T19:52:53.173ZApplied to files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| refreshUi(); | ||
| } else { | ||
| Toast.makeText(requireContext(), getString(R.string.connect_messaging_retrieve_messages_fail), Toast.LENGTH_SHORT).show(); | ||
| Context context = getContext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will actually not solve the real problem but suppress it.
I think issue is here, it has delay of 30 minutes before calling API, which is causing this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a real problem to suppress though is what I mean.
The API is being called every 30 seconds while the user is on the Connect Message Fragment screen. The current crash happens when the API is called, and then the user moves to a different screen before the work calling the API finishes, and that's why the context would be null.
I feel like we should not want to update the UI if the current screen the user is looking at is irrelevant right?
Looking at this code, we are already following this same philosophy in the function refreshUi() in this class:
Context context = getContext();
if (context != null) {
List<ConnectMessagingMessageRecord> messages = ConnectMessagingDatabaseHelper.getMessagingMessagesForChannel(context, channelId);
List<ConnectMessageChatData> chats = new ArrayList<>();
for (ConnectMessagingMessageRecord message : messages) {
chats.add(fromMessage(message));
if (!message.getUserViewed()) {
message.setUserViewed(true);
ConnectMessagingDatabaseHelper.storeMessagingMessage(context, message);
}
}
adapter.updateData(chats);
scrollToLatestMessage();
}
Checking if the context is not null makes more sense to me because why would we want to update the UI if the fragment is detached from its activity?
I'm open for discussion though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@conroy-ricketts I feel that 30 seconds is too high, may be some one wanted to put 3 seconds but added extra zero there.
If user is not on the screen, I think it should not call the API only. Can we cancel the handler in onDestroyView?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something, but the size of the interval does not cause the crash, and cancelling the handler in onDestroyView will not solve the issue. The crash happens because the coroutine in PushNotificationApiHelper's retrieveLatestPushNotificationsWithCallback() is launched while the user is on the Connect Message Fragment, but finishes and calls a listener when the user is no longer on the fragment:
object PushNotificationApiHelper {
fun retrieveLatestPushNotificationsWithCallback(
context: Context,
listener: ConnectActivityCompleteListener
) {
CoroutineScope(Dispatchers.IO).launch {
retrieveLatestPushNotifications(context).onSuccess {
withContext(Dispatchers.Main) { // switching to main to touch views
listener.connectActivityComplete(true)
}
}.onFailure {
withContext(Dispatchers.Main) { // switching to main to touch views
listener.connectActivityComplete(false)
}
}
}
}
This coroutine runs very fast normally.
For example, the crash will not happen if you wait a few seconds before hitting the back button on the Connect Message Fragment screen. The crash only happens when you hit the back button while the coroutine is doing its work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that this line is calling the Runnable again after 30 seconds and as user not on this fragment, causing issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I noticed too is that we are already cancelling the handler here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 30 second refresh rate was chosen intentionally as a way to have messages refresh somewhat frequently without constantly hitting the network. It was originally 60 seconds but we cut it in half earlier this year.
Also, it looks like the handler is already cleaned up during onPause so the call stops happening when the user leaves the page.
All in all, I think Conroy's fix is what we need here since the call can start while the user is on the page but take a while to complete such that the user may have navigated away by the time the call completes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it looks like Conroy's fix is what we need. This is strange as we need to handle manually the context. We need ViewModel scope here soon.
I was also just trying to understand why app is not crashing whenever it has valid response but only when it fails. Answer is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! I was worried about the success case and verified the same. Explains why all the crash logs were for failed calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am good with the solution here until we correct lifecyle management of network calls from UI. The alternative here would be to bind the callback to fragment lifecyle so that it never gets called if fragment is no longer alive. Although that might require a lot more changes (like doing this call from a view model scoped to fragment lifecycle). I do want to call out the downside of current solution that it will not hard crash in cases when context is null here due to some bad code (for example calling fetchMessagesFromNetwork from onDestroy, It will never pass our code reviews though :D).
The 30 second refresh rate was chosen intentionally as a way to have messages refresh somewhat frequently without constantly hitting the network. It was originally 60 seconds but we cut it in half earlier this year.
Is there context on that decision somewhere ? 30 seconds does still seem high to me.
OrangeAndGreen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A+ on the PR description and demo videos!
CCCT-1867
Technical Summary
This crash happens when the user goes to the Connect Message Fragment screen, the app begins to asynchronously fetch messages, and then the user goes to another screen (e.g. hitting the back button once or twice) before the asynchronous work is finished (in this case, the work finishes with a failure). This crash happens because the fragment's context is
nullwhen the fragment detaches from its Activity, and we are attempting to use thatnullcontext to display a Toast message to the user.So, to fix the crash I simply added a null check for the fragment's context before we try to show the Toast message.
The reason I chose this solution is because it doesn't make sense to me to show the user a Toast message regarding receiving messages if they're on a different screen entirely, which leads to poor UX. Whenever the user returns to the Connect Message Fragment Screen, the app will try to fetch the messages again anyways. However, I'd love to hear everyone's thoughts on this.
Safety Assurance
Safety story
I verified that the app no longer crashes when following these repro steps:
Here is a video example of the crash before my changes:
Screen_Recording_20251103_160708_CommCare.Debug.mp4
Here is a video example after my changes:
Screen_Recording_20251103_161151_CommCare.Debug.mp4
QA Plan
For QA, we should verify that if you follow these steps, the app does not crash: