Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground() #549

Closed
Borna-f opened this issue Dec 22, 2020 · 19 comments
Labels
android bug Acknowledged bug which is seen in Upload Service, but it's caused by an underlying Android Bug

Comments

@Borna-f
Copy link

Borna-f commented Dec 22, 2020

Hi,
I sometimes get the following error when the upload finishes and it crashes my app. Any idea or help? The last part after --- is the error. I brought the previous logs to get you an idea of when it does happen. (It is related to uploadservice/UploadService).

Thanks

I/UploadService: UploadService - (uploadId: N/A) - Service is about to be stopped because idle timeout of 10s has been reached
I/UploadService: NotificationActionsObserver - (uploadId: N/A) - unregistered
I/UploadService: UploadService - (uploadId: N/A) - Stopping foreground execution
I/UploadService: UploadService - (uploadId: N/A) - UploadService destroyed


D/AndroidRuntime: Shutting down VM
E/AndroidRuntime: FATAL EXCEPTION: main
Process: myApp, PID: 5855
android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground(): ServiceRecord{5986d70 u0 myApp/net.gotev.uploadservice.UploadService}
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2182)
at android.os.Handler.dispatchMessage(Handler.java:107)
at android.os.Looper.loop(Looper.java:237)
at android.app.ActivityThread.main(ActivityThread.java:8107)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:496)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1100)

@gotev
Copy link
Owner

gotev commented Dec 22, 2020

Thank for reporting this @Borna-f 😉

Can you replicate it in an Android instrumented test?

@Borna-f
Copy link
Author

Borna-f commented Dec 23, 2020

When making a Pull Request (and running workflows) the android-tests (probably also sometimes) fails with the following error:

net.gotev.uploadservice.BinaryUploadTests > successfulBinaryUploadAfterOneRetry[test(AVD) - 10] FAILED
android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground(): ServiceRecord{377d41b u0 net.gotev.uploadservice.test/net.gotev.uploadservice.UploadService}
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1945)
Tests on test(AVD) - 10 failed: Instrumentation run failed due to 'Process crashed.'

Task :uploadservice:connectedDebugAndroidTest FAILED
Task :uploadservice-s3:compileDebugJavaWithJavac

Task :uploadservice-ftp:connectedDebugAndroidTest
Starting 5 tests on test(AVD) - 10

FAILURE: Build failed with an exception.

  • What went wrong:
    Execution failed for task ':uploadservice:connectedDebugAndroidTest'.

So somewhere the services are having issues, causing the app to crash. Is that helpful/enough information?

@Borna-f
Copy link
Author

Borna-f commented Dec 23, 2020

Just to clarify. It is not specified to the specific test mentioned in the previous comment. This crash can happen at any task. i.e. here is another log I've got from another task

Task :uploadservice:connectedDebugAndroidTest

net.gotev.uploadservice.BinaryUploadTests > userCancelledBinaryUpload[test(AVD) - 10] FAILED
android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground(): ServiceRecord{4a1a50 u0 net.gotev.uploadservice.test/net.gotev.uploadservice.UploadService}
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1945)
Tests on test(AVD) - 10 failed: Instrumentation run failed due to 'Process crashed.'

Task :uploadservice-s3:mergeDebugAndroidTestResources
Task :uploadservice:connectedDebugAndroidTest FAILED
Task :uploadservice-s3:processDebugAndroidTestResources

Task :uploadservice-ftp:connectedDebugAndroidTest
Starting 5 tests on test(AVD) - 10

FAILURE: Build failed with an exception.

  • What went wrong:
    Execution failed for task ':uploadservice:connectedDebugAndroidTest'.

I see my S3 name somewhere there. Can it be something related to my newly added module?

@Borna-f
Copy link
Author

Borna-f commented Dec 24, 2020

Update:
After doing some research, this seems to be a common issue with the foregroundService. See this:
https://stackoverflow.com/questions/44425584/context-startforegroundservice-did-not-then-call-service-startforeground

Maybe this solution worth a shot?
https://stackoverflow.com/a/46449975

@gotev
Copy link
Owner

gotev commented Dec 24, 2020

@Borna-f Thank you for the details 😉

When making a Pull Request (and running workflows) the android-tests (probably also sometimes) fails with the following error

I also spot this happening very rarely on CI builds, but always thought it to be some random CI problem, like the emulator disappearing unexpectedly sometimes during the execution 😄

What I can't still understand is how come that error happens after the upload is finished. It would have made much more sense to me to debug it if it happened at the start. We should find a way to replicate this in the tests. It may be a service lifecycle implementation problem.

In Android 8.0+ docs is stated that when you launch a service in the background with startForegroundService you have max 5 seconds to launch the startForeground, otherwise you get android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground()

In this case though you get this error after the upload is finished. I thought about the following two cases:

Let's dive in the code and continue from there.

startForeground is done here:

/**
* Check if the task is currently the one shown in the foreground notification.
* @param uploadId ID of the upload
* @return true if the current upload task holds the foreground notification, otherwise false
*/
@Synchronized
fun holdForegroundNotification(uploadId: String, notification: Notification): Boolean {
if (!UploadServiceConfig.isForegroundService) return false
if (foregroundUploadId == null) {
foregroundUploadId = uploadId
UploadServiceLogger.debug(TAG, uploadId) { "now holds foreground notification" }
}
if (uploadId == foregroundUploadId) {
startForeground(UPLOAD_NOTIFICATION_BASE_ID, notification)
return true
}
return false
}

stopForeground is done here:

/**
* Called by each task when it is completed (either successfully, with an error or due to
* user cancellation).
* @param uploadId the uploadID of the finished task
*/
@Synchronized
fun taskCompleted(uploadId: String) {
val task = uploadTasksMap.remove(uploadId)
// un-hold foreground upload ID if it's been hold
if (UploadServiceConfig.isForegroundService && task != null && task.params.id == foregroundUploadId) {
UploadServiceLogger.debug(TAG, uploadId) { "now un-holded foreground notification" }
foregroundUploadId = null
}
if (UploadServiceConfig.isForegroundService && uploadTasksMap.isEmpty()) {
UploadServiceLogger.debug(TAG, NA) { "All tasks completed, stopping foreground execution" }
stopForeground(true)
shutdownIfThereArentAnyActiveTasks()
}
}

In this case:

Given Upload Service is not started and you are using default settings in UploadServiceConfig
When you perform a single upload with a single file, using either binary or multipart
Then the upload is performed and the service is shut down 10s after finishing the upload task

the following happens at a lifecycle level starting from UploadService (skipping previous execution stages from intent creation to service invocation on purpose as the problem is not there):

@Borna-f
Copy link
Author

Borna-f commented Dec 24, 2020

A couple of items

A) In the workflows it is crashing in the start of the upload (service lifecycle implementation problem.) So I think for that if you can call the services.startForeground() explicitly after context.startForegroundService(), that would make the problem less occurring based on the answered stackoverflow I shared earlier?

I face this error frequency the other way around. The error happens to me like very frequent and I can say only sometimes my build passes 😃

B) My uploads are indeed fast. Like in few KB and they end less than 1 seconds! I see in taskCompleted you have stopForeground(true). Can this be related to that?

C) At the end, when I review the example I shared on the first comment, this is happening when the task is being shutdown after the 10 seconds timeout. So maybe the task is trying to shutdown while it was already shut down?

I don't know a lot about services (and also the details of your implementation of services) so I am just trying to give thoughts and possible scenarios that may happen. Maybe fixing some of these (i.e. issues that may rise on the start of the upload due to Service Lifecycle Problem) would fix the other ones as well?

Let me know what you think. Thanks. I still owe you test scenarios but I thought maybe these would give you enough clues for debug.

gotev added a commit that referenced this issue Dec 25, 2020
…fication, to prevent android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground()
@gotev
Copy link
Owner

gotev commented Dec 27, 2020

@Borna-f many different scenarios can happen. Let's start tackling the first one 😉 and then see how it goes. Look at the service-lifecycle branch and the commit I made and integrate it into your branch. So far the CI has not failed in my case with that modification.

gotev added a commit that referenced this issue Jan 2, 2021
…fication, to prevent android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground() (#551)
Borna-f pushed a commit to biosensics/android-upload-service that referenced this issue Jan 4, 2021
… notification, to prevent android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground() (gotev#551)
Borna-f pushed a commit to biosensics/android-upload-service that referenced this issue Jan 5, 2021
… notification, to prevent android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground() (gotev#551)
Borna-f added a commit to biosensics/android-upload-service that referenced this issue Jan 5, 2021
* gotev#549: startForeground immediately in onCreate with a placeholder notification, to prevent android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground() (gotev#551)

* fix: changes assert condition to be lower or equal for interrupted uploads

Co-authored-by: Alex Gotev <alex@gotev.net>
@gotev
Copy link
Owner

gotev commented Jan 10, 2021

@chimmelb @Borna-f released 4.5.2 with the above fix. Let me know if the crash still happens.

@gotev gotev changed the title App crash on upload finished? android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground() Jan 10, 2021
@gotev
Copy link
Owner

gotev commented Jan 10, 2021

The observed pattern leading to the android.app.RemoteServiceException: Context.startForegroundService() did not then call Service.startForeground() exception is:

t0: user triggers a new upload. The intent is fired.
t1: the intent is received by Android. The Service gets created.
t2: android system calls Service's `onCreate`
t3: android system calls Service's `onStartCommand`

(t3-t2) >= 5 seconds
Android system throws the exception

Before 629ddc7 the startForeground happened only in onStartCommand. Now a placeholder notification gets immediately shown in onCreate to prevent problems due to (t3-t2) >= 5 seconds.

After many unsuccessful retries to create a unit test which replicates this behavior systematically, I haven't found evidence that the problem is due to something in Upload Service's code. Perhaps it is a runtime scenario which happens when the workload of the device is heavy at that specific moment, slowing down the single Service lifecycle method invocation. With the proposed technique, this scenario should not cause exceptions anymore.

@gotev gotev closed this as completed Jan 17, 2021
@gotev
Copy link
Owner

gotev commented Jan 24, 2021

Got some more insights on this: https://stackoverflow.com/a/50691494

Other devs reported this sequence:

t0: user triggers a new upload. The intent is fired.
t1: the intent is received by Android.
t2: android system calls Service's `onCreate`

(t2-t1) >= 5 seconds
Android system throws the exception

If this is the sequence leading to a crash, I don't think there's something which can be done in the library to prevent it, as it's a Google Design Flaw.

@mlazowik
Copy link

mlazowik commented Feb 10, 2021

We are also still seeing this error sometimes (even at 4.5.2). I've found a related issue in google's tracker, and there are some workaround suggestions in there, in particular this one seems promising: https://issuetracker.google.com/issues/76112072#comment158

@gotev
Copy link
Owner

gotev commented Feb 10, 2021

@mlazowik thank you for the link! This would change the lifecycle and maybe some mechanisms. I would love to see a PR and continue the discussion on it 😉

@mlazowik
Copy link

Great! Someone from my company should deliver in the coming weeks. Do you think this issue should be reopened?

@gotev gotev reopened this Feb 10, 2021
@gotev
Copy link
Owner

gotev commented Feb 10, 2021

@mlazowik great! Waiting for the PR then. If it gets too cumbersome or troublesome, I'm also considering moving away from the Service architecture entirely and use WorkManager instead

@xOIBrandon
Copy link
Contributor

Hey, has there been any movement on this? I not really an Android dev but I'd happy to take a stab at it. This error has been popping up in our logs a lot and I've been tasked to do something about it.

@gotev
Copy link
Owner

gotev commented Apr 1, 2021

You can go ahead

@gotev gotev added the android bug Acknowledged bug which is seen in Upload Service, but it's caused by an underlying Android Bug label Apr 13, 2021
@gotev
Copy link
Owner

gotev commented Apr 13, 2021

I think #578 and #579 are the best mitigations possible at the current state of the art. There's no way to fully prevent the RemoteServiceException from happening: #579 (comment) but at least now it's confined to the following use-case:

  • Android API 26+
  • App in the background
  • Upload intent fired
  • Android system unable to guarantee the 5 seconds time limit to start the service

Only the switch to WorkManager could fully prevent those crashes.

@gotev
Copy link
Owner

gotev commented Apr 17, 2021

Hopefully newly released 4.5.3 will reduce this scenario

@gotev gotev closed this as completed Apr 17, 2021
@mlazowik
Copy link

Thank you @xOIBrandon and @gotev!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
android bug Acknowledged bug which is seen in Upload Service, but it's caused by an underlying Android Bug
Projects
None yet
Development

No branches or pull requests

5 participants
@mlazowik @gotev @Borna-f @xOIBrandon and others