Closed Bug 1583357 Opened 5 years ago Closed 5 years ago

GcmInstanceIDListenerService invokes GcmTokenRefreshService, and ultimately PushService.onRefresh() on the wrong thread

Categories

(Firefox for Android Graveyard :: General, defect, P1)

Firefox 63
Unspecified
Android
defect

Tracking

(firefox-esr60 unaffected, firefox-esr68 verified, firefox69 unaffected, firefox70 unaffected, firefox71 unaffected)

VERIFIED FIXED
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- verified
firefox69 --- unaffected
firefox70 --- unaffected
firefox71 --- unaffected

People

(Reporter: robwu, Assigned: petru)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

Yesterday, I launched Nightly (after three months of not using it), and it crashed soon after startup. After submitting the report, it crashed again, so I thought that I had a bad build and updated to the latest version. But then it kept crashing over and over again (seemingly with an increasingly longer delay between each crash).

After checking the crash report, I think that this is a regression by bug 1467840.
In https://hg.mozilla.org/mozilla-central/rev/e16e23725134 , the invocation of GcmInstanceIDListenerService moved from ThreadUtils.postToBackgroundThread to JobIntentService. I'm no expert here, but I think that these two threads may be different, which causes the ThreadUtils.assertOnBackgroundThread() assertion to fail.

A fix could be to either remove the assertion, or to re-instate the post-to-background thread call.

Reports (query, from myself only):

Stack trace:

java.lang.IllegalThreadStateException
	at org.mozilla.gecko.util.ThreadUtils.assertOnThreadComparison(ThreadUtils.java:184)
	at org.mozilla.gecko.util.ThreadUtils.assertOnThread(ThreadUtils.java:155)
	at org.mozilla.gecko.util.ThreadUtils.assertOnBackgroundThread(ThreadUtils.java:147)
	at org.mozilla.gecko.push.PushService.onRefresh(PushService.java:193)
	at org.mozilla.gecko.gcm.GcmTokenRefreshService.onHandleWork(GcmTokenRefreshService.java:32)
	at android.support.v4.app.JobIntentService$CommandProcessor.doInBackground(JobIntentService.java:392)
	at android.support.v4.app.JobIntentService$CommandProcessor.doInBackground(JobIntentService.java:383)
	at android.os.AsyncTask$3.call(AsyncTask.java:378)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
	at java.lang.Thread.run(Thread.java:919)

[Tracking Requested - why for this release]:

Thanks for filing this Rob!
Indeed it seems to be a regression caused by bug 1467840.
There are quite a few reports that warrant a fix but it's interesting that there aren't more so we could have caught this sooner.

The full exception message is

java.lang.IllegalThreadStateException: Expected thread 6629 ("GeckoBackgroundThread"), but running on thread 6625 ("AsyncTask #1")

So we see that we are running on a background thread, just not on GeckoBackgroundThread, which is to be expected as after bug 1467840 this background work is to be done by the JobScheduler which manages it's own thread pool.

In my opinion assertOnBackgroundThread which is used leniently across the app should account not just for GeckoBackgroundThread but even for other threads which may be used for background work.

In the case of PushService.onRefresh we use that assert as more of a sanity check to ensure we'll not do a potential expensive and long lived operation on the Main thread.
To still have this assert benefit but not modify a widely used api at this point in Fennec's life I'm thinking that using assertNotOnUiThread in place of assertOnBackgroundThread would be the best solution.

Assignee: nobody → petru.lingurar
Status: NEW → ASSIGNED
Priority: -- → P1
Version: unspecified → Firefox 63

The previously used assertOnBackgroundThread() would assert the code is running
on the GeckoBackgroundThread, not on any background thread - what we want.
assertNotOnUiThread() would ensure the code started from a JobIntentService
is properly run on a background thread, whatever it might be.

Try build with my patch applied on top of esr68 -
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a2f3340507092aa3a5af85e82bb2470cce086e24

How can this be tested:

  • Just start the app and make sure everything works, no crashes.
  • There would be some logs with the GeckoPushService tag. Seeing them would mean PushService is working.
Flags: qe-verify+

Hi, I tried to reproduce the issue before the fix but I couldn't. Because I am not able to confirm the patch indeed resolved it or not. Will leave the ticket status as FIXED but will not mark it as VERIFIED.
The checking was done with the build from Comment 3 with Google Pixel 3a XL and Sony Xperia Z5(Android 7.0.0)
Note:

Comment on attachment 9095533 [details]
Bug 1583357 - Use assertNotOnUiThread to assert running on a background thread; r?VladBaicu

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Fix for a highly occurring crash, with the potential to have a lot more occurrences.
  • User impact if declined: Application crash in otherwise normal usage.
  • Fix Landed on Version: 68
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): No logic change, just the condition of an assertion which previously could deliberately crash the app.
  • String or UUID changes made by this patch:
Attachment #9095533 - Flags: approval-mozilla-esr68?

Comment on attachment 9095533 [details]
Bug 1583357 - Use assertNotOnUiThread to assert running on a background thread; r?VladBaicu

Approved for Fennec 68.2b6.

Attachment #9095533 - Flags: approval-mozilla-esr68? → approval-mozilla-esr68+
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
QA Whiteboard: [qa-triaged]

Hi I did not reproduced the issue with 68.2b6 and with device Google Pixel 3a XL (Android 9) and Sony Xperia Z(Android 7.0):

FYI (as the one who originally encountered and reported the issue): After updating to Fennec with the fix, the crash did not happen any more.

Hello, I was not able to reproduce the problem on Firefox RC 68.2.0 using Xiaomi Mi 8 Lite (Android 9) and OnePlus 5T(Android 9).
Due to my findings and that the reporter can't reproduce anymore the problem, I'll mark this ticket as Verified and remove the qe-verify+ flag, Thanks.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
QA Whiteboard: [qa-triaged]
Product: Firefox for Android → Firefox for Android Graveyard
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: