Last Comment Bug 788216 - java.lang.IllegalStateException: Hardware acceleration can only be used with a single UI thread on JB
: java.lang.IllegalStateException: Hardware acceleration can only be used with ...
Status: VERIFIED FIXED
[native-crash][startupcrash]
: crash, reproducible, topcrash
Product: Firefox for Android
Classification: Client Software
Component: General (show other bugs)
: 16 Branch
: ARM Android
: -- critical (vote)
: Firefox 19
Assigned To: Kartikaya Gupta (email:kats@mozilla.com)
:
: Sebastian Kaspari (:sebastian)
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-04 11:21 PDT by Brad Lassey [:blassey] (use needinfo?)
Modified: 2016-07-29 14:29 PDT (History)
10 users (show)
ryanvm: in‑testsuite-
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
affected
+
verified
verified
verified
+


Attachments
Patch (3.08 KB, patch)
2012-10-22 22:06 PDT, Kartikaya Gupta (email:kats@mozilla.com)
gpascutto: review+
lukasblakk+bugs: approval‑mozilla‑aurora+
lukasblakk+bugs: approval‑mozilla‑beta+
Details | Diff | Splinter Review

Description Brad Lassey [:blassey] (use needinfo?) 2012-09-04 11:21:32 PDT
This is currently our #1 top crash on Fx15. It looks like a dialog is being handled on a our background thread, here is the stack:

java.lang.IllegalStateException: Hardware acceleration can only be used with a single UI thread.
Original thread: Thread[main,5,main]
Current thread: Thread[GeckoBackgroundThread,5,main]
	at android.view.HardwareRenderer$GlRenderer.checkCurrent(HardwareRenderer.java:1284)
	at android.view.HardwareRenderer$Gl20Renderer.safelyRun(HardwareRenderer.java:1482)
	at android.view.HardwareRenderer$Gl20Renderer.destroyHardwareResources(HardwareRenderer.java:1506)
	at android.view.ViewRootImpl.destroyHardwareRenderer(ViewRootImpl.java:4012)
	at android.view.ViewRootImpl.die(ViewRootImpl.java:3946)
	at android.view.WindowManagerImpl.removeViewLocked(WindowManagerImpl.java:402)
	at android.view.WindowManagerImpl.removeView(WindowManagerImpl.java:350)
	at android.view.WindowManagerImpl$CompatModeWrapper.removeView(WindowManagerImpl.java:160)
	at android.app.Dialog.dismissDialog(Dialog.java:319)
	at android.app.Dialog$1.run(Dialog.java:119)
	at android.os.Handler.handleCallback(Handler.java:615)
	at android.os.Handler.dispatchMessage(Handler.java:92)
	at android.os.Looper.loop(Looper.java:137)
	at org.mozilla.gecko.GeckoBackgroundThread.run(GeckoBackgroundThread.java:31)


more reports can be found here: https://crash-stats.mozilla.com/report/list?range_value=7&range_unit=days&date=2012-09-04&signature=java.lang.IllegalStateException%3A%20Hardware%20acceleration%20can%20only%20be%20used%20with%20a%20single%20UI%20thread.%20Original%20thread%3A%20Thread%5Bmain%2C5%2Cmain%5D&version=FennecAndroid%3A15.0
Comment 1 Brad Lassey [:blassey] (use needinfo?) 2012-09-04 11:22:32 PDT
also note that this is a start up crash
Comment 2 Brad Lassey [:blassey] (use needinfo?) 2012-09-04 11:46:51 PDT

*** This bug has been marked as a duplicate of bug 769894 ***
Comment 3 Scoobidiver (away) 2012-10-10 02:51:07 PDT
Let's reopen it for the remaining crashes. Indeed, it's #8 top crasher in 16.0.
Comment 4 Alex Keybl [:akeybl] 2012-10-10 11:05:59 PDT
Scoobidiver - are there any correlations to specific Android versions, devices, or kernels?

Brad - let us know if there's anything besides an initial qa investigation with affected URLs that you may need.
Comment 5 Scoobidiver (away) 2012-10-10 11:51:33 PDT
(In reply to Alex Keybl [:akeybl] from comment #4)
> Scoobidiver - are there any correlations to specific Android versions,
> devices, or kernels?
It's currently a pure JB crash so it seems bug 769894 has fixed crashes on ICS.
Those kinds of startup crashes spike after each release.
Comment 7 Aaron Train [:aaronmt] 2012-10-15 14:21:14 PDT
I haven't been able to reproduce this; Nexus 7/Galaxy Nexus.
Comment 8 Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-10-15 14:24:39 PDT
Devices:
Rockchip U30GT-H
Samsung GT-P1000
Motorola Xoom 	49
Asus Transformer Prime TF201 	43
ASUS Transformer Pad TF300T 	13
Samsung Galaxy Nexus 	9
Samsung Nexus S 	5
Asus Nexus 7 	2
Acer A500 	2
Samsung Nexus S 4G 	1
Samsung GT-N7100 	1
Motorola MZ601 	1
ASUS Transformer Pad TF300TG 	1
Amazon Kindle Fire 	1
Samsung GT-I9000B 	1
Samsung GT-I9100
Comment 9 Brad Lassey [:blassey] (use needinfo?) 2012-10-15 16:17:21 PDT
why are we tracking this? it is the #12 crash on 16, #72 on 17 and doesn't exist on 18 or 19.
Comment 10 Lukas Blakk [:lsblakk] use ?needinfo 2012-10-15 16:32:19 PDT
This is at #74 on the topcrasher list with only 20 crashes on 17.0b1 so untracking.
Comment 11 Scoobidiver (away) 2012-10-16 00:33:28 PDT
(In reply to Brad Lassey [:blassey] from comment #9)
> why are we tracking this? it is the #12 crash on 16, #72 on 17 and doesn't
> exist on 18 or 19.
This startup crash spikes after each release, but doesn't spike in other channels, so it will be again a #10 top crasher in 17.0.
Comment 12 Lukas Blakk [:lsblakk] use ?needinfo 2012-10-17 11:24:32 PDT
(In reply to Scoobidiver from comment #11)
> (In reply to Brad Lassey [:blassey] from comment #9)
> > why are we tracking this? it is the #12 crash on 16, #72 on 17 and doesn't
> > exist on 18 or 19.
> This startup crash spikes after each release, but doesn't spike in other
> channels, so it will be again a #10 top crasher in 17.0.

Scoobidiver, looking at the link in comment 0 there's a spike on 15.0 release but not 15.0.1, 16.0, or 16.0.1 -- is there evidence somewhere I'm not seeing that this is going to spike for 17's release?
Comment 13 Scoobidiver (away) 2012-10-17 11:41:50 PDT
(In reply to Lukas Blakk [:lsblakk] from comment #12)
> Scoobidiver, looking at the link in comment 0 there's a spike on 15.0
> release but not 15.0.1, 16.0, or 16.0.1 -- is there evidence somewhere I'm
> not seeing that this is going to spike for 17's release?
The link in comment 0 doesn't take into account crashes after Sept 4. Use this link instead: https://crash-stats.mozilla.com/report/list?signature=java.lang.IllegalStateException%3A%20Hardware%20acceleration%20can%20only%20be%20used%20with%20a%20single%20UI%20thread.%20Original%20thread%3A%20Thread[main%2C5%2Cmain%5D
Comment 14 Alex Keybl [:akeybl] 2012-10-17 15:37:14 PDT
If we think this is a device/OS-specific startup crash that users are seeing if we've resolved after each release, there may be value to continue investigating. Otherwise, a spike only after release isn't worth devoting a ton of resources to (and may have something to do with the update itself?).

Brad - what do you think?
Comment 15 Alex Keybl [:akeybl] 2012-10-17 15:38:20 PDT
(I understand this is JB-specific, so what I really meant to ask is if we think this is device-specific and persistent)
Comment 16 Scoobidiver (away) 2012-10-17 22:46:16 PDT
(In reply to Alex Keybl [:akeybl] from comment #14)
> If we think this is a device/OS-specific startup crash that users are seeing
> if we've resolved after each release, there may be value to continue
It happens on many devices with any GPU. Note that it was higher in 15.0.1 and has been partially fixed in 16.0.1 and above by bug 769894.
Comment 17 Lukas Blakk [:lsblakk] use ?needinfo 2012-10-18 11:31:08 PDT
Indeed, this has 4 times less crash volume on 16.0.1 compared to 15.0.
Will wait for Brad to make the call on questions in comment 14.
Comment 18 Lukas Blakk [:lsblakk] use ?needinfo 2012-10-22 15:18:37 PDT
Does assigning this to Kats mean a fix is being investigated here for Beta?  We have a couple more weeks of Beta to consider something for uplift.
Comment 19 Kartikaya Gupta (email:kats@mozilla.com) 2012-10-22 22:06:57 PDT
Created attachment 674116 [details] [diff] [review]
Patch

After looking at all the Dialog instances we create, I found exactly one that is created on a non-UI thread. (Note that Dialog objects must be created on the UI thread so that the Handler object they hold on to is also for the UI thread). Attached patch moves the creation into the main UI thread, but I haven't tested the patch other than making sure it compiles and pushing it to try. [1]

:gcp, perhaps you can provide STR? I'm not sure under exactly what conditions this code runs. That would allow us to verify that the patch fixes the problem.

[1] https://tbpl.mozilla.org/?tree=Try&rev=d57bb536a439

Also pre-emptively requesting approval for aurora and beta since we're on a rushed timeline and I'm fairly sure this is the right dialog to be tinkering with.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 746365, I think
User impact if declined: startup crash under some circumstances when profile migrator runs
Testing completed (on m-c, etc.): none
Risk to taking this patch (and alternatives if risky): unsure, perhaps gcp can provide more info
String or UUID changes made by this patch: none
Comment 20 Gian-Carlo Pascutto [:gcp] 2012-10-23 00:18:56 PDT
STR:

1) Copy a places.sqlite from a desktop profile to your mobile profile dir.
2) Remove shared_prefs/ProfileMigrator.xml in your profile.
3) Quit and relaunch Fennec.
Comment 21 Gian-Carlo Pascutto [:gcp] 2012-10-23 00:20:41 PDT
Comment on attachment 674116 [details] [diff] [review]
Patch

Review of attachment 674116 [details] [diff] [review]:
-----------------------------------------------------------------

Looks ok to me.

::: mobile/android/base/GeckoApp.java
@@ +2345,5 @@
> +                        // create it in startCallback and still find a reference to it
> +                        // in stopCallback. (We must create it on the UI thread to fix
> +                        // bug 788216). Note that synchronization is not a problem here
> +                        // since it is only ever touched on the UI thread.
> +                        final SetupScreen[] setupScreenHolder = new SetupScreen[1];

So you need an array here because this must both be final to go into the Runnable, but not final in the sense that you need to change the contents. Neat.
Comment 22 Gian-Carlo Pascutto [:gcp] 2012-10-23 00:26:29 PDT
Actually, alternate STR:

Uninstall Fennec. Install Fennec. Run.

I'm pretty sure that's the STR that we're seeing here; on a fresh profile we need to run the code once to mark that there's nothing to migrate. I was wondering how this code could be an issue on Jelly Bean given that XUL Fennec was deprecated by the time that became available, and especially given that it only runs once and the check to see if it has ran is before the bug. But that's exactly the problem: it needs to run once for *every* user, not just the ones that actually have something to migrate.
Comment 23 Kartikaya Gupta (email:kats@mozilla.com) 2012-10-23 07:53:09 PDT
(In reply to Gian-Carlo Pascutto (:gcp) from comment #20)
> STR:
> 
> 1) Copy a places.sqlite from a desktop profile to your mobile profile dir.
> 2) Remove shared_prefs/ProfileMigrator.xml in your profile.
> 3) Quit and relaunch Fennec.

I was able to repro the crash on a Nightly build using these STR. (Crash report ID at https://crash-stats.mozilla.com/report/index/bp-38c00446-2e13-4bfe-85f3-ff4002121023). The crash occurs once and then does not re-occur when starting Fennec again. This makes sense from the code because the crash happens at the end of the profile migration, so the next time you start up, profile migration is done and doesn't run again.

I installed the build from my try push in comment #19 and verified that the crash does not happen with the same STR on that build.
Comment 24 Kartikaya Gupta (email:kats@mozilla.com) 2012-10-23 07:55:44 PDT
Comment on attachment 674116 [details] [diff] [review]
Patch

I'm happy with one r+ since I've verified that the fix works.

https://hg.mozilla.org/integration/mozilla-inbound/rev/df63d31cb673
Comment 25 Ryan VanderMeulen [:RyanVM] 2012-10-23 19:58:35 PDT
https://hg.mozilla.org/mozilla-central/rev/df63d31cb673
Comment 26 Lukas Blakk [:lsblakk] use ?needinfo 2012-10-24 15:33:58 PDT
"Risk to taking this patch (and alternatives if risky): unsure, perhaps gcp can provide more info"

Gian-Carlo - can you respond to this?  We want to take this, and we still have time to backout if it goes into Beta 4 but please give examples of what regressions might result from this.
Comment 27 Gian-Carlo Pascutto [:gcp] 2012-10-24 23:22:01 PDT
No real risk here, we're moving an operation from the wrong thread to the correct thread. The patch has been tested and verified to work. We want this.
Comment 29 Kevin Brosnan [:kbrosnan] 2013-01-02 13:38:25 PST
Verified fixed by crash stats.

Note You need to log in before you can comment on or make changes to this bug.