Closed Bug 1296298 Opened 3 years ago Closed 3 years ago

Crash in mozilla::AndroidBridge::OpenGraphicsLibraries

Categories

(Firefox for Android :: Toolbar, defect, P1, critical)

48 Branch
Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
Firefox 51
Tracking Status
firefox48 --- wontfix
firefox49 + fixed
firefox50 --- fixed
firefox51 --- fixed

People

(Reporter: marcia, Assigned: snorp)

Details

(Keywords: crash)

Crash Data

Attachments

(2 files, 4 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-486c3823-6ed6-49ed-beaf-3f8cd2160818.
=============================================================

Seen while looking at 48 crash stats - #13 top startup crash: http://bit.ly/2bAiPpF.

ni on snorp since it appears he worked in that area of code.
Flags: needinfo?(snorp)
It looks like this could happen when you are casting your screen to a Chromecast when starting Fennec. I'll try to take a look.
Assignee: nobody → snorp
Flags: needinfo?(snorp)
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #1)
> It looks like this could happen when you are casting your screen to a
> Chromecast when starting Fennec. I'll try to take a look.

Corroborated by the comments, where I see "Google Cast breaks Firefox. Crashes when casting. "
FWIW, I am seeing this affect versions back to Fennec 43 however there are some historical spikes.
=========================================================
Mar 12, 2016:  ~79 crashes > ~114 crashes ( 44% increase) 
Apr 15, 2016: ~112 crashes > ~149 crashes ( 33% increase)
Jun 10, 2016: ~158 crashes > ~192 crashes ( 22% increase) 
Jun 30, 2016: ~211 crashes > ~496 crashes (135% increase)
Aug 10, 2016: ~475 crashes > ~682 crashes ( 44% increase)
=========================================================
Overall 763% increase over the last 6 months!

As far as I can tell these dates do not correlate to any internal events (releases, merges, etc) so maybe it's simply due to more people trying to use Chromecast and Fennec? I really don't know.
Comment on attachment 8783706 [details] [diff] [review]
Delay adding presentation surface on Android until we are up and running

Review of attachment 8783706 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/java/org/mozilla/gecko/PresentationMediaPlayerManager.java
@@ +123,4 @@
>          public void surfaceChanged(SurfaceHolder holder, int format, int width,
>                                     int height) {
>              // Surface changed so force a composite
>              invalidateAndScheduleComposite(mGeckoView);

I think you should delay this one too.

@@ +128,5 @@
>  
>          @Override
>          public void surfaceCreated(SurfaceHolder holder) {
> +            queueNativeCallUntil(State.PROFILE_READY, PresentationMediaPlayerManager.class,
> +                                 "addPresentationSurface", mGeckoView, holder.getSurface());

> "addPresentationSurface", GeckoView.class, mGeckoView, Surface.class, holder.getSurface());

@@ +133,5 @@
>          }
>  
>          @Override
>          public void surfaceDestroyed(SurfaceHolder holder) {
>              removePresentationSurface();

And delay this one.
Attachment #8783706 - Flags: review?(nchen) → feedback+
(In reply to Jim Chen [:jchen] [:darchons] from comment #5)
> Comment on attachment 8783706 [details] [diff] [review]
> Delay adding presentation surface on Android until we are up and running
> 
> Review of attachment 8783706 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> :::
> mobile/android/base/java/org/mozilla/gecko/PresentationMediaPlayerManager.
> java
> @@ +123,4 @@
> >          public void surfaceChanged(SurfaceHolder holder, int format, int width,
> >                                     int height) {
> >              // Surface changed so force a composite
> >              invalidateAndScheduleComposite(mGeckoView);
> 
> I think you should delay this one too.
> 

Probably so, yup.

> @@ +128,5 @@
> >  
> >          @Override
> >          public void surfaceCreated(SurfaceHolder holder) {
> > +            queueNativeCallUntil(State.PROFILE_READY, PresentationMediaPlayerManager.class,
> > +                                 "addPresentationSurface", mGeckoView, holder.getSurface());
> 
> > "addPresentationSurface", GeckoView.class, mGeckoView, Surface.class, holder.getSurface());
> 
> @@ +133,5 @@
> >          }
> >  
> >          @Override
> >          public void surfaceDestroyed(SurfaceHolder holder) {
> >              removePresentationSurface();
> 
> And delay this one.

I disagree here. Once this method returns, the surface will be invalid. We probably need *additional* synchronization here, not less.
I guess we should still check if gecko is running, though. If it's not, we should cancel any pending calls from the changed and created events.
Attachment #8783706 - Attachment is obsolete: true
Comment on attachment 8784390 [details] [diff] [review]
Add GeckoThread.cancelQueuedNativeCalls

Review of attachment 8784390 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoThread.java
@@ +340,5 @@
>              queueNativeCallLocked(obj.getClass(), methodName, obj, args, State.RUNNING);
>          }
>      }
>  
> +    public static boolean cancelQueuedNativeCalls(final Object obj, final String methodName) {

There should be two overloads for either Class or Object.

@@ +346,5 @@
> +        synchronized (QUEUED_CALLS) {
> +            Iterator<QueuedCall> it = QUEUED_CALLS.iterator();
> +            while (it.hasNext()) {
> +                QueuedCall call = it.next();
> +                if (call != null && call.target == obj && call.method.getName() == methodName) {

`call.target` is null (and not Foo.class) for static methods. You need to check `call.method.getDeclaringClass()`.
Attachment #8784390 - Flags: review?(nchen)
Comment on attachment 8784391 [details] [diff] [review]
Delay adding presentation surface on Android until we are up and running

Review of attachment 8784391 [details] [diff] [review]:
-----------------------------------------------------------------

Actually I'm not sure this would work too well, because these calls are expected to be called on the UI thread, but when you queue them they will possibly get called on the Gecko thread.

::: mobile/android/base/java/org/mozilla/gecko/PresentationMediaPlayerManager.java
@@ +123,5 @@
>          public void surfaceChanged(SurfaceHolder holder, int format, int width,
>                                     int height) {
>              // Surface changed so force a composite
> +            GeckoThread.queueNativeCallUntil(GeckoThread.State.PROFILE_READY, PresentationMediaPlayerManager.class,
> +                                            "invalidateAndScheduleComposite", mGeckoView);

Second line should be,
> "invalidateAndScheduleComposite", GeckoView.class, mGeckoView);

@@ +129,5 @@
>  
>          @Override
>          public void surfaceCreated(SurfaceHolder holder) {
> +            GeckoThread.queueNativeCallUntil(GeckoThread.State.PROFILE_READY, PresentationMediaPlayerManager.class,
> +                                             "addPresentationSurface", mGeckoView, holder.getSurface());

Second line should be,
> "addPresentationSurface", GeckoView.class, mGeckoView, Surface.class, holder.getSurface());
Attachment #8784391 - Flags: review?(nchen)
This goes a different route and just avoids adding the presentation stuff until Gecko is running
Attachment #8785455 - Flags: review?(nchen)
Attachment #8784390 - Attachment is obsolete: true
Attachment #8784391 - Attachment is obsolete: true
Comment on attachment 8785455 [details] [diff] [review]
Don't add a presentation for MediaRouter until Gecko is running

Nevermind, this doesn't work because you can't register multiple listeners for Gecko:Ready. Ugh.
Attachment #8785455 - Flags: review?(nchen)
OK, now I'm confused as to how we ever end up with a null AndroidBridge, because the presentation stuff is only initialized in Gecko:DelayedStartup[0]. That means Gecko is definitely up and we should have an AndroidBridge. The only thing that makes sense is that we have destroyed the AndroidBridge because we're exiting, but I guess haven't finished the browser activity? Weirdness.

[0] https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/BrowserApp.java#1899
This is very strange. The log here[0] shows that we crash before browser.js does any work ('chrome startup finished' message is after the crash but at the beginning of BrowserApp.startup), which means it was impossible for Gecko:DelayedStartup to have been sent, and therefore PresentationMediaPlayerManager could not have been created. Yet it does, because it's making JNI calls.


[0] https://crash-stats.mozilla.com/report/index/eb7fed95-513e-47cc-b7b0-fb6002160826
We also don't see the log message from BrowserApp ("Found tag") when it's creating the PresentationMediaPlayerManager. So somehow an instance is being created elsewhere, though I don't see how it's possible. It's using Fragment stuff that I don't really understand, so maybe that's somehow involved? Anyway, I think I'm just going to guard using the GeckoThread state and leave it at that.
Attachment #8785455 - Attachment is obsolete: true
Comment on attachment 8785463 [details] [diff] [review]
Don't try to add a presentation surface unless Gecko is running

Review of attachment 8785463 [details] [diff] [review]:
-----------------------------------------------------------------

Can we also disable the casting button before Gecko is ready?
Attachment #8785463 - Flags: review?(nchen) → review+
(In reply to Jim Chen [:jchen] [:darchons] from comment #18)
> Comment on attachment 8785463 [details] [diff] [review]
> Don't try to add a presentation surface unless Gecko is running
> 
> Review of attachment 8785463 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Can we also disable the casting button before Gecko is ready?

This is different from casting a video. This particular crash is from when you cast your entire screen to another device, so we can't do anything about that.
Pushed by jwillcox@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1cef2ab65912
Don't try to add a presentation surface unless Gecko is running r=jchen
https://hg.mozilla.org/mozilla-central/rev/1cef2ab65912
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 51
James, the volume is important on release. Do we want to uplift to aurora & beta?
Flags: needinfo?(snorp)
Comment on attachment 8785463 [details] [diff] [review]
Don't try to add a presentation surface unless Gecko is running

Approval Request Comment
[Feature/regressing bug #]: Unknown
[User impact if declined]: Crashes on startup when casting screen
[Describe test coverage new/current, TreeHerder]: Nightly
[Risks and why]: Low
[String/UUID change made/needed]: None
Flags: needinfo?(snorp)
Attachment #8785463 - Flags: approval-mozilla-beta?
Attachment #8785463 - Flags: approval-mozilla-aurora?
Comment on attachment 8785463 [details] [diff] [review]
Don't try to add a presentation surface unless Gecko is running

Startup crash fix, Aurora50+
Attachment #8785463 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Needs rebasing for Aurora.
Flags: needinfo?(snorp)
Comment on attachment 8785463 [details] [diff] [review]
Don't try to add a presentation surface unless Gecko is running

Avoid a casting startup crash, let's take this for beta as well.
Attachment #8785463 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
fyi, still needs rebasing, and in risk of missing the release i guess
Flags: needinfo?(sledru)
Flags: needinfo?(lhenry)
I'll rebase this today
Flags: needinfo?(snorp)
Flags: needinfo?(lhenry)
This is the rebase patch for aurora/beta (I already pushed to aurora)
Comment on attachment 8785463 [details] [diff] [review]
Don't try to add a presentation surface unless Gecko is running

[Triage Comment]

We should get this onto mozilla-release so as to avoid a big startup crash for fennec in 49.
Attachment #8785463 - Flags: approval-mozilla-release+
We'll need to verify this by checking https://crash-stats.mozilla.com, after the RC build and next Beta build will be released on Play Store.
You need to log in before you can comment on or make changes to this bug.