Closed Bug 1064669 Opened 10 years ago Closed 9 years ago

GooglePlayServicesUtil.isGooglePlayServicesAvailable takes ~212ms during startup

Categories

(Firefox for Android Graveyard :: Screencasting, defect)

All
Android
defect
Not set
normal

Tracking

(firefox32 wontfix, firefox33- wontfix, firefox34+ wontfix, firefox35+ wontfix, firefox36+ wontfix, firefox37 fixed, fennec37+)

RESOLVED FIXED
Firefox 37
Tracking Status
firefox32 --- wontfix
firefox33 - wontfix
firefox34 + wontfix
firefox35 + wontfix
firefox36 + wontfix
firefox37 --- fixed
fennec 37+ ---

People

(Reporter: mfinkle, Assigned: wesj)

References

Details

(Keywords: perf)

Attachments

(1 file, 6 obsolete files)

We call it here:
http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/ChromeCast.java#143

I don't know if there is a way to lazy start the media player manager or not. The route issue here is that isGooglePlayServicesAvailable ends up taking a lot of time due to certificate handling.
I can upload the trace file somewhere if anyone wants to take a look.
Keywords: perf
OS: Linux → Android
Hardware: x86_64 → All
Looks like this *should* be as simple as not triggering onRouteAdded until later.

I guess we can do that either by calling MPM.init later (in delayedstartup?), or by delaying rout handling.

Right now we init (via reflection) at the end of BrowserApp.onCreate.

Call stack for onRouteAdded:

  isGo…
  ChromeCast (constructor)
  MPM.getMediaPlayerForRoute
  MediaRouter.Callback.onRouteAdded
I wonder if we could delay until we get this message from the JS service:
http://mxr.mozilla.org/mozilla-central/source/mobile/android/modules/SimpleServiceDiscovery.jsm#179
Can we move the init onto the background thread?
Blocks: 1056176
Pretty sure we want to be tracking this…
tracking-fennec: --- → ?
Assignee: nobody → wjohnston
tracking-fennec: ? → 33+
Attached patch Patch (obsolete) — Splinter Review
Moved this to DelayedStartup. I also added an AppConstants check outside it (although the same check is done in getMediaPlayerManager() so its a bit redundant, I was hoping proguard would remove more of this if we have it here).

This seems like the simplest fix to put in and uplift to beta.
Attachment #8491907 - Flags: review?(rnewman)
Comment on attachment 8491907 [details] [diff] [review]
Patch

Review of attachment 8491907 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm!

I'd encourage you to do this prior to the stumbler block, though. Stumbler is the lowest priority thing, even in delayed startup.
Attachment #8491907 - Flags: review?(rnewman) → review+
Attached patch Patch v2 (obsolete) — Splinter Review
I'm running something a bit like this on try.

https://tbpl.mozilla.org/?tree=Try&rev=ac0e4190a7e3

I think the issue is that we're not getting an onResume() call now, because we're setting up in delayed startup. This forces us to call onResume() if we're already unpaused when the listener is added. Does this seem sane rnewman?
Attachment #8491907 - Attachment is obsolete: true
Attachment #8494720 - Flags: review?(rnewman)
Flags: needinfo?(wjohnston)
Trying again. Needed to initialize the MediaRouter earlier:

https://tbpl.mozilla.org/?tree=Try&rev=8b4b62a9fde8
(In reply to Wesley Johnston (:wesj) from comment #12)
> Trying again. Needed to initialize the MediaRouter earlier:
> 
> https://tbpl.mozilla.org/?tree=Try&rev=8b4b62a9fde8

Looks OK, but I retriggered some runs.
Comment on attachment 8494720 [details] [diff] [review]
Patch v2

>diff --git a/mobile/android/base/GeckoApp.java b/mobile/android/base/GeckoApp.java
>     public void onResume()
>     {

>+            mIsPaused = true;

That seems wrong

This seems like an overall scary change.
Comment on attachment 8494720 [details] [diff] [review]
Patch v2


>diff --git a/mobile/android/base/GeckoApp.java b/mobile/android/base/GeckoApp.java

>-    public void addAppStateListener(GeckoAppShell.AppStateListener listener) {
>+    public synchronized void addAppStateListener(GeckoAppShell.AppStateListener listener) {
>         mAppStateListeners.add(listener);
>+
>+        // If we're already resumed, make sure the listener gets a notification.
>+        if (!isPaused()) {
>+            listener.onResume();
>+        }

Is this a potential problem for all AppStateListeners?
I think (In reply to Mark Finkle (:mfinkle) from comment #14)
> Comment on attachment 8494720 [details] [diff] [review]
> Patch v2
> 
> >diff --git a/mobile/android/base/GeckoApp.java b/mobile/android/base/GeckoApp.java
> >     public void onResume()
> >     {
> 
> >+            mIsPaused = true;
> 
> That seems wrong

Yeah. Typo. Fixed locally. Sorry.

It could be a problem for any AppStateListener that registers itself after startup. Basically, they can't expect every onResume() to be paired with an onPause(). But I agree its a bit scary. I originally just hacked this in for MediaPlayerManager, but moved it to this as I realized this is where the real bug is. These lifecycles are... too fragile. We should probably just make these viewless Fragments or something, so that Android can manage this for us (although I doubt that will actually make anything better).
(In reply to Wesley Johnston (:wesj) from comment #11)

> I think the issue is that we're not getting an onResume() call now, because
> we're setting up in delayed startup. This forces us to call onResume() if
> we're already unpaused when the listener is added.

If this is the case, couldn't you just force a call to MediaPlayerManager.onPause() from where we init it?
We could call onResume() when we init it? I started to do that and then worried that DelayedStartup could happen after we are put in the background. i.e. we could be paused at that point, so we needed some way to know which state we were in. I added a isPaused boolean to track the state in GeckoApp.

For some reason I decided it should be threadsafe here, but all the current queries are on the UI Thread. We could dump all that and this would look a lot simpler.
What if we tried something completely different?

The performance regression is when we create a new ChromeCast object in response to onRouteAdded:
http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/MediaPlayerManager.java#223

What if we tracked an "enabled" flag in MediaPlayerManager, defaulted to false, but set to true in DelayedStartup?

That way we would keep the lifecycle crap working for MediaPlayerManager, but we would avoid starting the slow code until we were ready.

Thoughts?
Attached patch Patch (obsolete) — Splinter Review
This is the same patch with most of the threading tricks removed. AFAIK, the only method that might not be on the UI thread is addAppStateListener, so I added an assert in there.

I'm not sure what you mean by the enabled trick. You want the callback to ignore mediaRoute's it finds until we're enabled? We can't just throw those away, so we'd have to cache them in a temporary cache in MediaPlayerManager so that when we were enabled we could set them up. That sounds more complicated and risky than this to me.
Attachment #8494720 - Attachment is obsolete: true
Attachment #8494720 - Flags: review?(rnewman)
Attachment #8496300 - Flags: review?(mark.finkle)
Comment on attachment 8496300 [details] [diff] [review]
Patch

OK. That's try it.

In my mind, the queue in MediaPlayerManager is cleaner and more isolated, but let's see how this works out.
Attachment #8496300 - Flags: review?(mark.finkle) → review+
Comment on attachment 8496300 [details] [diff] [review]
Patch

Approval Request Comment
[Feature/regressing bug #]: Chromecast support
[User impact if declined]: Startup delay
[Describe test coverage new/current, TBPL]: Startup tests caught this
[Risks and why]: This is pretty simple. Its a bit of a hammer, but a pretty straightforward simple fix.
[String/UUID change made/needed]: none.
Attachment #8496300 - Flags: approval-mozilla-beta?
Attachment #8496300 - Flags: approval-mozilla-aurora?
We have one beta left. I'm not sure that it's worth risk for the perf gain on beta. I'll let Sylvestre make that call. I think this is a good change for Aurora after it successfully lands on m-c.
https://hg.mozilla.org/mozilla-central/rev/a1d7b16101c7
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 35
Comment on attachment 8496300 [details] [diff] [review]
Patch

Yeh, it is a bit too late for optimization changes. Let it ride the train.
Attachment #8496300 - Flags: approval-mozilla-beta?
Attachment #8496300 - Flags: approval-mozilla-beta-
Attachment #8496300 - Flags: approval-mozilla-aurora?
Attachment #8496300 - Flags: approval-mozilla-aurora+
Depends on: 1076841
The WebRTC camera driver:

a) Needs to add appStateListeners off of the main thread. [1]
b) Needs to do so after the application has already started and is running.
c) Expects onResume to only be called after an onPause call.

This patch breaks every single one of those. I will back it out in 1076841.

[1] If you observe that it likely should only do so in a synchronized method or when listeners is a synchronized object, you'd be right!
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: Firefox 35 → ---
Attachment #8496300 - Flags: approval-mozilla-aurora+
(In reply to Gian-Carlo Pascutto [:gcp] from comment #28)
> a) Needs to add appStateListeners off of the main thread. [1]
Sounds like we need to fix the threading here then (like the original patch did), or have you do this on the UIThread.

> b) Needs to do so after the application has already started and is running.
That's what we were fixing here.

> c) Expects onResume to only be called after an onPause call.
This feels broken to me. I assume they just call their onResume code themself when they add the listener? Can we fix this in webrtc?

Do we have bugs for these things? It might have been easier to just fix this (i.e. we can call onResume manually while we fix the broken webrtc code) rather than rush to backout.
(In reply to Wesley Johnston (:wesj) from comment #31)
> (In reply to Gian-Carlo Pascutto [:gcp] from comment #28)
> > a) Needs to add appStateListeners off of the main thread. [1]
> Sounds like we need to fix the threading here then (like the original patch
> did), or have you do this on the UIThread.

addAppStateListener/removeAppStateListener should indeed be synchronized and the fact that they're not is a bug which you found, see comment 28. Note that doing this properly requires synchronizing onPause/onResume as well, but that's perfectly legal.

Doing it on the main thread is problematic, at the very least the Camera driver needs to be sure those handlers are registered before it allocates the camera. Allowing those to be called off the main thread isn't a problem and makes that trivial. Doing it on the UI thread would require additional messaging back and forth.

> > b) Needs to do so after the application has already started and is running.
> That's what we were fixing here.

No, you misunderstand on how the logic works, see below:

> > c) Expects onResume to only be called after an onPause call.
> This feels broken to me. I assume they just call their onResume code
> themself when they add the listener? 

Not at all. They expect onResume to be called when the app is getting an onResume, not at a random point when the app is running fine and there's nothing being resumed.

You're trying to second guess the code here and assuming that it wants to hear about events that happened before it registered it's event handler. That's a very weird way to make event listeners work and the current code most certainly doesn't assume any such thing.

Adding mIsPaused seems like a fine addition, but the code that wants to register AppStateListeners should be the one using it, I don't think it's up the event system itself to second guess that code probably wants to get a phantom onResume event.

On a related note, the Camera code expects to only be running when the activity isn't paused. If that's an invalid assumption it'd need a fix to query mIsPaused as well. But my understanding is that the thing described in your comment 18 can't happen, i.e. no UI thread code can ever be called after onPause finishes.
(In reply to Gian-Carlo Pascutto [:gcp] from comment #33)
> Not at all. They expect onResume to be called when the app is getting an
> onResume, not at a random point when the app is running fine and there's
> nothing being resumed.

I guess I'm using these as lifecycles callbacks on the object instead of "events". i.e. I want to know the app state. Getting an initial callback with the current state when you register doesn't feel wrong to me. But we can treat them differently if you want.
Or move the MediaPlayerManager to be a fragment instead....
[Tracking Requested - why for this release]: Noming to untrack. If we're not going to uplift this, I don't want it tracked.
The wontfix tag was enough ;)
tracking-fennec: 33+ → 34+
Wes - Are you still on this? Do you still intend to fix this bug in 34?
Flags: needinfo?(wjohnston)
Attached patch Patch (obsolete) — Splinter Review
The only difference here is this force calls onResume() if the app isn't paused when the app state listener is initialized. To do that I added an isPaused() method to GeckoApp and removed the onResume call I'd put in addAppStateListener. I also synchronized onResume, onPause, and isPaused on mIsPaused so that if someone queries isPaused while we're in the process of pausing it will wait. I don't think the performance implications there are huge?
Attachment #8496300 - Attachment is obsolete: true
Flags: needinfo?(wjohnston)
Attachment #8515229 - Flags: review?(rnewman)
Comment on attachment 8515229 [details] [diff] [review]
Patch

Review of attachment 8515229 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/GeckoApp.java
@@ +1889,5 @@
>  
>      @Override
>      public void onResume()
>      {
> +        synchronized (mIsPaused) {

This is actually super super tricksy -- you're capturing a Boolean and using that as a lock, and then swapping out the value on line 1916.

Unfortunately, another thread arriving after that assignment will see a different instance of mIsPaused (a new Boolean(false)), and will synchronize on *that*. Which isn't what you want, otherwise you'd have ended the synchronized block on line 1917.

You probably want an mIsPausedLock = new Object(), and to use that instead.

But I'm not sure why you need to lock here at all.

Because mIsPaused is volatile, and it's only written on the UI thread, you can't get write races.

If you're trying to exclude readers until a new value is ready, synchronization isn't quite the right model; you need to have a third state ("not sure") and a barrier. Synchronization will _often_ work (when onResume starts first), but what happens if the other caller calls isPaused() _before_ onResume? It'll stably and immediately read "true", when I think you want it to wait until it reads "false".

Think about whether it's OK to simply set mIsPaused when you're done resuming or done pausing, or whether we need to have some kind of hook to let MPM know when to start.
Attachment #8515229 - Flags: review?(rnewman) → review-
On IRC you expressed concern about a scenario like this:

 if (!isPaused)    // GeckoThread, mIsPaused = false
                   // mIsPaused is set true in GeckoApp.onPause on UIThread
 MPM.onPause()     // UIThread
 MPM.onResume()    // GeckoThread

The result of this would be that we'd call mediaRouter.addCallback when we don't want to.

Assuming that we care, there are three approaches to fix that.

1. Check isPaused again in onResume. That is: "are we paused? no, build this thing. are we paused? yes? never mind.".

2. Post runnables between threads to ensure that operations are serialized.

3. Use synchronization to tie MPM to GeckoApp. This would require MPM to hold a lock around GeckoApp.mIsPaused for the duration of MPM.onResume; effectively blocking the UI thread to allow the GeckoThread to finish resuming MPM.

#3 is equivalent to, but less efficient than, simply queuing MPM.onPause onto the Gecko thread. In both cases you'd ensure that MPM.onPause runs after an already-triggered MPM.onResume.

#1 is vewwwwy close to perfect; yes, you could *in theory* have GeckoApp pause between the check and the addCallback call, but it's unlikely.
Attached patch Patch v3 (obsolete) — Splinter Review
This likes to crash, but I think its probably a better route. Rather than doing this ourselves, it just uses a viewless-Fragment to manage the lifecycle. That basically gives me what I had before with out dealing with the dumbness of the webrtc code.
Attachment #8515229 - Attachment is obsolete: true
Attachment #8515368 - Flags: feedback?(rnewman)
Attached patch Patch v4 (obsolete) — Splinter Review
Using fragments. This is a lot simpler I think. I also ran into some races with initializing CastingApps and initializing this. i.e. both were initialized in startup/delayed-startup, but because finding devices is an async operation, they can often appear after CastingApps initial poll, meaning it takes another 2 min for them to appear in the UI). We seemed to be lucky before and the initial detection was done before CastingApps.js was even loaded.

This removed the poll (and all the Android specific code) in SimpleServiceDiscovery entirely, and instead has Java notify JS when a new device is seen. I exposed an interface in SimpleServiceDiscovery to allow us to add/remove devices from outside.
Attachment #8515368 - Attachment is obsolete: true
Attachment #8515368 - Flags: feedback?(rnewman)
Attachment #8516324 - Flags: review?(mark.finkle)
ping ping ping?
Flags: needinfo?(mark.finkle)
Comment on attachment 8516324 [details] [diff] [review]
Patch v4

>diff --git a/mobile/android/base/BrowserApp.java b/mobile/android/base/BrowserApp.java
>+    // Tag for media player fragment. This should be in MediaPlayerManager but it isn't always built. To keep
>+    // from having to reflect it out, its stored here.
>+    private static final String MEDIA_PLAYER_TAG = "MediaPlayerManagerFragment";

Grumble. Aren't we already reflecting from the MediaPlayer?

>diff --git a/mobile/android/base/GeckoApp.java b/mobile/android/base/GeckoApp.java

>     private String mPrivateBrowsingSession;
>-
>     private volatile HealthRecorder mHealthRecorder;

Some housekeeping?

>diff --git a/mobile/android/chrome/content/CastingApps.js b/mobile/android/chrome/content/CastingApps.js

> var mediaPlayerDevice = {

>+  init: function() {
>+    Services.obs.addObserver(this, "MediaPlayer:Added", false);
>+    Services.obs.addObserver(this, "MediaPlayer:Changed", false);
>+    Services.obs.addObserver(this, "MediaPlayer:Removed", false);
>+  },
>+  observe: function(subject, topic, data) {
>+    if (topic === "MediaPlayer:Added") {
>+      let service = this.toService(JSON.parse(data));
>+      SimpleServiceDiscovery.addService(service);
>+    } else if (topic === "MediaPlayer:Changed") {
>+      let service = this.toService(JSON.parse(data));
>+      SimpleServiceDiscovery.updateService(service);
>+    } else if (topic === "MediaPlayer:Removed") {
>+      SimpleServiceDiscovery.removeService(data);
>+    }
>+  },
>+  toService: function(display) {
>+    // Convert the native data into something matching what is created in _processService()
>+    return {
>+      location: display.location,
>+      target: "media:router",
>+      friendlyName: display.friendlyName,
>+      uuid: display.uuid,
>+      manufacturer: display.manufacturer,
>+      modelName: display.modelName,
>+      mirror: display.mirror
>+    };
>+  }

I think this is a good step, moving us toward creating a new refactor/structure later. More like the Network Service Discovery system we have talked about:
http://www.w3.org/TR/discovery-api/

Where SSDP and MediaRoute are two discovery mthods feeding into a NetworkDiscoveryService singleton

>diff --git a/toolkit/modules/secondscreen/SimpleServiceDiscovery.jsm b/toolkit/modules/secondscreen/SimpleServiceDiscovery.jsm

Make sure Desktop and Tests are not broken by these changes

>+  addService: function(service, update = false) {

I don't like this change. It is tempting to make addService handle add and update, I don't think we should. See notes below:

>     // Filter out services that do not match the device filter
>     if (!this._filterService(service)) {
>       return;
>     }

Updating should not need to filter.

>     // Only add and notify if we don't already know about this service
>-    if (!this._services.has(service.uuid)) {
>+    var found = this._services.has(service.uuid);

Updating should only work if the service is found

>+  updateService: function(service) {
>+    this.addService(service, true);

Please add an implementation for updateService which might be similar to addService, but not the same.

>   }
>+
> }

No need for the blank line

r- for the addService / updateService changes. I just want to see the final patch.
Flags: needinfo?(mark.finkle)
Attachment #8516324 - Flags: review?(mark.finkle) → review-
We're too late in 34 to take this perf fix that has already bounced once. Keep pushing. It would be great to see this fixed while 35 is still on Aurora.
tracking-fennec: 34+ → 35+
Next steps?
Flags: needinfo?(wjohnston)
Attached patch Patch v5Splinter Review
I reworked the add/update/removeService code a bit. I also reflected out the tag field out of MediaPlayerManager.
Attachment #8516324 - Attachment is obsolete: true
Flags: needinfo?(wjohnston)
Attachment #8538073 - Flags: review?(mark.finkle)
Comment on attachment 8538073 [details] [diff] [review]
Patch v5

>diff --git a/mobile/android/base/BrowserApp.java b/mobile/android/base/BrowserApp.java

>+                if (AppConstants.MOZ_MEDIA_PLAYER) {

>+                            final String tag = "";
>+                            final Field[] fields = mediaManagerClass.getFields();
>+                            for (Field f : fields) {
>+                                Log.i(LOGTAG, "Found field: " + f.getName());
>+                            }

Is this loop needed?

>diff --git a/mobile/android/base/MediaPlayerManager.java b/mobile/android/base/MediaPlayerManager.java

>+    @JNITarget
>+    public static final String MEDIA_PLAYER_TAG = "MediaPlayerManagerFragment";

Careful with the length. I thought Android clipped long ones.

>diff --git a/mobile/android/gradle/build.gradle b/mobile/android/gradle/build.gradle
>diff --git a/mobile/android/gradle/gradle/wrapper/gradle-wrapper.properties b/mobile/android/gradle/gradle/wrapper/gradle-wrapper.properties

Revert these files? Or use a separate bug?

>diff --git a/toolkit/modules/secondscreen/SimpleServiceDiscovery.jsm b/toolkit/modules/secondscreen/SimpleServiceDiscovery.jsm

>+  addService: function(service) {

>+      if(!this._addService(service)) {
>+        return;

if (
Attachment #8538073 - Flags: review?(mark.finkle) → review+
I feel like this is big enough of a patch to wait and see how it does before requesting uplift. Will do so in a week or so.
https://hg.mozilla.org/mozilla-central/rev/a275c8158e78
Status: REOPENED → RESOLVED
Closed: 10 years ago9 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 37
Tracking for 36, wontfix for 35.  If this is successful and stable, it can be uplifted to 36 in early beta cycle.
Leaving a needinfo for Wes as a reminder for January.
Flags: needinfo?(wjohnston)
Please make sure to uplift bug 1116872 as well if you uplift this.
Depends on: 1116872
Wes, it has been a week, could you fill the uplift requests? Thanks (and on bug 1116872 if needed)
Richard, can you help here? thanks
Flags: needinfo?(rnewman)
Just poked Wes; he says he's on it.
Flags: needinfo?(rnewman)
Mark, is it a patch you want for 36? Thanks
Flags: needinfo?(mark.finkle)
tl;dr - Let it ride the trains

With this regression already in a few releases, let's just let it ride the trains. I don't want to take the patch on Beta. If we had uplifted it while on Aurora, that would have been OK. Let's not take the risk.
Flags: needinfo?(wjohnston)
Flags: needinfo?(mark.finkle)
tracking-fennec: 35+ → 37+
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: