Open Bug 1116313 Opened 5 years ago Updated 2 months ago

Tab switches are painfully slow

Categories

(Firefox for Android :: General, defect, P2)

All
Android
defect

Tracking

()

Tracking Status
fennec + ---

People

(Reporter: rnewman, Unassigned)

References

(Blocks 1 open bug, )

Details

(Keywords: perf)

Attachments

(2 files)

Some pages are worse than others, but all are bad.

Here's a video on a brand-new Nexus 9:

https://www.youtube.com/watch?v=5Kh_drsBOGI

On my phone it's just as bad, but the switch to and from the tab tray make it seem less painful.

Chrome's switches, even on this huge screen, are near-instant.

I'll attach a profile with four tab switches.
Attached file Profile.
Attached image The horror
This shows a one-second slice, which corresponds to one tab switch.

You'll see the Gecko thread pumping its message loop about once every 20msec.

You'll see the Android main thread pumping its message loop, likewise.

Then you'll see LayerRenderer taking a shit-ton of locks and COWALs, figuring out metrics, and drawing.

All of this is steady-state stuff. (No wonder our battery perf is so bad!)

The touch event happens at about 3080msec.

We finally send this event to Gecko about 120msec later.

We're done parsing the JSON in the message after 20msec more.

Then the Gecko thread is tied up for 40msec serializing JSON for the reply, while the main thread is tied up doing... stuff for 50msec.

Finally, we do a TON of Android layout. A ton.

All of this is very frightening, considering we have the page in memory and the UI isn't really changing shape. (Bug 1107636 should help with this.)
BrowserToolbar.onTabChanged: 96msec wall, 80msec CPU.

45msec of that is ToolbarEditLayout.setText.

The rest is a mix of ToolbarDisplayLayout.setTitle, some futzing with ImageView, and stuff like updateTabCount (on switch?!).

GeckoApp.invalidateOptionsMenu takes 82msec wall.
(In reply to Richard Newman [:rnewman] from comment #3)
> BrowserToolbar.onTabChanged: 96msec wall, 80msec CPU.
> 
> 45msec of that is ToolbarEditLayout.setText.
> 
> The rest is a mix of ToolbarDisplayLayout.setTitle, some futzing with
> ImageView, and stuff like updateTabCount (on switch?!).

I'm sure we could tweak this a bit to remove unneeded paths, like updateTabCount.

> GeckoApp.invalidateOptionsMenu takes 82msec wall.

This is mostly updating the menu to be "right" for the newly selected tab's state.

That said, this is not a lot of time. If this is slow enough to be painfully obvious, I'd be looking for other smoking guns too.
GeckoMenu.findItem gets called a lot and can take up a non-trivial amount of time. I wonder if we should make a MenuUtils.find(int id) that maintains a hashmap. This will reduce the hiearchy-walking and hasSubMenu / hasActionProvider overhead.

This would assume that menu IDs would not change during a session, which should be true, since MenuUtils is static.
(In reply to Mark Finkle (:mfinkle) from comment #4)

> That said, this is not a lot of time. If this is slow enough to be painfully
> obvious, I'd be looking for other smoking guns too.

Take a look at the profile (or the screenshot).

We do multiple expensive linear steps, each of which should be shrunk or eliminated. The Runnable in Tabs (Tabs$5.run()) that includes each of the calls in Comment 3 takes a total of *240msec* to complete.

This is on a top-end multi-core device (no context switches), so that's pretty much the best case scenario.

A quarter-second main thread stall is painfully obvious, particularly next to Chrome.

I think we need to be walking through the profile for that runnable, knocking it down.

Additionally, the Gecko thread does about 75msec of JSON work to handle each tab switch. 75msec is itself kinda worrying, because that's tying up a core and preventing handling of subsequent messages. On a single-core device that pushes us towards one-second-plus tab switches.
Reading profiles is a lot like reading tea leaves, so I'll use the same approach I use for reading tea leaves: Look for the obvious.

> > That said, this is not a lot of time. If this is slow enough to be painfully
> > obvious, I'd be looking for other smoking guns too.
> 
> Take a look at the profile (or the screenshot).
> 
> We do multiple expensive linear steps, each of which should be shrunk or
> eliminated. The Runnable in Tabs (Tabs$5.run()) that includes each of the
> calls in Comment 3 takes a total of *240msec* to complete.

The profile tells me:
1. Tabs$5.run is called 8 times for four tabs switches, taking an average of 30ms per call
2. Of the two largest children:
  A. BrowserApp.onTabChanged is mainly about updating menus
  B. BrowserToolbar.onTabChanged is about setting text
3. I see a lot of Android layout work happening

In #1, we should try to stop two calls per tab change
In #2A, we should look at making the menu updates faster
For #3, we can hope the RelativeLayout removal helps. I bet it does have a big impact.

> Additionally, the Gecko thread does about 75msec of JSON work to handle each
> tab switch. 75msec is itself kinda worrying, because that's tying up a core
> and preventing handling of subsequent messages. On a single-core device that
> pushes us towards one-second-plus tab switches.

This could be a low ROI path. Unless we see that something is being blocked. The profile also shows Java is doing a lot of "waiting for an event", which means Gecko is doing work, or nothing is happening.

Let's look for some actionable work/hot-spots.
tracking-fennec: --- → ?
tracking-fennec: ? → ---
Summary: Tab switches are painfully slow → [mta] Tab switches are painfully slow
Summary: [mta] Tab switches are painfully slow → [meta] Tab switches are painfully slow
De-metafying since all the bugs are closed now.

But this is an obvious performance issue. 

Ni-ing barbara to get on her radar.
No longer blocks: fennec-polish
Flags: needinfo?(bbermes)
Summary: [meta] Tab switches are painfully slow → Tab switches are painfully slow
Hi Sorina, as mentioned, 
Could you please kindly help us verify the bug ? 
we are not sure if this symptom " Tab switches are painfully slow“ still existed or not. 

Thank you very much !
Flags: needinfo?(sorina.florean)
Hi all,

Here is a video made with Nexus 9 (Android 7.0) on latest Nightly: https://youtu.be/6HruSFWo7Zc.
Comparing with Chrome, switching tabs is still painfully slow. I will test again with other devices to see the behavior.
Thank you very much , Sorina!
I will follow the status on this bug. Thank you again !
Hi Sorina, this video on Youtube is removed...

Could you please share your update with us if any . Thank you !!
tracking-fennec: --- → ?
(I am sorry, I should make a comment here)

This bug is under program mgmt radar so please work on it on the next iteration available.
Priority: -- → P1
Assignee: nobody → cnevinchen
tracking-fennec: ? → +
Hi Jim
Would you please help give me some advice about this bug?

I've reproduced it using Nightly on Samsung Tab 3 (Android 4.2.2)
https://www.youtube.com/watch?v=bXgsK7gCIeM&feature=youtu.be

And the latest central
https://www.youtube.com/watch?v=lExMXioW998

Log is below
12-02 16:31:48.970 20805-20805/org.mozilla.fennec_nechen E/dalvikvm: Could not find class 'android.app.AppOpsManager', referenced from method com.google.android.gms.common.zze.zzb
12-02 16:31:49.030 20805-20805/org.mozilla.fennec_nechen E/GMPM: GoogleService failed to initialize, status: 10, Missing an expected resource: 'R.string.google_app_id' for initializing Google services.  Possible causes are missing google-services.json or com.google.gms.google-services gradle plugin.
12-02 16:31:49.030 20805-20805/org.mozilla.fennec_nechen E/GMPM: Scheduler not set. Not logging error/warn.
12-02 16:31:49.070 20805-20849/org.mozilla.fennec_nechen E/GMPM: Uploading is not possible. App measurement disabled
12-02 16:31:49.185 20805-20805/org.mozilla.fennec_nechen E/dalvikvm: Could not find class 'android.app.Notification$Action$Builder', referenced from method org.mozilla.gecko.media.MediaControlService.createNotificationAction
12-02 16:31:49.190 20805-20805/org.mozilla.fennec_nechen E/dalvikvm: Could not find class 'android.media.session.MediaSession', referenced from method org.mozilla.gecko.media.MediaControlService.initMediaSession
12-02 16:31:49.190 20805-20805/org.mozilla.fennec_nechen E/dalvikvm: Could not find class 'android.app.Notification$MediaStyle', referenced from method org.mozilla.gecko.media.MediaControlService.updateNotification
12-02 16:31:49.405 20805-20864/org.mozilla.fennec_nechen E/GeckoLibLoad: Load sqlite start
12-02 16:31:49.435 20805-20864/org.mozilla.fennec_nechen E/GeckoLibLoad: Load sqlite done
12-02 16:31:49.445 20805-20864/org.mozilla.fennec_nechen E/GeckoLibLoad: Load nss start
12-02 16:31:49.445 20805-20864/org.mozilla.fennec_nechen E/GeckoLibLoad: Load nss done
12-02 16:31:49.685 20805-20864/org.mozilla.fennec_nechen E/GeckoLibLoad: Loaded libs in 240.595792ms total, 20ms(65ms) user, 30ms(45ms) system, 22(29) faults
12-02 16:31:49.720 20805-20805/org.mozilla.fennec_nechen E/dalvikvm: Could not find class 'org.mozilla.gecko.GeckoAccessibility$5', referenced from method org.mozilla.gecko.GeckoAccessibility.setAccessibilityManagerListeners
12-02 16:31:49.735 20805-20809/org.mozilla.fennec_nechen E/dalvikvm: adjustAdaptiveCoef max=8388608, min=2097152, ut=568
12-02 16:31:51.125 20805-20805/org.mozilla.fennec_nechen E/dalvikvm: adjustAdaptiveCoef max=8388608, min=2097152, ut=568
12-02 16:31:52.350 20805-20864/org.mozilla.fennec_nechen E/dalvikvm: Could not find class 'android.view.inputmethod.CursorAnchorInfo$Builder', referenced from method org.mozilla.gecko.GeckoInputConnection.updateCompositionRects
12-02 16:32:09.460 20805-20805/org.mozilla.fennec_nechen E/ViewRootImpl: sendUserActionEvent() mView == null
12-02 16:32:11.505 20805-20864/org.mozilla.fennec_nechen E/GeckoConsole: [JavaScript Error: "uncaught exception: undefined" {file: "https://images-na.ssl-images-amazon.com/images/G/01/AUIClients/ShoppingPortalJQuery-c5d3431201d04a8c2580ea54ad45ae853e5a15ff._V2_.js#AUIClients/ShoppingPortalJQuery.secure.min" line: 173}]
12-02 16:32:13.550 20805-20814/org.mozilla.fennec_nechen E/dalvikvm: adjustAdaptiveCoef max=8388608, min=2097152, ut=568
12-02 16:32:14.465 20805-20805/org.mozilla.fennec_nechen E/ViewRootImpl: sendUserActionEvent() mView == null
12-02 16:32:17.820 20805-20864/org.mozilla.fennec_nechen E/GeckoConsole: [JavaScript Error: "uncaught exception: undefined"]
12-02 16:32:26.335 20805-21547/org.mozilla.fennec_nechen E/ACodec: ## 3d, eNativeColorFormat 0x15
12-02 16:32:26.335 20805-21547/org.mozilla.fennec_nechen E/ACodec: ## 3d, eNativeColorFormat 0x105
12-02 16:32:32.025 20805-21547/org.mozilla.fennec_nechen E/BufferQueue: [unnamed-20805-0] dequeueBuffer: min undequeued buffer count (2) exceeded (dequeued=4 undequeudCount=1)
12-02 16:32:32.025 20805-21547/org.mozilla.fennec_nechen E/ACodec: dequeueBuffer failed.
12-02 16:33:13.740 20805-20864/org.mozilla.fennec_nechen E/dalvikvm: adjustAdaptiveCoef max=8388608, min=2097152, ut=568
12-02 16:33:37.640 20805-20869/org.mozilla.fennec_nechen E/GeckoConsole: [JavaScript Error: "image2.pubmatic.com : server does not support RFC 5746, see CVE-2009-3555"]
12-02 16:36:50.950 20805-20864/org.mozilla.fennec_nechen E/GeckoConsole: [JavaScript Error: "too much recursion" {file: "resource://gre/modules/DataReader.jsm" line: 108}]
12-02 16:37:45.105 20805-20814/org.mozilla.fennec_nechen E/StrictMode: A resource was acquired at attached stack trace but never released. See java.io.Closeable for information on avoiding resource leaks.
                                                                       java.lang.Throwable: Explicit termination method 'release' not called
                                                                           at dalvik.system.CloseGuard.open(CloseGuard.java:184)
                                                                           at android.view.Surface.<init>(Surface.java:375)
                                                                           at dalvik.system.NativeStart.run(Native Method)
Flags: needinfo?(nchen)
(In reply to Rachelle Yang [:ryang][ryang@mozilla.com] from comment #12)
> Hi Sorina, this video on Youtube is removed...
> 
> Could you please share your update with us if any . Thank you !!

Sorry for delay here, the tab switching is still slow, comparing with Chrome. Switching tabs show a blank page and then the page open in tab, like is reloading.
Flags: needinfo?(sorina.florean)
(In reply to Nevin Chen [:nechen] from comment #14)
> Hi Jim
> Would you please help give me some advice about this bug?

I didn't see anything too obvious. I can take a look again when I get access to a tablet.
Flags: needinfo?(nchen)
Assignee: cnevinchen → nobody
Hi Jim
Here's what I've found:

When user click on on a tab, 
https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/tabs/TabsLayout.java#141
Java asked Gecko to update tab state in BrowserApp.
https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/Tabs.java#290

After Gecko reads the URL and it's content, it send the callback to BrowserApp
I can get SELECTED of the new tag and UNSELECTED of the old tag almost imediately.
https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/BrowserApp.java#356

But LayerView still need a little time (< 1sec) to show its content.
Flags: needinfo?(nchen)
tracking-fennec: + → ?
The(In reply to Jim Chen [:jchen] [:darchons] from comment #18)
> I think SELECTED and UNSELECTED are not triggered by Gecko? Once Java asks
> Gecko to switch tabs, the request goes to [1], [2], and [3]. You can log
> from those places and see what the timing is?
The three happens in the same time slot, almost simultaneously with order [1]->[2]->[3]
Flags: needinfo?(nchen)
Sometimes they were instant, sometimes they weren't. And the order were sometimes wired. 

12-21 10:43:27.610 E: =====[1]=======
12-21 10:43:27.745 E: =======[2]=======
12-21 10:43:27.745 E: =======[3]=======

12-21 10:43:28.490 E: =====[1]=======
12-21 10:43:28.575 E: =======[2]=======
12-21 10:43:28.575 E: =======[3]=======

12-21 10:43:39.415 E: =====[1]=======
12-21 10:43:39.415 E: =======[2]=======
12-21 10:43:39.415 E: =======[3]=======

12-21 10:43:40.720 E: =====[1]=======
12-21 10:43:40.775 E: =======[2]=======
12-21 10:43:40.780 E: =======[3]=======

12-21 10:43:42.370 E: =====[1]=======
12-21 10:43:42.370 E: =======[2]=======
12-21 10:43:42.370 E: =======[3]=======

12-21 10:43:46.600 E: =====[1]=======
12-21 10:43:46.600 E: =======[2]=======
12-21 10:43:46.600 E: =======[3]=======

12-21 10:43:47.300 E: =====[1]=======
12-21 10:43:47.310 E: =======[2]=======
12-21 10:43:47.315 E: =======[3]=======

12-21 10:43:47.995 E: =====[1]=======
12-21 10:43:48.055 E: =======[2]=======
12-21 10:43:48.055 E: =======[3]=======

12-21 10:43:55.970 E: =====[1]=======
12-21 10:43:55.975 E: =======[2]=======
12-21 10:43:55.975 E: =======[3]=======

12-21 10:44:04.700 E: =====[1]=======
12-21 10:44:04.745 E: =======[2]=======
12-21 10:44:04.745 E: =======[3]=======

12-21 10:44:05.365 E: =====[1]=======
12-21 10:44:05.380 E: =======[2]=======
12-21 10:44:05.380 E: =======[3]=======

12-21 10:44:07.205 E: =====[1]=======
12-21 10:44:07.210 E: =======[2]=======
12-21 10:44:07.210 E: =======[3]=======

12-21 10:44:07.835 E: =====[1]=======
12-21 10:44:07.890 E: =======[2]=======
12-21 10:44:07.895 E: =======[3]=======

12-21 10:44:08.575 E: =====[1]=======
12-21 10:44:08.590 E: =======[2]=======
12-21 10:44:08.590 E: =======[3]=======

12-21 10:44:09.220 E: =====[1]=======
12-21 10:44:09.245 E: =======[2]=======
12-21 10:44:09.250 E: =======[3]=======

12-21 10:44:09.695 E: =====[1]=======
12-21 10:44:09.935 E: =======[2]=======
12-21 10:44:09.935 E: =======[3]=======

12-21 10:49:12.135 E: =====[1]=======
12-21 10:49:12.640 E: =====[1]=======
12-21 10:49:13.760 E: =====[1]=======
12-21 10:49:13.960 E: =======[2]=======
12-21 10:49:13.960 E: =======[3]=======
12-21 10:49:13.990 E: =======[2]=======
12-21 10:49:13.990 E: =======[3]=======
12-21 10:49:14.130 E: =======[2]=======
12-21 10:49:14.130 E: =======[3]=======

12-21 10:49:14.935 E: =====[1]=======
12-21 10:49:14.955 E: =======[2]=======
12-21 10:49:14.955 E: =======[3]=======

12-21 10:49:16.040 E: =====[1]=======
12-21 10:49:16.045 E: =======[2]=======
12-21 10:49:16.045 E: =======[3]=======

12-21 10:49:16.800 E: =====[1]=======
12-21 10:49:16.820 E: =======[2]=======
12-21 10:49:16.820 E: =======[3]=======

12-21 10:49:17.500 E: =====[1]=======
12-21 10:49:17.505 E: =======[2]=======
12-21 10:49:17.505 E: =======[3]=======

12-21 10:49:18.220 E: =====[1]=======
12-21 10:49:18.220 E: =======[2]=======
12-21 10:49:18.225 E: =======[3]=======

12-21 10:49:18.560 E: =====[1]=======
12-21 10:49:19.070 E: =======[2]=======
12-21 10:49:19.075 E: =======[3]=======
Flags: needinfo?(nchen)
tracking-fennec: ? → +
(In reply to Nevin Chen [:nechen] from comment #21)
> 
> 12-21 10:49:12.135 E: =====[1]=======
> 12-21 10:49:12.640 E: =====[1]=======
> 12-21 10:49:13.760 E: =====[1]=======
> 12-21 10:49:13.960 E: =======[2]=======
> 12-21 10:49:13.960 E: =======[3]=======
> 12-21 10:49:13.990 E: =======[2]=======
> 12-21 10:49:13.990 E: =======[3]=======
> 12-21 10:49:14.130 E: =======[2]=======
> 12-21 10:49:14.130 E: =======[3]=======
> 
> 12-21 10:49:18.560 E: =====[1]=======
> 12-21 10:49:19.070 E: =======[2]=======
> 12-21 10:49:19.075 E: =======[3]=======

These are not good. Looks like Gecko is stuck doing something.
Flags: needinfo?(nchen)

Downgrading priority from P1 to P2

Priority: P1 → P2
You need to log in before you can comment on or make changes to this bug.