Last Comment Bug 888482 - Initialize Gecko thread sooner during cold startup
: Initialize Gecko thread sooner during cold startup
Status: RESOLVED FIXED
:
Product: Firefox for Android
Classification: Client Software
Component: General (show other bugs)
: Trunk
: All Android
: -- normal (vote)
: Firefox 35
Assigned To: Jim Chen [:jchen] [:darchons]
:
Mentors:
: 760152 (view as bug list)
Depends on: 1081768
Blocks: 807322 959776 1075644
  Show dependency treegraph
 
Reported: 2013-06-28 14:14 PDT by Brad Lassey [:blassey] (use needinfo?)
Modified: 2014-10-15 08:24 PDT (History)
12 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch to add waitForLaunchState (1.93 KB, patch)
2013-06-28 14:14 PDT, Brad Lassey [:blassey] (use needinfo?)
bugmail: review-
Details | Diff | Splinter Review
patch to init gecko sooner (3.06 KB, patch)
2013-06-28 14:15 PDT, Brad Lassey [:blassey] (use needinfo?)
bugmail: review+
Details | Diff | Splinter Review
patch to add waitForLaunchState (3.38 KB, patch)
2013-06-28 14:53 PDT, Brad Lassey [:blassey] (use needinfo?)
bugmail: review+
Details | Diff | Splinter Review
patch to run sooner (3.32 KB, patch)
2013-06-28 15:20 PDT, Brad Lassey [:blassey] (use needinfo?)
no flags Details | Diff | Splinter Review
Move GeckoThread creation to onCreate() (8.05 KB, patch)
2013-08-14 18:00 PDT, Brian Nicholson (:bnicholson) (PTO through August 1)
bugmail: review+
Details | Diff | Splinter Review
Logging (1.92 KB, patch)
2013-08-14 18:08 PDT, Brian Nicholson (:bnicholson) (PTO through August 1)
no flags Details | Diff | Splinter Review
Make new event to set layer client in Gecko (v1) (9.70 KB, patch)
2014-09-26 13:34 PDT, Jim Chen [:jchen] [:darchons]
snorp: review+
Details | Diff | Splinter Review
Send event to set layer client (v1) (7.23 KB, patch)
2014-09-26 13:40 PDT, Jim Chen [:jchen] [:darchons]
snorp: review+
Details | Diff | Splinter Review
Start Gecko early (v1) (5.58 KB, patch)
2014-09-26 13:41 PDT, Jim Chen [:jchen] [:darchons]
snorp: review+
Details | Diff | Splinter Review
Make sure default profile exists when starting Gecko (v1) (2.61 KB, patch)
2014-09-30 11:03 PDT, Jim Chen [:jchen] [:darchons]
snorp: review+
Details | Diff | Splinter Review

Description Brad Lassey [:blassey] (use needinfo?) 2013-06-28 14:14:36 PDT
Created attachment 769160 [details] [diff] [review]
patch to add waitForLaunchState
Comment 1 Brad Lassey [:blassey] (use needinfo?) 2013-06-28 14:15:48 PDT
Created attachment 769162 [details] [diff] [review]
patch to init gecko sooner
Comment 2 Kartikaya Gupta (email:kats@mozilla.com) 2013-06-28 14:25:55 PDT
Comment on attachment 769160 [details] [diff] [review]
patch to add waitForLaunchState

Review of attachment 769160 [details] [diff] [review]:
-----------------------------------------------------------------

Hm. I don't know how I never noticed this before, but our use of synchronization in this class is broken. We're synchronizing on sLaunchState and reassigning it inside the synchronized block, which means other functions can enter other synchronized blocks at that point because sLaunchState has changed. That's almost as bad as not having synchronization at all.

::: mobile/android/base/GeckoThread.java
@@ +142,4 @@
>      static void setLaunchState(LaunchState setState) {
>          synchronized (sLaunchState) {
>              sLaunchState = setState;
> +            sLaunchStateQueue.offer(sLaunchState);

This seems like a bad idea, because the code calling setLaunchState will block too, which I don't think is the intent here. I think a better approach is to (1) add a private static Object sLock = new Object() to this class, (2) use sLock as the synchronized() lock, and (3) make setLaunchState() do a sLock.notifyAll() and waitForLaunchState do a sLock.wait() inside a synchronized loop while (!checkLaunchState).
Comment 3 Brad Lassey [:blassey] (use needinfo?) 2013-06-28 14:53:39 PDT
Created attachment 769207 [details] [diff] [review]
patch to add waitForLaunchState

good catch
Comment 4 Brad Lassey [:blassey] (use needinfo?) 2013-06-28 15:20:25 PDT
Created attachment 769218 [details] [diff] [review]
patch to run sooner

Nathan, I'd be curious to know if this speeds up loading a page (so, our eideticker nytimes start up test)
Comment 5 Kartikaya Gupta (email:kats@mozilla.com) 2013-06-28 17:27:27 PDT
Comment on attachment 769207 [details] [diff] [review]
patch to add waitForLaunchState

Review of attachment 769207 [details] [diff] [review]:
-----------------------------------------------------------------

This should work fine, but I think addressing the second comment below can improve the performance in some cases (e.g. if the state changes on another thread just before the wait call).

::: mobile/android/base/GeckoThread.java
@@ +18,5 @@
>  import android.util.Log;
>  import android.app.Activity;
>  
> +import java.util.concurrent.SynchronousQueue;
> +import java.util.concurrent.TimeUnit;

Imports not needed any more

@@ +35,5 @@
>      };
>  
> +    static class LaunchStateHolder {
> +        private LaunchState mLaunchState = LaunchState.Launching;
> +        public void waitForLaunchState(LaunchState checkState) {

Make this function synchronized, and remove the "200" in the wait. This will improve scenarios where thread A calls waitForLaunchState, and just about to go into the wait, but thread B changes the state. With this code thread A will still wait 200ms. Picking up the lock the second time with the call to checkLaunchState should be a no-op in the JVM since the lock will already be acquired, so it's not extra overhead either.
Comment 6 Kartikaya Gupta (email:kats@mozilla.com) 2013-06-29 05:59:55 PDT
Comment on attachment 769162 [details] [diff] [review]
patch to init gecko sooner

Review of attachment 769162 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/GeckoApp.java
@@ +1221,5 @@
> +            String uri = getURIFromIntent(intent);
> +            String action = intent.getAction();
> +            if (uri != null && uri.length() > 0) {
> +                passedUri = uri;
> +            }

Not a huge fan of the duplicated code here but I can't think of a good way to refactor it :(

@@ +1438,5 @@
>          if (!ACTION_DEBUG.equals(action) &&
>              GeckoThread.checkAndSetLaunchState(GeckoThread.LaunchState.Launching, GeckoThread.LaunchState.Launched)) {
>              sGeckoThread.start();
> +        } else if (!ACTION_DEBUG.equals(action) &&
> +            GeckoThread.checkAndSetLaunchState(GeckoThread.LaunchState.LoadingLibs, GeckoThread.LaunchState.Launched)) {

Maybe add a comment in the if body here saying that the gecko thread was started in onCreate? I feel like the empty body needs an explanation

::: mobile/android/base/GeckoAppShell.java
@@ +295,5 @@
>          // run gecko -- it will spawn its own thread
>          GeckoAppShell.nativeInit();
>  
> +        GeckoThread.waitForLaunchState(GeckoThread.LaunchState.Launched);
> +

This is good to start. I suspect we can move this to happen even later if we change some other pieces of code to enforce a particular timeline.
Comment 7 Nathan Froyd [:froydnj] 2013-07-01 13:28:51 PDT
Comment on attachment 769218 [details] [diff] [review]
patch to run sooner

With all these patches applied, Fennec becomes a lot crashier.  When I do get a decent eideticker run, libraries start loading much earlier, but firstLoadURI actually slows down by several hundred ms.
Comment 8 Nathan Froyd [:froydnj] 2013-07-01 13:44:10 PDT
(In reply to Nathan Froyd (:froydnj) from comment #7)
> With all these patches applied, Fennec becomes a lot crashier.  When I do
> get a decent eideticker run, libraries start loading much earlier, but
> firstLoadURI actually slows down by several hundred ms.

Example timeline (some entries elided for brevity's sake):

168  geckoThreadCreated
198  linkerInitialized
789  librariesLoaded
816  main
1359 AMI_startup_begin
1559 AMI_startup_end
1736 createTopLevelWindow
2266 androidBrowserStartupStart
3069 androidBrowserStartupEnd
3167 sessionRestored
3278 firstLoadURI

which is about 300ms slower than normal.  I'm guessing this is partly from the 800ms that androidBrowserStartup took; it normally takes more like 200ms.  I presume this is because the Java side isn't really doing any work while the C++ side is running, and so browser.js needs to wait on that?
Comment 9 Jim Chen [:jchen] [:darchons] 2013-07-17 08:59:48 PDT
Here are two profiles without and with the patches. They seem to agree with the timeline above.

Without,
http://people.mozilla.com/~bgirard/cleopatra/#report=7ecda08b52daed2195300f11c92afde98f7a8dc4

With,
http://people.mozilla.com/~bgirard/cleopatra/#report=9b94efdd9969646bf40ddcec902221ae7f4dfd00
Comment 10 Brian Nicholson (:bnicholson) (PTO through August 1) 2013-08-14 18:00:23 PDT
Created attachment 790561 [details] [diff] [review]
Move GeckoThread creation to onCreate()

These results were bothering me, so I wanted to see if I could reproduce locally.

Since the patches here no longer apply, I just started from scratch. Like Nathan, I was seeing lots of crashes from simply moving GeckoThread to onCreate(), and logcat showed that it was a null pointer in Compositor. I moved some things around to ensure the layer view exists before firing the Gecko thread.

Over 10 runs with and without the patch, I found that the time to receive a document stop dropped from an average of 5791ms to 5483ms.

Not sure if it's a difference in the patch, testing method, or other variables. Nathan, can you give this patch a try?
Comment 11 Brian Nicholson (:bnicholson) (PTO through August 1) 2013-08-14 18:08:44 PDT
Created attachment 790565 [details] [diff] [review]
Logging

Simple logging patch I'm using. For reference, I was testing this on https://blog.mozilla.org/blog/2013/08/01/telefonica-announces-launches-of-first-firefox-os-devices-in-latin-america/.

STR: In settings, I changed my Tabs pref to "Always restore", then opened the page, waited for the second document stop (first one is for about:blank), pushed home, swiped from recent apps, repeat.

Since I wasn't really clear in the comment above: the 5791ms was without the patch, and 5483ms was with it.
Comment 12 Mike Hommey [:glandium] 2013-08-15 07:02:01 PDT
*** Bug 760152 has been marked as a duplicate of this bug. ***
Comment 13 Nathan Froyd [:froydnj] 2013-08-15 10:18:00 PDT
Running this through Eideticker shows it to be much more stable than previous versions.  So that's a plus.

about:telemetry says that we're getting to firstLoadURI several hundred milliseconds sooner (on a Galaxy Nexus).  That agrees with the speedup Brian's seeing in comment 10.  I say ship it.
Comment 14 Brian Nicholson (:bnicholson) (PTO through August 1) 2013-08-23 17:14:22 PDT
Pushed to try: https://tbpl.mozilla.org/?tree=Try&rev=d74a2203cf18
Comment 15 Brian Nicholson (:bnicholson) (PTO through August 1) 2013-08-26 21:52:48 PDT
Second push, since the last one used a bad parent: https://tbpl.mozilla.org/?tree=Try&rev=5421f50ea14d
Comment 16 Brian Nicholson (:bnicholson) (PTO through August 1) 2013-08-26 21:54:42 PDT
Comment on attachment 790561 [details] [diff] [review]
Move GeckoThread creation to onCreate()

Since everything is green, and since Nathan and I have both seen startup time improvements, I guess this is ready for review.
Comment 17 Kartikaya Gupta (email:kats@mozilla.com) 2013-08-27 06:09:08 PDT
Comment on attachment 790561 [details] [diff] [review]
Move GeckoThread creation to onCreate()

Review of attachment 790561 [details] [diff] [review]:
-----------------------------------------------------------------

As long as this works, I don't see any problem with it.
Comment 18 Brian Nicholson (:bnicholson) (PTO through August 1) 2013-08-27 11:39:20 PDT
https://hg.mozilla.org/integration/fx-team/rev/0aed878e34d4
Comment 20 Jim Chen [:jchen] [:darchons] 2014-09-02 12:14:05 PDT
Are you still working on this, Brian?
Comment 21 Brian Nicholson (:bnicholson) (PTO through August 1) 2014-09-02 14:14:34 PDT
I'm not -- I wasn't able to pin down the editor-related failures and haven't looked at this recently.
Comment 22 Jim Chen [:jchen] [:darchons] 2014-09-26 13:34:53 PDT
Created attachment 8496205 [details] [diff] [review]
Make new event to set layer client in Gecko (v1)

One problem I found when moving Gecko startup earlier was that it creates a race condition in setting the layer client. We call GeckoAppShell.setLayerClient in GeckoThread initialization, and that means we must have the layer view before we create the GeckoThread.

However, if we create an event that sets the layer client, we can start GeckoThread before we even have the layer view. Once we have the layer view, we can post the event to Gecko to set the layer client. This also makes sure that we set the layer client on the Gecko thread, whereas we were setting it from multiple different threads before.
Comment 23 Jim Chen [:jchen] [:darchons] 2014-09-26 13:40:18 PDT
Created attachment 8496213 [details] [diff] [review]
Send event to set layer client (v1)

Replace setLayerClient calls with the event.
Comment 24 Jim Chen [:jchen] [:darchons] 2014-09-26 13:41:15 PDT
Created attachment 8496214 [details] [diff] [review]
Start Gecko early (v1)

Move GeckoThread start to as early as we can so we can do initialization in parallel.
Comment 25 James Willcox (:snorp) (jwillcox@mozilla.com) 2014-09-26 13:53:15 PDT
Comment on attachment 8496214 [details] [diff] [review]
Start Gecko early (v1)

Review of attachment 8496214 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me. Lets see if anything breaks? :)
Comment 26 Jim Chen [:jchen] [:darchons] 2014-09-30 11:03:12 PDT
Created attachment 8497668 [details] [diff] [review]
Make sure default profile exists when starting Gecko (v1)

Refactor and (more importantly) call forceCreate() when using the default profile, so that Gecko gets a usable profile when starting up.
Comment 29 Bob Clary [:bc:] 2014-09-30 22:50:16 PDT
Strangely the throbber stop improved for everyone except the nexus s devices and the total throbber time improved for everyone. The worst offenders where the nexus s devices running Android 2.3.
Comment 30 Bob Clary [:bc:] 2014-09-30 23:15:40 PDT
For background, the throbber start and throbber stop are measured relative to a start time that is determined by the time of the first logcat message containing Gecko|fennec. As such this 'regression' in the throbber start is expected considering the goal of initializing the Gecko thread sooner. I'm not sure the regression is not just an artifact of how the start time is determined.
Comment 32 James Willcox (:snorp) (jwillcox@mozilla.com) 2014-10-01 07:03:20 PDT
The throbber stop times are pretty encouraging. The nexus S is a single core device, so we didn't expect to see an improvement there. We were actually suspecting it to regress, but that doesn't really appear to be the case.
Comment 33 Jim Chen [:jchen] [:darchons] 2014-10-01 07:57:07 PDT
Yeah we anticipated the Nexus S throbber start regression because, for a single-core device, the Gecko thread is now competing with the UI thread on showing UI. There are tricks we can use though like lowering Gecko thread priority (right now we don't lower the Gecko thread priority until later in start up).

Note You need to log in before you can comment on or make changes to this bug.