Closed Bug 888482 Opened 11 years ago Closed 10 years ago

Initialize Gecko thread sooner during cold startup

Categories

(Firefox for Android Graveyard :: General, defect)

All
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Firefox 35

People

(Reporter: blassey, Assigned: jchen)

References

Details

Attachments

(5 files, 5 obsolete files)

Attached patch patch to add waitForLaunchState (obsolete) — Splinter Review
      No description provided.
Attachment #769160 - Flags: review?(bugmail.mozilla)
Attached patch patch to init gecko sooner (obsolete) — Splinter Review
Assignee: nobody → blassey.bugs
Attachment #769162 - Flags: review?(bugmail.mozilla)
Blocks: 807322
Comment on attachment 769160 [details] [diff] [review]
patch to add waitForLaunchState

Review of attachment 769160 [details] [diff] [review]:
-----------------------------------------------------------------

Hm. I don't know how I never noticed this before, but our use of synchronization in this class is broken. We're synchronizing on sLaunchState and reassigning it inside the synchronized block, which means other functions can enter other synchronized blocks at that point because sLaunchState has changed. That's almost as bad as not having synchronization at all.

::: mobile/android/base/GeckoThread.java
@@ +142,4 @@
>      static void setLaunchState(LaunchState setState) {
>          synchronized (sLaunchState) {
>              sLaunchState = setState;
> +            sLaunchStateQueue.offer(sLaunchState);

This seems like a bad idea, because the code calling setLaunchState will block too, which I don't think is the intent here. I think a better approach is to (1) add a private static Object sLock = new Object() to this class, (2) use sLock as the synchronized() lock, and (3) make setLaunchState() do a sLock.notifyAll() and waitForLaunchState do a sLock.wait() inside a synchronized loop while (!checkLaunchState).
Attachment #769160 - Flags: review?(bugmail.mozilla) → review-
Attached patch patch to add waitForLaunchState (obsolete) — Splinter Review
good catch
Attachment #769160 - Attachment is obsolete: true
Attachment #769207 - Flags: review?(bugmail.mozilla)
Attached patch patch to run sooner (obsolete) — Splinter Review
Nathan, I'd be curious to know if this speeds up loading a page (so, our eideticker nytimes start up test)
Attachment #769218 - Flags: feedback?(nfroyd)
Comment on attachment 769207 [details] [diff] [review]
patch to add waitForLaunchState

Review of attachment 769207 [details] [diff] [review]:
-----------------------------------------------------------------

This should work fine, but I think addressing the second comment below can improve the performance in some cases (e.g. if the state changes on another thread just before the wait call).

::: mobile/android/base/GeckoThread.java
@@ +18,5 @@
>  import android.util.Log;
>  import android.app.Activity;
>  
> +import java.util.concurrent.SynchronousQueue;
> +import java.util.concurrent.TimeUnit;

Imports not needed any more

@@ +35,5 @@
>      };
>  
> +    static class LaunchStateHolder {
> +        private LaunchState mLaunchState = LaunchState.Launching;
> +        public void waitForLaunchState(LaunchState checkState) {

Make this function synchronized, and remove the "200" in the wait. This will improve scenarios where thread A calls waitForLaunchState, and just about to go into the wait, but thread B changes the state. With this code thread A will still wait 200ms. Picking up the lock the second time with the call to checkLaunchState should be a no-op in the JVM since the lock will already be acquired, so it's not extra overhead either.
Attachment #769207 - Flags: review?(bugmail.mozilla) → review+
Comment on attachment 769162 [details] [diff] [review]
patch to init gecko sooner

Review of attachment 769162 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/GeckoApp.java
@@ +1221,5 @@
> +            String uri = getURIFromIntent(intent);
> +            String action = intent.getAction();
> +            if (uri != null && uri.length() > 0) {
> +                passedUri = uri;
> +            }

Not a huge fan of the duplicated code here but I can't think of a good way to refactor it :(

@@ +1438,5 @@
>          if (!ACTION_DEBUG.equals(action) &&
>              GeckoThread.checkAndSetLaunchState(GeckoThread.LaunchState.Launching, GeckoThread.LaunchState.Launched)) {
>              sGeckoThread.start();
> +        } else if (!ACTION_DEBUG.equals(action) &&
> +            GeckoThread.checkAndSetLaunchState(GeckoThread.LaunchState.LoadingLibs, GeckoThread.LaunchState.Launched)) {

Maybe add a comment in the if body here saying that the gecko thread was started in onCreate? I feel like the empty body needs an explanation

::: mobile/android/base/GeckoAppShell.java
@@ +295,5 @@
>          // run gecko -- it will spawn its own thread
>          GeckoAppShell.nativeInit();
>  
> +        GeckoThread.waitForLaunchState(GeckoThread.LaunchState.Launched);
> +

This is good to start. I suspect we can move this to happen even later if we change some other pieces of code to enforce a particular timeline.
Attachment #769162 - Flags: review?(bugmail.mozilla) → review+
Comment on attachment 769218 [details] [diff] [review]
patch to run sooner

With all these patches applied, Fennec becomes a lot crashier.  When I do get a decent eideticker run, libraries start loading much earlier, but firstLoadURI actually slows down by several hundred ms.
Attachment #769218 - Flags: feedback?(nfroyd)
(In reply to Nathan Froyd (:froydnj) from comment #7)
> With all these patches applied, Fennec becomes a lot crashier.  When I do
> get a decent eideticker run, libraries start loading much earlier, but
> firstLoadURI actually slows down by several hundred ms.

Example timeline (some entries elided for brevity's sake):

168  geckoThreadCreated
198  linkerInitialized
789  librariesLoaded
816  main
1359 AMI_startup_begin
1559 AMI_startup_end
1736 createTopLevelWindow
2266 androidBrowserStartupStart
3069 androidBrowserStartupEnd
3167 sessionRestored
3278 firstLoadURI

which is about 300ms slower than normal.  I'm guessing this is partly from the 800ms that androidBrowserStartup took; it normally takes more like 200ms.  I presume this is because the Java side isn't really doing any work while the C++ side is running, and so browser.js needs to wait on that?
These results were bothering me, so I wanted to see if I could reproduce locally.

Since the patches here no longer apply, I just started from scratch. Like Nathan, I was seeing lots of crashes from simply moving GeckoThread to onCreate(), and logcat showed that it was a null pointer in Compositor. I moved some things around to ensure the layer view exists before firing the Gecko thread.

Over 10 runs with and without the patch, I found that the time to receive a document stop dropped from an average of 5791ms to 5483ms.

Not sure if it's a difference in the patch, testing method, or other variables. Nathan, can you give this patch a try?
Flags: needinfo?(nfroyd)
Attached patch LoggingSplinter Review
Simple logging patch I'm using. For reference, I was testing this on https://blog.mozilla.org/blog/2013/08/01/telefonica-announces-launches-of-first-firefox-os-devices-in-latin-america/.

STR: In settings, I changed my Tabs pref to "Always restore", then opened the page, waited for the second document stop (first one is for about:blank), pushed home, swiped from recent apps, repeat.

Since I wasn't really clear in the comment above: the 5791ms was without the patch, and 5483ms was with it.
Running this through Eideticker shows it to be much more stable than previous versions.  So that's a plus.

about:telemetry says that we're getting to firstLoadURI several hundred milliseconds sooner (on a Galaxy Nexus).  That agrees with the speedup Brian's seeing in comment 10.  I say ship it.
Flags: needinfo?(nfroyd)
Second push, since the last one used a bad parent: https://tbpl.mozilla.org/?tree=Try&rev=5421f50ea14d
Comment on attachment 790561 [details] [diff] [review]
Move GeckoThread creation to onCreate()

Since everything is green, and since Nathan and I have both seen startup time improvements, I guess this is ready for review.
Attachment #790561 - Flags: review?(bugmail.mozilla)
Assignee: blassey.bugs → bnicholson
Comment on attachment 790561 [details] [diff] [review]
Move GeckoThread creation to onCreate()

Review of attachment 790561 [details] [diff] [review]:
-----------------------------------------------------------------

As long as this works, I don't see any problem with it.
Attachment #790561 - Flags: review?(bugmail.mozilla) → review+
Are you still working on this, Brian?
Flags: needinfo?(bnicholson)
I'm not -- I wasn't able to pin down the editor-related failures and haven't looked at this recently.
Assignee: bnicholson → nobody
Flags: needinfo?(bnicholson)
Assignee: nobody → nchen
Status: NEW → ASSIGNED
One problem I found when moving Gecko startup earlier was that it creates a race condition in setting the layer client. We call GeckoAppShell.setLayerClient in GeckoThread initialization, and that means we must have the layer view before we create the GeckoThread.

However, if we create an event that sets the layer client, we can start GeckoThread before we even have the layer view. Once we have the layer view, we can post the event to Gecko to set the layer client. This also makes sure that we set the layer client on the Gecko thread, whereas we were setting it from multiple different threads before.
Attachment #769162 - Attachment is obsolete: true
Attachment #769207 - Attachment is obsolete: true
Attachment #769218 - Attachment is obsolete: true
Attachment #790561 - Attachment is obsolete: true
Attachment #8496205 - Flags: review?(snorp)
Replace setLayerClient calls with the event.
Attachment #8496213 - Flags: review?(snorp)
Move GeckoThread start to as early as we can so we can do initialization in parallel.
Attachment #8496214 - Flags: review?(snorp)
Attachment #8496205 - Flags: review?(snorp) → review+
Attachment #8496213 - Flags: review?(snorp) → review+
Comment on attachment 8496214 [details] [diff] [review]
Start Gecko early (v1)

Review of attachment 8496214 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me. Lets see if anything breaks? :)
Attachment #8496214 - Flags: review?(snorp) → review+
Blocks: 959776
OS: Mac OS X → Android
Hardware: x86 → All
Summary: init gecko sooner → Initialize Gecko thread sooner during cold startup
Refactor and (more importantly) call forceCreate() when using the default profile, so that Gecko gets a usable profile when starting up.
Attachment #8497668 - Flags: review?(snorp)
Attachment #8497668 - Flags: review?(snorp) → review+
Strangely the throbber stop improved for everyone except the nexus s devices and the total throbber time improved for everyone. The worst offenders where the nexus s devices running Android 2.3.
For background, the throbber start and throbber stop are measured relative to a start time that is determined by the time of the first logcat message containing Gecko|fennec. As such this 'regression' in the throbber start is expected considering the goal of initializing the Gecko thread sooner. I'm not sure the regression is not just an artifact of how the start time is determined.
The throbber stop times are pretty encouraging. The nexus S is a single core device, so we didn't expect to see an improvement there. We were actually suspecting it to regress, but that doesn't really appear to be the case.
Yeah we anticipated the Nexus S throbber start regression because, for a single-core device, the Gecko thread is now competing with the UI thread on showing UI. There are tricks we can use though like lowering Gecko thread priority (right now we don't lower the Gecko thread priority until later in start up).
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: