Closed Bug 636607 Opened 13 years ago Closed 6 years ago

Several seconds of delayed hang on 2nd run due to addon list stuff

Categories

(Firefox for Android Graveyard :: General, defect)

ARM
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: azakai, Unassigned)

References

Details

On first run, we show the first run page. On second run, we should about:home and search for addons. On third run, we show the cached addons.

On second run, the addon search can take 1-2 seconds, on the main thread, hanging the browser. Pan left and right constantly, and after a few seconds the hang will occur.

It looks like the cause is some sync operations in the XHR calls in AddonRepository.js:

    this._request = Cc["@mozilla.org/xmlextras/xmlhttprequest;1"].
                    createInstance(Ci.nsIXMLHttpRequest);
    this._request.open("GET", aURI, true); // this line takes ~1 second

and 

    this._request.send(null); // this line takes ~0.4 seconds

The XHR is defined as async (the |true| there), however something sync is going on. I haven't looked inside there yet.

I see this on Galaxy S. Might be a disk IO issue, if so, it would be limited just to that device.

This is much more noticeable after bug 634666, which caused a similar hang on almost every run, due to disk IO.
The second stall, in send(), is due to

  mChannel->AsyncOpen(listener, nsnull);

which ends up calling the CookieService. The SQL stuff there (not the oldcookiefile stuff at the end) ends up taking 0.3-0.4 seconds, on the main thread (the XHR is async, but the cookie service is started up on the main thread).

Still investigating the first and bigger stall. So far narrowed it down to

  ioService->NewChannelFromURI(uri, getter_AddRefs(chan))

in NS_NewChannel.
Summary: 1-2 second hang on 2nd run due to addon listing → 1-2 second hang on 2nd run due to addon listing which calls CookieService SQL
(In reply to comment #1)
> which ends up calling the CookieService. The SQL stuff there (not the
> oldcookiefile stuff at the end) ends up taking 0.3-0.4 seconds, on the main
> thread (the XHR is async, but the cookie service is started up on the main
> thread).
If you can init the cookie service earlier, it will do it's work in the background.
The first stall is ~1 second in

  // HACK: make sure PSM gets initialized on the main thread.
  net_EnsurePSMInit();

which is called since the addons XHR uses https.
(In reply to comment #2)
> (In reply to comment #1)
> > which ends up calling the CookieService. The SQL stuff there (not the
> > oldcookiefile stuff at the end) ends up taking 0.3-0.4 seconds, on the main
> > thread (the XHR is async, but the cookie service is started up on the main
> > thread).
> If you can init the cookie service earlier, it will do it's work in the
> background.

Interesting, so the cookieservice stuff ends up being sync because it is started too late?

Looks like we should indeed initialize it earlier. But we'd need to be careful about when, I think.
Summary: 1-2 second hang on 2nd run due to addon listing which calls CookieService SQL → 1-2 second hang on 2nd run due to addon listing calling CookieService SQL, net_EnsurePSMInit
Alon - How early are we trying to do the AddonRepository XHR? We purposefully try to read the cached file, as you note, and if it does not yet exist, we delay several seconds before trying to use the AddonRepository:
http://mxr.mozilla.org/mobile-browser/source/chrome/content/aboutHome.xhtml#352
Yes, we delay several seconds. The addon spinner spins as we wait, then the hang occurs when we do finally do the XHR. (Then shortly afterwards the async part of the XHR returns from the remote server, and the spinner is replaced by the results.)

So this isn't as bad as making the load slower, as it happens later. And it is just on second run, not first or subsequent ones.

*But*, I believe (have not confirmed though) we are suffering these issues later on other runs - we eventually do need the cookie service, and will stall then, and we will stall for the https stuff when we do the first relevant request.
Right, I totally agree with you about the need to try to fix these issues. I was just thinking about what parts of Fennec UX were affected. These slowdowns will definitely hurt us at some point.
To complete the picture, there is another major stall on 2nd run due to the addons code. Here is a summary so far, in order of the hangs:

1. AddonManager.getAddonsByTypes calls LightweightThemeManager's getAddonsByTypes, which takes 1-2 seconds. Still investigating the underlying cause.
2. To find addons, we do an https XHR, which ends up running net_EnsurePSMInit() for 1 second.
3. And that XHR also, through AsyncOpen, ends up starting the CookieService, which takes 0.35 seconds.

All of this happens on the main thread, and only on 2nd run.
Summary: 1-2 second hang on 2nd run due to addon listing calling CookieService SQL, net_EnsurePSMInit → Several seconds of delayed hang on 2nd run due to addon list stuff
To elaborate on 1: My profiling code was confusing by the indirection. The culprit is actually XPIProvider.jsm, specifically

1.a. XPIDatabase.openDatabaseFile() takes the bulk of the time

1.b. Services.prefs.savePrefFile(null) called from XPDatabase.openConnection takes a smaller but still significant amount of time. (The prefs are saved synchronously to disk.)
Closing a Firefox for Android Graveyard bug
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.