Several seconds of delayed hang on 2nd run due to addon list stuff

RESOLVED WONTFIX

Status

RESOLVED WONTFIX
8 years ago
a month ago

People

(Reporter: azakai, Unassigned)

Tracking

Trunk
ARM
Android
Dependency tree / graph

Details

(Reporter)

Description

8 years ago
On first run, we show the first run page. On second run, we should about:home and search for addons. On third run, we show the cached addons.

On second run, the addon search can take 1-2 seconds, on the main thread, hanging the browser. Pan left and right constantly, and after a few seconds the hang will occur.

It looks like the cause is some sync operations in the XHR calls in AddonRepository.js:

    this._request = Cc["@mozilla.org/xmlextras/xmlhttprequest;1"].
                    createInstance(Ci.nsIXMLHttpRequest);
    this._request.open("GET", aURI, true); // this line takes ~1 second

and 

    this._request.send(null); // this line takes ~0.4 seconds

The XHR is defined as async (the |true| there), however something sync is going on. I haven't looked inside there yet.

I see this on Galaxy S. Might be a disk IO issue, if so, it would be limited just to that device.

This is much more noticeable after bug 634666, which caused a similar hang on almost every run, due to disk IO.
(Reporter)

Comment 1

8 years ago
The second stall, in send(), is due to

  mChannel->AsyncOpen(listener, nsnull);

which ends up calling the CookieService. The SQL stuff there (not the oldcookiefile stuff at the end) ends up taking 0.3-0.4 seconds, on the main thread (the XHR is async, but the cookie service is started up on the main thread).

Still investigating the first and bigger stall. So far narrowed it down to

  ioService->NewChannelFromURI(uri, getter_AddRefs(chan))

in NS_NewChannel.
(Reporter)

Updated

8 years ago
Summary: 1-2 second hang on 2nd run due to addon listing → 1-2 second hang on 2nd run due to addon listing which calls CookieService SQL
(In reply to comment #1)
> which ends up calling the CookieService. The SQL stuff there (not the
> oldcookiefile stuff at the end) ends up taking 0.3-0.4 seconds, on the main
> thread (the XHR is async, but the cookie service is started up on the main
> thread).
If you can init the cookie service earlier, it will do it's work in the background.
(Reporter)

Comment 3

8 years ago
The first stall is ~1 second in

  // HACK: make sure PSM gets initialized on the main thread.
  net_EnsurePSMInit();

which is called since the addons XHR uses https.
(Reporter)

Comment 4

8 years ago
(In reply to comment #2)
> (In reply to comment #1)
> > which ends up calling the CookieService. The SQL stuff there (not the
> > oldcookiefile stuff at the end) ends up taking 0.3-0.4 seconds, on the main
> > thread (the XHR is async, but the cookie service is started up on the main
> > thread).
> If you can init the cookie service earlier, it will do it's work in the
> background.

Interesting, so the cookieservice stuff ends up being sync because it is started too late?

Looks like we should indeed initialize it earlier. But we'd need to be careful about when, I think.
(Reporter)

Updated

8 years ago
Summary: 1-2 second hang on 2nd run due to addon listing which calls CookieService SQL → 1-2 second hang on 2nd run due to addon listing calling CookieService SQL, net_EnsurePSMInit
Alon - How early are we trying to do the AddonRepository XHR? We purposefully try to read the cached file, as you note, and if it does not yet exist, we delay several seconds before trying to use the AddonRepository:
http://mxr.mozilla.org/mobile-browser/source/chrome/﷒0
(Reporter)

Comment 6

8 years ago
Yes, we delay several seconds. The addon spinner spins as we wait, then the hang occurs when we do finally do the XHR. (Then shortly afterwards the async part of the XHR returns from the remote server, and the spinner is replaced by the results.)

So this isn't as bad as making the load slower, as it happens later. And it is just on second run, not first or subsequent ones.

*But*, I believe (have not confirmed though) we are suffering these issues later on other runs - we eventually do need the cookie service, and will stall then, and we will stall for the https stuff when we do the first relevant request.
Right, I totally agree with you about the need to try to fix these issues. I was just thinking about what parts of Fennec UX were affected. These slowdowns will definitely hurt us at some point.
(Reporter)

Comment 8

8 years ago
To complete the picture, there is another major stall on 2nd run due to the addons code. Here is a summary so far, in order of the hangs:

1. AddonManager.getAddonsByTypes calls LightweightThemeManager's getAddonsByTypes, which takes 1-2 seconds. Still investigating the underlying cause.
2. To find addons, we do an https XHR, which ends up running net_EnsurePSMInit() for 1 second.
3. And that XHR also, through AsyncOpen, ends up starting the CookieService, which takes 0.35 seconds.

All of this happens on the main thread, and only on 2nd run.
Summary: 1-2 second hang on 2nd run due to addon listing calling CookieService SQL, net_EnsurePSMInit → Several seconds of delayed hang on 2nd run due to addon list stuff
(Reporter)

Comment 9

8 years ago
To elaborate on 1: My profiling code was confusing by the indirection. The culprit is actually XPIProvider.jsm, specifically

1.a. XPIDatabase.openDatabaseFile() takes the bulk of the time

1.b. Services.prefs.savePrefFile(null) called from XPDatabase.openConnection takes a smaller but still significant amount of time. (The prefs are saved synchronously to disk.)
Closing a Firefox for Android Graveyard bug
Status: NEW → RESOLVED
Last Resolved: a month ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.