Closed Bug 1570747 Opened 5 years ago Closed 5 years ago

UrlClassifierFeatureBase::InitializePreferences took a long time during page load

Categories

(Toolkit :: Safe Browsing, defect, P1)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1577218
Performance Impact high

People

(Reporter: sefeng, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf:pageload)

This is the profile for loading m.facebook.com on GVE https://perfht.ml/2KaHNxX

Based on the profile, the parent process uses 97ms in nsUrlClassifierDBService::Classify and this is a lot of time of this page load, we should see if we can improve it.

Whiteboard: [qf]
Component: Networking → Safe Browsing
Product: Core → Toolkit
Assignee: nobody → dlee
Status: NEW → ASSIGNED
Priority: -- → P1
Blocks: 1543783

Related, this is a simpleperf profile of a facebook pageload on android (Moto G5)
It includes the URL Classifier thread.
https://perfht.ml/2ZsTaXt
I don't trust the measured time (256ms), but there are lot of samples coming from nsUrlClassifierDBServiceWorker::DoSingleLocalLookupWithURIFragments()

This will impact the first page load because, if I understand correctly, the network loads of that page load are blocked on URL classifier initialization.
Interestingly, most of the time during this initialization is spent importing JavaScript modules. You can see which modules are getting imported in the marker chart panel: https://perfht.ml/33l2sXP
To me, the modules under resources://services-settings/* sound unrelated to URL classification. Maybe we're importing JS modules for those accidentally when we don't really need it? Unfortunately this profile does not contain JS stacks so we can't see the import chain.

Whiteboard: [qf] → [qf:p1:pageload]

I wanted to see the impact of disabling safebrowsing on raptor pageload tests.
But it's disabled in the test framework.
So I made a comparison in which I enable it by removing the above prefs.
And the strange thing is that pageload is improved in raptor when I re-enable safebrowsing:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=190b8f87707ac2e16c9d94003310fec1700a14ad&newProject=try&newRevision=c5ef97f580f929579670912ee33d27bc020eb4c6&framework=10

e.g. 9% overall on bing, what look to be solid 5% and 6% on amazon and google.

I can get the same results by simply removing the dummy local host urls from the prefs:
https://hg.mozilla.org/try/rev/6676df5f8da550a9f4862e26e9372e08468f7d99
e.g. user_pref("browser.safebrowsing.provider.google.gethashURL", "http://127.0.0.1/safebrowsing-dummy/gethash");

Perhaps with the dummy urls there's work being done retrieving them?
e.g.
https://searchfox.org/mozilla-central/rev/9775cca0a10a9b5c5f4e15c8f7b3eff5bf91bbd0/toolkit/components/url-classifier/UrlClassifierHashCompleter.jsm#236

Andrew, could you file a new bug about these investigations? Let's keep this bug focused on the issue that was identified in the profile: The fact that the first call to nsUrlClassifierDBService::Classify takes some times because it calls UrlClassifierFeatureBase::InitializePreferences which does a surprising amount of work.

Summary: nsUrlClassifierDBService::Classify took a long time during page load → UrlClassifierFeatureBase::InitializePreferences took a long time during page load

Yes, good call Markus.
I think I have to investigate a bit further before I log this because right now all I have is "Enabling Safebrowsing URLs in Raptor improves performance" which is more of an anti-bug.

(In reply to Markus Stange [:mstange] from comment #2)

This will impact the first page load because, if I understand correctly, the network loads of that page load are blocked on URL classifier initialization.
Interestingly, most of the time during this initialization is spent importing JavaScript modules. You can see which modules are getting imported in the marker chart panel: https://perfht.ml/33l2sXP
To me, the modules under resources://services-settings/* sound unrelated to URL classification. Maybe we're importing JS modules for those accidentally when we don't really need it? Unfortunately this profile does not contain JS stacks so we can't see the import chain.

Not sure if there is anything we can do from URL Classifier, even that if we remove importing remote-setting.js from ::InitializePreferences, this may probably still happen somewhere using remote setting.

From the profile in the description(https://perfht.ml/2KaHNxX)
It seems that the js modules involved here includes;
Remote-settings.js -> RemoteSettingsClient.jsm -> kinto-offline-client.js & kinto-http-client.js

A lot of time spent on importing kinto-offline-client.js(29ms) & kinto-http-client.js(35ms)
Maybe we should check if we can do something here?

Hi Mathieu,
Could you help check if there is anything we can do? thanks!

Flags: needinfo?(mathieu)

(In reply to Andrew Creskey from comment #3)

I wanted to see the impact of disabling safebrowsing on raptor pageload tests.
But it's disabled in the test framework.
So I made a comparison in which I enable it by removing the above prefs.
And the strange thing is that pageload is improved in raptor when I re-enable safebrowsing:

Maybe this is because the optimizations we have done by using tracking protection, See
"Tracking annotations" in this link : https://wiki.mozilla.org/Security/Tracking_protection

As Markus suggested, It will be great if you can help file another bug for this and we can discuss this in the bug , thanks!

(In reply to Dimi Lee [:dimi][:dlee] from comment #7)

Maybe this is because the optimizations we have done by using tracking protection, See
"Tracking annotations" in this link : https://wiki.mozilla.org/Security/Tracking_protection

As Markus suggested, It will be great if you can help file another bug for this and we can discuss this in the bug , thanks!

Thank you for the link, that's quite interesting Dimi.
I've logged Bug 1572713 to discuss. Feel free to edit title and we may likely end up moving the component.

Hi Mathieu,
Could you help check if there is anything we can do? thanks!

If out 100ms spent in URL classification, 70ms are spent in loading JS files, then yes we should definitely do something!
If I'm not mistaken, the imports are «cached» and this should only be true for the first call (to be confirmed).

The load of kinto-http-client.js could totally be avoided. It is supposed to be loaded lazily only when needed (ie. not in your context).

But because of this line I suppose that it gets loaded along with the constructor:
https://github.com/Kinto/kinto.js/blob/add721282c69418e3af1651695efc8d249a55116/fx-src/index.js#L54
I'll open a ticket.

Apart from that I cannot see any low-hanging fruits :(

Reducing the size of kinto-offline-client.js may help, especially because we use a tiny part of it, but it'd require lots of efforts...

Flags: needinfo?(mathieu)
See Also: → 1572737

I've been playing a bit with this :)

It's not reviewed/merged upstream yet, but you can try my changes like this:

$ curl https://gist.githubusercontent.com/leplatrem/08d6a93f27b210a503d4f96f643bfb92/raw/aa57c97400839c8086431641d55a8b1ae45e5ebd/kinto-offline-client.js > services/common/kinto-offline-client.js 
$ ./mach build faster

I could verify that kinto-http-client.js is not loaded if no network activity is required.
Let me know if you can seen some improvements on your side too.

Hi Sean,
Do you have to time to help test the patch provided by Mathieu in Comment 10?
If no, can you show me what is your reproduce step?(do you use a real device or emulator)
Thanks!

Flags: needinfo?(sefeng)

The lazy-load patch landed yesterday. You should see better results when loading Remote Settings.

Need info'ing myself - we'll profile and run visual metrics tests before and after the patch.

Flags: needinfo?(acreskey)

Mathieu - I realize now that the lazy-load patch is not to the mozilla central tree.
What would be the best way for me to create two binaries, one with the change and one without?
Something more like Comment 10?

Flags: needinfo?(mathieu)

Well, what makes you think that?

The changes are supposed to have landed in https://bugzilla.mozilla.org/show_bug.cgi?id=1577218

Flags: needinfo?(mathieu)
Flags: needinfo?(sefeng)

Thanks Mathieu -- I was confused with the kinto github repo.
I'll compare the changes from Bug 1577218 overnight.
To be honest, this is mostly to test new tooling, but I am quite interested in whether or not it picks up a difference.

FYI, I wasn't able to measure a difference between Bug 1577218 and its parent commit

SpeedIndex:                                       | baseline       | Bug 1577218| Relative speedup
----------------------------------------------------------------------------------------------------------
http://www.espn.com/nba/story/_/page/allstarw..   |   3391.00      |  3387.00 |   0.12% |
https://aframe.io/examples/showcase/animation/    |   3945.00      |  3913.00 |   0.81% |
https://booking.com                               |   2173.00      |  2203.00 |  -1.38% |
https://cnn.com                                   |   3540.00      |  3490.00 |   1.41% |
https://cnn.com/ampstories/us/why-hurricane-m..   |   2345.00      |  2397.00 |  -2.22% |
https://en.m.wikipedia.org/wiki/Main_Page         |   1926.00      |  1964.00 |  -1.97% |
https://m.ebay-kleinanzeigen.de                   |   2049.00      |  2036.00 |   0.63% |
https://m.facebook.com/Cristiano                  |   2179.00      |  2201.00 |  -1.01% |
https://m.imdb.com/                               |   2262.00      |  2292.00 |  -1.33% |
https://stackoverflow.com/                        |   2177.00      |  2151.00 |   1.19% |
https://support.microsoft.com/en-us               |   2144.00      |  2112.00 |   1.49% |
https://web.de/magazine/politik/politologe-gl..   |   2314.00      |  2305.00 |   0.39% |
https://www.allrecipes.com/                       |   2845.00      |  2861.00 |  -0.56% |
https://www.amazon.com                            |   2151.00      |  2211.00 |  -2.79% |
https://www.amazon.com/s/ref=nb_sb_noss_2/139..   |   2369.00      |  2381.00 |  -0.51% |
https://www.bbc.com/news/business-47245877        |   2176.00      |  2160.00 |   0.74% |
https://www.bing.com/search?q=restaurants         |   2456.00      |  2471.00 |  -0.61% |
https://www.jianshu.com/                          |   1940.00      |  1978.00 |  -1.96% |
https://www.reddit.com                            |   2665.00      |  2757.00 |  -3.45% |
https://www.youtube.com                           |   2143.00      |  2123.00 |   0.93% |
https://www.youtube.com/watch?v=COU5T-Wafa4       |   2194.00      |  2246.00 |  -2.37% |

I thinks it's possible that the test framework behaviour of navigating to a dummy url and waiting could be hiding any performance improvements.

Flags: needinfo?(acreskey)

Hi Andrew, thank you for help test this. Are all the measurements here are first-page load?
Can you also help capture a profile like what we did in the description? I think I can analyze the profile to see if the performance of UrlClassifierFeatureBase::InitializePreferences has improved. Thanks!

Flags: needinfo?(acreskey)

Dimi, yes, those measurements are all what we call cold loads -- the first time the given page was loaded with the user profile.

I captured a profile of the same site with the latest build -- I'm seeing mozilla::net::UrlClassifierFeatureBase::InitializePreferences() at 3 ms.
https://perfht.ml/2MX1H2h

However I also took a profile from August 27th nightly geckoview (the day before Bug 1577218 was merged) and I'm also see a fast InitializePreferences
https://perfht.ml/2MU7mWQ

I may be missing some steps of the original bug -- Sean, are there specific methods to reproduce the delay you found in the description?
I've been loading geckoview_example from a fresh install, pasting in the url, and starting to profile.

Flags: needinfo?(acreskey) → needinfo?(sefeng)

I don't think I didn't any thing specific to produce this result.

Actually, I can't remember if it was a cold load or warm load (I think it may be a warm load), and I also didn't use a fresh profile.

I'll try to reproduce it.

I can't reproduce the long loading InitializePreferences() on the latest nightly.

After doing some bisecting, I figured the long loading InitializePreferences() has been fixed since August 8th nightly geckoview. (Build ID:20190808093310).

Dimi, did we land some patches around August 7th that might fixed it?

Flags: needinfo?(sefeng) → needinfo?(dlee)

(In reply to Sean Feng [:sefeng] from comment #21)

I can't reproduce the long loading InitializePreferences() on the latest nightly.

After doing some bisecting, I figured the long loading InitializePreferences() has been fixed since August 8th nightly geckoview. (Build ID:20190808093310).

Dimi, did we land some patches around August 7th that might fixed it?

I would say no in URL Classifier.

I find the difference is that in those profiles with fast loading InitializePreferences(), Remote-settings.js didn't import RemoteSettingsClient.jsm, so it also didn't import the other slow loading modules mentioned in Comment 6.
If I understand correctly, what Mathieu has done(Comment 10) is to delay loading kinto-http-client.js, it doesn't affect whether RemoteSettingsClient.jsm will be loaded or not.
Maybe something has changed in Remote-settings.js so we don't always import RemoteSettingsClient.jsm now?

Assignee: dlee → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(dlee)

Hi Mathieu
Are you aware of any changes that may cause not loading RemoteSettingsClient.jsm while importing remote-settings.js?

Flags: needinfo?(mathieu)

RemoteSettingsClient.jsm will still be imported when the RemoteSettings() function from remote-settings.js is used.

What was changed in Bug 1577218 is that the kinto-http-client.js is now only imported when some network activity is required (ie. server synchronization).

Flags: needinfo?(mathieu)

(In reply to Sean Feng [:sefeng] [PTO until Oct 6] from comment #21)

I can't reproduce the long loading InitializePreferences() on the latest nightly.

After doing some bisecting, I figured the long loading InitializePreferences() has been fixed since August 8th nightly geckoview. (Build ID:20190808093310).

Dimi, did we land some patches around August 7th that might fixed it?

Hi Sean,
I did some check and find that this may because of Bug 1535049 which imports remote-setting.js prior to UrlClassifierFeatureBase::InitializePreferences.
By checking a profile captured locally, I'll also say this issue is fixed because kinto-http-client.js now is not imported immediately when remote-settings.js is imported.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → DUPLICATE
Performance Impact: --- → P1
Keywords: perf:pageload
Whiteboard: [qf:p1:pageload]
You need to log in before you can comment on or make changes to this bug.