Last Comment Bug 394979 - Search Service is slow on startup (Ts hit)
: Search Service is slow on startup (Ts hit)
Status: RESOLVED FIXED
: perf
Product: Firefox
Classification: Client Software
Component: Search (show other bugs)
: Trunk
: All All
: P1 normal with 2 votes (vote)
: Firefox 3.1b1
Assigned To: Ryan Flint [:rflint] (ping via IRC for reviews)
:
Mentors:
Depends on: 458810
Blocks:
  Show dependency treegraph
 
Reported: 2007-09-04 19:47 PDT by Robert Sayre
Modified: 2015-09-18 02:52 PDT (History)
24 users (show)
mconnor: blocking‑firefox3-
mconnor: wanted‑firefox3.5+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
cache everything in JSON file and only parse XML if we must (18.39 KB, patch)
2007-09-04 19:48 PDT, Robert Sayre
no flags Details | Diff | Review
JS Ts timings, no patch (52.30 KB, text/plain)
2007-09-04 21:57 PDT, Robert Sayre
no flags Details
JS Ts timings, with patch (159.88 KB, text/plain)
2007-09-04 22:02 PDT, Robert Sayre
no flags Details
ifdef logging (1.19 KB, patch)
2007-09-05 15:08 PDT, Robert Sayre
gavin.sharp: review+
Details | Diff | Review
New timings (1.75 KB, text/plain)
2008-01-31 18:34 PST, Ryan Flint [:rflint] (ping via IRC for reviews)
no flags Details
Patch (23.04 KB, patch)
2008-03-19 20:24 PDT, Ryan Flint [:rflint] (ping via IRC for reviews)
no flags Details | Diff | Review
Patch v2 (39.45 KB, patch)
2008-09-27 00:45 PDT, Ryan Flint [:rflint] (ping via IRC for reviews)
gavin.sharp: review-
Details | Diff | Review
Patch v3 (39.67 KB, patch)
2008-09-30 23:54 PDT, Ryan Flint [:rflint] (ping via IRC for reviews)
gavin.sharp: review+
Details | Diff | Review

Description Robert Sayre 2007-09-04 19:47:16 PDT
DTrace profiles show that the search service expensive on startup. Turns out that it parses each OpenSearch XML file and does a bunch of permissions checking on files and directories to extract a very small amount of data.
Comment 1 Robert Sayre 2007-09-04 19:48:49 PDT
Created attachment 279684 [details] [diff] [review]
cache everything in JSON file and only parse XML if we must
Comment 2 Robert Sayre 2007-09-04 19:50:00 PDT
Comment on attachment 279684 [details] [diff] [review]
cache everything in JSON file and only parse XML if we must

I copied the read/write methods from session store for the moment.
Comment 3 Robert Sayre 2007-09-04 21:57:44 PDT
Created attachment 279700 [details]
JS Ts timings, no patch
Comment 4 Robert Sayre 2007-09-04 22:02:08 PDT
Created attachment 279701 [details]
JS Ts timings, with patch

Attachment 279684 [details] [diff] definitely hits a perf-critical path, since I am able to move lots rows with it. notifyAction shows some outliers in the no-patch version. I think that may be due to gc or other thrashing occurring after a bunch of XML documents are parsed.

That said, is there a reason we have to notify observers of added search engines on startup? Seems like a waste.
Comment 5 Robert Sayre 2007-09-04 22:06:39 PDT
Comment on attachment 279684 [details] [diff] [review]
cache everything in JSON file and only parse XML if we must

I'd like to land this experimentally, and see if it makes a dent in Ts on non-Solaris platforms.

Does the general approach look OK? Before landing it permanently, I'll stop duplicating the read/write stuff from session store, at the very least.
Comment 6 Robert Sayre 2007-09-05 15:08:22 PDT
Created attachment 279799 [details] [diff] [review]
ifdef logging

this is good for a small Ts win, according to Ts and dtrace.
Comment 7 Robert Sayre 2007-09-06 10:19:28 PDT
Comment on attachment 279684 [details] [diff] [review]
cache everything in JSON file and only parse XML if we must

Man, this would be a much bigger win with a native JSON implementation.
Comment 8 Ryan Flint [:rflint] (ping via IRC for reviews) 2008-01-31 18:34:22 PST
Created attachment 300800 [details]
New timings

This is a pretty significant win when used with native JSON - nominating for b4.
Comment 9 Mike Connor [:mconnor] 2008-02-06 13:35:24 PST
So where's the patch?
Comment 10 Mark Finkle (:mfinkle) (use needinfo?) 2008-03-04 12:03:12 PST
I was attempting to convert attachment 279684 [details] [diff] [review] to use nsIJSON instead of JSON.jsm (a somewhat trivial task, I hoped) and found the JSON.encode was crashing. It does not crash if I leave JSON.jsm in place, only nsIJSON.

Stack trace:
 	js3250.dll!js_LookupPropertyWithFlags(JSContext * cx=0x044a60d0, JSObject * obj=0x05a70c60, long id=134342884, unsigned int flags=0, JSObject * * objp=0x0012b5e8, JSProperty * * propp=0x0012b5cc)  Line 3220 + 0x21 bytes	C
 	js3250.dll!js_LookupProperty(JSContext * cx=0x044a60d0, JSObject * obj=0x05a70c60, long id=134342884, JSObject * * objp=0x0012b5e8, JSProperty * * propp=0x0012b5cc)  Line 3191 + 0x1b bytes	C
>	js3250.dll!CallEnumeratorNext(JSContext * cx=0x044a60d0, JSObject * iterobj=0x05a804c0, unsigned int flags=1, long * rval=0x0012b7d8)  Line 542 + 0x21 bytes	C
 	js3250.dll!js_CallIteratorNext(JSContext * cx=0x044a60d0, JSObject * iterobj=0x05a804c0, long * rval=0x0012b7d8)  Line 603 + 0x15 bytes	C
 	gklayout.dll!nsJSON::EncodeObject(JSContext * cx=0x044a60d0, long * vp=0x0012b9bc, nsJSONWriter * writer=0x05722768, JSObject * whitelist=0x00000000, unsigned int depth=3)  Line 286 + 0x12 bytes	C++
 	gklayout.dll!nsJSON::EncodeObject(JSContext * cx=0x044a60d0, long * vp=0x0012bb98, nsJSONWriter * writer=0x05722768, JSObject * whitelist=0x00000000, unsigned int depth=2)  Line 360 + 0x22 bytes	C++
 	gklayout.dll!nsJSON::EncodeObject(JSContext * cx=0x044a60d0, long * vp=0x0012bd74, nsJSONWriter * writer=0x05722768, JSObject * whitelist=0x00000000, unsigned int depth=1)  Line 360 + 0x22 bytes	C++
 	gklayout.dll!nsJSON::EncodeObject(JSContext * cx=0x044a60d0, long * vp=0x04762528, nsJSONWriter * writer=0x05722768, JSObject * whitelist=0x00000000, unsigned int depth=0)  Line 360 + 0x22 bytes	C++
 	gklayout.dll!nsJSON::EncodeInternal(nsJSONWriter * writer=0x05722768)  Line 249 + 0x1a bytes	C++
 	gklayout.dll!nsJSON::Encode(nsAString_internal & aJSON={...})  Line 94 + 0x11 bytes	C++

JSON in the writer at the time of the crash:

"{"e:\\mozilla-trunk\\mozilla\\firefox-objdir\\dist\\bin\\searchplugins":{"lastModifiedTime":1201869145588,"engines":[{"alias":null,"_name":"Amazon.com","_dataType":1,"_readOnly":true,"description":"Amazon.com Search","filePath":"e:\\mozilla-trunk\\mozilla\\firefox-objdir\\dist\\bin\\searchplugins\\amazondotcom.xml","_hasPreferredIcon":true,"hidden":null,"type":1,"queryCharset":"ISO-8859-1","searchForm":"http://www.amazon.com/","_installLocation":1,"_updateInterval":null,"_updateURL":null,"_iconUpdate"
Comment 11 Vladimir Vukicevic [:vlad] [:vladv] 2008-03-14 13:07:59 PDT
What's the status of this?  This IMO either needs a patch very soon or should get marked blocking-
Comment 12 Ryan Flint [:rflint] (ping via IRC for reviews) 2008-03-19 20:24:22 PDT
Created attachment 310671 [details] [diff] [review]
Patch

This includes a test to check the serialized output, but it's currently disabled because the search service's cache file handle doesn't appear to be getting cleared quickly enough on windows :/
Comment 13 Mike Connor [:mconnor] 2008-03-24 18:35:22 PDT
This isn't a blocker, think we should probably sit on this until next cycle unless its absolutely safe.
Comment 14 Mike Beltzner [:beltzner, not reading bugmail] 2008-09-25 10:32:04 PDT
Ryan, I know you were working something through here ... what are the chances that we'll see this in 3.1b1?
Comment 15 Ryan Flint [:rflint] (ping via IRC for reviews) 2008-09-27 00:45:11 PDT
Created attachment 340704 [details] [diff] [review]
Patch v2

I'm having a couple issues with the accompanying test suite so it's not in this version of the patch, but let's get the review ball rolling on this anyway.
Comment 16 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-09-30 16:05:51 PDT
Comment on attachment 340704 [details] [diff] [review]
Patch v2

>diff --git a/browser/app/profile/firefox.js b/browser/app/profile/firefox.js

>+// Search plugin cache version. Used to force cache reloads.
>+pref("browser.search.cache.version", "chrome://browser-region/locale/region.properties");

>diff --git a/browser/locales/en-US/chrome/browser-region/region.properties b/browser/locales/en-US/chrome/browser-region/region.properties

>+browser.search.cache.version=1

I think we should just use nsIXULAppInfo::platformVersion instead - this makes sure we won't forget to update this for whatever reason, and removes the burden on localizers having to remember to update it when they change their plugins. Perhaps even platformBuildID, to make nightly testers exercise this path more often (and make nightly testing of search plugin changes easier).

>diff --git a/toolkit/components/search/nsSearchService.js b/toolkit/components/search/nsSearchService.js

>+function getDir(aKey, aIFace) {

>+  let iface = aIFace || Ci.nsIFile;
>+  return _dirSvc.get(aKey, iface);

nit: just pass aIFace || Ci.nsIFile directly and avoid the temporary.

>+  // We use this to ensure that serializing this to an object after initializing
>+  // from JSON won't copy over mozparams which have already been expanded.
>+  this.mozparams = [];
>+
>+  // Unexpanded pref params
>+  this.rawparams = [];

I think this stuff is overly complicated. You can just keep an array of mozParam definitions (i.e. the object you pass to _addMozParam, but with a "mozparam: true" property) and the regular array of expanded params. When it comes time to serialize:

function swapMozParams(p)
  this.mozparams[p.name] || p;
cache.params = this.params.map(swapMozParams, this);

When reading in cache.params, you can deal what kind you're dealing with by checking for the "mozparam" property.

>+  _initWithJSON: function SRC_EURL__initWithJSON(aJson, aIsDefault) {

>+        if (param.condition == "defaultEngine") {
>+          if (aIsDefault)
>+            this.addParam(param.name, param.trueValue);
>+          else
>+            this.addParam(param.name, param.falseValue);
>+        } else if (param.condition == "pref") {
>+          let value = getMozParamPref(param.pref);
>+          this.addParam(param.name, value);
>+        }
>+        this._addMozParam(param);

Could also split this out into a helper that returns the QueryParameter object, and have the equivalent code in _parseURL call it (and make addParam take the object directly rather than name/value).

>-              value = prefB.getCharPref(BROWSER_SEARCH_PREF + "param." +
>-                                        param.getAttribute("pref"));
>+              value = getMozParamPref(param.getAttribute("name"), value);

Should be "pref", not "name", right?

>+  _isDefaultEngine: function SRCH_ENG__isDefaultEngine() {

>+    try {
>+      gDefaultEngine = defaultPrefB.getComplexValue("defaultenginename", nsIPLS).data;

gDefaultEngine is undefined, and isn't used elsewhere - get rid of the g and use let?

>+  _serializeToJSON: function SRCH_ENG__serializeToJSON(aFilter) {
>+    var json = {
>+      _name: this._name,
>+      description: this.description,
>+      filePath: this._file.path,

You should use nsILocalFile.persistentDescriptor here instead of path (might run into encoding issues otherwise, and potential problems on Mac I think).

I'm somewhat concerned that we're going to end up adding a field or something and forget to update _initWithJSON/_serializeToJSON. It'd be nice if we came up with a test that makes sure that all properties of a toSource()ed engine item are serialized, or something like that (with an exception list for those that don't need to be stored in JSON, of course).

>+  _loadEngines: function SRCH_SVC__loadEngines(aInvalidate) {

>+    let cacheEnabled = getBoolPref(BROWSER_SEARCH_PREF + "cache.enabled", true);
>+    // See if we have a cache file so we don't have to parse a bunch of XML.
>+    let cacheFile = getDir(NS_APP_USER_PROFILE_50_DIR);
>+    cacheFile.append("search.json");
>+    if (cacheFile.exists() && !aInvalidate && cacheEnabled) {
>+      cache = this._readCacheFile(cacheFile);
>+      if (cache)
>+        cacheNeedsWrite = false;
>+    }

Seems like you should be able to rework this to avoid doing any of the cache stuff it the pref is flipped - could put the "cacheEnabled" check outside to avoid creating an nsIFile for it, at least. Could you add the cache pref to firefox.js as well?

>+    while (locations.hasMoreElements()) {

>+      if (!cache[path] || cache[path].lastModifiedTime < dir.lastModifiedTime ||
>+          cache.locale != locale || cache.localizedVersion != localizedVersion ||
>+          cache.version != CACHE_VERSION) {

Would be nice to LOG("skipping cache") here.

>+        let cacheEntry = {}
>+        cacheNeedsWrite = true;
>+        cacheEntry.lastModifiedTime = dir.lastModifiedTime;
>+        cacheEntry.engines = [];
>+        for each(let engine in loadedEngines)
>+          cacheEntry.engines.push(engine._serializeToJSON(true));
>+        cache[path] = cacheEntry;
>+
>+        // Allows us to force a cache refresh should the cache format change.
>+        cache.version = CACHE_VERSION;
>+        // The next two properties assist us in ensuring that the cache will be
>+        // reloaded should the default set of plugins change in any way. This
>+        // is necessary because we don't want to incur the costs of stat()ing each
>+        // plugin on every startup when the only (supported) time they will change is
>+        // during runtime (where we refresh for changes through the API) and app updates.
>+        // Extension-shipped plugins are the only exception to this, but their
>+        // directories are blown away on updates, so we'll detect their changes.
>+        cache.localizedVersion = localizedVersion;
>+        cache.locale = locale;

This should be split out into a _buildCache method for clarity, I think, and called only once. You'll need to check each engine._file.parent to determine which cache entry to put it in, but that shouldn't be a problem.

>+      } else {
>+        // Cache hit. Add engines from JSON.
>+        let engines = cache[path].engines;
>+        LOG("_loadEngines: Loading from cache. " + engines.length + " engines to load.");
>+        for (let i = 0; i < engines.length; ++i) {
>+          let json = engines[i];
>+          let file = Cc["@mozilla.org/file/local;1"].createInstance(Ci.nsILocalFile);
>+          file.initWithPath(json.filePath);
>+          let engine = new Engine(file, json._dataType, json._readOnly);
>+          engine._initWithJSON(json);
>+          this._addEngineToStore(engine);
>+        }

Similarly, this could be split out into _loadEnginesFromCache(aDir).

>+    if (cacheNeedsWrite) {
>+      LOG("_loadEngines: Writing to cache file.");
>+      this._writeCacheFile(cacheFile, cache);
>+    }
>+   },

>+  _batchCacheInvalidation: function SRCH_SVC__batchCacheInvalidation() {
>+    let callback = {
>+      self: this,
>+      notify: function SRCH_SVC_batchTimerNotify(aTimer) {
>+        LOG("_batchCacheInvalidation: Invalidating engine cache");
>+        this.self._loadEngines(true);

As far as I can tell this _loadEngines(true) call and the one in the QUIT_APPLICATION_TOPIC observer are causing us to read all the engines from disk again, and there's no need for that. The only reason this doesn't cause bustage is that addEngineToStore refuses to add an already-existing engine, so it's all essentially a no-op. Splitting out _buildCache and calling it instead would fix this.

>@@ -2852,18 +3109,17 @@ SearchService.prototype = {

>+    var prefB = gPrefSvc.QueryInterface(Ci.nsIPrefBranch);

unneeded QI (it's QIed to nsIPrefBranch when created)

>@@ -2879,37 +3135,42 @@ SearchService.prototype = {

>+          this._batchCacheInvalidation();

I think you need to use SEARCH_ENGINE_ADDED instead of SEARCH_ENGINE_LOADED to catch addEngineWithDetails callers (which don't go through _onLoad).

>+        if (aVerb == SEARCH_ENGINE_REMOVED)
>+          this._batchCacheInvalidation();

I think you need to also invalidate on SEARCH_ENGINE_CHANGED to catch engine updates.

r- because of the _loadEngines thing, but looks good otherwise.
Comment 17 Ryan Flint [:rflint] (ping via IRC for reviews) 2008-09-30 23:54:31 PDT
Created attachment 341234 [details] [diff] [review]
Patch v3
Comment 18 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-10-01 01:32:09 PDT
Comment on attachment 341234 [details] [diff] [review]
Patch v3

>diff --git a/toolkit/components/search/nsSearchService.js b/toolkit/components/search/nsSearchService.js

>+  _serializeToJSON: function SRCH_EURL__serializeToJSON() {

>+    function collapseMozParams(aParam) {
>+      for (let i = 0; i < this.mozparams.length; i++) {
>+        if (aParam.name == this.mozparams[i].name)
>+          return this.mozparams[i];
>+      }
>+      return aParam;
>+    }
>+    json.params = this.params.map(collapseMozParams, this);

Still think this can be simplified by using an object keyed by param name rather than an array.

>+  _serializeToJSON: function SRCH_ENG__serializeToJSON(aFilter) {

>+      filePath: this._file.persistentDescriptor,

Should QI to nsILocalFile to be sure, I think (Engine constructor does this QI, but not good to rely on it).

>+  _buildCache: function SRCH_SVC__buildCache() {
>+    if (!getBoolPref(BROWSER_SEARCH_PREF + "cache.enabled", true))
>+      return;
>+
>+    let cache = {};
>+    let locale = getLocale();

I think locale is redundant now that we're using buildID. Should check/file a followup.

>+  _loadEnginesFromCache: function SRCH_SVC__loadEnginesFromCache(aDir) {

>+    for (let i = 0; i < engines.length; i++) {
>+      let json = engines[i];
>+      let file = Cc["@mozilla.org/file/local;1"].createInstance(Ci.nsILocalFile);
>+      file.initWithPath(json.filePath);

This is a persistentDescriptor now, which is only a path on non-Mac. Need to change to setting nsILocalFile.persistentDescriptor.

>   observe: function SRCH_SVC_observe(aEngine, aTopic, aVerb) {

>+        switch (aVerb) {

>+          case SEARCH_ENGINE_CHANGED:
>+          case SEARCH_ENGINE_REMOVED:
>+            this._batchCacheInvalidation();
>+          break;

nit: indent

r=me with those. Please get the tests done ASAP!
Comment 19 Robert Sayre 2008-10-06 11:46:14 PDT
does this improve Ts?
Comment 20 Ryan Flint [:rflint] (ping via IRC for reviews) 2008-10-06 11:57:20 PDT
Yes, local talos runs and the try server have shown around a 5% improvement.
Comment 21 Ryan Flint [:rflint] (ping via IRC for reviews) 2008-10-06 14:51:00 PDT
http://hg.mozilla.org/mozilla-central/rev/ce56bc915463
Comment 22 Florian Quèze [:florian] [:flo] 2015-05-19 03:59:27 PDT
Does anybody who was involved here remember why the cache.enabled pref was needed? Do we still need it?
Comment 23 Florian Quèze [:florian] [:flo] 2015-09-09 03:54:12 PDT
(In reply to Florian Quèze [:florian] [:flo] from comment #22)
> Does anybody who was involved here remember why the cache.enabled pref was
> needed? Do we still need it?

Adding a needinfo flag.
Comment 24 Florian Quèze [:florian] [:flo] 2015-09-18 02:52:57 PDT
(In reply to Florian Quèze [:florian] [:flo] from comment #22)
> Does anybody who was involved here remember why the cache.enabled pref was
> needed? Do we still need it?

I/we decided it's not, and was likely added at the time just for the sake of being able to pref-off the change on aurora if needed. I removed it in bug 1203161.

Note You need to log in before you can comment on or make changes to this bug.