Closed Bug 398807 Opened 17 years ago Closed 10 years ago

session restore accounts for 2-3% of talos pageload (make it faster)

Categories

(Firefox :: Session Restore, defect, P4)

defect

Tracking

()

RESOLVED FIXED
Firefox 3

People

(Reporter: sayrer, Unassigned)

References

Details

(Keywords: meta, perf, Whiteboard: [not needed for 1.9][meta])

Attachments

(5 files, 5 obsolete files)

We disabled session restore on the trunk for a bit and watched the perf boxes.

This means whatever session store is doing is really slow, since its timeout doesn't fire for every page. The page that happens to load with the session store activity probably gets a much larger regression.
Flags: blocking-firefox3?
Dietrich: any quick thoughts about what the cause of this might be? I'd love to block on improving the perf here, but not at the cost of disabling session restore.
Blocks: 397945
(In reply to comment #1)
> I'd love to
> block on improving the perf here, but not at the cost of disabling session
> restore.

Yes, the bug is to make it faster, not disable it.

Here is a comment from Brendan Gregg on bug 385224 regarding attachment 278904 [details]:

Ahh ok, I do see functions such as sss_saveState (nsSessionStore.js) taking
over 50ms on your run, and over 80 ms on mine (and sometimes higher).

Why sss_saveState() is slow may already be obvious to many on this list;
nevertheless, I did a bit more digging to see how DTrace can explore this.

I started by tracing the JavaScript function sss_saveState(), and all the
system calls that it causes, along with delta times. There were many read()s
and write()s, and a several filesystem syscalls. The time seemed to be mostly
spent in syscalls such as xstat(), open64(), close() and rename(), rather than
in JavaScript.

Digging further, I traced sss_saveState() along with those four syscall types,
and used DTrace to print out their arguments. I've attached the full output,
which shows these syscalls in relation to the JavaScript code, and delta times.
Below is a summary from a different run, with just the syscalls,

# ./js_truss -p `pgrep -n firefox-bin`
C   DELTA FILE                 TYPE      -- FUNC
0       3 nsSessionStore.js    JS         -> sss_saveState
0   17374 "                    syscall      -> xstat (sessionstore.js)
0     711 "                    syscall      <- xstat (0 errno=0)
0      63 "                    syscall      -> open64 (sessionstore.js 0x701)
0     998 "                    syscall      <- open64 (-1 errno=17)
0      40 "                    syscall      -> open64 (sessionstore-1.js 0x701)
0   53059 "                    syscall      <- open64 (41 errno=0)
0      30 "                    syscall      -> close (41)
0     406 "                    syscall      <- close
0      20 "                    syscall      -> open64 (sessionstore-1.js 0x301)
0   22644 "                    syscall      <- open64 (41 errno=0)
0     431 "                    syscall      -> close (41)
0   70284 "                    syscall      <- close
0      54 "                    syscall      -> rename (sessionstore-1.js,
sessionstore.js)
0    9505 "                    syscall      <- rename
0      70 nsSessionStore.js    JS         <- sss_saveState
^C

To start with, xstat() checks for "sessionstore.js" and finds it. Then open64()
attempts to open it with the flags O_WRONLY|O_TRUNC|O_CREAT|O_EXCL (0x701)
which errors with code 17 "File exists". Seems an odd to use O_EXCL next, since
xstat() had already found this file to exist.

Then, "sessionstore-1.js" is opened once with O_EXCL and then again without.
Why open it twice? Finally, "sessionstore-1.js" is renamed to
"sessionstore.js".

Sounds like a few slow syscalls might be dropped with a little different logic.

The syscall times did vary from run to run, I'd guess because my home directory
is on NFS and these syscalls may be causing network I/O - more DTrace can
confirm that. The syscall delta times are also a little higher than they should
be, as by this point I'm using DTrace to pull in syscall arguments and print
them out.

Anyhow, that was just for sss_saveState. It might be interesting to attack
other functions in a similar way.
Summary: session restore accounts for 2-3% of talos pageload → session restore accounts for 2-3% of talos pageload (make it faster)
In case of Windows, can we save sessionstore.js in CSIDL_LOCAL_APPDATA (Local Settings), to avoid I/O over the network ? Just an idea.
On pure page load we really shouldn't be doing much to start with: we call sss_saveStateDelayed which launches a timer (in case it hasn't been launched already) which a few seconds later updates the data required to restore all tabs and saves it to disk. Besides the two proposed optimizations (see below), we could try to choose slightly larger values for reaction time (internal constant of 2 to 3 seconds) and browser.sessionstore.interval (currently defaults to 10s) since those values don't have much empirical basis (except that they seem reasonable and worked quite nicely for myself).

(In reply to comment #2)
That's rather sss_writeFile which is slowed down by the safe-file-output-stream we use (thus an XPCOM I/O bug if any at all), although that function should at most be called every 10s and thus not be too relevant for Tp (should it still be, a background I/O thread might help).

(In reply to comment #3)
That sounds like a suggestion for a new bug (saving sessionstore.js to the profile only on exit and keeping the frequently updated one as local as possible).
Simon is correct, SS doesn't do much synchronously on page load, mostly just adds event handlers.

I did some profiling locally and found that with >10 tabs, the tab load handler can take up to 2ms, but this does not increase past 2ms as the number of tabs increases.

However, the duration of saveState does increase relative to the number of tabs open and I observed a cost of 8-12ms per tab, for data collection. At 20 tabs, saveState took about 300ms, including writing to disk.

But as my previous numbers about tab load show, this should not affect Tp. I've forwarded some questions to Alice to get more information about how the Talos Tp number is formulated, to see if the saveState costs are somehow figuring in.
(In reply to comment #5).
> 
> But as my previous numbers about tab load show, this should not affect Tp. I've
> forwarded some questions to Alice to get more information about how the Talos
> Tp number is formulated, to see if the saveState costs are somehow figuring in.

saveState executes on a timeout. The timeout is firing occasionally. It affects pageload, and should be included in our numbers.
(In reply to comment #6)
> saveState executes on a timeout. The timeout is firing occasionally. It affects
> pageload, and should be included in our numbers.
> 

Agreed, it needs to run faster regardless.

Alice said that Tp cycles the pages in a single window in an iframe. Given the numbers in comment #5, this means that SS's real-world Tp hit for multi-tab browsing sessions (a common real-world occurrence, in all likelihood) is actually far higher than the test indicates. That said, the perf hit does not affect all page loads - only for those that occur while we're saving session data.

(In reply to comment #4)
> we could try to choose slightly larger values for reaction time (internal
> constant of 2 to 3 seconds) and browser.sessionstore.interval (currently
> defaults to 10s) since those values don't have much empirical basis (except
> that they seem reasonable and worked quite nicely for myself).

The longer we wait between saves, the higher the risk of data loss when a crash happens. I'd prefer another solution if at all possible.

Also, the Tp number is showing the median page load time. SS does not affect all page loads; it likely only affects page loads that occur while we're in the middle of saveState(). Increasing the interval between saves might improve our test numbers (fewer saves in the duration of the test), but doesn't affect real-world effect on users, who will still experience a slow page load anytime we're in saveState().

> (In reply to comment #2)
> That's rather sss_writeFile which is slowed down by the safe-file-output-stream
> we use (thus an XPCOM I/O bug if any at all), although that function should at
> most be called every 10s and thus not be too relevant for Tp (should it still
> be, a background I/O thread might help).

writeFile was a minority of the time spent when there were multiple tabs open.
Definitely wanted, but without a strategy or defined performance improvement goal, we're finding it hard to block on this. Let's talk about it in today's Firefox 3 meeting and we can re-nominate ...
Flags: blocking-firefox3? → blocking-firefox3-
Whiteboard: [wanted-firefox3]
I looked at a sessionstore.js file from a Talos Tp run, and for each test page the cycler loads a frame hierarchy like this:

framecycler.html?quit=0&cycles=5
  header.html
  pages/www.whatever.com/index.html

This means that SS is tracking and saving 3 pages for every 1 page tested, and each call to saveState() will take that much longer since we're saving this extraneous data.

This issue might be moot to the extent that all components are subject to this same penalty, the only difference being that SS's performance correlates directly with the amount of data saved, so it is specifically inhibited by the test itself.

But all that is still about the composition of the test, which clearly does not reflect real-world browser usage. An avenue that may bear more fruit is that the Talos sessionstore.js file contained 50 history entries for it's single tab. Given that SS.saveState()'s performance correlates directly to the amount of data to be saved, we may want to cap the saved tab history at a reasonable number (10?). We could add a pref for "deep restore", defaulting to false, which would re/store all tab session history. I'll make a patch for this, if it sounds like a reasonable approach.

My subjective opinion is that saving a shallow(er) tab history is fine, since I rarely go back/forward more than 1-3 levels deep. UX, do we have any scientific/academic/industry data on back/forward history usage?
>>My subjective opinion is that saving a shallow(er) tab history is fine, since I
rarely go back/forward more than 1-3 levels deep. UX, do we have any
scientific/academic/industry data on back/forward history usage?<<

fwiw,

I'm not a typical user, but in the case of back/forward usage, I rarely use forward, but do use back, sometimes trying to reach a page past my (default) back-button limit.  But I am hardly typical I imagine.  I would also not expect sessionstore to save the exact state that far back, (but would love if it would still store url's, etc.)

Though I can easily understand a reduction in "back button" ability after crashes here, if it means better perf.
(In reply to comment #9)
> UX, do we have any
> scientific/academic/industry data on back/forward history usage?

We have data showing that the back button is by far the most used UI control on the browser toolbar, outstripping the next most popular by a factor of 3. 

I did some quick research, and found two studies (http://vsis-www.informatik.uni-hamburg.de/publications/view.php/263 and http://informationr.net/ir/11-2/paper249.html) which imply that the average used back button depth is inversely proportional to the number of open tabs. The reasons for this seem to be twofold: first, increased cognitive overhead means that users find it harder to remember deep back-navigation stacks as they increase the number of open tabs; second, opening a page in a new tab effectively prunes a back-navigation stack.

There's no solid numbers, but it occurs to me that one potential optimization is to start reducing the depth of saved back-navigation stacks on each tab as the number of open tabs increases.
Did anybody profile to see how much time is eaten by saving all the session history (compared to the rest of session store work)? I would hate to lose more of session store completeness if it isn't really necessary -- the less complete session store is, the less the user can rely on it.

Doesn't session store go through all the session history each time it serializes a tab's state? Maybe a long-term solution would be to avoid doing that, relying on the fact that older session history entries do not change over time.

> it occurs to me that one potential optimization
> is to start reducing the depth of saved back-navigation stacks on each tab as
> the number of open tabs increases.
That means I, as a user, can only rely on a single session history entry being saved. That certainly reduces the usefulness of session store.

Another thing to keep in mind is that certain users (probably not too many, but at least some of the power users I saw) have lots of tabs open, but actively interact only with a few of them. The proposed optimization would cap the restored history at a low level in this case.
Attached patch tweaksSplinter Review
some tweaks, improved saveState() performance by up to 7% in my tests.
Assignee: nobody → dietrich
Status: NEW → ASSIGNED
Attachment #284411 - Flags: review?(zeniko)
Comment on attachment 284411 [details] [diff] [review]
tweaks

>-      var disallow = CAPABILITIES.filter(function(aCapability) {
>-        return !browser.docShell["allow" + aCapability];
>-      });

We do use .forEach, .filter and friends with anonymous functions in other places in sss_saveState's code paths. Have you tried replacing them with plain loops as well?

>+      var disallow = "";

IMO it would look cleaner if you used an array instead, appending the disabled capabilities and .join(",") if disallow.length > 0. For Tp this should be pretty much equivalent, since in most cases everything will remain allowed anyway.

>+      if (this.xulAttributes.length != 0) {

Nice one. Maybe the equivalent check could also be used for

>       tabData.extData = tabbrowser.mTabs[i].__SS_extdata || null;

instead of the || null appendage. This would give us another actually unneeded property less to carry around (per tab).
Attachment #284411 - Flags: review?(zeniko) → review+
(In reply to comment #12)
> Doesn't session store go through all the session history each time it
> serializes a tab's state?

It shouldn't; we already cache the relevant bits of session history and update them per tab at "load" events, so that unless you are a highly parallel surfer we should already skip history serialization for most tabs at a time:

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/browser/components/sessionstore/src/nsSessionStore.js&rev=1.78&mark=815-826#810
Dietrich, did the .filter() elimination save measurable runtime? It could have, I would just like to know. We hope to optimize functional programming forms to rival or beat loops, but that work lies in the future. If a loop is not too ugly, and it is noticeably faster, use it (but let us know the performance delta, and if you can, some stats about the loop bound).

/be
(In reply to comment #16)
> Dietrich, did the .filter() elimination save measurable runtime? It could have,

I don't believe it, without more convincing data. Dietrich's code is faster, so switching it up probably didn't hurt.

Attachment #284411 - Flags: approval1.9?
Attachment #284411 - Flags: approval1.9? → approval1.9+
Checking in browser/components/sessionstore/src/nsSessionStore.js;
/cvsroot/mozilla/browser/components/sessionstore/src/nsSessionStore.js,v  <--  nsSessionStore.js
new revision: 1.79; previous revision: 1.78
done

(In reply to comment #14)
> >+      var disallow = "";
> 
> IMO it would look cleaner if you used an array instead, appending the disabled
> capabilities and .join(",") if disallow.length > 0. For Tp this should be
> pretty much equivalent, since in most cases everything will remain allowed
> anyway.

Fixed, as it didn't make a difference at the millisecond level.

Thanks for the pointers to the other potential improvements, I'll test these as well.

(In reply to comment #17)
> (In reply to comment #16)
> > Dietrich, did the .filter() elimination save measurable runtime? It could have,
> 
> I don't believe it, without more convincing data. Dietrich's code is faster, so
> switching it up probably didn't hurt.
> 

Using primeval Date.now() profiling, I saw a 50% improvement at ms granularity (ie: consistently went from 2ms to 1ms).
Attached patch tweaks 2Splinter Review
mostly just not serializing empty or default values.
Attachment #284511 - Flags: review?(zeniko)
Comment on attachment 284511 [details] [diff] [review]
tweaks 2

Do these changes have a measurable performance impact?

>+    if (cacheKey && cacheKey instanceof Ci.nsISupportsPRUint32 && cacheKey != 0) {

That should rather be cacheKey.data != 0 at the end.

>+    if (aEntry.contentType.length != 0)

Why not just |if (aEntry.contentType)| ?

>+    var hidden = [];
>+    WINDOW_HIDEABLE_FEATURES.forEach(function(aItem) {
>+      if (aWindow[aItem] && !aWindow[aItem].visible)
>+        hidden.push(aItem);
>+    });

That's really the same as filtering without joining. I.e. rather do

      var hidden = WINDOW_HIDEABLE_FEATURES.filter(...);
      if (hidden.length != 0) ...

>+    if (aEntry.subframe)
>+      shEntry.setIsSubFrame(aEntry.subframe || false);

If you are sure that shEntry.isSubFrame defaults to false you can drop the || false appendage, otherwise drop this optimization.

>+    if (aEntry.scroll) {
>+      var scrollPos = (aEntry.scroll || "0,0").split(",");

If you are sure that the scrolling position defaults to (0, 0), you can drop the useless || "0,0" appendage, otherwise drop this optimization.

r+ with the above changes.
Attachment #284511 - Flags: review?(zeniko) → review+
Depends on: 399572
Attached patch tweaks 2.1Splinter Review
Attachment #284770 - Flags: approval1.9?
(In reply to comment #20)
> (From update of attachment 284511 [details] [diff] [review])
> Do these changes have a measurable performance impact?

decreases the duration of saveState() by up to 4%.
Attachment #284770 - Flags: approval1.9? → approval1.9+
Comment on attachment 284770 [details] [diff] [review]
tweaks 2.1

Checking in browser/components/sessionstore/src/nsSessionStore.js;
/cvsroot/mozilla/browser/components/sessionstore/src/nsSessionStore.js,v  <--  nsSessionStore.js
new revision: 1.80; previous revision: 1.79
done
per Fx3 meeting: next step is to cap history. i've implemented this, but want to do some more profiling to isolate the perf changes, since we shouldn't be paying penalty for data gathering after the first run. it may be that we're just saving time in IO.
Attached patch history cap v1 (obsolete) — Splinter Review
Capping history at 5 saved time in data collection as well as writing to disk, basically because we just do less of both. saveState went from avg of 120ms to an avg of 30ms in my tests. This was using the sessionstore.js from the Talos Tp test.

Note that a page reload (what Talos Tp does 400 times) causes the tab history data to be re-gathered. This is a huge perf hit (especially when not capping history), and is the next place i'll look at optimizing.
Attachment #285301 - Flags: review?(zeniko)
Comment on attachment 285301 [details] [diff] [review]
history cap v1

(In reply to comment #25)
> Note that a page reload (what Talos Tp does 400 times) causes the tab history
> data to be re-gathered. This is a huge perf hit (especially when not capping
> history), and is the next place i'll look at optimizing.

Shouldn't we try this first? If we can get away with more aggressive caching, we might not have to consider capping tab history at all.

>+// maximum number of pages of history per tab to save

Please add "(-1 = unlimited)" or similar.

>+    var cap = this._prefBranch.getIntPref("sessionstore.max_tab_history");

Since that variable is quite far away from where it's used, please give it a more descriptive name.

>+          for (var j = history.index - cap; j < history.index + 1; j++)

This completely kills all forward history unless cap == -1! Is that intended - and if so: what's the rationale?

I'd either center the current page between as many back and forward items - or just add all forward items in addition to the .max_tab_history (since forward history is usually quite rare anyway).

BTW: Won't this throw or otherwise misbehave when there are less history items than .max_tab_history is set to?
Comment on attachment 284411 [details] [diff] [review]
tweaks

Resetting all approval1.9+ flags on bugs that have not been checked in by Oct 22 11:59 PM PDT.  Please re-request approval if needed.
Attachment #284411 - Flags: approval1.9+
Comment on attachment 284770 [details] [diff] [review]
tweaks 2.1

Resetting all approval1.9+ flags on bugs that have not been checked in by Oct 22 11:59 PM PDT.  Please re-request approval if needed.
Attachment #284770 - Flags: approval1.9+
Comment on attachment 284411 [details] [diff] [review]
tweaks

Landed by dietrich on 	2007-10-11 10:52.
Attachment #284411 - Flags: approval1.9?
Comment on attachment 284770 [details] [diff] [review]
tweaks 2.1

Landed by dietrich on 2007-10-13 10:29.
Attachment #284770 - Flags: approval1.9?
Comment on attachment 284770 [details] [diff] [review]
tweaks 2.1

Reset approval flag to + as it was already checked in.
Attachment #284770 - Flags: approval1.9? → approval1.9+
Comment on attachment 284411 [details] [diff] [review]
tweaks

Reset approval flag to + as it was already checked in.
Attachment #284411 - Flags: approval1.9? → approval1.9+
(In reply to comment #25)
> Capping history at 5 saved time in data collection as well as writing to disk,
> basically because we just do less of both.

BTW: Can you indicate how much time we spend in sss_updateCookieHosts? Looks like another good candidate for further improvements as well (e.g.: using arguments.callee for the recursion or caching its results instead of just throwing them away) since it's called for all of a window's URLs whenever we save state to disk.
Seems that for the first two tweaks, I've missed one potential issue: once one of the values .disallow, .xultab, .hidden, .sidebar, .extdata, .screenX and .screenY has been set, it won't be unset anymore, leading to regressions such as bug 400873.

If both tweaks have indeed led to performance improvements, we'd at least have to add code for unsetting all of the above properties, e.g.:

    if (sidebar)
      ...
+   else if (winData.sidebar)
+     delete winData.sidebar;
Wouldn't it make sense to generalize what we do for cycle collector and avoid running session store during pageload altogether, possibly running it soon after the previous page finishes loading, instead of just running it off a fixed-length timer?

Probably want a followup bug on that if that sounds reasonable.
Moving back to blocking list for perf push for b2.  Can we get the last patch cleaned up and in?
Flags: blocking-firefox3- → blocking-firefox3+
Target Milestone: --- → Firefox 3 M10
Target Milestone: Firefox 3 M10 → Firefox 3 M11
Priority: -- → P3
Ping on this - wondering if we can get the history limit per tab part of this in since that seems like an easy win...
(In reply to comment #37)
Wondering if we shouldn't have some proper plan for what'd be an acceptable loss first - and maybe some ui-review before no longer restoring forward history...

Besides: There are also still unfixed regressions from the first two patches of this bug. Dietrich: Are you aware of those (cf. comment #34)?
(In reply to comment #33)
> BTW: Can you indicate how much time we spend in sss_updateCookieHosts?

i'm very interested in knowing this, too. currently the code grabs an enumerator to the entire cookie list each time it's invoked. that list will be 1000 elements long, under *typical* usage. moving some of this work into the cookieservice (i.e. doing hash lookups for the host of each open tab, and just working with those) might save time here.

dietrich, any chance you could run some quick tests here? you'll need a big cookie file to try (copy one over from your regular use profile, that should suffice); if you don't have 1000 cookies, let me know and i'll send you one.
Depends on: 404225
Attachment #285301 - Flags: review?(zeniko)
Depends on: 407166
Patch for caching tab history, per comments 25/26. Using the sessionstore.js from Talos Tp, saveSate() went from 113ms to 33ms.
Attachment #293176 - Flags: review?(zeniko)
Comment on attachment 293176 [details] [diff] [review]
use cached tab history

>+    if (aPanel.__SS_data)
>+      delete aPanel.__SS_data._tab;

Not completely deleting __SS_data onTabLoad will have the consequence that navigating five pages back and quickly three different ones forward will not correctly update the new intermediary pages nor will it clear the fourth/fifth now obsolete entries. IOW: We'd need a better strategy to manage the cache.

One option would be to clear at least the entries ahead of the current history index (which usually should be none) and always update those when updating tab history (in _collectTabData). This should result in a similar perf win without the issues of the current patch.

>+    delete aPanel.__SS_text

Nit: Missing semicolon;

>+        // update the cache
>+        if (!cache[history.index])
>+          cache[history.index] = { url: browser.currentURI.spec };

This looks too hackish. What if |cache[history.index-1]| is missing as well? Besides: You really should use _serializeHistoryEntry instead of just saving the page's address so that subframes, etc. will be properly restored as well.

>+        var cap = this._prefBranch.getIntPref("sessionhistory.max_entries");
>+        cache.splice(cap)

IMO you should rather cut away the oldest entries than the most recent ones, since users will most of the time be at the very last page (no forward history at all). I was hoping we'd be doing the same for Gecko, but in fact wasn't able to see any change when setting that pref to 5 for the tabs' session history.

So rather do

+        if (cache.length > cap)
+          cache.splice(0, cache.length - cap);

Furthermore this patch somehow still doesn't work as I'd expect it: while normally only max_entries entries are saved, at shutdown it once again saves all of them, so that with max_entries at 5 I still get 10 or 15 entries restored. Haven't been able to quickly figure out why that might be, though.

And finally: Would you mind doing this optimization in a different bug and either close this one or make it a meta-bug? Thanks.
Attachment #293176 - Flags: review?(zeniko) → review-
That's a pretty huge update. I'm somewhat keen to try this for a beta and see if people even notice. What's the perf improvement if we cap to 10 instead of 5, which seems a little less aggressive (several Amazon category trees, f.e., are 3 or 4 levels deep)
Bumping to p1.  I'd really like to see some fix for this for b3.   Rare that we have such a smoking gun for perf so we should really take advantage of it.

As an aside Polvi used spectator with a super small sample set (so use with a pinch of salt):

# Back	Total Hits	% of Cases
1	31	4.90%
2	301	52.45%
3	136	73.93%
4	63	83.89%
5	22	87.36%
6	20	90.52%
7	13	92.58%
8	12	94.47%
9	6	95.42%
10	6	96.37%
11	2	96.68%
12	5	97.47%
13	3	97.95%
14	3	98.42%
15	3	98.89%
17	2	99.21%
18	3	99.68%
31	2	100.00%

So 5 hits 87% of cases, 10 hits 96.37.  Not the distribution drops off quite a bit so even 20 gets us all but .32% of cases.

So I'm with beltzer 10 at Max :-)

(a little data is more dangerous then none - I know :-) )

{1: 3199, 2: 301, 3: 136, 4: 63, 5: 22, 6: 20, 7: 13, 8: 12, 9: 6, 10: 6, 11: 2, 12: 5, 13: 3, 14: 3, 15: 3, 17: 2, 18: 3, 31: 2}

Priority: P3 → P1
s/update/improvement/ in comment #42, and yay for data backing up my wild assertion, though that's probably not teaching me the right lesson ;)
(In reply to comment #43)
> Bumping to p1.  I'd really like to see some fix for this for b3.   Rare that we
> have such a smoking gun for perf so we should really take advantage of it.
> 
> As an aside Polvi used spectator with a super small sample set (so use with a
> pinch of salt):
> 
> # Back  Total Hits      % of Cases
> 1       31      4.90%

Whoops! You missed two 9's there, it should be 3199. 

> {1: 3199, 2: 301, 3: 136, 4: 63, 5: 22, 6: 20, 7: 13, 8: 12, 9: 6, 10: 6, 11:
> 2, 12: 5, 13: 3, 14: 3, 15: 3, 17: 2, 18: 3, 31: 2}

That makes the table look like this:

1       3199    0.841621
2       301     0.920810
3       136     0.956590
4       63      0.973165
5       22      0.978953
6       20      0.984215
7       13      0.987635
8       12      0.990792
9       6       0.992370
10      6       0.993949
11      2       0.994475
12      5       0.995791
13      3       0.996580
14      3       0.997369
15      3       0.998158
17      2       0.998685
18      3       0.999474
31      2       1.000000

using the latest data...

5519 Browser:Back events found
3992 total sequences
min     max     avg     median
1       31      1.3823  1.0

# Back  Total Hits      % of Cases
1       3339    83.27%
2       339     91.72%
3       156     95.61%
4       67      97.28%
5       24      97.88%
6       21      98.40%
7       13      98.73%
8       13      99.05%
9       6       99.20%
10      6       99.35%
11      3       99.43%
12      5       99.55%
13      4       99.65%
14      4       99.75%
15      3       99.83%
17      2       99.88%
18      3       99.95%
31      2       100.00%

{1: 3339, 2: 339, 3: 156, 4: 67, 5: 24, 6: 21, 7: 13, 8: 13, 9: 6, 10: 6, 11: 3, 12: 5, 13: 4, 14: 4, 15: 3, 17: 2, 18: 3, 31: 2}

Meaning 10 gets you over 99%.... should be plenty.
(In reply to comment #45)
> (In reply to comment #43)
> > Bumping to p1.  I'd really like to see some fix for this for b3.   Rare that we
> > have such a smoking gun for perf so we should really take advantage of it.
> > 
> > As an aside Polvi used spectator with a super small sample set (so use with a
> > pinch of salt):
> > 
> > # Back  Total Hits      % of Cases
> > 1       31      4.90%
> 
> Whoops! You missed two 9's there, it should be 3199. 
> 
> > {1: 3199, 2: 301, 3: 136, 4: 63, 5: 22, 6: 20, 7: 13, 8: 12, 9: 6, 10: 6, 11:
> > 2, 12: 5, 13: 3, 14: 3, 15: 3, 17: 2, 18: 3, 31: 2}
> 
> That makes the table look like this:
> 
> 1       3199    0.841621
> 2       301     0.920810
> 3       136     0.956590
> 4       63      0.973165
> 5       22      0.978953
> 6       20      0.984215
> 7       13      0.987635
> 8       12      0.990792
> 9       6       0.992370
> 10      6       0.993949
> 11      2       0.994475
> 12      5       0.995791
> 13      3       0.996580
> 14      3       0.997369
> 15      3       0.998158
> 17      2       0.998685
> 18      3       0.999474
> 31      2       1.000000
> 
> using the latest data...
> 
> 5519 Browser:Back events found
> 3992 total sequences
> min     max     avg     median
> 1       31      1.3823  1.0
> 
> # Back  Total Hits      % of Cases
> 1       3339    83.27%
> 2       339     91.72%
> 3       156     95.61%
> 4       67      97.28%
> 5       24      97.88%
> 6       21      98.40%
> 7       13      98.73%
> 8       13      99.05%
> 9       6       99.20%
> 10      6       99.35%
> 11      3       99.43%
> 12      5       99.55%
> 13      4       99.65%
> 14      4       99.75%
> 15      3       99.83%
> 17      2       99.88%
> 18      3       99.95%
> 31      2       100.00%
> 
> {1: 3339, 2: 339, 3: 156, 4: 67, 5: 24, 6: 21, 7: 13, 8: 13, 9: 6, 10: 6, 11:
> 3, 12: 5, 13: 4, 14: 4, 15: 3, 17: 2, 18: 3, 31: 2}
> 
> Meaning 10 gets you over 99%.... should be plenty.
> 

Helpful - and the difference between 6 and 10 is .95% - depending on the perf difference between 6 and 10 might not be worth it :-).

numbers from capping tab history, 3 runs each, measuring the duration of saveState()'s first and second runs after startup in ms:

unpatched:
119, 115
118, 76
117, 109

patched (cap at 10):
41, 29
45, 38
40, 41

patched (cap at 6):
33, 31
35, 32
36, 31


difference between cap of 6 and cap of 10 is roughly 5-10 ms per pageload.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Priority: P1 → P3
Resolution: --- → FIXED
weird, marked it fixed somehow.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Priority: P3 → P1
Attached patch history cap v2 (cap at 10) (obsolete) — Splinter Review
per the fx meeting today and the numbers from polvi, going to try the cap approach out in nightlies.

patch addresses previous review comments on the cap approach.

i can set cap at 6 prior to checkin if y'all think the extra 5-10ms are worth the tradeoff.
Attachment #285301 - Attachment is obsolete: true
Attachment #296100 - Flags: review?(zeniko)
Comment on attachment 296100 [details] [diff] [review]
history cap v2 (cap at 10)

>+// maximum number of pages of history per tab to save
>+pref("browser.sessionstore.max_tab_history", 10);

Nit: This only applies to _back_ history. Please update the comment and the pref name (max_tab_back_history).

>+      var startIndex = (cap == -1) ? 0 :
>+                       (history.index > cap) ? (history.index - cap) : 0;

Those parentheses aren't really needed. What about
>>      var startIndex = -1 < cap && cap < history.index ? history.index - cap : 0;

>+      tabData.index = (history.index < cap) ? history.index : cap;

For clarity, this should IMO rather read
>>      tabData.index = history.index - startIndex;

r+ with the nits fixed. Hope this is worth it!
Attachment #296100 - Flags: review?(zeniko) → review+
Attached patch history cap v3 (cap at 10) (obsolete) — Splinter Review
addresses issues from comment #50
Attachment #296100 - Attachment is obsolete: true
Attachment #296181 - Flags: review?(zeniko)
Comment on attachment 296181 [details] [diff] [review]
history cap v3 (cap at 10)

Looking good. One last detail (sorry for the oversight):

>-      tabData.index = history.index + 1;
>+      tabData.index = history.index - startIndex;

should actually be (all our indices are 1-based):

>>      tabData.index = history.index - startIndex + 1;

You can fix that when landing, though.
Attachment #296181 - Flags: review?(zeniko) → review+
Attached patch history cap v3.1 (cap at 10) (obsolete) — Splinter Review
as checked in

Checking in browser/app/profile/firefox.js;
/cvsroot/mozilla/browser/app/profile/firefox.js,v  <--  firefox.js
new revision: 1.251; previous revision: 1.250
done
Checking in browser/components/sessionstore/src/nsSessionStore.js;
/cvsroot/mozilla/browser/components/sessionstore/src/nsSessionStore.js,v  <--  nsSessionStore.js
new revision: 1.91; previous revision: 1.90
done
Attachment #296181 - Attachment is obsolete: true
Whiteboard: [wanted-firefox3]
After reviewing the Tp data from old Tp and Talos boxes, the conclusion is that this change had no detectable effect on the results of the page load test.
So we've got no reason to keep that change, or do we?
Are we sure the pageload test is accurately measuring the effect of the cap?
(In reply to comment #55)
> So we've got no reason to keep that change, or do we?
> 

not if it doesn't improve Tp. but that depends on the answer to the question in comment #56.

(In reply to comment #56)
> Are we sure the pageload test is accurately measuring the effect of the cap?
> 

no.
Target Milestone: Firefox 3 beta3 → Firefox 3 beta4
can someone with d-trace run the same test mentioned in comment #2, and see if the cap has improved perf there? sayrer, do you have cycles to do that?
Nope :(
Priority: P1 → P3
Target Milestone: Firefox 3 beta4 → Firefox 3
Priority: P3 → P4
(In reply to comment #39)

quick profiling of the cookie-related functions, using Date.now() with a 750-strong cookie list and about 5 tabs open, on a fast dualcore linux machine:

on startup, sss_restoreCookies takes about 5ms
after pageload, sss_updateCookieHosts takes <= 1ms
after pageload, sss_updateCookies takes 15 - 20ms
on shutdown, sss_updateCookies seems to be called twice.

it's clear sss_updateCookies could be faster. can we fix it getting called twice on shutdown?
(In reply to comment #60)
> can we fix it getting called twice on shutdown?

The first call happens for the "quit-application-request" notification, which is when we last save the state of all currently open windows. Updating cookies here is required to keep our data consistent (in case extensions want to do something with them e.g. at "quit-application-granted".

The second call happens for the "quit-application" notification, after extensions had the chance to do some clean-up (and for the rare case where there's no "quit-application-request" at all).

I guess we could try to bend the code to (usually) not make the second call, but IMO the additional code complexity doesn't make it worth. I'd much rather see if we can speed up sss_updateCookies some more (e.g. by not throwing all cookie data away and starting over for every call).
Attached patch cookie tweaks (obsolete) — Splinter Review
here's an idea i threw together to speed up sss_updateCookies - instead of iterating over the 1000-strong cookie list each time, just ask the cookieservice for the cookies for each domain we're interested in. (just iterating over all 1000 cookies was taking 15ms, with this patch it's << 1ms.)

Simon, the sessionstore changes in this patch don't work (they're just to show the idea) - if you have time, do you want to pick this up and run with it? if not, maybe you can help me out with the sessionstore bits.
This bug is getting far to crowded (many different, only half resolved issues). Please add new ideas to bugs blocking this one...

Dietrich: Any progress in answering comment #55? What about just changing browser.sessionstore.max_tab_back_history back to -1 and see the impact on our perf tests?

(In reply to comment #62)
Would you mind moving your patch to a new bug? I like the way this is going and could offer you some further input for the SessionStore bits.
Depends on: 423132
filed bug 423132 on the cookie stuff.
(In reply to comment #63)
> Dietrich: Any progress in answering comment #55? What about just changing
> browser.sessionstore.max_tab_back_history back to -1 and see the impact on our
> perf tests?
> 

I haven't had time to look into it, and probably won't in the Fx3 cycle, given other higher priority bugs.

Yes, we should revert the change (either fully, or equivalently as you suggest above).
Assignee: dietrich → nobody
Status: REOPENED → NEW
Let's back this out, as it introduces complexity without perf gain.
Attachment #296200 - Attachment is obsolete: true
Attachment #309693 - Flags: review?(dietrich)
Attachment #309632 - Attachment is obsolete: true
I'm sure we can further improve SessionStore's performance, but please in new bugs - one per improvement idea, since I doubt that we're having one obvious flaw, rather several individually minor ones...
Keywords: meta
Comment on attachment 309693 [details] [diff] [review]
history cap back out

r=me
Attachment #309693 - Flags: review?(dietrich) → review+
Comment on attachment 309693 [details] [diff] [review]
history cap back out

Thanks.
Attachment #309693 - Flags: approval1.9?
Comment on attachment 309693 [details] [diff] [review]
history cap back out

a1.9=beltzner
Attachment #309693 - Flags: approval1.9? → approval1.9+
Keywords: checkin-needed
Version: unspecified → Trunk
Checking in browser/app/profile/firefox.js;
/cvsroot/mozilla/browser/app/profile/firefox.js,v  <--  firefox.js
new revision: 1.310; previous revision: 1.309
done
Checking in browser/components/sessionstore/src/nsSessionStore.js;
/cvsroot/mozilla/browser/components/sessionstore/src/nsSessionStore.js,v  <--  nsSessionStore.js
new revision: 1.98; previous revision: 1.97
done
Keywords: checkin-needed
Keywords: checkin-needed
Keywords: checkin-needed
Not blocking on this bug for final ship. Would take a safe enough patch if one comes through.
Flags: wanted-firefox3+
Flags: blocking-firefox3-
Flags: blocking-firefox3+
Whiteboard: [not needed for 1.9][meta]
Wow. This is...just ducky. So the issues I've been having owe to problems with the invisible restore memory? I was lucky even to find a pointer to this.

https://bugzilla.mozilla.org/show_bug.cgi?id=465386
Massive system load with high tabcount may be related to the calls referenced above. If browser is iterating/bookkeeping across so many tabs, the hit would be phenomenal with the methods shown here. But are there enough calls being issued to confuse/destabilize Windows itself in my case?

https://bugzilla.mozilla.org/show_bug.cgi?id=471725
Restore could be related to the history/URL losses I've been seeing.
...though it seems more likely to come from some other module...

Please to delete this comment if it shouldn't be here.
Resolved by the recent change to how we do session restore?
There have been some changes that might change these numbers (tracked in bug 536910). But the other changes to sessionstore shouldn't have made any difference to Talos. But seeing as there are a few perf-related sessionstore bugs, it might be good to get them under one roof.
(In reply to comment #75)
> There have been some changes that might change these numbers (tracked in bug
> 536910). But the other changes to sessionstore shouldn't have made any
> difference to Talos. But seeing as there are a few perf-related sessionstore
> bugs, it might be good to get them under one roof.
Bug 634666 will likely help too in terms of disk I/O.
Session Restore has been almost fully rewritten since the last activity on this bug. Times should be much more reasonable now. Can we close this bug?
Agreed, this bug isn't very specific and we rewrote most code dealing with sessionstore data collection. We also have Telemetry showing us that we improved a lot since this bug was filed.
Status: NEW → RESOLVED
Closed: 16 years ago10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.