Last Comment Bug 705597 - about:blank subframe entries in session restore make browser slow
: about:blank subframe entries in session restore make browser slow
Status: RESOLVED FIXED
[Snappy:P2]
:
Product: Firefox
Classification: Client Software
Component: Session Restore (show other bugs)
: Trunk
: All All
: -- normal with 4 votes (vote)
: Firefox 11
Assigned To: Tim Taubert [:ttaubert] (on PTO, back Aug 29th)
:
Mentors:
: 659567 (view as bug list)
Depends on:
Blocks: sessionRestoreJank 707866
  Show dependency treegraph
 
Reported: 2011-11-27 20:41 PST by Chris AtLee [:catlee]
Modified: 2014-07-07 08:31 PDT (History)
24 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
possible patch (3.14 KB, patch)
2011-12-02 20:51 PST, Tim Taubert [:ttaubert] (on PTO, back Aug 29th)
dietrich: feedback+
Details | Diff | Splinter Review
patch v (3.20 KB, text/plain)
2011-12-04 16:07 PST, Tim Taubert [:ttaubert] (on PTO, back Aug 29th)
no flags Details
patch v2 (3.20 KB, patch)
2011-12-04 16:12 PST, Tim Taubert [:ttaubert] (on PTO, back Aug 29th)
no flags Details | Diff | Splinter Review
patch v3 (2.47 KB, patch)
2011-12-05 18:25 PST, Tim Taubert [:ttaubert] (on PTO, back Aug 29th)
no flags Details | Diff | Splinter Review
patch v4 (2.42 KB, patch)
2011-12-05 18:33 PST, Tim Taubert [:ttaubert] (on PTO, back Aug 29th)
no flags Details | Diff | Splinter Review
patch v5 (2.52 KB, patch)
2011-12-05 19:29 PST, Tim Taubert [:ttaubert] (on PTO, back Aug 29th)
no flags Details | Diff | Splinter Review
patch v6 (2.51 KB, patch)
2011-12-05 22:56 PST, Tim Taubert [:ttaubert] (on PTO, back Aug 29th)
dietrich: feedback+
Details | Diff | Splinter Review
patch v7 (with test) (5.09 KB, patch)
2011-12-09 00:14 PST, Tim Taubert [:ttaubert] (on PTO, back Aug 29th)
dietrich: review+
Details | Diff | Splinter Review

Description Chris AtLee [:catlee] 2011-11-27 20:41:41 PST
Firefox has been very slow for me lately. It periodically hangs for a split second, whether I'm inputting text into a text area, navigating the menus, etc.

One suspicion I have is that writing to the sessionstore.js file is very expensive.

Before filing this bug it had grown to almost 6MB. I took a look at it and found that the tab for google reader seemed to be taking quite a bit of memory, so I force-reloaded my google reader tab. That trimmed sessionrestore.js down to 2MB. I took a closer look, and it looks like I have over 65,000 'about:blank' in my sessionstore.js.

python -c 'import json; print json.dumps(json.load(open("sessionstore.js")), indent=2)' | grep 'about:blank' | wc -l
65707
Comment 1 Mardeg 2011-11-27 20:47:34 PST
Is this a dupe of bug 669603?
Comment 2 Mardeg 2011-11-27 20:57:33 PST
(In reply to Mardeg from comment #1)
> Is this a dupe of bug 669603?

It seems that was one of the bugs spun off bug 667250, the other being bug 669034. So now there's a choice of what to dupe it to :)
Comment 3 Chris AtLee [:catlee] 2011-11-27 20:58:24 PST
I figured the massive number of 'about:blank' entries might be a bug in and of itself.
Comment 4 Chris AtLee [:catlee] 2011-11-27 21:28:15 PST
Trimming out any 'about:blank' items from my session brings it down to 880k (from almost 6MB) and results in a much more responsive browser.
Comment 5 Randell Jesup [:jesup] 2011-11-27 23:25:36 PST
You need to pretty-print (see bug 669034 or one of the related bugs for the link to the pretty-print code for JSON).

This will tell you *what* the about:blank entries are, and how they got into the data stored by sessionstore.js.  It's unlikely to be bug 669603.  My guess is that they've valid data, and maybe even required.  There is at least one bug against (for example) the addons manager (I think) that caused it to add hundreds or thousands of about:blank entries.  (Do you have an addons tab open?)
Comment 6 Chris AtLee [:catlee] 2011-11-28 04:28:16 PST
(In reply to Randell Jesup [:jesup] from comment #5)
> You need to pretty-print (see bug 669034 or one of the related bugs for the
> link to the pretty-print code for JSON).
> 
> This will tell you *what* the about:blank entries are, and how they got into
> the data stored by sessionstore.js.  It's unlikely to be bug 669603.  My
> guess is that they've valid data, and maybe even required.  There is at
> least one bug against (for example) the addons manager (I think) that caused
> it to add hundreds or thousands of about:blank entries.  (Do you have an
> addons tab open?)

The majority of them were from my google reader tab.
Comment 7 Randell Jesup [:jesup] 2011-11-28 09:38:56 PST
But what in the tab caused them?  I suspect it's something we need (at least in theory), but perhaps there's a common-enough case that makes sense to optimize for about:blank to save space and/or parsing/JSON-ing time.  Can you post the tab's JSON data?  (pretty-printed if possible)
Comment 8 Chris AtLee [:catlee] 2011-11-28 10:33:31 PST
        {
          "index": 50,
          "extData": {
            "weaveLastUsed": "1322453078",
            "tabview-tab": "null"
          },
          "pinned": true,
          "entries": [
            {
              "docIdentifier": 2211,
              "docshellID": 1197,
              "title": "Google Reader (327)",
              "url": "http://www.google.ca/reader/view/?tab=my#stream/userXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
              "subframe": true,
              "children": [
                {
                  "url": "about:blank",
                  "docIdentifier": 2212,
                  "ID": 3883578269,
                  "docshellID": 0
                },
                {
                  "url": "about:blank",
                  "docIdentifier": 2213,
                  "ID": 3883578270,
                  "docshellID": 0
                },
                {
                  "url": "about:blank",
                  "docIdentifier": 2214,
                  "ID": 3883578271,
                  "docshellID": 0
                },
                {
                  "url": "about:blank",
                  "docIdentifier": 2215,
                  "ID": 3883578272,
                  "docshellID": 0
                },
                {
                  "url": "about:blank",
                  "docIdentifier": 2216,
                  "ID": 3883578273,
                  "docshellID": 0
                },
                {
                  "url": "about:blank",
                  "docIdentifier": 2217,
                  "ID": 3883578274,
                  "docshellID": 0
                },

etc.

I can put the entire file up somewhere if there's a safe place for that. Not sure I want my session restore data on a public bug :)
Comment 9 Randell Jesup [:jesup] 2011-11-28 11:55:09 PST
No, that's good.  So the website creates a ton of about:blank children that it probably has built to have available to stuff real data into (or some such, doesn't really matter why).

First approximation: there's no easy way to remove them - they're part of the saved state of the page.  There are optimizations that could be done that would help this specific site to collapse the representation in the JSON, at least in theory, but would add noticeable complexity and make it pseudo-JSON (allowing parts to be LZ encoded and invoking that on repeated runs of similar sub-documents, for example).

We could refuse to save them, but I imagine that would break some websites, even if Google Reader isn't broken. Or if a website generates "too much" sessionstore data, don't save the full state, just the URL and history/etc.  That might be viable and solve an entire class of problems, albeit with user-visible impact.   (Though hopefully this will engender site owners to fix their sites.)

Added [snappy] since this affects UI performance as with other big-sessionstore bugs.  Perhaps some of the ideas in the previous paragraph will help.
Comment 10 Dietrich Ayala (:dietrich) 2011-11-28 13:19:06 PST
See: http://mxr.mozilla.org/mozilla-central/search?string=insert%20a%20dummy%20entry

For some reason nsISHEntry.children can be a sparse array. I don't remember why this can occur.
Comment 11 Paul O'Shannessy [:zpao] (not reading much bugmail, email directly) 2011-11-28 13:28:45 PST
(In reply to Dietrich Ayala (:dietrich) from comment #10)
> See:
> http://mxr.mozilla.org/mozilla-central/
> search?string=insert%20a%20dummy%20entry
> 
> For some reason nsISHEntry.children can be a sparse array. I don't remember
> why this can occur.

But these all have docidentifiers and docshellIDs, which leads me to believe they aren't dummy entries, at least not ones we created (unless maybe they've gone through a restart and they somehow get turned into real ones).

---

I'm not sure how useful all of those about:blank children entries are? I guess they're valuable for restoring something from bfcache, but beyond that it looks like they have little value.

Justin, is there anything you can say about these entries? What's causing them to even exist? What would happen if we didn't save them?
Comment 12 Chris AtLee [:catlee] 2011-11-28 13:31:16 PST
(In reply to Dietrich Ayala (:dietrich) from comment #10)
> See:
> http://mxr.mozilla.org/mozilla-central/
> search?string=insert%20a%20dummy%20entry
> 
> For some reason nsISHEntry.children can be a sparse array. I don't remember
> why this can occur.

The last item in this particular list of children is also blank.
Comment 13 Justin Lebar (not reading bugmail) 2011-11-28 13:41:14 PST
The biggest problem is that we know this operation is expensive in a variety of cases, and yet we're still doing it synchronously on the main thread.  :-/

I really don't understand what all those about:blank entries are good for, but maybe Smaug can help.
Comment 14 Dietrich Ayala (:dietrich) 2011-11-30 16:27:21 PST
updating summary to better reflect the problem in this case.

note: i found when profiling a similar case (bug 700923) that the writing wasn't the problem, but the synchronous recursive gathering of the subframe data.

this bug is kinda a variation on that bug, but will have a fix more targeted at the about:blank problem specifically, so not going to dupe it.

assuming that we actually *need* to reconstruct the subframes with the indices intact, an option is to store an array of the indices at the session history level, instead of serializing fake entries inline. this would result in a much smaller sessionstore.js, affecting both save and restore procedures.
Comment 15 Andrew McCreight [:mccr8] 2011-12-01 12:40:14 PST
Marking as [Snappy:P2] because it looks bad and may not be too hard to fix, but it isn't clear how common it is, and other more general sessionstore improvements will also help mitigate this issue.
Comment 16 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-02 20:51:57 PST
Created attachment 578808 [details] [diff] [review]
possible patch

Experiencing the same issue, which is actually quite bad for me. I have a 657K sessionstore.js file. SessionStore.saveState() takes 800-900ms on my machine with not a lot of tabs loaded. I did some profiling and turns out that > 95% of the time is spent on serializing history entries of my Twitter app tab (which probably has been open and restored for a long time and is a very common use case).

I took a stab at solving this and replaced the 'about:blank' dummy entries with just 'null'. This involves some extra little checks but that's not too bad. The results of this are:

SessionStore.saveState() takes now 50-80ms for me. The sessionstore.js has shrunk to 295K. I'd say that's a pretty big win.

Regarding the patch: I'm not really experienced with SHEntry children but I think  simply calling SHEntry.AddChild(..., ..., index) should add the entry at the given index and make sure they stay in the correct order, right?
Comment 17 Randell Jesup [:jesup] 2011-12-03 07:39:25 PST
Comment on attachment 578808 [details] [diff] [review]
possible patch

Looks pretty good.

On restore, the sh child list will have unset entries, instead about:blank entries.  This might be ok (I'd have to check more deeply into a bunch of stuff to be sure - zpao?)  If it isn't ok, then we could make a pass over the shEntry array after restoring it replacing null entries with about:blank - that shouldn't be hard.
Comment 18 Dietrich Ayala (:dietrich) 2011-12-03 17:47:57 PST
Comment on attachment 578808 [details] [diff] [review]
possible patch

Review of attachment 578808 [details] [diff] [review]:
-----------------------------------------------------------------

thanks for the patch. r=me w/ comments fixed. if it passes try, let's get this in nightlies and see if it causes any problems.

::: browser/components/sessionstore/src/nsSessionStore.js
@@ +1954,5 @@
>      if (aEntry.childCount > 0) {
>        entry.children = [];
>        for (var i = 0; i < aEntry.childCount; i++) {
>          var child = aEntry.GetChildAt(i);
> +        if (child && "about:blank" != child.URI.spec) {

yoda conditions ftl

@@ +1964,3 @@
>          }
>          // don't try to restore framesets containing wyciwyg URLs (cf. bug 424689 and bug 450595)
> +        if (entry.children[i] && /^wyciwyg:\/\//.test(entry.children[i].url)) {

while you're there, how about indexOf instead of a regex?
Comment 19 IU 2011-12-04 09:10:27 PST
This bug is the same cause of bug 624409, which had over 29,000 about:blank entries triggered by a single online CMS site.

The only difference, in comparison to comment 8, is that: (1) all instances of "docshellID" had varying non-zero values; (2) there were also over 29,000 "owner_b64" entries having having approximately 600 character strings -- several thousand (over 15,000 in one case) having the exact same base64 string; (3) and it included docIdentifier entries also with varying non-zero values.

The sessionstore.js file was over 32 MB, primarily due to all the large base64 entries.

Fixing this should fix bug 624409, so I'll wait and see before marking it dependent on this bug.
Comment 20 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-04 16:07:09 PST
Created attachment 578953 [details]
patch v
Comment 21 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-04 16:12:55 PST
Created attachment 578954 [details] [diff] [review]
patch v2

(In reply to Dietrich Ayala (:dietrich) from comment #18)
> yoda conditions ftl

Fixed.

> while you're there, how about indexOf instead of a regex?

Fixed.

I additionally replaced to null-dummy entry with {url: ""}. A lot of places in the code seem to rely on an actual entry with a url property. Another benefit is that we won't make sessionstore.js incompatible with previous Firefox versions. That would be pretty bad in case we'll have to back this out and would possibly break sessions of lots of testers.
Comment 22 Justin Lebar (not reading bugmail) 2011-12-04 16:37:10 PST
What happens if I have the following session history:

  Top page, A.html contains an iframe with history
    B.html
    about:blank
    C.html

Will the about:blank in the history get thrown out?
Comment 23 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-04 16:40:16 PST
(In reply to Justin Lebar [:jlebar] from comment #22)
> Will the about:blank in the history get thrown out?

They won't be loaded or stored as child entries to/from SHistory. Is that a realistic situation/use case that occurs in the wild?
Comment 24 Justin Lebar (not reading bugmail) 2011-12-04 16:45:45 PST
I don't know if that occurs in the wild, although my guess is that since it can occur, some site out there relies on it.  bz may know better.
Comment 25 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-04 16:50:30 PST
Ok, well, this was just an attempt to fix old sessions but we really can't decide between real 'about:blank' entries and dummy entries. Maybe I should take this out but that would mean that there's still a pretty big perf hit for old sessions unless users closed and re-opened (not undo closed) all their tabs (at least tabs causing this issue, like Twitter).
Comment 26 Justin Lebar (not reading bugmail) 2011-12-04 16:54:59 PST
Maybe you could flush all the about:blanks just once, so things won't break more than once.
Comment 27 Boris Zbarsky [:bz] 2011-12-04 17:08:41 PST
> I don't know if that occurs in the wild

It almost certainly does.

What makes the about:blank case so slow?  Serializing the owner principals?  Something else?  Comment 19 is describing the "lots of copies of the same principal all serialized individually problem", but comment 8 doesn't have any principal info in it....  What were the details of the profile that showed serializing about:blank entries as being the problem?

If the issue is really the "initial" about:blank, we can certainly detect and restrict to that case, right?
Comment 28 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-04 21:03:19 PST
After doing some profiling/debugging I found a place where some time can be saved. I have ~25 tabs and the session contain ~5600 'about:blank' child entries. It takes 880ms to save the session state.

>    try {
> +    if (entry.url.indexOf("about:") != 0)
>        aHostSchemeData.push({ host: aEntry.URI.host, scheme: aEntry.URI.scheme });
>    }
>    catch (ex) {
>      // We just won't attempt to get cookies for this entry.
>    }

So this throws at least 5600x times (because about: has no host) and that's of course very expensive. The fix reduced the time to save the session to 624ms.

Even with returning early in _serializeHistoryEntry() when the url is about:blank we're at 240ms which is still not very ideal. That's just the time it takes to access those >5600 history entries and check their urls...

First after restoring a saved session without loading all these 'about:blank' SHEntries we're down to a reasonable 50-70ms. My memory usage also went down from 227 to 197mb. I didn't measure but there should be a startup/restore win as well when we don't need to restore 5600 history entries.

So, in fact it's partly about 'about:blank' but mostly about the sheer amount of history entries getting saved/restored.
Comment 29 Boris Zbarsky [:bz] 2011-12-04 21:48:51 PST
Would a flag indicating that the docshell is sitting at its initial about:blank entry help short-circuit some of that work?
Comment 30 Randell Jesup [:jesup] 2011-12-04 22:02:29 PST
If we're really concerned with breaking sites, on restore we would fill in 'skipped' entries with about:blank, since the only reason to skip them (I assume) is if they're about:blank.  If there are other reasons for holes, we could replace N about:blank entries with one entry and repeat count.

The save times are less worrying (though still problematic) if we have a dirty bit, or if we can do some of the preparation for save "along the way" and amortize it over other history ops (like adding/removing entries) that don't block sessionstore save.
Comment 31 IU 2011-12-05 07:05:04 PST
I'm having a hard time understanding the purpose of recording all the about:blank child entries.  What useful information do they contain?

Might it be possible to put a limit (possibly about:config configurable) on the number of about:blank child entries that will be written to sessionstore.js for any given URL?  In the case I detailed in comment 19, retaining over 15,000 about:blank entries for a single URL makes no sense to me (and seem more like the product of a poorly coded web site).

And as indicated in comment 0, force reloading the page probably had the effect of jettisoning much of those entries, without breaking the page.  So, are all those entries being really that critical?  What, if anything, would be broken by setting a limit?
Comment 32 Boris Zbarsky [:bz] 2011-12-05 07:22:57 PST
> What useful information do they contain?

"do" or "can"?  For the latter, see comment 22.  But that's the rare case; hence my questions about having a flag to indicate that we're NOT in that case.
Comment 33 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 07:28:53 PST
(In reply to IU from comment #31)
> I'm having a hard time understanding the purpose of recording all the
> about:blank child entries.  What useful information do they contain?

TBH, I'm not really sure what that code is for:

http://mxr.mozilla.org/mozilla-central/source/browser/components/sessionstore/src/nsSessionStore.js#1962

> else { // to maintain the correct frame order, insert a dummy entry 
>   entry.children.push({ url: "about:blank" });
> }

Somehow, if SHistory.GetEntryAtIndex() fails we replace this history entry with 'about:blank'. Maybe there's enough blame info in CVS but I have no clue why we do that. Can someone shed some light on this?

> Might it be possible to put a limit (possibly about:config configurable) on
> the number of about:blank child entries that will be written to
> sessionstore.js for any given URL?  In the case I detailed in comment 19,
> retaining over 15,000 about:blank entries for a single URL makes no sense to
> me (and seem more like the product of a poorly coded web site).

I think this may be the symptom of some bug in session history but I'm not sure. In the case of Twitter we have URLs that contain *only* about:blank children (i.e. GetEntryAtIndex() fails for every SHEntry). I think it might be safe to remove them, they're not in between meaningful entries like in comment #22.
Comment 34 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 07:36:06 PST
(In reply to Boris Zbarsky (:bz) from comment #32)
> > What useful information do they contain?
> 
> "do" or "can"?  For the latter, see comment 22.  But that's the rare case;
> hence my questions about having a flag to indicate that we're NOT in that
> case.

This flag would help us with existing sessions maybe but actually I'd like to not even load those entries in the first place. They're just taking up memory (despite never being used) and getting to the point where we check if they contain information (when saving the session) is too expensive with thousands of entries.
Comment 35 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 07:41:36 PST
FTR, the important point here is that we don't have 'real' about:blank entries that are saved, but session store chooses to inject them when SH.GetEntryAtIndex() fails. Maybe it's just SHEntry.childCount that is too high? In which situations can GetEntryAtIndex() fail?
Comment 36 Boris Zbarsky [:bz] 2011-12-05 07:46:51 PST
"Not even load" in what sense?

> but session store chooses to inject them when SH.GetEntryAtIndex() fails

Uh... that seems bogus.

> In which situations can GetEntryAtIndex() fail?

I can't think of reasonable ones offhand.  Olli?
Comment 37 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 08:04:15 PST
(In reply to Boris Zbarsky (:bz) from comment #36)
> "Not even load" in what sense?

I mean I don't want to call SH.addEntry() that turns them into history entries if they're just the product of some bogus situation. Ideally we should fix this situation...
Comment 38 Chris AtLee [:catlee] 2011-12-05 08:19:21 PST
Is there telemetry data for how long it's taking us to save sessionrestore.js? Is there an easy way for me to find out?
Comment 39 Olli Pettay [:smaug] (vacation Aug 25-28) 2011-12-05 08:45:44 PST
(In reply to Boris Zbarsky (:bz) from comment #36)

> > In which situations can GetEntryAtIndex() fail?
> 
> I can't think of reasonable ones offhand.  Olli?

Indexing out of bounds?
Comment 40 Boris Zbarsky [:bz] 2011-12-05 09:05:16 PST
This code is indexing up to childCount; that shouldn't end up out of bounds, right?

Can we have holes in the child array from removed docshells?
Comment 41 Olli Pettay [:smaug] (vacation Aug 25-28) 2011-12-05 09:10:06 PST
Um, which GetEntryAtIndex are we talking about here? Shistory doesn't have childCount but
count.
Comment 42 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 09:35:40 PST
Sorry, I actually meant SHEntry.GetChildAt() and SHEntry.childCount.
Comment 43 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 09:43:02 PST
I managed to get a pretty strange STR clicking around on Twitter, restarting, clicking around again and reloading the page. On reload a lot of history entries get removed and leave holes. I think these holes shouldn't be filled at all with 'about:blank' but that's what session store does.

If childCount=2 and we remove all entries we're effectively adding two 'about:blank' entries that are really useless because they should've been removed, right?

Sure, SHistory could behave better and set childCount=0 when all entries are 'nsnull' but nonetheless the current SS behavior isn't correct I think.
Comment 44 Dietrich Ayala (:dietrich) 2011-12-05 10:29:12 PST
(In reply to Chris AtLee [:catlee] from comment #38)
> Is there telemetry data for how long it's taking us to save
> sessionrestore.js? Is there an easy way for me to find out?

Bug 671041. Not telemetrized yet.

There are 3 factors in "time to save":

1. data collection
2. JSON serialization
3. writing JSON to disk

The major responsiveness problem, from the testing I've done so far, is due to synchronous code in #1. Depending on user hardware and environment and session, #3 can be a problem as well.
Comment 45 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 18:25:03 PST
Created attachment 579187 [details] [diff] [review]
patch v3

Removed the insertion of dummy entries as they all refer to removed SHEntry children. They've been removed so we shouldn't keep them alive and turn them into 'about:blank' entries when restoring.

Incorporated the fix from comment #28. Fixing up old sessions and removing all 'about:blank' subframes once should be handled in a different bug. I'll file a follow-up for this.
Comment 46 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 18:33:12 PST
Created attachment 579188 [details] [diff] [review]
patch v4

Small fix.
Comment 47 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 18:59:17 PST
(In reply to Tim Taubert [:ttaubert] from comment #43)
> Sure, SHistory could behave better and set childCount=0 when all entries are
> 'nsnull' [...].

Filed bug 707862.
Comment 48 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 19:29:41 PST
Created attachment 579199 [details] [diff] [review]
patch v5

Sorry for the spam :/ This fixes a try failure.
Comment 49 Justin Lebar (not reading bugmail) 2011-12-05 20:53:06 PST
Comment on attachment 579199 [details] [diff] [review]
patch v5

> +      if (entry.url.indexOf("about:") != 0)

This is perf-sensitive code, so can we test whether a regex, a startsWith function, or just checking the scheme would be faster?  This code has to iterate over the entirety of each URI.
Comment 50 Boris Zbarsky [:bz] 2011-12-05 20:56:52 PST
Checking the scheme is way faster than getting the full spec for many urls (e.g. data: URIs).  Note that I said getting the spec; just _that_ part is slow, no matter how you then look for stuff in it.
Comment 51 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 22:56:57 PST
Created attachment 579240 [details] [diff] [review]
patch v6

(In reply to Justin Lebar [:jlebar] from comment #49)
> > +      if (entry.url.indexOf("about:") != 0)
> 
> This is perf-sensitive code, so can we test whether a regex, a startsWith
> function, or just checking the scheme would be faster?  This code has to
> iterate over the entirety of each URI.

'entry.url' is a normal string. That's the part where we're deserializing the history data. A quick test says that using a regexp for this is about 100% slower than indexOf. Everything else I tried is also slower.

(In reply to Boris Zbarsky (:bz) from comment #50)
> Checking the scheme is way faster than getting the full spec for many urls
> (e.g. data: URIs).  Note that I said getting the spec; just _that_ part is
> slow, no matter how you then look for stuff in it.

Didn't know about .schemeIs() - that's way better. I changed to following:

>    // don't try to restore framesets containing wyciwyg URLs
> -  if (child.URI.spec.indexOf("wyciwyg://") == 0) {
> +  if (child.URI.schemeIs("wyciwyg")) {
>      children = [];
>      break;
>    }
Comment 52 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-05 23:10:05 PST
Oops, I missed that we have both the URI as a string and as a nsIURI available. indexOf() is about 22x faster than schemeIs() so I guess we should leave that as it is. Same thing for URI.scheme. We need the whole spec anyway so it makes sense to use that for comparison and save the .host/.scheme accesses for about: pages.
Comment 53 Paul O'Shannessy [:zpao] (not reading much bugmail, email directly) 2011-12-06 12:59:29 PST
(In reply to Tim Taubert [:ttaubert] from comment #51)
> (In reply to Justin Lebar [:jlebar] from comment #49)
> > > +      if (entry.url.indexOf("about:") != 0)
> > 
> > This is perf-sensitive code, so can we test whether a regex, a startsWith
> > function, or just checking the scheme would be faster?  This code has to
> > iterate over the entirety of each URI.
> 
> 'entry.url' is a normal string. That's the part where we're deserializing
> the history data. A quick test says that using a regexp for this is about
> 100% slower than indexOf. Everything else I tried is also slower.

Tangentially relevant data point... v8 & JSC both look faster when using Reg.test vs String.indexOf (http://jsperf.com/regexp-text-vs-string-indexof/2) whereas we are the opposite (but our .indexOf is better than other's .test). I bring it up only for future generations where the numbers may flip.
Comment 54 Dietrich Ayala (:dietrich) 2011-12-06 14:46:03 PST
Comment on attachment 579240 [details] [diff] [review]
patch v6

Review of attachment 579240 [details] [diff] [review]:
-----------------------------------------------------------------

needs a test actually, for r+. but looks fine otherwise.

::: browser/components/sessionstore/src/nsSessionStore.js
@@ +1970,5 @@
>          }
>        }
> +
> +      if (children.length)
> +        entry.children = children;

please confirm that there's no restore code that expects a 'children' property to exist.
Comment 55 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-08 06:30:48 PST
(In reply to Dietrich Ayala (:dietrich) from comment #54)
> please confirm that there's no restore code that expects a 'children'
> property to exist.

The old code did the same when a 'wyciwyg://' entry existed so this should be safe. I double-checked and every other place I found checks for the property first before accessing it.
Comment 56 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-09 00:14:14 PST
Created attachment 580334 [details] [diff] [review]
patch v7 (with test)

Added a test. Ensured that it fails without the patch applied.
Comment 57 Dietrich Ayala (:dietrich) 2011-12-09 09:59:55 PST
Comment on attachment 580334 [details] [diff] [review]
patch v7 (with test)

Review of attachment 580334 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/sessionstore/src/nsSessionStore.js
@@ +1862,5 @@
>  
>      try {
> +      // throwing is expensive, we know that about: pages will throw
> +      if (entry.url.indexOf("about:") != 0)
> +        aHostSchemeData.push({ host: aEntry.URI.host, scheme: aEntry.URI.scheme });

since we're checking for about: pages, do we need the try/catch block here still?
Comment 58 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-09 10:05:19 PST
(In reply to Dietrich Ayala (:dietrich) from comment #57)
> since we're checking for about: pages, do we need the try/catch block here
> still?

Yes, there could be other schemes without a host name. 'about:' seems to be the most common - other schemes would probably be registered by add-ons or the like.
Comment 59 :Gavin Sharp [email: gavin@gavinsharp.com] 2011-12-09 10:08:52 PST
Yeah, any non-nsStandardURL-implemented nsIURI (e.g. nsSimpleURI used for view-source, data:, etc.) can throw when getting .host.
Comment 60 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-09 10:28:58 PST
https://hg.mozilla.org/integration/fx-team/rev/9f98db90b6e1
Comment 61 Tim Taubert [:ttaubert] (on PTO, back Aug 29th) 2011-12-12 06:25:59 PST
https://hg.mozilla.org/mozilla-central/rev/9f98db90b6e1
Comment 62 IU 2011-12-13 07:15:17 PST
Unfortunately, this did not fix bug 624409, as I had hoped.  Does Mozilla have policies and a process for handling potentially sensitive user data?  I'm thinking the only way ahead may be for me to submit the sessionstore.js directly to Mozilla and maybe also mark bug 624409 a security bug.

Please reply in bug 624409.  Thanks.
Comment 63 Jesse Ruderman 2011-12-27 22:59:55 PST
(In reply to Paul O'Shannessy [:zpao] from comment #53)

I filed bug 713813 on getting us a fast check for string starts-with constant. Please file bugs when you notice insane performance characteristics in other components :)
Comment 64 Ed Lee :Mardak 2014-07-07 08:31:40 PDT
*** Bug 659567 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.