Closed Bug 669603 Opened 13 years ago Closed 6 years ago

large sessionStorage data causes session restore to block the UI

Categories

(Firefox :: Session Restore, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: jesup, Unassigned)

References

(Depends on 2 open bugs, Blocks 2 open bugs, )

Details

(Keywords: dev-doc-needed, perf, Whiteboard: [snappy:p1][leave open])

Attachments

(4 files, 10 obsolete files)

360.79 KB, text/plain
Details
652 bytes, patch
ted
: review+
ttaubert
: checkin+
Details | Diff | Splinter Review
39.58 KB, patch
Details | Diff | Splinter Review
3.13 KB, patch
smaug
: review+
mayhemer
: review+
ttaubert
: checkin+
Details | Diff | Splinter Review
Spin-off from bug 669034

Google's "instant" search seems to cause huge amounts of HTML to be stored in the sessionstore.js file.  Loading a fresh profile, closing the first-time mozilla.org tab, going to google.com and typing two words caused a 366K sessionstore.js file to be created.  In my 20MB 500 tab profile, ~10MB of it is just a couple of dozen google tabs or history items.  Each entry can range from 200K to >600K.

Options: take this up with Google and see if they can not do this, or if they can't if they can flag to us what we really should save.  Or we could purposely not save it as a site-specific kludge.  Or we could refuse to save if over some large size.  Or we could add a new mechanism for a site to control what's saved.  Probably more as well.

Attachment 544215 [details] is in bug 669034
Keywords: perf
Hardware: x86 → All
What's the info being saved?  Hidden input values?  Something else?
To pretty-print JSON:  
Python 2.6 - "python -mjson.tool <sessionstore.js >session_pretty.js"
Python 2.5 - install simplejson.py (http://pypi.python.org/pypi/simplejson/)
   "python -msimplejson.tool <sessionstore.js >sessionstore_pretty.js"
Looks like each result causes about 40K of JSONed HTML in sessionstore; plus another block for the text input widget (which had a dropdown visible when I exited).
Ah, interesting.  Are those huge things sessionstorage?
(In reply to comment #4)
> Ah, interesting.  Are those huge things sessionstorage?

yes
Do we not store that in an sqlite db already?   I wonder whether we can store it out-of-band somehow for session restore purposes....
(In reply to comment #6)
> Do we not store that in an sqlite db already?   I wonder whether we can
> store it out-of-band somehow for session restore purposes....

Perhaps I'm thinking of these wrong, but aren't they a lot like session cookies? They're supposed to expire when the session ends, but since we're restoring the session, we don't treat it as expiring. This was brought up during initial ipml (bug 339445) drawing from this part of the spec: "The lifetime of a browsing context can be unrelated to the lifetime of the actual user agent process itself, as the user agent may support resuming sessions after a restart." (http://dev.w3.org/html5/webstorage/#dom-sessionstorage)

That said, we surely _could_ store it out of band. Previously we have kept all session information in sessionstore.js though for backup and portability reasons.
> Perhaps I'm thinking of these wrong, but aren't they a lot like session cookies?

In terms of lifetime, yes.  But in terms of usage and implementation, they're different: they allow very large values and internally I'm pretty sure we just use an in-memory sqlite db for them....   So we might be able to get things for free if we just allow write-through from that in-memory db to disk or something.
...especially if we decide that the long-term plan is for sessionstore.js to move to a sqlite DB for reasons other than localstorage!

Should we consider a short-term fix to suppress saving this data on google.com?
Maybe the short-term fix would be to just not save the sessionstorage data if it's larger than a certain size.
I should note that if google appears anywhere in a tab's sessionhistory, you'll get most if not all the hit.  And if someone makes Google their home page and tends to start all new tabs by hitting Home....

Not to mention users who have no idea what those bars on the top are and think the way you find a site on the web is to type it's name into google.  Like my father-in-law.
This is not a Google-specific problem - could be caused by any site that utilizes sessionStorage, so we need to fix this in our session restore code.

We have a file cache system that is designed for temporary storage of blobs like these, so maybe we can utilize that here.
Summary: Google generates huge sessionstore.js info → large sessionStorage data causes session restore to block the UI
Whiteboard: [snappy]
Whiteboard: [snappy] → [snappy:p1]
Might be able to fix this with minor surgery to store the data in (and restore from) the file cache, saving only the cache key in the session data.
Assignee: nobody → ttaubert
That's not a bad plan, though we're increasing the amount of file accesses.  If we do this, I'd want some type of 'dirty' bit to tell us if the data in the file cache needs to be changed, since 99% of the time it won't.  (99% might not even be an exaggeration for sessions with lots of tabs).

Note that this means that some tricks done today with sessionstore.js (copying it, backing it up and restoring it, etc) will no longer work.  Also, sessionstore-knowledgable extensions (SessionManager) may have to be revved, which is ok.  If there's some upgrade/downgrade risk, we may want to change the filename at the same time (sessionstore2.js) if there's no internal versioning (and I think there isn't).  Overall though the risk is low due to it being JSON; worst might be loss of the sessionstorage data.

Also, since it's filecache, we'll need to make sure there's no chance of it getting the "wrong" data; I don't know how the keys are done, but we'll need to make sure they can't be aliased, *or* include a hash in sessionstore to verify it's the "right" data.
Attached patch patch v1 (WIP) (obsolete) — Splinter Review
(In reply to Randell Jesup [:jesup] from comment #14)
> Note that this means that some tricks done today with sessionstore.js
> (copying it, backing it up and restoring it, etc) will no longer work. 
> Also, sessionstore-knowledgable extensions (SessionManager) may have to be
> revved, which is ok.  If there's some upgrade/downgrade risk, we may want to
> change the filename at the same time (sessionstore2.js) if there's no
> internal versioning (and I think there isn't).  Overall though the risk is
> low due to it being JSON; worst might be loss of the sessionstorage data.

True, if we now store sessionStorage data out of 'sessionstore.js' these tricks will fail. I'm not sure though how many people rely on the DOM.sessionStorage data to be included in that file. IMHO this is not really worth a rename, sessionStorage data is rather volatile.

> Also, since it's filecache, we'll need to make sure there's no chance of it
> getting the "wrong" data; I don't know how the keys are done, but we'll need
> to make sure they can't be aliased, *or* include a hash in sessionstore to
> verify it's the "right" data.

I think we're pretty safe that we'll get the 'correct' data because we use a specific cache session for this data. Data from different cache sessions is stored separately and thus will not be overridden unintentionally by other code parts.

--------------------

This patch does the following:

+ It uses the file cache to store domain-specific sessionStorage data. Keys look like '{07dab289-0ae1-5c49-9b0b-a54f63ff491a}-http://example.com' (I'm using the UUID generator). The UUID is assigned to a tab when it's not already present. The UUID is of course re-used.

+ When closing a tab all its sessionStorage data (from all history entries) is collected and asynchronously written to the cache. We start with the current page's data and continue with all other history entries, sorted by their distance from the current entry.

+ When restoring a tab the current page's sessionStorage data is loaded synchronously from the cache. The data for all other history entries is loaded asynchronously, sorted by their distance from the current entry.

+ When duplicating a tab all sessionStorage data is copied from the source tab to the target tab. Doing this asynchronously shouldn't be necessary as this data is already loaded in memory and doesn't need to be JSON.stringified/parsed.

--------------------

TODO:

+ Hook into 'browser:purge-session-history' to clear all domain-specific data when requested.

+ We could (should?) implement a dirty bit for the sessionStore to determine if it has changed and needs saving on tab close.

+ When the browser is closed all entries need to be written synchronously.
Attachment #576683 - Flags: feedback?(paul)
Attachment #576683 - Flags: feedback?(rjesup)
If you're going to use the file cache for sessionStorage data, it seems to me that bug 105843 is going to become a bigger issue.
(In reply to Ryan VanderMeulen from comment #16)
> If you're going to use the file cache for sessionStorage data, it seems to
> me that bug 105843 is going to become a bigger issue.

Yeah, this is more than bad if we want session restore to mitigate the results of a crash. It's not too hard switching the data store behind that patch. Maybe we should have a look into IndexedDB or LevelDB? I don't have any experience with these storage APIs and can't say if they would be a good fit for our problem. All we'd need is a (fast) DB with sync and async APIs...
Status: NEW → ASSIGNED
Yes, a DB may make more sense - though we aren't *required* to restore the data, there's an advantage in not losing it (and in being consistent about restoring it).

Please check the message I posted in m.d.performance and cross-posted to m.d.platform back in the first week of July, and the responses.

I'd consider the cache solution to be roughly equivalent to a workable version of what Shaver suggested (store the data in files).

My understanding without looking into it too deeply is that LevelDB may be a good match conceptually.
(In reply to Randell Jesup [:jesup] from comment #18)
> Yes, a DB may make more sense - though we aren't *required* to restore the
> data, there's an advantage in not losing it (and in being consistent about
> restoring it).

Yeah, I think sessionStorage data is rather kind of volatile and mainly used for caching and in-work state stuff. Though I'm not sure about that. It's definitely an advantage if take care of it as we do with every other kind of data.

> Please check the message I posted in m.d.performance and cross-posted to
> m.d.platform back in the first week of July, and the responses.

FTR: https://groups.google.com/forum/#!topic/mozilla.dev.performance/Itt-8Pr_aZk/discussion (to save others some time :)

> I'd consider the cache solution to be roughly equivalent to a workable
> version of what Shaver suggested (store the data in files).
> 
> My understanding without looking into it too deeply is that LevelDB may be a
> good match conceptually.

Yeah this is pretty much what Shaver suggested, I think. I'm not really sure when LevelDB (bug 679852) will be available - it might take some time to have it in the product. Should we maybe start with the file cache as a first implementation until LevelDB is available? This is mainly a tradeoff between those kind of 'dataloss' issues we could experience and the bloated sessionstore.js files combined with UI hickups (which is far worse IMHO).
Well, they're used for different things.  Google keeps search results in them (it appears).  Others have much smaller uses, and as mentioned there's no guarantee of restoration - but it might be good to check the other major browsers to make sure if they restore this data or not.

That said, if we determine it won't break too many things for users too often, I'd be good with the cache solution as a stepping stone to a DB.  And in any case, a dirty bit will help HUGELY.  (Anything that avoids rewriting the entire file & data will help a ton - delta journals, etc.)
In bug 705597 I describe how I have a 6MB sessionstore.js file that I reduce down to 880k simply by removing 'about:blank' entries. Is that related?
We should not block resolution of this problem on LevelDB being available. Roll with cache for now to mitigate the immediate hurt, at the risk of dataloss in some scenarios. Please relnote exactly which user scenarios can cause cache destruction so that the support folks can tie those reports back to this change.
(In reply to Chris AtLee [:catlee] from comment #21)
> In bug 705597 I describe how I have a 6MB sessionstore.js file that I reduce
> down to 880k simply by removing 'about:blank' entries. Is that related?

Nope. That is possible a combination of bug 700923 and a yet-unfiled bug related to using about:blank for weird things in session restore (i can't remember offhand exactly why we do that).
(In reply to Dietrich Ayala (:dietrich) from comment #23)
> (In reply to Chris AtLee [:catlee] from comment #21)
> > In bug 705597 I describe how I have a 6MB sessionstore.js file that I reduce
> > down to 880k simply by removing 'about:blank' entries. Is that related?

> ... a yet-unfiled bug related to using about:blank for weird things in session restore

Er, let's use your bug for that :)
Attached patch patch v2 (WIP) (obsolete) — Splinter Review
Implemented synchronous write on browser shutdown.

Cache destruction scenarios:

1) The cache format changes at Firefox update (unlikely, hasn't changed since 4.0).
2) Firefox crashes or is forced to quit.
3) History is cleared - there is now way to clear the cache based on timestamp so the whole cache is blown away.

Questions:

1) Should we write periodically as session store does? At the moment this patch asynchronously writes all DOM SS data only when the tab or windows gets closed and synchronously when the browser gets closed. If Firefox crashes or is forced to quit we'll lose that data anyway (see point #2 above) so I think there's no point in doing that at the moment.

2) Do we need to purge history when requested? All cache data is cleared (see point #3 above) so there's no need to modify the cache. I guess the current DOM SS data is either not touched or cleared by the DOMStorage internally? Seems like there's nothing to do here at the moment as well.

3) How could we implement a dirty bit for the DOM SS data? Do we need that for the initial implementation? Could we do this as a follow-up maybe? Not-restored tabs in the background don't have any data so they're not going to be touched anyway. This is only an optimization for loaded tabs.
Attachment #576683 - Attachment is obsolete: true
Attachment #577871 - Flags: feedback?(paul)
Attachment #577871 - Flags: feedback?(dietrich)
Attachment #576683 - Flags: feedback?(rjesup)
Attachment #576683 - Flags: feedback?(paul)
Well, there are two major uses of sessionstore: restoring tabs/windows/ after a shutdown (frequent for many users, rare for others), and restoring tabs/windows after a crash or hard shutdown (rare for many users, very common for others).

Some users (quite a few) virtually never shut down the browser, so the crash/force-quit is a primary restart scenario.  This doesn't mean we must guarantee save in this case, but we need to think about it.

On history clearing, we can schedule a session store event and re-save the data, leaving a small hole - probably ok.

As for dirty bits - right now all the non-loaded tabs are repeatedly saved including this data, so a dirty bit is not just an optimization for loaded tabs.  The win from a dirty bit (or hash check) would be quite large.
Talked to Tim about this. sessionStorage dirty bit would be nice. better might be events or notifications, to avoid having to check a dirty bit at all. Either one is fine as a followup, doesn't need to block this bug.
(In reply to Randell Jesup [:jesup] from comment #26) 
> Some users (quite a few) virtually never shut down the browser, so the
> crash/force-quit is a primary restart scenario.  This doesn't mean we must
> guarantee save in this case, but we need to think about it.

True. Unfortunately there currently is no way to keep the data after crash/force-quit when using the file cache as the storage backend. I think that tradeoff is not too bad, though.

> On history clearing, we can schedule a session store event and re-save the
> data, leaving a small hole - probably ok.

Yeah, we could totally do that. Does history clearing involve current DOM sessionStorage data or is that kept? If we keep it we should just re-save everything after the history has been cleared. I have no idea what the behavior should be.

> As for dirty bits - right now all the non-loaded tabs are repeatedly saved
> including this data, so a dirty bit is not just an optimization for loaded
> tabs.  The win from a dirty bit (or hash check) would be quite large.

Actually, unrestored tabs have no valid sessionStorage data at all. So all we do is access the storage (on browser quit or window/tab close) and if it's empty we just do nothing. This check isn't really expensive I think, so there is an optimization for not-loaded tabs.

(In reply to Dietrich Ayala (:dietrich) from comment #27)
> Talked to Tim about this. sessionStorage dirty bit would be nice. better
> might be events or notifications, to avoid having to check a dirty bit at
> all. Either one is fine as a followup, doesn't need to block this bug.

Yeah, the one thing that hurts is that we have to synchronously write all sessionStorage data when shutting down the browser (which is what we do now anyway). With a dirty bit we'd now which sessionStorage data has changed and would write only that. I'd suggest to file a follow-up bug about:

1) Implement a dirty bit (somehow) to check if sessionStorage data has been modified.
2) Check this dirty bit whenever writing (a)synchronously to the cache.
3) Write sessionStorage data to the cache periodically (together with SS's save timer). That way we could asynchronously save modifications to the DOM sessionStorage and there is a good chance that there is really few work to do synchronously on shutdown.
(In reply to Tim Taubert [:ttaubert] from comment #28)
> (In reply to Randell Jesup [:jesup] from comment #26) 
> > Some users (quite a few) virtually never shut down the browser, so the
> > crash/force-quit is a primary restart scenario.  This doesn't mean we must
> > guarantee save in this case, but we need to think about it.
> 
> True. Unfortunately there currently is no way to keep the data after
> crash/force-quit when using the file cache as the storage backend. I think
> that tradeoff is not too bad, though.

Well, for users who typically exit the browser and restart it later, that's fine.  For users who typically browse and leave it up, and use session restore to recover from crashes, this loses - though it only loses this part of the data, so it's not so bad.  I'd love to get a read on what the effect on various sites is, and what the other browsers do.
 
> > On history clearing, we can schedule a session store event and re-save the
> > data, leaving a small hole - probably ok.
> 
> Yeah, we could totally do that. Does history clearing involve current DOM
> sessionStorage data or is that kept? If we keep it we should just re-save
> everything after the history has been cleared. I have no idea what the
> behavior should be.

I'd bet strongly we keep current sessionStorage data, but I haven't checked.

> > As for dirty bits - right now all the non-loaded tabs are repeatedly saved
> > including this data, so a dirty bit is not just an optimization for loaded
> > tabs.  The win from a dirty bit (or hash check) would be quite large.
> 
> Actually, unrestored tabs have no valid sessionStorage data at all. So all
> we do is access the storage (on browser quit or window/tab close) and if
> it's empty we just do nothing. This check isn't really expensive I think, so
> there is an optimization for not-loaded tabs.

No data?  Or do you mean the data would still be in the cache and not in memory?  Right now I can assure you that sessionstore.js includes storage data for not-loaded tabs.
(In reply to Randell Jesup [:jesup] from comment #29)
> Well, for users who typically exit the browser and restart it later, that's
> fine.  For users who typically browse and leave it up, and use session
> restore to recover from crashes, this loses - though it only loses this part
> of the data, so it's not so bad.  I'd love to get a read on what the effect
> on various sites is, and what the other browsers do.

From: http://msdn.microsoft.com/en-us/library/cc197062%28v=vs.85%29.aspx#_session

"Note: Although it is allowed by the HTML5 (Working Draft), Internet Explorer 8 does not resume sessionStorage after browser crash recovery."

Chrome does not seem to restore sessionStorage data at all, after the browser was closed or has crashed (I did some quick manual testing).

> I'd bet strongly we keep current sessionStorage data, but I haven't checked.

According to the spec we must not delete it while there's a script that might access it - that does make sense.

> No data?  Or do you mean the data would still be in the cache and not in
> memory?  Right now I can assure you that sessionstore.js includes storage
> data for not-loaded tabs.

Yes, the data would still be in the cache and not loaded into memory (for not yet restored tabs). So we check each tab's session storage when quitting and don't access the cache when we didn't find any data.
Comment on attachment 577871 [details] [diff] [review]
patch v2 (WIP)

Review of attachment 577871 [details] [diff] [review]:
-----------------------------------------------------------------

looking good. the separation of concerns is great, will make it easy to swap out storage for leveldb or whatever we end up with. some overall comments below.


* none of this new code is specific to SS, which is great. but i'm not sure it belongs here. gavin is making a nice home for modules like this in bug 699575, so could move it there eventually. for now, could rename s/pss//, turn it into jsms and keep it here? or maybe the tab-specific stuff (history, domstorage) could be in a tabhelpers module or something?

* please add an overview comment per file saying what the new code does, and per-method docs as well. finally, these new modules should have unit tests specific to them.

* can we plan on deprecating the sync methods in the cache module, for example, and marking them as such? once session restore becomes async, we could switch it over.

* i see lots of synchronous function calls that could do lots of work and end up blocking the UI: History.getDomains(), session history stuff, domStorage reading and writing. what do you think about using this opportunity to think about using some common async patterns - such as function(options, callback(error, result)) like a lot of node.js code uses, or some variation?
Attachment #577871 - Flags: feedback?(dietrich) → feedback+
Comment on attachment 577871 [details] [diff] [review]
patch v2 (WIP)

Review of attachment 577871 [details] [diff] [review]:
-----------------------------------------------------------------

A general question that I may have missed - if somebody clears their cache at shutdown, will that wipe out sessionStorage? It's probably not a big deal, but would be a change.

Dietrich brings up some good points too.

::: browser/components/sessionstore/src/PersistentSessionStorage.jsm
@@ +52,5 @@
> +
> +#include PssCache.js
> +#include PssState.js
> +#include PssHistory.js
> +#include PssDomStorage.js

I'm inclined to say either those should all be one file or they should just be in this file. For a couple reasons.. the license headers is taking up a lot of space (and will be injected into here) for each of them (it's actually > 1/2 of PssCache.js). There are consts & getters in some of them which actually are going to affect all of them. It also adds a layer of misdirection. I do applaud the cleanliness of it all though, I just think it adds maintenance costs (more than a single file).

@@ +57,5 @@
> +
> +let PersistentSessionStorage = {
> +  _timers: [],
> +
> +  store: function PSS_store(aTab, aTabState, aSync) {

aSync is a bit of an unfortunate name :/ It makes sense, just needs to be read twice since it actually means the opposite of async!

@@ +289,5 @@
> +      timers.splice(timers.indexOf(timer), 1);
> +      aCallback();
> +    }, 0, Ci.nsITimer.TYPE_ONE_SHOT);
> +  }
> +};

store, restore, duplicate are the only "public" methods here, but many more methods are being exported since the whole object is going. It might make more sense to restructure a bit so that you're exporting a set of methods (then the imported object is already setup, don't need to get PersistentSessionStorage off the imported object) or have a public object which just forwards to an internal object.

::: browser/components/sessionstore/src/PssDomStorage.js
@@ +48,5 @@
> +  read: function DomStorage_read(aTab, aFullData) {
> +    let docShell = aTab.linkedBrowser && aTab.linkedBrowser.docShell;
> +
> +    if (!(docShell instanceof Ci.nsIDocShell))
> +      return;

Couldn't you just check if !docShell? I think it will be undefined unless aTab.linkedBrowser.docShell gives us something. Though yours is more explicit

@@ +53,5 @@
> +
> +    let self = this;
> +    let storageData = [];
> +
> +    History.forEachEntry(aTab, function (aUri, aDomain) {

if you make History.getAllEntries, then you could use `entries.forEach(fn, this)` and get rid of self. Alternatively, can't you just use DomStorage.___ directly?

@@ +71,5 @@
> +    });
> +
> +    // did we find any storage data?
> +    if (0 < Object.keys(storageData).length)
> +      return storageData;

Part of me wants to make this API consistent & always return an object. The other part of me says that it's internal so it doesn't matter. I'll leave it up to you, but something to think about. Same with _readSessionStorage.

Further:
`if (Object.keys(storageData).length)` should suffice.

Further Further:
If this is going to be hit a lot, it might make sense to just have a flag variable set when we put something into storageData. It's probably ok though given this isn't being hit 100 times a second.

@@ +90,5 @@
> +        let key = storage.key(i);
> +        data[key] = storage.getItem(key);
> +      }
> +      catch (e) {
> +        // XXXzeniko this currently throws for secured items (cf. bug 442048)

You can get rid of the XXXzeniko part. It's not adding value & Simon hasn't been working on this for a while.

@@ +94,5 @@
> +        // XXXzeniko this currently throws for secured items (cf. bug 442048)
> +      }
> +    }
> +
> +    if (0 < Object.keys(data).length)

`if (Object.keys(data).length)`

::: browser/components/sessionstore/src/PssHistory.js
@@ +35,5 @@
> + *
> + * ***** END LICENSE BLOCK ***** */
> +
> +let History = {
> +  forEachEntry: function History_forEachEntry(aTab, aCallback) {

I'm wondering if it might make more sense to just return the entries (and call it getAllEntries of whatever). Not sure if it's necessary now or ever, but it gives a bit more power to the consumer.

@@ +68,5 @@
> +    History.forEachEntry(aTab, function (aUri, aDomain) {
> +      domains.push(aDomain);
> +    });
> +
> +    return domains;

If you did make it getAllEntries, this could just become
> return History.getAllEntries().map(function(entry) entry.domain);
Attachment #577871 - Flags: feedback?(paul) → feedback+
Depends on: 711465
Put a blocker on this bug to decide what the "on-crash" behavior should be, since this patch would be a significant change.  Recovery-on-crash is a desirable behavior, and this patch as written retains the less-desirable behavior of "restore-after-quit retains session storage".
Attached patch patch v3 (WIP) (obsolete) — Splinter Review
WIP patch. Using ZipReader/Writer as storage, so no data loss on crash (yay) and asynchronous reading and writing.
Attachment #577871 - Attachment is obsolete: true
Attached patch patch v3b (WIP) (obsolete) — Splinter Review
Attachment #596688 - Attachment is obsolete: true
Attached patch patch v3c (WIP) (obsolete) — Splinter Review
Attachment #596964 - Attachment is obsolete: true
Attached patch patch v4 (WIP) (obsolete) — Splinter Review
Attachment #597761 - Attachment is obsolete: true
Attached patch patch v4 (WIP) (obsolete) — Splinter Review
Attachment #629293 - Attachment is obsolete: true
Attached patch patch v4 (WIP) (obsolete) — Splinter Review
Attachment #629618 - Attachment is obsolete: true
Attached patch patch v4 (WIP) (obsolete) — Splinter Review
Attachment #629876 - Attachment is obsolete: true
I'm trying to determine how this will impact my Session Manager add-on beyond the obvious likelihood of sessionStorage not being available.  What happens if the UUID does not exist in the cache if a session is being restored?  I'm assuming it simply just doesn't restore that data (i.e. nothing "bad" happens).

Also would it be possible to add an API to pass (inject) the sessionStorage data directly in, in case Session Manager decided to store said data itself for future use?
(In reply to Michael Kraft [:morac] from comment #43)
> I'm trying to determine how this will impact my Session Manager add-on
> beyond the obvious likelihood of sessionStorage not being available.  What
> happens if the UUID does not exist in the cache if a session is being
> restored?  I'm assuming it simply just doesn't restore that data (i.e.
> nothing "bad" happens).

Yes, it doesn't do anything if it doesn't find file.

> Also would it be possible to add an API to pass (inject) the sessionStorage
> data directly in, in case Session Manager decided to store said data itself
> for future use?

Well, if some add-on decides to store session store data by itself then they could just read and write the data themselves using the same APIs that SessionStorage.jsm uses. The JSM will continue to (re)store the data by its own. Not sure we need an API for this because that would effectively be a replacement for the new sessionStorage mechanism. At best we could provide a switch to turn it off.
Comment on attachment 630132 [details] [diff] [review]
Part 1 - Add MODE_RDWR to FileUtils.jsm

Review of attachment 630132 [details] [diff] [review]:
-----------------------------------------------------------------

I don't really know much about this file, but this seems like a simple enough change.
Attachment #630132 - Flags: review?(ted.mielczarek) → review+
Attached patch patch v4 (WIP)Splinter Review
Attachment #630118 - Attachment is obsolete: true
The current session storage implementation hasn't changed for quite some time and does still behave rather like an add-on. There is some functionality regarding session storages that could better be implemented in the platform.

clearSessionStorages() is needed when session store re-used an opened tab. All session storage objects need to be detached before restoring some state into this tab or we could probably run into weird situations - which we didn't so far, obviously.

cloneSessionStoragesFrom() is very useful when duplicating tabs and in situations where we can't access the storage, yet. It precisely clones all storage objects attached to a docShell and so we don't need to copy all this data to JS anymore just to immediately write it back to the platform.
Attachment #631868 - Flags: review?(khuey)
Attachment #631868 - Attachment description: Bug 669603 - Part 2 - Add sessionStorage helpers to nsIDocShell → Part 2 - Add sessionStorage helpers to nsIDocShell
Comment on attachment 631868 [details] [diff] [review]
Part 2 - Add sessionStorage helpers to nsIDocShell

Review of attachment 631868 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not a peer of this code.
Attachment #631868 - Flags: review?(khuey) → review?(bugs)
Comment on attachment 631868 [details] [diff] [review]
Part 2 - Add sessionStorage helpers to nsIDocShell

Honza should look at this.

>diff --git a/docshell/base/nsDocShell.cpp b/docshell/base/nsDocShell.cpp
>--- a/docshell/base/nsDocShell.cpp
>+++ b/docshell/base/nsDocShell.cpp
>@@ -2572,16 +2572,48 @@ nsDocShell::AddSessionStorage(nsIPrincip
>         else {
>             return topDocShell->AddSessionStorage(aPrincipal, aStorage);
>         }
>     }
> 
>     return NS_OK;
> }
> 
>+static PLDHashOperator
>+CloneSessionStorages(nsCStringHashKey::KeyType aKey, nsIDOMStorage* aStorage,
>+                     void* aUserArg)
>+{
>+    nsCOMPtr<nsPIDOMStorage> pistorage = do_QueryInterface(aStorage);
>+    nsCOMPtr<nsIDOMStorage> clone = pistorage->Clone();
>+
>+    nsInterfaceHashtable<nsCStringHashKey, nsIDOMStorage> *storages =
>+        static_cast<nsInterfaceHashtable<nsCStringHashKey, nsIDOMStorage>*>(aUserArg);
>+    storages->Put(aKey, clone);
>+
>+    return PL_DHASH_NEXT;
>+}
>+
>+nsresult
NS_IMETHODIMP

>+nsDocShell::CloneSessionStoragesFrom(nsIDocShell* aDocShell)
>+{
>+    ClearSessionStorages();
>+
>+    nsDocShell *docShell = static_cast<nsDocShell*>(aDocShell);
>+    docShell->mStorages.EnumerateRead(CloneSessionStorages, &mStorages);
>+
>+    return NS_OK;
>+}
>+
>+nsresult
NS_IMETHODIMP

>+++ b/docshell/base/nsIDocShell.idl
>@@ -430,16 +430,29 @@ interface nsIDocShell : nsISupports
update uuid

I'll re-read this after Honza has given feedback.
Attachment #631868 - Flags: review?(honzab.moz)
Comment on attachment 631868 [details] [diff] [review]
Part 2 - Add sessionStorage helpers to nsIDocShell


>+nsresult
>+nsDocShell::CloneSessionStoragesFrom(nsIDocShell* aDocShell)
>+{
>+    ClearSessionStorages();
>+
>+    nsDocShell *docShell = static_cast<nsDocShell*>(aDocShell);
>+    docShell->mStorages.EnumerateRead(CloneSessionStorages, &mStorages);
Also, technically this is unsafe.
If JS passes some dummy nsIDocShell object, we crash here.
Could you make nsIDocShell interface builtinclass. Then the static_cast should be ok - 
or at least less scary :)

This all needs tests.
Attachment #631868 - Flags: review?(bugs) → review-
Comment on attachment 631868 [details] [diff] [review]
Part 2 - Add sessionStorage helpers to nsIDocShell

If I understand what you are trying to do, then have the clone method opposite:

nsIDocShell.cloneSessionStoragesTo(in nsIDocShell targetDocShell) and copy the code at [1] to move storages to the target doc shell.

[1] http://hg.mozilla.org/mozilla-central/annotate/733994f12c53/embedding/components/windowwatcher/src/nsWindowWatcher.cpp#l931
Attachment #631868 - Flags: review?(honzab.moz) → review-
Addressed all comments.
Attachment #631868 - Attachment is obsolete: true
Attachment #632890 - Flags: review?(honzab.moz)
Attachment #632890 - Flags: review?(bugs)
Attachment #632890 - Flags: review?(bugs) → review+
Attachment #632890 - Flags: review?(honzab.moz) → review+
Needs documentation for the nsIDocShell additions when landed.
Keywords: dev-doc-needed
SessionStore object sometimes stores great loads of outdated data which is unrecoverable by the browser, useless to user, and it makes FF crawl. Here are details (on example of Twitter.com site, which had no "recent browsing history" "Back" button active, yet kept megabytes of dead month-old data in SS.json):

http://forums.mozillazine.org/viewtopic.php?p=12088439#p12088439
Depends on: 771277
As an addition to the issue described on Comment 54 (https://bugzilla.mozilla.org/show_bug.cgi?id=669603#c54), it should be noted that when FireFox 14.01 works in a session for quite a time, some sites like Twitter -- which end-up storing its garbage in SessionStore (still not fixed) -- become completely defunct because this JSON object gets corrupted along the way, right when user works in the browser.

When you try to exit FF in this state, it closes interface, but the process FireFox.exe continues running in the system with its full like 2.5 GB memory footprint, which slightly fluctuated for hours and FF never really quits no matter how long you wait.

And when you kill the process, when you relaunch FF it shows you purely blank session. It can not restore actual session, because even its SessionStore.bak is corrupted. Since FF has only one backup file, the only way to recover user's session is to go to archive and exctract the previous working SessionStore.JS. Obviously, this works only if user has set up backup. Otherwise, user's session and all of its data in it lost *forever*.

I see the following issues with this situation:

1) how come FF allows overwite SessionStore.bak without checking integrity of this object -- effectively killing forever user's session (if he/she is not archiving)? It the object is corrupted, ".bak" file should not be overwritten;

2) why only one "Bak" file for session store is created? Why no "Ba1", "Ba2", et cetera, for older backups?

3) why FF engine fails to deal with its SessionStore object when it is still running? Why it ends up in a endless loop, trying to sort out the corrupted garbage in SessionStore object?

4) Should not be there some supportive mechanism that determines if an attempt to prepare SessionStore object for saving takes long time, then it has to do some *different* treatment to the object, and thus avoid endless hang of FireFox.exe process, and actually save SessionStore object, stripping it out of the corrupted garbage that did not allow normal processing?
@User Dderss: All valid points, but they would deserve their own bugs.

@Tim: I see that it has landed. Can we close this bug?
Flags: needinfo?(ttaubert)
The JSON does not already blow up for me -- the issue was majorly solved, thankfully -- but the sessions are still highly critical and unresolved mess thanks to this archaic, lacking architecture.

For example, take this story: my UPS is off for maintenance, and during this time electricity was unstable once, so that my PC got shut down.

Do you guess what has happened?

JSON object was not saved correctly, and the only (!) one "bak" file happened to be corrupt, too, so my "new and shiny" FF 21 opened me purely BLANK session -- with all of my groups and tabs obviously obliterated.

This session management "system" is a SHAME for whoever is chief software architect is. The only way to deal with this disaster of system is to have JSON object archived constantly by a third-party utilities such as Cobian back up.

It is amazing that even to version 21, Mozilla is still completely unable to solve issue that absolutely devastates whole user experience, whole body of user's work (in cases when the use is job-related).
Backups will be handled as part of bug 558425, hopefully soonish.
There are also plans for additional backups as part of bug 876168.
Let's move conversations on session restore backups to these bugs.
(In reply to David Rajchenbach Teller [:Yoric] from comment #58)
> @Tim: I see that it has landed. Can we close this bug?

I only landed part 1 and 2 of this bug which were prerequisites. nsIDocShell.cloneSessionStoragesTo() introduced by part2 seems to be gone - iirc it was removed by a refactoring.

The DOMSessionStorage situation is still quite bad, also mostly because we're lacking a good storage for it.
Flags: needinfo?(ttaubert)
I'm not currently working on it or have been in the last months.
Assignee: ttaubert → nobody
Status: ASSIGNED → NEW
Since the landings for this bug will be spread apart very far, should a new bug be created for the remaining work? Keeping this bug in progress will likely make tracking in the future harder.
Flags: needinfo?(ttaubert)
Flags: needinfo?(dteller)
That makes sense. I vote yay.
Flags: needinfo?(dteller)
Yes, I think that's a very good idea.
Flags: needinfo?(ttaubert)
Setting target-milestone to Fx16 based on m-c date of 2012-07-05.
Target Milestone: --- → Firefox 16
Blocks: 624409
So what's the status of this bug?
Flags: needinfo?(ttaubert)
It's still an issue that I'm not working on right now.
Flags: needinfo?(ttaubert)
Target Milestone: Firefox 16 ???
Depends on: 934934
Target Milestone: Firefox 16 → ---
Whiteboard: [snappy:p1][leave open] → [triage][snappy:p1][leave open]
No longer blocks: fxdesktopbacklog
Whiteboard: [triage][snappy:p1][leave open] → [snappy:p1][leave open]
i found this bug after hours and hours of searching a reason why my sessionstore was growing exponentially, rendering ff sluggish on every save. 
after removing all that pointless data my sessionstore dropped from 20 to 1mb...that's a lot of wasted resources when saving every 15seconds, even more so if you have your profile on network drive or backup your sessionstore regularly.
is a fix of this ever going to happen?
if cleaning up sessionstore all the time would be too difficult to do, how about just moving this data to a different json file that could be safely deleted from time to time? just a (probably bad) idea.
(In reply to de ff from comment #73)
> is a fix of this ever going to happen?

We indeed landed bug 952998 in Firefox 29 that at least mitigates the problem. You can give Firefox Aurora a try for some days and see if that improves your situation.
Flags: firefox-backlog?
(In reply to Tim Taubert [:ttaubert] from comment #75)
> (In reply to de ff from comment #73)
> > is a fix of this ever going to happen?
> 
> We indeed landed bug 952998 in Firefox 29 that at least mitigates the
> problem. You can give Firefox Aurora a try for some days and see if that
> improves your situation.

Does the fix clean up existing sessionstore.js files or only prevent new "junk" from accumulating in the file? I'm running Firefox 29 and it does not appear to be shrinking the file any (mine is over 80 MB), but it is also not updating changes to the file either.
Hi Christopher,

If you'd like to find out which tabs keep all that data in the store (and selective close them), you can use this "about:sessionstore" addon:

https://addons.mozilla.org/en-US/firefox/addon/about-sessionstore/
(In reply to Christopher D. Coleman from comment #76)
> (In reply to Tim Taubert [:ttaubert] from comment #75)
> > (In reply to de ff from comment #73)
> > > is a fix of this ever going to happen?
> > 
> > We indeed landed bug 952998 in Firefox 29 that at least mitigates the
> > problem. You can give Firefox Aurora a try for some days and see if that
> > improves your situation.
> 
> Does the fix clean up existing sessionstore.js files or only prevent new
> "junk" from accumulating in the file? I'm running Firefox 29 and it does not
> appear to be shrinking the file any (mine is over 80 MB), but it is also not
> updating changes to the file either.

The bug referenced should reduce the pauses to collect the data, not reduce the data saved itself.  See other bugs referenced in the dependency tree.  Not that I'm not waiting myself... (having identified this problem)
(In reply to Christopher D. Coleman from comment #76)
> Does the fix clean up existing sessionstore.js files or only prevent new
> "junk" from accumulating in the file? I'm running Firefox 29 and it does not
> appear to be shrinking the file any (mine is over 80 MB), but it is also not
> updating changes to the file either.

It prevents new stuff from accumulating. If you access/select a restored tab so that it loads we will re-collect information and throw away possibly stored DOMSessionStorage data.

(In reply to Randell Jesup [:jesup] from comment #78)
> The bug referenced should reduce the pauses to collect the data, not reduce
> the data saved itself.

It does both actually. We previously saved DOMSessionStorage data for all history entries and now only do it for the active one.
Flags: firefox-backlog? → firefox-backlog+
Would it be possible to add a button to remove all this old junk at once from all tabs into about:sessionstore? 
Having to click on hundreds of tabs one by one isn't very practical.
We could possibly have this in about:support or a preference.
A discussion about changes/improvements to the Session Restore (sessionstore):
https://groups.google.com/forum/#!topic/mozilla.dev.platform/JHrOP3yMgfg
Is this bug still valid?
Yes this bug is still valid and continues at bug 938464
Except the bug is not caused by Instant Search, I get it for *all* Google Searches, even with Instant disabled.
I am also continuing to see this issue and have for a years. I do a log of google searching simultaneously. It has become such a problem I have had to invest in more system memory.
Blocks: ss-perf
Whiteboard: [snappy:p1][leave open] → [snappy:p1][leave open][fxperf]
Untracking this from fxperf in favour of bug 938464.
Whiteboard: [snappy:p1][leave open][fxperf] → [snappy:p1][leave open]
bug 1362058 should have addressed this. If people are still seeing issues on Firefox 59 (out in a week or two) or above, please file new bugs in Firefox :: Session Restore with detailed specifics.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: