User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:18.104.22.168) Gecko/20070725 Firefox/22.214.171.124
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:126.96.36.199) Gecko/20070725 Firefox/188.8.131.52
I've been getting a number of bug reports against the Session Manager extension detailing issues loading sessions with large numbers of windows and tabs. The following issues have been documented:
2. Not all tabs/windows will load if there are a large number of windows and tabs in a session. This seems to happen haphazardly. May be related to #1 above or the memory issue below.
Something else I've noticed is that if I load a session with a fair number of windows and tabs it may load relatively quickly and only use about 100 MB of memory. If I load it again, it will load slower and Firefox will use more memory. Load it again and it will use even more memory. I managed to get a nearly clean copy of Firefox to use around 1 GB of memory just by repeating this process (even after closing all open windows/tabs). This seems to indicate a memory leak, though it is probably in the browser code and not the SessionStore code.
See also the following:
I do realize that the processing power and memory size will limit Firefox's ability to restore tabs and windows, but I ran tests on a fairly powerful machine and still had issues so it appears the functionality could be optimized.
Steps to Reproduce:
1. Open tons of windows and tabs and then restart Firefox
Takes a very long time to load, even on a fairly fast machine with lots of memory. Sometimes not all tabs/windows will load.
All tabs/windows should be restored in a reasonable amount of time.
I ran my tests on 2 different machines:
Pentium 4 3 GhZ - 1 GB RAM
Pentium M 1.86 GHz - 2 GB RAM
Test results were similar.
Please file one bug per individual issue and make them block this meta-bug.
What will also help is if you could attach a sessionstore.js which causes trouble as a reference.
Created attachment 279193 [details]
Zipped sessionstore.js with 24 windows and 228 tabs.
Well I had a very good one that someone sent me, but it contained personal data so I ended up erasing it. I put together something on my own that's not as good, but it does have 24 windows and 228 tabs. I'm trying to get someone to send me a better example.
When I tested this in Firefox 184.108.40.206 with no plugins (searching for plugins disabled) or addons installed, it loads (eventually).
After it loaded, I closed out all the windows and tabs and cleared all the private data and Firefox was using 75 MB of memory (63 MB VM). Compared with a starting browser state of 24 MB (13 MB VM) and 222 MB (211 MB VM) with all the windows/tabs open. Like I mentioned I think this is a memory leak in Firefox itself and not the Session Store API, but there are a large number of memory leak bugs already open.
Created attachment 279194 [details]
Starting memory use
Created attachment 279195 [details]
Memory use after 24 window and 228 tabs restored
Created attachment 279196 [details]
Memory use after all but 1 tab/window closed and browser data cleared
Created attachment 279228 [details]
A good example of a sessionstore.js file that causes severe performance issues
Someone sent me this stored session. It only has 10 windows with a total of 96 tabs, but it nearly brings Firefox to its knees because it is about 2.1 MB in size (mostly seems to be cookies).
lots of js tabs can be problematic IMO
might networking load cause problems, both external (as in causing a server to throttle you) and internal (as in windows or gecko not been efficient enough for the load)? I have an example that needs retesting. On big session restarts on a beefy laptop (wireless to cablemodem) I generally have no problem. But when it was logged in through a VPN, forcing all traffic through the VPN box and a different DNS server - potentially slowing the pipe as it were - then I had trouble with restore. Many of my tabs failed to load perhaps because of dns timeouts, these tabs gave a message to the effect of site or server not found.
another thought - might it be possible, in extreme cases, to prompt the user for how much to restore in order to improve startup time? For example ignore tab history, don't load images (ala the netscape days), etc
*** Bug 471089 has been marked as a duplicate of this bug. ***
*** Bug 498179 has been marked as a duplicate of this bug. ***
Would throttling the number of windows/tabs that open per second work here? Currently the restoreWindow function just opens the windows and restores things as fast as it possible can. If there was some kind of artificial user settable delay added I think that would satisfy cases where people either don't have powerful enough machines or fast enough connections to load all windows and tabs nearly simultaneously.
Another possibility would be to open all the windows and tabs just as it's done now, but instead of actually loading the tabs, simply store the session state in the tab and have a button which when clicked restores the tab's state. Basically it would behave similar to the reload button page that shows up when a tab fails to load, except in this case when the button is pressed it would restore the session data into the tab.
It is questionable whether Bug #498179 should be marked as a duplicate of this bug due to (a) time difference (> 1 year); (b) Firefox/Mozilla version changes (2.0 --> 3.0); and (c) different operating systems (Windows vs. Linux).
It is also true that if one uses "old" or "variable" URLs (news pages, etc) in the problematic sessions, it may be difficult or impossible to reproduce the problematic sessions (the load a URL places on the network and/or CPU can vary widely over time).
With all of the above as caveats, I generally agree with Michael on possible solutions. These may include:
(1) Mostly static tab-state gifs (Unaccessed / Partially loaded / Complete) which indicate tab load state. These could be nothing more than a colored square which changes color/brightness as the estimated page load completes. Minimize the X communications and CPU load as much as possible during session restarts -- the network & CPU can be maxed for 5-15 minutes for large sessions -- no need to make additional work (and increase global warming) just for "eye candy".
(2) There should be user-constrainable (prefs.js) settings on *both* max-CPU use and max-network use. I believe that Opera now has user determined constraints on the network use which is an advantage in my book over Firefox. When I restore a complex session priority should last-in, first-out (pages most recently accessed should appear first), excepting that an active tab (and secondarily the window it is in) should move to the "head" of the pending network & CPU queues. Being able to simply constrain session restores to 1 active network connection (DNS lookups and page I/O's) would go a long way towards fixing this problem until a more permanent solution could be developed (If in addition the spinners are "static" on all non-active tabs.)
It should be noted (relating to #'s 2 & 3) that Opera seems to be way ahead of Firefox in these areas in that it seems to now have the concept of "effective" browsing for slow connections (with user control over how this works). It should be fairly obvious that "slow" connections can also equate to "congested" connections (3G & 4G phone networks come to mind). If one is browsing google result pages one is largely going to know whether the page is of interest by looking at the first paragraph or two (the same is true for academic abstracts) -- the goal for both active browsing as well as session restores should be to get as much critical information on the screen as fast as possible and leave the bells and whistles as background (as time/bandwidth permits) pursuits.
It would be nice to get some movement on this bug, though if throttling isn't going to be used, then to really fix it properly would probably require threading.
Can someone please use something like Shark or DTrace or CodeAnalyst to figure out where time is *actually* being spent loading the testcase? That's the first step to making progress here.
Unfortunately I don't think any of those will run on my machine which is a Windows machine with an Intel processor.
CodeAnalyst should still work.
It may be worth reading Chrome Issues #32061, #32165 and #30933, which deal with the same problem -- namely that session restores pay no attention to the resources they use and the load they place on the system.
My short way to fix this:
2) Change the "spinner"/"throbber" from an active icon to a staged set of static gifs (display a different image at different stages of the load process).
3) Set a maximum active thread limit which is constrained by the network bandwidth (e.g. dial-up << DSL << cable << FIOS+). One could expand this to include a max-CPU limit (Firefox load <= X% (where X=60-80?) of available CPU).
One problem with attempting to reproduce this using "non-local" pages is that one has no control over the connection timeout settings on the web servers. Web servers will timeout (and effectively hangup) web connections which are open but not responding (as will be the case with a busy session restore). One can imagine that busy web servers set these connection timeouts to shorter periods than less loaded web servers. But the server managers are free to vary these timeouts on hourly, daily or monthly basis so precise reproduction of this problem is a very unlikely situation. Using local pages is unlikely to provide the same symptoms as the DNA lookups and page downloads are likely to be very fast and one will saturate the network link for a brief period followed by saturation of the CPU while the pages are redrawn.
There is one interesting difference between chrome and firefox in this area -- a chrome restore is somewhat less likely to saturate the network link, I suspect because the multi-process startup/switching tends to generate some periods where fewer network requests are pending.
1. It is also worth noting that people would care a *lot* less how long a session restore took if there were a concept of a "priority" (active) window/tab which pre-empted all the other windows/tabs/threads. I don't care if a session restore takes an hour as long as the new window I just created behaves like its in a brand new browser session.
Created attachment 543578 [details]
JProf of startup with the last testcase
jprof of startup with profile from attachment 279228 [details]; trunk build pulled today.
Not sure I'd categorize it as "severe performance problems" with current builds
Since the most recent comment, we've moved to a model where tabs are not actually loaded until accessed. Michael, can you test with a Nightly build?
Taras: I'm removing the "P1" here because the data from Test Pilot shows that test-cases like this are *far* outside the norm.
(In reply to Dietrich Ayala (:dietrich) from comment #19)
> Since the most recent comment, we've moved to a model where tabs are not
> actually loaded until accessed. Michael, can you test with a Nightly build?
> Taras: I'm removing the "P1" here because the data from Test Pilot shows
> that test-cases like this are *far* outside the norm.
Can we get telemetry probes to confirm this? Test pilot is far from representative of overall population, telemetry is slightly better and will get better once we do it by default on nightlies.
(In reply to Taras Glek (:taras) from comment #20)
> (In reply to Dietrich Ayala (:dietrich) from comment #19)
> > Since the most recent comment, we've moved to a model where tabs are not
> > actually loaded until accessed. Michael, can you test with a Nightly build?
> > Taras: I'm removing the "P1" here because the data from Test Pilot shows
> > that test-cases like this are *far* outside the norm.
> Can we get telemetry probes to confirm this? Test pilot is far from
> representative of overall population, telemetry is slightly better and will
> get better once we do it by default on nightlies.
Yea, let's figure out what exactly we need and make that happen in bug 671041 (if it makes sense), though I'm wont to believe the Test Pilot numbers Dietrich is talking about.
Dietrich I'd like to keep this as P2 until we have telemetry data that can show otherwise. Marking as P2 because it shouldn't block other work, but it would be nice to have this.
This is a meta bug that doesn't go into the backlog.
Taras, David, what do you both think about closing this bug? We restore on demand by default for a while now and since this bug was filed sessionstore also started to load at most three tabs concurrently when restoring a multi-window session. I'm not sure what telemetry measurements exactly we were talking about here but I don't know of any cases where single tabs don't load or show timeout dialogs since we have cascaded restore.
(In reply to Tim Taubert [:ttaubert] from comment #24)
> Taras, David, what do you both think about closing this bug? We restore on
> demand by default for a while now and since this bug was filed sessionstore
> also started to load at most three tabs concurrently when restoring a
> multi-window session. I'm not sure what telemetry measurements exactly we
> were talking about here but I don't know of any cases where single tabs
> don't load or show timeout dialogs since we have cascaded restore.
This is more of a Vladan question.
glandium used to keep very large sessions until recently and he says that he has seen more than 3 or 4 tabs being loaded simultaneously, as well as some tabs not loading at all (with "server not found" errors)
I think this bug is still valid. Can we try reproducing this again?
Ok, let's put this into the backlog and find some time to investigate whether this is still valid and reproducible.