Closed
Bug 1493767
Opened 6 years ago
Closed 6 years ago
Firefox freezes when trying to restore many tabs after a crash
Categories
(Firefox :: Session Restore, defect, P3)
Firefox
Session Restore
Tracking
()
People
(Reporter: ori, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: perf, Whiteboard: [fxperf:p3])
Tested with Nightly on Linux.
I have 1000 tabs open.
When Firefox starts up, the recent performance improvements mean it's responsive very quickly.
However, if Firefox crashes, and the special about:sessionrestore tab appears, restoring the tabs via the button, Firefox becomes unresponsive and uses increasingly more memory, to the point that I close it manually, rather than wait.
The next time I start my browser, it's no longer in "session restore" mode, and automatically loads the tabs, so it's a recoverable problem.
Comment 1•6 years ago
|
||
how many windows, and how many pinned tabs? (content is loaded for visible tab in each window, plus all pinned tabs)
Keywords: perf
Whiteboard: [qf]
Updated•6 years ago
|
Flags: needinfo?(ori)
Would be nice to get a performance profile.
I wonder if bug 1474130 will help here, or is the issue elsewhere.
Updated•6 years ago
|
Whiteboard: [qf] → [qf][fxperf]
Updated•6 years ago
|
Whiteboard: [qf][fxperf] → [qf:p1:f67][fxperf]
Comment 4•6 years ago
|
||
Ori - differential profiles (one with the problem, one without) would help. You can install the Gecko Profiler extension by browsing to perf-html.io
You can start the profiler with ctrl-shift-1. You capture a profile with ctrl-shift-2 (the buffer isn't infinite, so do so soon after what you want to capture). Then save the profile either to a file, or via share (and paste the URL into a bug here). When sharing, you can strip URLs from the profile. Thanks
Flags: needinfo?(ori)
Reporter | ||
Comment 5•6 years ago
|
||
The browser becomes non-responsive while it's loading the tabs, so I have no way to stop the profiling.
I can attempt waiting for it to finish, but I doubt it will end.
Flags: needinfo?(ori)
Ori, did you try increasing profiler's buffer size or reduce interval a bit?
That way the time for profiling can be quite long.
Flags: needinfo?(mstange)
And Markus may have other suggestions for profiling.
Reporter | ||
Comment 8•6 years ago
|
||
In case it wasn't clear, the browser hangs whether or not the profiling is active.
Comment 9•6 years ago
|
||
If "Close it manually" means you hit the close widget on the window (or ctrl-shift-q), and it exits (OS doesn't have to be used to kill it), then setting the env var MOZ_PROFILER_SHUTDOWN to a filename to save the profile in may work. If it locks and won't respond to the close box or ctrl-shift-q, then you may be stuck.
Comment 10•6 years ago
|
||
We don't need Ori to jump through hoops to obtain a profile. We can profile our own builds with fewer tabs that cause shorter hangs in this scenario.
Here's a profile where I terminated a content process, with a session of ~40 tabs, 4 of them pinned: https://perfht.ml/2NdlHcw
This triggered a total of about 1 second of jank on the parent process main thread.
Flags: needinfo?(mstange)
Comment 11•6 years ago
|
||
Oh, but those tabs were all loaded. If I restart the browser and only load one tab (plus the pinned tabs that are loaded automatically on restore), and terminate that process, I get "only" 150ms of jank: https://perfht.ml/2NclGWp
Comment 12•6 years ago
|
||
From spending a few minutes staring at these profiles, I don't see an obvious fix. Time seems to be spread quite thinly (though it's also confusing to me that the profiles don't seem to be using e10s as the session store content restores are all in the same process as the parent's stuff). Most of the time seems to be spent creating content viewers and loading/running JS associated with that. A good 150ms of the 1s hang is spent dealing with the SingleFile add-on (not sure what that is or if the reporter has add-ons, given the profiles aren't from the reporter). Otherwise, I'd be happy to believe that we avoid some work when restoring on startup that we don't when restoring via about:sessionrestore, but it's really not obvious to me exactly why that is or how to fix it. Maybe :mikedeboer has a better idea of how this could be addressed.
Given this primarily affects people with a lot of tabs, and that there's no obvious fix without a bunch of digging, I'd be inclined to mark as fxperf:p2 or p3 - but this is a qf:p1, and I don't really understand if that means it needs to be a fxperf:p1 as well or not. :mconley, wanna weigh in?
Flags: needinfo?(mdeboer)
Flags: needinfo?(mconley)
Whiteboard: [qf:p1:f67][fxperf] → [qf:p1:f67][fxperf:p3]
Comment 13•6 years ago
|
||
This seems strange - restoring the session should be pretty cheap because we should be creating lazy background tabs that restore on demand (like we do doing start-up).
mstange, when you reproduce this, do you see things like:
"[bug 1345098] Lazy browser prematurely inserted via '${name}' property access" in the Browser Console?
Or, if you don't have time to test, can you quickly throw together some STR here? Anytime I crash the browser with some set of tabs, and then restart it, I get an empty session (though I can restore some of the session with a button in the App Menu). This, despite the fact that my testing profile is configured to restore the previous session automatically - though that might be a separate bug. :/
Flags: needinfo?(mconley) → needinfo?(mstange)
Comment 14•6 years ago
|
||
Oops, I think I did something completely different to what comment 0 describes. My STR was:
1. Load many tabs
2. Terminate a content process.
3. Experience jank in the parent process.
No restarting was involved. So it was a completely different bug.
I'll try to reproduce the restarting behavior now.
Flags: needinfo?(mstange)
Comment 15•6 years ago
|
||
I'm having no luck getting the crash reporter to show on demand. Even executing this on the error console doesn't bring up the crash reporter on an official Nightly on macOS for me:
Cu.import("resource://gre/modules/ctypes.jsm");var badnumber = new ctypes.intptr_t(8);var badptr = ctypes.cast(badnumber, ctypes.PointerType(ctypes.int32_t));var crash = badptr.contents;
Comment 16•6 years ago
|
||
Given that the issue here is with what happens *after* the crash, when we restart the browser, presumably we can test this case without actually crashing the browser, by just setting the right prefs / json files to make session store think we're restoring from a crash, and then running a profile? Mike, how would we go about doing that?
Comment 17•6 years ago
|
||
Do you use any (session management) add-ons? Because this kind of smells like bug 1422436 and bug 1422436 comment 16
Flags: needinfo?(ori)
Reporter | ||
Comment 18•6 years ago
|
||
(In reply to :Gijs (he/him) from comment #17)
> Do you use any (session management) add-ons?
I don't use Tab Session Manager. I have OneTab, but I was under impression it only does anything when I ask it to.
I can disable it and try again, but as you and others mentioned, it's hard to make Firefox think it's recovering from a crash. Is there an easy way to achieve that? (other than searching for e.g. pages that crash WebRender)
Flags: needinfo?(ori)
Reporter | ||
Comment 19•6 years ago
|
||
OK, OneTab is the culprit! I tested several times with and without it.
I will pass this on to the developers. This bug can be closed, as far as I'm concerned.
Comment 20•6 years ago
|
||
Thanks Ori!
I'll close this, but I do wonder if there's something we should do to make it easier for tab management add-ons to not be completely swamped by tab creation events if we create tabs after startup from this crash recovery page. Obviously, it's not a super common thing (hopefully!) so it's easy for add-on devs to miss, but unfortunately the effects are also really bad, and of course that's really not what you want when recovering after a crash - the resulting experience shouldn't then involve serious hangs...
Maybe :aswan knows if there's anything we can do to make this story better from our side. I guess we could at a minimum add a warning to our docs for the webextensions `tabs` API? Not sure if there's actual code changes we could make.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(mdeboer) → needinfo?(aswan)
Resolution: --- → WORKSFORME
Comment 21•6 years ago
|
||
This has come up before, specifically bug 1433879. I don't know of an easy solution here, a lot of stuff has been dumped into Tab objects over the years (https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/tabs/Tab) Since all those properties may then be accessed synchronously in the extension process, I don't think we don't have any realistic alternative to fully constructing them and sending them over to the extension process when processing tab events. If we were starting from scratch, I think we would design this API quite differently.
ccing Kris, perhaps he has other ideas.
Flags: needinfo?(aswan)
Updated•3 years ago
|
Performance Impact: --- → P1
Whiteboard: [qf:p1:f67][fxperf:p3] → [fxperf:p3]
You need to log in
before you can comment on or make changes to this bug.
Description
•