Session store corrupted after a shutdown hang crash
Categories
(Firefox :: Session Restore, defect, P1)
Tracking
()
People
(Reporter: glandium, Unassigned)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: regression)
I don't have large amounts of details yet, and I'll try to get more tomorrow, but with Firefox 67, I get shutdown hang crashes, for some reason, and after that, restarting Firefox leaves me with only half of my tabs, with a session store jsonlz4 file that is 1.8MB instead of >8MB.
Comment 1•5 years ago
|
||
That's bad and worst thing is that I've heard this before - but nobody has been able to give me an STR... that would be the absolute most useful thing here - I need to figure out what's stopping the session data collection. JS errors (like you gave me in bug 1553413 - thank you!!), anything in the console, etc.
You can get more console log spew by setting browser.sessionstore.debug
to true
.
When you can get back to me with more info, I'll bump this to a P1 of course, because then I can take it on.
Reporter | ||
Comment 2•5 years ago
|
||
So far, I've had it happen 3 times, but now that I think I found how I got shutdown hangs (kind of self inflicted from a custom module that apparently creates problems), I'm failing to reproduce the sessionstore corruption :(
You can get more console log spew by setting
browser.sessionstore.debug
totrue
.
Did this, as well as set devtools.console.stdout.chrome
to true
Hopefully, if that happens again, even during shutdown, I should have some (persisted) logs (stdout is supposed to be available in the xsession logs).
Reporter | ||
Comment 3•5 years ago
|
||
(stdout is supposed to be available in the xsession logs).
Except it seems it's not :( stderr is.
Reporter | ||
Comment 4•5 years ago
|
||
I should note that bug 1551366 may of may not be related, but happens on a different OS in different circumstances (but I do have observed shutdown crashes before upgrade on the machine that got bug 1551366)
Reporter | ||
Comment 5•5 years ago
|
||
So it happened again to me, not even involving the self inflicted problem from the custom module, but I got absolutely no log out despite all the settings I switched. Because of comment 3, essentially :(
Reporter | ||
Comment 6•5 years ago
|
||
Relatedly, I saved my entire profile before quitting, in case this would happen, because I didn't want to lose my session, then quit firefox, which triggered this bug, restarted, which left me with less tabs (like, this time, I got 15 tabs back... from > 3000), then quit firefox again, and restored my profile. On the next startup, the restored session was definitely not what I had when I was saving my profile, but something much older, like from > 12 hours ago. The history, however, was fresh.
I subsequently tried to reproduce this "wrong restore" issue, by opening new tabs, saving the profile again, killing firefox, opening firefox, quit firefox, restore profile, restart firefox. And that didn't reproduce it. So I'm starting to wonder if that is a separate symptom of this bug.
I'll also note that in all these extra attempts, I've been running firefox from a terminal, and still wasn't able to see anything related to sessionstore show up on stdout. In fact, it seems devtools.console.stdout.chrome
only redirects console.log
from chrome to stdout, and doesn't cover anything that would use the xpcom interfaces to log.
Comment 7•5 years ago
|
||
So perhaps it'd be useful to share that profile with me at this point - if you still have it? I can only promise that I won't look at the data it contains, but merely at code as it steps through, but I'm not sure if that's enough assurance.
(BTW, from the story you're telling here, it seems more likely that some async process is not being resolved, halting the restore. Not a JS error or another thrown exception.)
Reporter | ||
Comment 8•5 years ago
|
||
Ok, so I had it happen again, and I now have:
- a profile backup before quitting
- a profile backup after quitting
- a profile backup after restarting
(and restoring the first backup had the same effect of taking me back to a tab list of a few hours ago)
I'd rather not share any of those, but I can certainly look at the differences. What specific files should I focus on?
Reporter | ||
Comment 9•5 years ago
|
||
I should note that these two last occurrences were a little different in that when I started Firefox again, I got a about:sessionrestore page (because of crash), which I didn't originally, and that restoring from there gave me a 15-tabs session, instead of > 3000.
Reporter | ||
Comment 10•5 years ago
|
||
The good news is that I can reproduce reliably starting from the second backup, so eventually it should be possible to find out what's wrong about it.
Reporter | ||
Comment 11•5 years ago
|
||
Differing files between the first two backups:
SiteSecurityServiceState.txt | 18 +++++-----
cookies.sqlite |binary
cookies.sqlite-wal |binary
datareporting/aborted-session-ping | 1
datareporting/archived/2019-05/1559121783202.189bcd8e-9510-4152-b682-b8910464074c.main.jsonlz4 |binary
favicons.sqlite |binary
favicons.sqlite-wal |binary
places.sqlite |binary
places.sqlite-wal |binary
prefs.js | 2 -
sessionCheckpoints.json | 2 -
sessionstore-backups/recovery.baklz4 |binary
sessionstore-backups/recovery.jsonlz4 |binary
storage/default/https+++twitter.com/.metadata-v2 |binary
storage/default/https+++www.youtube.com/.metadata-v2 |binary
storage/default/moz-extension+++2f1ef354-2018-4f19-a247-46a7afa99b3c^userContextId=4294967295/.metadata-v2 |binary
storage/default/moz-extension+++c4ffd6fc-8c3f-41c9-a138-1f01d51c04cb^userContextId=4294967295/.metadata-v2 |binary
storage/default/moz-extension+++d67428eb-c0e7-4ce9-90be-d760a6448268^userContextId=4294967295/.metadata-v2 |binary
storage/default/moz-extension+++e0daebab-e175-4d3f-acc4-53b69cd06b71^userContextId=4294967295/.metadata-v2 |binary
storage/permanent/chrome/idb/1657114595AmcateirvtiSty.sqlite |binary
webappsstore.sqlite |binary
webappsstore.sqlite-wal |binary
xulstore.json | 2 -
Missing in the list is the lock
file, because it's not a real file, but it's not there in the second backup, so I don't know why it starts with about:sessionrestore
.
Reporter | ||
Comment 12•5 years ago
|
||
The second backup (so, after quitting), has this in sessionCheckpoints.json
:
{"profile-after-change":true,"final-ui-startup":true,"sessionstore-windows-restored":true,"quit-application-granted":true,"quit-application":true,"profile-change-net-teardown":true,"profile-change-teardown":true,"profile-before-change":true}
Reporter | ||
Comment 13•5 years ago
|
||
Should I maybe try to instrument SessionStoreInternal._needsRestorePage
to figure out why it starts with about:sessionrestore
?
Reporter | ||
Comment 14•5 years ago
|
||
So it's because there's no sessionstore-final-state-write-complete
in sessionCheckpoints.json
that it starts with about:sessionrestore
, but it's irrelevant, really, because the session is already busted when reaching SessionStoreInternal._needsRestorePage
.
As for the contents of the about:sessionrestore
, they're read from sessionstore-backups/recovery.jsonlz4
(because sessionstore.jsonlz4
doesn't exist), which is only 323k in that second backup (after quitting), while it was 8.8M in the first backup (and recovery.baklz4
is that size in the second backup, since that's a copy of that file).
sessionstore-backups/recovery.jsonlz4
is well formed, which is why it's loaded successfully, but only contains data for 72 tabs spread over 4 windows, while sessionstore-backups/recovery.baklz4
has 3006 over 5 windows. So it's a case of something bad happening during shutdown, rather than at startup.
Reporter | ||
Comment 15•5 years ago
|
||
That's also why starting from the first backup gives me a session that is very old: sessionstore-backups/recovery.jsonlz4
in that profile is 8 hours older than the most recent files in the profile.
Reporter | ||
Comment 16•5 years ago
|
||
If I look at my currently running profile (a continuation of that first backup), which I restarted late yesterday, and sessionstore-backups/recovery.jsonlz4
is fresh (and regularly updated).
This suggests that at some point, for some reason the session store stops being updated. Then, whether that's related or not is not clear, but when shutting down, the recovery json is saved incomplete.
Some of this could be related to the other bugs I filed about error messages I was seeing in the browser console (bug 1553412, bug 1553413).
Reporter | ||
Comment 17•5 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #16)
Some of this could be related to the other bugs I filed about error messages I was seeing in the browser console (bug 1553412, bug 1553413).
I've confirmed half of it: I observed both set of errors in the browser console starting a few minutes ago, and the recovery file stopped being updated. I then proceeded to open some new tabs and manually crash Firefox (kill -11) to see how it would go on a restart, and it went on restoring the state from before I opened the new tabs (well, from when the errors started happening).
I don't know whether I would have lost part of the session had I quit the browser rather than kill it. I'm waiting for the next time I see the errors in the browser console. An interesting observation from that last experience is that it seemed to have started occurring after I opened a new tab via a third party application executing firefox url
, but after restarting, doing that didn't trigger anything.
Reporter | ||
Comment 18•5 years ago
|
||
Okay, so I had this happen again. That is, I observed the messages from bug 1553412 and bug 1553413 in the browser console, and double checked that recover.jsonlz4
had stopped being updated (which it had). This time, I quit Firefox normally (Ctrl+Q), after which recovery.jsonlz4
was truncated again.
I restored recovery.jsonlz4
from recovery.baklz4
, then restarted Firefox, which then presented me with about:sessionrestore
again. I didn't check sessionCheckpoints.json
this time, but I assume sessionstore-final-state-write-complete
was missing.
While the fix in bug 1553413 would certainly help my case, it feels something should be done about the fact that apparently, something bad happening shouldn't leave the session store in a bad state. For one, recovery.jsonlz4
should still update during the Firefox session, and it also shouldn't be truncated when exiting.
Reporter | ||
Comment 19•5 years ago
|
||
New data. I was able to reproduce with the latest nightly, by doing the following:
- patched
modules/sessionstore/PrivacyFilter.jsm
in omni.ja with:
@@ -9,6 +9,8 @@
ChromeUtils.defineModuleGetter(this, "PrivacyLevel",
"resource://gre/modules/sessionstore/PrivacyLevel.jsm");
+const {Services} = ChromeUtils.import("resource://gre/modules/Services.jsm");
+
/**
* A module that provides methods to filter various kinds of data collected
* from a tab by the current privacy level as set by the user.
@@ -43,6 +45,10 @@
* @return object
*/
filterFormData(data) {
+ if (Services.prefs.getBoolPref("browser.foo", false)) {
+ throw new Error("foo");
+ }
+
// If the given form data object has an associated URL that we are not
// allowed to store data for, bail out. We explicitly discard data for any
// children as well even if storing data for those frames would be allowed.
- started this modified Firefox with my profile
- Open some tabs, and witness that
recovery.jsonlz4
is updated - Go to
about:config
, create thebrowser.foo
boolean pref and set it to true - Open Browser console
- Open some more tabs
- Wait for "Error: foo" to show up in the Browser console
- See that
recovery.jsonlz4
hasn't changed. - Quit Firefox
- See that
recovery.jsonlz4
is now truncated.
Reporter | ||
Comment 20•5 years ago
|
||
Everything up to the last step is reproducible with a fresh profile. Only the truncation doesn't happen, and might be related to something in my profile. I'll try to figure that one out.
Reporter | ||
Comment 21•5 years ago
|
||
Actually, correction. I did lose a tab in an attempt with a fresh profile, creating a few tabs. Which does feel similar to my last experiences with bug 1551366.
Reporter | ||
Comment 22•5 years ago
•
|
||
Aha. So, the problem only happens when there's an about:config
tab (and possibly some other about:something
tab. EDIT: this may require the page to contain a form), presumably because that's what it takes to trigger the PrivacyFilter.jsm
code. And the tabs that disappear are all the ones after (and including) the about:config
tab.
So what seems to be happening is that under normal circumstances, an exception happening during session store prevents the entire store from happening, but on shutdown, the store still happens, with the state built until the exception happened.
Comment 23•5 years ago
|
||
Alright, that sounds very interesting, Mike, and something that Alphan needs to check out, because this is likely due to his changes!
Alphan, Mike has given quite a number of hints in comment 19 - 22, which you can use to analyze what kind of value for formdata would make PrivacyFilter.jsm choke.
Updated•5 years ago
|
Comment 24•5 years ago
|
||
It could be related to bug 1549973.
I will check it in detail.
Comment 25•5 years ago
•
|
||
(In reply to Alphan Chen [:alchen] from comment #24)
It could be related to bug 1549973.
I will check it in detail.
Right now, I cannot reproduce the symptom.
Since the symptom happens on 67 and after 67, so it is not related to
Bug 1474130(Firefox 68) and Bug 1549973(Firefox 69).
In 67, what I did is rewrite the sessionStore collect() and restore() from JS to C++.
The related patch I have are the followings:
Bug 1497147 - - Rewrite "mapFrameTree" implementation to C++
Bug 1507286 - Convert content-sessionStore.jsm to C++
Bug 1497146 part 3 - Convert FormData.jsm to C++ [restore() part]
Bug 1497146 part 2 - Convert FormData.jsm to C++ [collect() part]
Bug 1497146 part 1 - Add SessionStoreUtils.webidl
Bug 1497144 part 2 - Get rid of ScrollPosition.jsm
Bug 1497144 - Rewrite DocShellCapabilities.jsm and ScrollPosition.jsm into C++
Comment 26•5 years ago
|
||
Update status:
I cannot reproduce the symptom by the steps provided by comment 19.
Then I try to simulate the symptom Mike met.
By adding a "formdata:null" into the TabState purposely, I can see the failed symptom.
After applying the patch of bug 1553413, the symptom disappears.
Hi Mike,
there are somethings that needed your help.
- Could you patch "PrivacyFilter.jsm": in filterFormData(data), do an early return when data is null.
Just the same as the patch of bug 1553413? - Could you share the steps of comment 22?
Look like your behavior could trigger a tabState data with "formdata:null" and the tab is "about:config".
I cannot produce a TabState with "formdata:null".
Comment 27•5 years ago
|
||
(In reply to Alphan Chen [:alchen] from comment #26)
Then I try to simulate the symptom Mike met.
By adding a "formdata:null" into the TabState purposely, I can see the failed symptom.
One more thing, I can see the log as bug 1553412 and bug 1553413 described in this case.
Comment 28•5 years ago
|
||
Also, it is also weird that we will have "formdata:null" in TabState.
In theory, we will delete the entry if the value is null when updating TabState.
https://searchfox.org/mozilla-central/rev/f8b11433159cbc9cc80500b3e579d767473fa539/browser/components/sessionstore/TabStateCache.jsm#165
Reporter | ||
Comment 29•5 years ago
|
||
(In reply to Alphan Chen [:alchen] from comment #26)
Update status:
I cannot reproduce the symptom by the steps provided by comment 19.
Then I try to simulate the symptom Mike met.
By adding a "formdata:null" into the TabState purposely, I can see the failed symptom.
After applying the patch of bug 1553413, the symptom disappears.Hi Mike,
there are somethings that needed your help.
- Could you patch "PrivacyFilter.jsm": in filterFormData(data), do an early return when data is null.
Just the same as the patch of bug 1553413?
I already applied the patch from bug 1553413 and I haven't had the problem since.
- Could you share the steps of comment 22?
Look like your behavior could trigger a tabState data with "formdata:null" and the tab is "about:config".
I cannot produce a TabState with "formdata:null".
comment 22 is a followup to comment 19. The patch to purposefully and pseudo-manually trigger the problem was still involved. I'm still not sure exactly how I ended up with formdata:null in filterFormData, but an about:config tab was how I got in filterFormData at all when I tried to reproduce it.
Now, the problem is that since the session store is never saved when it started happening, the best I have is the last state of my session before bug 1553413 happened, as well as the truncated state after quitting. I suppose the tabs might be in the same order in both stores and maybe that could give me a hint as to what was broken.
In any case, I don't think relying on nothing going awry when getting one tab's state is a good thing, as apparently hitting this or similar issues on two different machines with two very different profiles shows. I guess what bugs me the most is the fact that a truncated state is being saved (and also the discrepancy... why is no state saved at all before quitting?)
Comment 30•5 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #29)
Now, the problem is that since the session store is never saved when it started happening, the best I have is the last state of my session before bug 1553413 happened, as well as the truncated state after quitting. I suppose the tabs might be in the same order in both stores and maybe that could give me a hint as to what was broken.
In any case, I don't think relying on nothing going awry when getting one tab's state is a good thing, as apparently hitting this or similar issues on two different machines with two very different profiles shows. I guess what bugs me the most is the fact that a truncated state is being saved (and also the discrepancy... why is no state saved at all before quitting?)
Hi Mike,
The current symptom we focused on is why a truncated state is being saved when the error happened.
It is more related to error handling.
My patch is about collect() and restore() rewriting.
Could you have some inputs here?
Comment 31•5 years ago
|
||
I don't know what's going on here, but I just lost all my tabs/windows for about the 5th time since FF 67. Extremely frustrating. This time on linux, but it's happened on windows too.
Comment 32•5 years ago
|
||
[Tracking Requested - why for this release]:
Losing user data is bad; would be good to track this issue.
Comment 33•5 years ago
|
||
I agree that this bug isn't great, but it's also not a new issue in 69 nor does it seem to have very solid steps to consistently reproduce. While I'd be open to accepting a patch for uplift, I don't see much value in tracking either.
Updated•5 years ago
|
Comment 34•5 years ago
|
||
(In reply to Jonathan Nicol from comment #31)
I don't know what's going on here, but I just lost all my tabs/windows for about the 5th time since FF 67. Extremely frustrating. This time on linux, but it's happened on windows too.
Can you please file a new bug so with a precise STR so that we can investigate and track your issue properly?
Mike's problem went away after bug 1553413 was fixed, so it's good to consider this one a dupe. Thanks so much for joining us in the investigation, Mike, you discovered and helped resolve an important issue!
Updated•5 years ago
|
Updated•2 years ago
|
Description
•