Closed
Bug 807102
Opened 12 years ago
Closed 12 years ago
Beta and Aurora lock up when using a profile that's recently used Nightly
Categories
(Core :: Security: CAPS, defect)
Tracking
()
People
(Reporter: mconley, Unassigned)
References
Details
Attachments
(2 files)
251.19 KB,
text/plain
|
Details | |
988 bytes,
patch
|
bzbarsky
:
review+
akeybl
:
approval-mozilla-aurora+
akeybl
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
I don't have further information about this, and I'm about to head home, but I want to get it down:
I've been using Nightly primarily for the past few months - though I frequently shift back and forth between Beta and Aurora (using the same profile) to test things, etc.
Starting yesterday, or perhaps the day before, I've noticed that if I shift to Beta or Aurora from Nightly, Firefox locks up on start-up.
I haven't been able to repro on my Linux machine, so I can't attach gdb - and debugging on Windows is not something I know how to do yet, so no dice there.
I know a few others have been experiencing this too (both Windows and OSX) - I've CC'd them on this bug.
Comment 1•12 years ago
|
||
FWIW, here's a sample from Activity Monitor when my local (debug) build from the Aurora tree hangs on startup.
Reporter | ||
Comment 2•12 years ago
|
||
Hey Felipe - I'm Cc'ing Hilary, who's also been hitting this one for the past few days.
Let us know if there's any other data you need from us. Thanks!
Comment 3•12 years ago
|
||
This matches what I saw last week: Stuck somewhere under ReadAnnotationEntry. Interestingly, the function is gone in nightlies. It was removed in bug 789224.
Comment 4•12 years ago
|
||
Looking briefly, I'm guessing the format of nsPrincipal data changed in object streams?
Comment 5•12 years ago
|
||
Since bug 789224's patch 4 appears to be the bug which causes this hang, I'm going to move this to a component where we can maybe get some more visibility to this bug than an untriaged component.
Updated•12 years ago
|
Comment 6•12 years ago
|
||
Hrm, that's bad. Bug 789224 just removed the code that read/wrote principals as prefs. So the Nightly builds just wouldn't be reading/writing in this way at all. I'm surprised that breaks stuff, but I don't know that code much. I just removed it. :-)
Can someone confirm that this bug can in fact be triggered by using the same profile for a build before+after bug 789224?
Keywords: qawanted
Comment 7•12 years ago
|
||
> Can someone confirm that this bug can in fact be triggered by using the same
> profile for a build before+after bug 789224?
Yeah I bisected that. With a profile saved with a current nightly, older builds lock up until http://hg.mozilla.org/mozilla-central/rev/d7c0dc5df119, and after that cset they don't.
Comment 8•12 years ago
|
||
BTW you don't need an entire profile, just the sessionstore.js file. And the hang happens when this is called from SessionStore.jsm: http://mxr.mozilla.org/mozilla-central/source/browser/components/sessionstore/src/SessionStore.jsm#3419
Comment 9•12 years ago
|
||
Oh, I see! This isn't related to reading/writing principals from prefs. This is related to the implementation of nsISerializable, which is used by session store.
Do we have a general story for what happens when the layout of a serializable object changes? I know that Java fixes this with a kSerialVersionID or something of the sort. Do we have a way to indicate that the format has changed?
Comment 10•12 years ago
|
||
No, we do not. If you plan to change the layout of serializables, you have to include some sort of version identifier in the serialization (or out-of-band) yourself. I really wish I had for nsPrincipal. :(
Comment 11•12 years ago
|
||
One option is to change the serialization on trunk in such a way that older build will just fail to deserialize when reading it...
Comment 12•12 years ago
|
||
Actually, I bet we can just rev the CID on nsPrincipal, which should cause ReadObject to abort on the do_CreateInstance call, before any of the sensitive stuff has been read. Let me put together a patch.
Comment 13•12 years ago
|
||
Theoretically, this patch should do the trick. However, it would need to be applied to Nightly, aurora, and beta at this point, since those branches all contain the new format. The incompatibility in data formats only lasts until the next branch.
I'm wondering, though. Could this be causing a much wider degree of corruption in session history? What happens when users upgrade? We might well want to take this patch just to be safe.
Can someone who has reproduced the problem confirm that this patch does the trick?
Updated•12 years ago
|
Attachment #689089 -
Flags: review?(bzbarsky)
Comment 14•12 years ago
|
||
Comment on attachment 689089 [details] [diff] [review]
Rev the CID on nsPrincipal. v1
r=me
Attachment #689089 -
Flags: review?(bzbarsky) → review+
Comment 15•12 years ago
|
||
I've verified that this fixes the problem here. I can't reliably start a new session and end up with corrupt data, but I do have a copy of a sessionstore.js file known to cause the problem, so here's what I did to test:
1 - copy that file to the profile
2 - start nightly built from current m-c tip and close it so that hopefully the same data is serialized and written again
3 - start firefox 17 release: it hangs
and then I redid it replacing 2 with a build from m-c tip + this patch, and firefox 17 doesn't hang.
This means there will be some potential dataloss when going with a session from 18+ to 17, right? Which is probably fine I believe (not the entire session is lost, just some binary-encoded parts)
Comment 16•12 years ago
|
||
> This means there will be some potential dataloss when going with a session from 18+ to 17,
And vice versa, when going from 17 to 18+.
> not the entire session is lost, just some binary-encoded parts
Yes. What's lost is the security context of about:blank, data:, wyciwyg:, and javascript: documents. Basically, it would reintroduce bug 389274 across this move. I suspect this is acceptable, personally, given the other options here.
Comment 17•12 years ago
|
||
Comment on attachment 689089 [details] [diff] [review]
Rev the CID on nsPrincipal. v1
Ok, given the confirmation in comment 15, I'm flagging this for m-a and m-b approval. We'll want to land it everywhere simultaneously to avoid unnnecessary breakage of people moving between m-c, m-a, and m-b builds.
[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 789224
User impact if declined: Using the same profile for Release and {Nightly,Aurora,Beta} can cause data corruption in both directions. In particular, principals can end up getting corrupted, which is bad. The current user-visible impact of this change is a hang. This has the potential to corrupt data for users moving their profile from 17 to 18, which is exactly what our entire worldwide audience will do when they upgrade after the next release.
Note that this will cause the data loss bz mentions in comment 16. But this kind of well-understood, localized data loss is much preferable to poorly-understood data corruption, which could lead to hangs and security vulnerabilities.
Testing completed (on m-c, etc.): None, unfortunately. The CID gets written out for session-store, so forcing an incompatibility with a CID rev on Nightly (but not on Aurora or Beta) will cause artificial incompatibilities for users using the same profile for Nightly and {Aurora,Beta}, even though the data format is identical. Given that this is a pretty well-understood change, it seems preferable to land this on all 3 branches at once.
Risk to taking this patch (and alternatives if risky): Least risk of any option, IMO.
String or UUID changes made by this patch: None. CID change, but bz says that's ok.
Attachment #689089 -
Flags: approval-mozilla-beta?
Attachment #689089 -
Flags: approval-mozilla-aurora?
Comment 18•12 years ago
|
||
(In reply to Bobby Holley (:bholley) from comment #17)
> Note that this will cause the data loss bz mentions in comment 16. But this
> kind of well-understood, localized data loss is much preferable to
> poorly-understood data corruption, which could lead to hangs and security
> vulnerabilities.
(In reply to Boris Zbarsky (:bz) from comment #16)
> Yes. What's lost is the security context of about:blank, data:, wyciwyg:,
> and javascript: documents. Basically, it would reintroduce bug 389274
> across this move. I suspect this is acceptable, personally, given the other
> options here.
In https://bugzilla.mozilla.org/show_bug.cgi?id=389274#c0 I see "The text that was in the editor is NOT restored and you cannot type anything in the editor". Is there any way to prevent that? When will the user be able to type something into an affected editor again?
tracking-firefox18:
--- → +
tracking-firefox19:
--- → +
Comment 19•12 years ago
|
||
(In reply to Alex Keybl [:akeybl] from comment #18)
> In https://bugzilla.mozilla.org/show_bug.cgi?id=389274#c0 I see "The text
> that was in the editor is NOT restored and you cannot type anything in the
> editor". Is there any way to prevent that? When will the user be able to
> type something into an affected editor again?
IIUC, the issue there is that the appropriate permissions for the page to be in design mode aren't restored. The user will probably have to manually retoggle design mode.
But this all assumes that the user has a page in design mode (uncommon) in their session history (even more uncommon).
Comment 20•12 years ago
|
||
> Is there any way to prevent that?
Not really.
> When will the user be able to type something into an affected editor again?
After hitting the reload button on that page.
Comment 21•12 years ago
|
||
Comment on attachment 689089 [details] [diff] [review]
Rev the CID on nsPrincipal. v1
Thanks for the explanation - this risk is manageable (and much better than the alternative as bholley points out).
Attachment #689089 -
Flags: approval-mozilla-beta?
Attachment #689089 -
Flags: approval-mozilla-beta+
Attachment #689089 -
Flags: approval-mozilla-aurora?
Attachment #689089 -
Flags: approval-mozilla-aurora+
Comment 22•12 years ago
|
||
Also assigning juanb as the qacontact here to try to find unexpected/critical regressions when moving between Firefox versions (17->18 specifically) and session restore.
QA Contact: jbecerra
Comment 23•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 24•12 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/1e0ca1b0c656
https://hg.mozilla.org/releases/mozilla-beta/rev/d306f97c1da2
status-firefox18:
--- → fixed
status-firefox19:
--- → fixed
Comment 25•12 years ago
|
||
Started 2012-10-30 Nightly with a new profile.
I made this profile dirty with lots of history, bookmarks, add-ons, tabs, pinned tabs, plugin content loaded. I set "When Firefox starts: Show my windows and tabs from last time". Closed Nightly.
With this dirty profile, I started then Firefox 17.0.1 and it did not respond on loading tabs from previous session. After 1-2 minutes I forced a quit because Firefox was still not responding. Same behavior for Firefox 16.0.2.
I started Firefox 18.0 Beta 3 with the same dirty profile and the tabs were opened quickly and Firefox was working normally.
Verified on Windows 7 x86 and Mac OS X 10.8.
Verified also a dirty profile from Latest Nightly to Firefox 18.0 Beta 3 and worked normally.
You need to log in
before you can comment on or make changes to this bug.
Description
•