Closed
Bug 1261175
Opened 8 years ago
Closed 7 years ago
crash in nsPresContext::GetRootPresContext or nsPresContext::GetParentPresContext (from nsRefreshDriver::IsWaitingForPaint())
Categories
(Core :: Layout, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla59
Tracking | Status | |
---|---|---|
e10s | + | --- |
firefox46 | + | wontfix |
firefox47 | + | wontfix |
firefox48 | + | wontfix |
firefox49 | + | wontfix |
firefox-esr45 | --- | wontfix |
firefox50 | --- | wontfix |
firefox51 | --- | wontfix |
firefox52 | --- | wontfix |
firefox-esr52 | 59+ | fixed |
firefox53 | --- | wontfix |
firefox54 | --- | wontfix |
firefox55 | --- | wontfix |
firefox56 | --- | wontfix |
firefox57 | --- | wontfix |
firefox58 | + | fixed |
firefox59 | + | fixed |
People
(Reporter: bugs, Assigned: mattwoodrow)
References
(Depends on 1 open bug)
Details
(4 keywords, Whiteboard: [adv-esr52.7+][adv-main58+])
Crash Data
Attachments
(3 files, 1 obsolete file)
279.47 KB,
image/png
|
Details | |
1011 bytes,
patch
|
tnikkel
:
review+
ritu
:
approval-mozilla-beta+
ritu
:
approval-mozilla-esr52+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
1.00 KB,
patch
|
bobowen
:
review+
gchang
:
approval-mozilla-beta+
gchang
:
approval-mozilla-release+
lizzard
:
approval-mozilla-esr52+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
+++ This bug was initially created as a clone of Bug #1241217 +++ Based on bug 724355 comment 17, we have reason to believe that bug 1241217 has morphed into this one after fixing bug 1241651. This crash is currently #35 in the e10s experiment crash list - https://crash-stats.mozilla.org/search/?ActiveExperiment=e10s-beta46-noapz%40experiments.mozilla.org&ActiveExperimentBranch=experiment-no-addons&_facets=signature&_columns=signature&_columns=product&_columns=build_id&_columns=platform#facet-signature More crash reports can be found on bug 724355.
Reporter | ||
Updated•8 years ago
|
Whiteboard: [adv-main45+][adv-esr38.7+][post-critsmash-triage][fixed by bug 1241651]
Reporter | ||
Comment 1•8 years ago
|
||
[Tracking Requested - why for this release]: Potential UAF. Also e10s blocker.
Crash Signature: [@ nsPresContext::GetParentPresContext] → [@ nsPresContext::GetRootPresContext]
status-firefox46:
--- → affected
status-firefox47:
--- → affected
tracking-e10s:
--- → ?
tracking-firefox46:
--- → ?
Depends on: 1260531
Comment 2•8 years ago
|
||
Bug 1260531 probably not related. mats debunked my theory.
No longer depends on: 1260531
Comment 3•8 years ago
|
||
In the long term I think we should simplify the nsPresContext hierarchy by simply adding a mParent pointer and NOT going through the viewmanager / view tree since it has proven to be rather fragile. I made a patch for this in bug 984690 which seemed to work at the time IIRC. (I'm sorry I didn't follow through on that proposal). That would also support another (neglected) goal which is to remove viewmanager/views. I'm not sure it will fix these crashes, but unless someone has a better idea it might be worth a shot...
Reporter | ||
Comment 4•8 years ago
|
||
(In reply to Mats Palmgren (:mats) from comment #3) > In the long term I think we should simplify the nsPresContext hierarchy by > simply > adding a mParent pointer and NOT going through the viewmanager / view tree > since it > has proven to be rather fragile. I made a patch for this in bug 984690 > which seemed > to work at the time IIRC. (I'm sorry I didn't follow through on that > proposal). > > That would also support another (neglected) goal which is to remove > viewmanager/views. > > I'm not sure it will fix these crashes, but unless someone has a better idea > it might > be worth a shot... Mats, you are the chosen one! Please get 984690 rebased and tested. I like the idea of removing the dependency on the View system from the refresh driver.
Depends on: 984690
Comment 5•8 years ago
|
||
Posted a patch in bug 1261230 for another guess. Also, am I crazy, or is nothing keeping the stashed views alive? I'll write a patch for this too (unless I'm missing something).
Updated•8 years ago
|
Group: core-security, core-security-release → layout-core-security
Updated•8 years ago
|
Flags: needinfo?(tnikkel)
Comment 7•8 years ago
|
||
Too late for 46 uplift, but since this is a sec-critical crash we may want to try to fix it before we ship e10s. This crash signature is also showing up on 45.0.1. Brad are you sure this shouldn't block?
Reporter | ||
Comment 8•8 years ago
|
||
We took a fix for bug 1261230 in FF47. Let's see if that affects the frequency of crashes with this signature.
Depends on: 1261230
Comment 9•8 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #7) > Too late for 46 uplift, but since this is a sec-critical crash we may want > to try to fix it before we ship e10s. This crash signature is also showing > up on 45.0.1. Brad are you sure this shouldn't block? Did I say somewhere this shouldn't block? And what are you asking about blocking?
Flags: needinfo?(blassey.bugs)
Comment 10•8 years ago
|
||
We came across this today in the CritSmash triage, where we're trying to make sure sec-critical bugs get quick attention. I was asking because this was marked as e10s:m9+ and then changed to e10s:+ which I interpreted as, "this used to block e10s going to release, but now it doesn't". Rather than just throw it back into e10s:? for triage, it seemed faster to ask directly. While a bug being rated sec-critical doesn't necessarily block e10s on release, we wanted to double check that you considered the rating.
Comment 11•8 years ago
|
||
I don't think this was ever m9+. Jet nominated on 3/31 and I marked it + on 4/4. We're not taking security rating into account for e10s tracking purposes, under the assumption that they'll be handled with the normal security bug prioritization and tracking process.
Comment 13•8 years ago
|
||
Current crash stats report for the content process only crash: https://crash-stats.mozilla.com/signature/?product=Firefox&dom_ipc_enabled=%21__null__&ipc_channel_error=%210x7&process_type=content&version=47.0b1&signature=nsPresContext%3A%3AGetRootPresContext
Comment 14•8 years ago
|
||
Bug 1261230 appears to have landed in 47.0b4, and bug 1265577 in 47.0b5, but we still crash in 47.0b5 with this signature: bp-f6d7dbfb-eb64-49ef-ae99-2435f2160514
Comment 15•8 years ago
|
||
Shows up in 48.0a2 as well, with an e5e5 UAF crash: https://crash-stats.mozilla.com/report/index/f1403679-2fe0-49f3-ac75-ee6482160511 Same call sequence as bug 1241217 as best I can tell, which means I think this is basically the same bug - i.e. per comment 0, i think it was never really fixed, perhaps just knocked down in frequency. (I haven't delved that deeply into the specifics, but I saw a 48 crash (and a 47) in the first page of reports. Most were from older versions (much), which implies the frequency has gone down).
Comment 16•8 years ago
|
||
Critsmash triage: can we get a patch for this? This is a sec-critical that is lurking about.
status-firefox49:
--- → affected
tracking-firefox49:
--- → +
Comment 17•8 years ago
|
||
(In reply to Al Billings [:abillings] from comment #16) > can we get a patch for this? I spent a _HUGE_ amount of time trying to fix this bug recently. My work finally got uplifted last week, which finally gave us the chance to see if my work had any effect. Unfortunately it seems that it did not. I used up all my best guesses already in trying to get this fixed. I'll circle back to this and try again, but with ~1 month turn around time to be able to tell if any patch for this bug has any effect it is a very hard road to walk. Anybody who has ideas for this is very welcome to come forward!
Updated•8 years ago
|
Keywords: sec-critical → sec-high
Updated•8 years ago
|
Comment 18•8 years ago
|
||
Marking fix optional for 49/50, and assuming this still affects 51. Jet, shall we just count this as sec-critical backlog or do you think it needs further investigation?
Reporter | ||
Comment 19•8 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #18) > Jet, shall we just count this as sec-critical backlog or do you think it > needs further investigation? I think we should put it on the sec-critical backlog, _and_ continue to investigate. We want to see this fixed ASAP, though our attempts have not been conclusive. If a reproducible case comes up, this one goes right back on the front burner.
Flags: needinfo?(bugs)
Comment 20•8 years ago
|
||
too late for 48
Assignee | ||
Updated•8 years ago
|
Crash Signature: [@ nsPresContext::GetRootPresContext] → [@ nsPresContext::GetRootPresContext]
[@ nsPresContext::GetRootPresContext()]
Comment 21•8 years ago
|
||
Still lots of crashes, mostly in 50b and 48. (Oddly far fewer in 49, though non-zero)
Comment 22•8 years ago
|
||
Randell, if you still have the recent crash-stat query urls for this, could you paste them here.
Flags: needinfo?(rjesup)
Comment 23•8 years ago
|
||
https://crash-stats.mozilla.com/signature/?signature=nsPresContext%3A%3AGetRootPresContext&date=%3E%3D2016-09-01T00%3A00%3A00.000Z&date=%3C2016-10-05T00%3A00%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1#summary For example... All I did was click on a date in the signature graph (https://addons.mozilla.org/en-US/firefox/addon/bugzilla-socorro-lens/?src=api) and then expand the date range
Flags: needinfo?(rjesup)
Updated•8 years ago
|
status-firefox52:
--- → affected
status-firefox53:
--- → affected
Comment 24•8 years ago
|
||
I see a lot of crashes in 50, and a decent number in 51, but only a few in 52.
Comment 25•8 years ago
|
||
I wonder if we have several issues around here and if bug 1308459 helped in some case, but not in all.
Comment 26•8 years ago
|
||
We get a lot less usage in Aurora than beta and release, so lots in 50, fair number in 51, a few in 52 is a pretty typical signature, no?
Updated•8 years ago
|
Crash Signature: [@ nsPresContext::GetRootPresContext]
[@ nsPresContext::GetRootPresContext()] → [@ nsPresContext::GetRootPresContext]
[@ nsPresContext::GetRootPresContext()]
[@ nsPresContext::GetParentPresContext]
Summary: crash in nsPresContext::GetRootPresContext (from nsRefreshDriver::IsWaitingForPaint()) → crash in nsPresContext::GetRootPresContext or nsPresContext::GetParentPresContext (from nsRefreshDriver::IsWaitingForPaint())
Comment 28•8 years ago
|
||
The number or crashes for https://crash-stats.mozilla.com/signature/?signature=nsPresContext%3A%3AGetParentPresContext is low, but the number of crashes for https://crash-stats.mozilla.com/signature/?signature=nsPresContext%3A%3AGetRootPresContext is increasing slightly. Jet, could somebody be assigned to this bug to investigate this as per comment 19?
Flags: needinfo?(bugs)
Comment 29•8 years ago
|
||
I just spent a bunch of time going through the latest crash-stats URLs and comments associated with the nsPresContext::GetRootPresContext crashes, hoping to find a way to reproduce. No luck. I did notice that there's an unusually high number of PDF related URLs. (By no means all - there are crashes of the Google homepage and Facebook homepage, for example.) There also seemed to be quite a high ratio of mentions of printing. Loading various URLs, navigating during/after load, navigating during/after printing to file, no crashes. :/ I'd also note that memory pressure does not seem to be an issue.
Comment 30•8 years ago
|
||
(In reply to Mats Palmgren (:mats) from comment #3) > In the long term I think we should simplify the nsPresContext hierarchy by > simply > adding a mParent pointer and NOT going through the viewmanager / view tree The stacks I'm seeing are stacks like: https://crash-stats.mozilla.com/report/index/9f871728-4316-4009-a0f6-cfbde2161001 Those looks as likely to be caused by nsRefreshDriver's mPresContext raw pointer pointing at invalid memory as they are to be caused by the pc->GetParentPresContext() call in nsPresContext::GetRootPresContext() returning something invalid. Was there a reason to suspect the latter?
Flags: needinfo?(mats)
Comment 31•8 years ago
|
||
nsRefreshDriver::mPresContext is a mozilla::WeakPtr<nsPresContext>, not a raw pointer, so when the nsPresContext is deleted this member should become null. The only potential problem I see with that is that mozilla::WeakPtr isn't thread safe: http://searchfox.org/mozilla-central/source/mfbt/WeakPtr.h#17 but, as far as I know, destruction of nsPresContext and nsRefreshDriver::Tick are only supposed to happen on the main thread, so that shouldn't be a problem.
Flags: needinfo?(mats)
Comment 32•8 years ago
|
||
(In reply to Jonathan Watt [:jwatt] from comment #30) > (In reply to Mats Palmgren (:mats) from comment #3) > > In the long term I think we should simplify the nsPresContext hierarchy by > > simply > > adding a mParent pointer and NOT going through the viewmanager / view tree > > The stacks I'm seeing are stacks like: > > https://crash-stats.mozilla.com/report/index/9f871728-4316-4009-a0f6- > cfbde2161001 In addition to what Mats said, the crashing line in that report is inside a for loop, so we may have successfully iterated to a parent prescontext 1 or more times before encountering the crash, which would implicate one of the pointers we follow in GetParentPresContext being invalid.
Comment 33•7 years ago
|
||
(In reply to Mats Palmgren (:mats) from comment #31) > nsRefreshDriver::mPresContext is a mozilla::WeakPtr<nsPresContext>, not a > raw pointer Ah, it looks like that's a somewhat recent change. If you follow the nsRefreshDriver.cpp link from the crash report I gave and then switch to the .h file (with the same rev), the pointer is a raw pointer.
Comment 34•7 years ago
|
||
(In reply to Jonathan Watt [:jwatt] from comment #33) > (In reply to Mats Palmgren (:mats) from comment #31) > > nsRefreshDriver::mPresContext is a mozilla::WeakPtr<nsPresContext>, not a > > raw pointer > > Ah, it looks like that's a somewhat recent change. If you follow the > nsRefreshDriver.cpp link from the crash report I gave and then switch to the > .h file (with the same rev), the pointer is a raw pointer. The weak ptr change in 52, which is on beta right now, and we see plenty of crashes there, so it didn't fix it.
Comment 35•7 years ago
|
||
time is very short for beta/52. Any chance of wallpaper, if not a full fix?
Flags: needinfo?(jwatt)
Comment 36•7 years ago
|
||
If I understand correctly, nobody has a solution to this right now that doesn't involve major refactoring.
Flags: needinfo?(jwatt)
Updated•7 years ago
|
Flags: needinfo?(tnikkel)
Flags: needinfo?(bugs)
Comment 37•7 years ago
|
||
(In reply to Jonathan Watt [:jwatt] from comment #36) > If I understand correctly, nobody has a solution to this right now that > doesn't involve major refactoring. Sorry to bug you once again, Jet. If this is truly a sec-critical (and I'm happy to be proven otherwise), we must consider the refactoring as an option. Do you think you could discuss the required work in your team and come up with a rough time frame?
Flags: needinfo?(bugs)
Comment 38•7 years ago
|
||
Please don't cancel my needinfos (which I set and was asking myself for info).
Flags: needinfo?(tnikkel)
Reporter | ||
Comment 39•7 years ago
|
||
(In reply to Frederik Braun [:freddyb] from comment #37) > (In reply to Jonathan Watt [:jwatt] from comment #36) > > If I understand correctly, nobody has a solution to this right now that > > doesn't involve major refactoring. > > Sorry to bug you once again, Jet. > If this is truly a sec-critical (and I'm happy to be proven otherwise), we > must consider the refactoring as an option. > Do you think you could discuss the required work in your team and come up > with a rough time frame? Our biggest issue continues to be a lack of reproducible steps. GetRootPresContext() is a *very* hot code path, and any changes here (refactoring, etc.) come with significant risk. We're willing to take on that risk if there's a reasonable chance that we'll actually fix this bug. That gets us back to reproducible steps :( re: the sec-critical designation, this one should stay on the backlog since the crash signature is a potential UAF, but we can't reproduce. I think bug 984690 is still how we'd refactor this code and we may still do that independent of this bug. I would still love to get repro steps to be sure we actually fixed it.
Comment 40•7 years ago
|
||
Mass wontfix for bugs affecting firefox 52.
Comment 41•7 years ago
|
||
Still happening at volume; 1200+ crashes in the last week.
status-firefox54:
--- → affected
status-firefox55:
--- → affected
status-firefox-esr45:
--- → affected
status-firefox-esr52:
--- → affected
Comment 42•7 years ago
|
||
Would it be useful to get some URLs that have caused the crash? The volume seems quite high indeed.
Flags: needinfo?(rjesup)
Updated•7 years ago
|
Flags: needinfo?(rjesup)
Comment 43•7 years ago
|
||
The (few) comments still seem to involve printing often
Comment 44•7 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #43) > The (few) comments still seem to involve printing often Then bug 1354443 might be quite relevant.
Comment 45•7 years ago
|
||
I'll mark bug 1354443 as a dependency since it shouldn't be too hard to fix bug 1354443 and then we can see how many of these crashes remain after fixing it.
Depends on: 1354443
Updated•7 years ago
|
Keywords: testcase-wanted
Comment 46•7 years ago
|
||
Looks like its dependent (bug 1354443) just got resolved. Does that turn this issue into something more actionable, Timothy?
Comment 47•7 years ago
|
||
Unfortunately it doesn't look like the fixes for bug 1354443 helped. There are lots of crashes still being reported for 55.0b13. (Fixing that bug also doesn't make this bug any more actionable.)
Comment 48•7 years ago
|
||
[Tracking Requested - why for this release]: Because this is a perennial top-UAF crasher that hasn't gotten traction. Poking this bug -- this is the #1 UAF crash (by a bunch) on 57/58: https://crash-stats.mozilla.com/signature/?product=Firefox&version=58.0a1&version=57.0b&version=57.0a1&address=~e5e5&signature=nsPresContext%3A%3AGetRootPresContext&date=%3E%3D2017-10-02T14%3A00%3A00.000Z&date=%3C2017-10-09T14%3A00%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_sort=-date&page=1#reports >250 crashes just in the last week on 57 and 58 only 4 of 266 crashes *don't* have IsWaitingForPaint() as the calling function: https://crash-stats.mozilla.com/signature/?product=Firefox&version=58.0a1&version=57.0b&version=57.0a1&address=~e5e5&proto_signature=%21~IsWaitingForPaint&signature=nsPresContext%3A%3AGetRootPresContext&date=%3E%3D2017-10-02T14%3A00%3A00.000Z&date=%3C2017-10-09T14%3A00%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_sort=-date&page=1 Those are called from APZCCallbackHelper::ApplyCallbackTransform() (3) and nsCSSFrameConstructor::BeginUpdate() (1) Just an observation: like most frequent crashes, this has a weekend dip. However, the dip seems much larger than most. Might be a (small) clue.
status-firefox56:
--- → affected
status-firefox57:
--- → affected
status-firefox58:
--- → affected
tracking-firefox58:
--- → ?
Flags: needinfo?(milan)
So, a pres context that the refresh driver has as a weak pointer to, is going away while the refresh driver still tries to use it? Timothy, anything we can do here to get more information or to wallpaper it? The volume is not huge, so if we can get rid of UAF and replace it with a release assert or some such, it would be an improvement.
Flags: needinfo?(tnikkel)
Flags: needinfo?(milan)
Updated•7 years ago
|
Flags: needinfo?(tnikkel)
Comment 50•7 years ago
|
||
So this would say that the #1 and #3 UAFs in 57/58 both probably have the same root cause...
Assignee | ||
Comment 51•7 years ago
|
||
So, I think this is because nsIPresShell::mViewManager is dangling somewhere. The crash address is 0xE5E5E5ED, while we're accessing [eax+8], and eax is 0xE5E5E5E5. eax looks to be 'view' here, so that suggests that 'viewManager' is a valid address, but is deleted, so looking up mRootView (0x14) returned 0xE5E5E5E5. That would suggest that 'shell' is valid, and points to valid memory, but mViewManager (0x14) is dangling.
Assignee | ||
Comment 52•7 years ago
|
||
Some follow-on thoughts: nsViewManager::~nsViewManager - Should we runtime assert that mPresShell is null, rather than just setting it to nullptr. If it's not nullptr, the presshell will have a dangling pointer to us. nsIPresShell::mViewManager - comment says docViewer owns it, but we don’t always have a doc viewer (nsPrintEngine!) nsPrintEngine::ReflowPrintObject - make !mPresContext assertion a runtime one. If we’ve called this twice without destroying the presentation, then we’ll drop the reference to the nsViewManager without calling Destroy on the shell. Possible that the shell then outlives the viewmanager. Should we call DestroyPresentation just to be safe? nsDocumentViewer::InitInternal calls MakeWindow, which writes to mViewManager. Again, if we already had one (and a pres shell), then this can destroy the old view manager without Destroy being called on the old pres shell). Could this be called multiple times? Looks like Init is exposed to script? Tim, Mats, does that all seem sane? Still not sure what the actual problem is, but if that is all correct, some runtime asserts could probably catch it earlier (and in a non-UAF way).
Flags: needinfo?(mats)
Comment 53•7 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #52) > nsIPresShell::mViewManager - comment says docViewer owns it, but we don’t > always have a doc viewer (nsPrintEngine!) The nsPrintObject would own it then, it looks like. > nsDocumentViewer::InitInternal calls MakeWindow, which writes to > mViewManager. Again, if we already had one (and a pres shell), then this can > destroy the old view manager without Destroy being called on the old pres > shell). Could this be called multiple times? Looks like Init is exposed to > script? I don't think Init is callable by script, nsIContentViewer::Init has [noscript] on it. The other ones sound reasonable. My only guess is one of the nsDocumentViewer or relevant functions in our printing code can re-enter and mess things up. We've had that problem before, and fixed the ones we found, but no guarantee we know about them all. I had some hacky patches to detect this that I resurrected. I'll see if they can be made landable.
Assignee | ||
Comment 54•7 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #53) > (In reply to Matt Woodrow (:mattwoodrow) from comment #52) > > nsIPresShell::mViewManager - comment says docViewer owns it, but we don’t > > always have a doc viewer (nsPrintEngine!) > > The nsPrintObject would own it then, it looks like. Indeed, just a confusing comments for something with fairly complex ownership requirements. > > > nsDocumentViewer::InitInternal calls MakeWindow, which writes to > > mViewManager. Again, if we already had one (and a pres shell), then this can > > destroy the old view manager without Destroy being called on the old pres > > shell). Could this be called multiple times? Looks like Init is exposed to > > script? > > I don't think Init is callable by script, nsIContentViewer::Init has > [noscript] on it. Ah right, good catch. There's still quite a few c++ callers to Init, I didn't audit them all to see if re-entry was possible.
Assignee | ||
Comment 55•7 years ago
|
||
I think this should be sufficient (if I have the cause correct) to prevent UAF, and it'll give us a stack to the actual bug.
Attachment #8917584 -
Flags: review?(tnikkel)
Updated•7 years ago
|
Attachment #8917584 -
Flags: review?(tnikkel) → review+
Assignee | ||
Comment 56•7 years ago
|
||
Comment on attachment 8917584 [details] [diff] [review] viewmanager-assert [Security approval request comment] How easily could an exploit be constructed based on the patch? Not easily. It'll probably make it easier to roughly figure out what triggers the bug (destroying the view manager before the pres shell), but you'd still have to audit the code to figure out how to make that happen (which we've failed to do thus far). Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No. Which older supported branches are affected by this flaw? All. Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? No, but they should be identical. How likely is this patch to cause regressions; how much testing does it need? Fairly unlikely, just adding a new assertion.
Attachment #8917584 -
Flags: sec-approval?
Comment 57•7 years ago
|
||
Comment on attachment 8917584 [details] [diff] [review] viewmanager-assert sec-approval+ for trunk. We'll want a patch nominated for Beta and ESR52 as well, to hopefully land after this is on trunk.
Attachment #8917584 -
Flags: sec-approval? → sec-approval+
Updated•7 years ago
|
Comment 58•7 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/8281ed36bd49 I'm assuming leave-open was implied since the patch seems diagnostic in nature?
Comment 59•7 years ago
|
||
It's more than a diagnostic - it makes us preemptively crash, safely, to avoid the UAF (per comment 55). Perhaps we should keep this bug focused on the UAF, for saner semantics of tracking flags as we uplift it, and file a followup on investigating & fixing the (anticipated) MOZ_RELEASE_ASSERT crashes? mattwoodrow, what do you think? If you agree, go ahead and close this as FIXED, I think (since the patch has been merged to central).
Flags: needinfo?(matt.woodrow)
Assignee | ||
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(matt.woodrow)
Resolution: --- → FIXED
Assignee | ||
Comment 61•7 years ago
|
||
Comment on attachment 8917584 [details] [diff] [review] viewmanager-assert Approval Request Comment [Feature/Bug causing the regression]: Unknown, long standing sec bug. [User impact if declined]: Potentially exploitable UAF bug. [Is this code covered by automated tests?]: No, we have no reproducible test cases yet. [Has the fix been verified in Nightly?]: No, again, no known way to reproduce. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: None. [Is the change risky?]: No. [Why is the change risky/not risky?]: Just adds a release assert to verify our ownership requirements, should turn the UAF crash into a safe assertion crash (along with the stack that will help us fix the underlying problem). [String changes made/needed]: None.
Attachment #8917584 -
Flags: approval-mozilla-esr52?
Attachment #8917584 -
Flags: approval-mozilla-beta?
Updated•7 years ago
|
Assignee: nobody → matt.woodrow
Updated•7 years ago
|
Target Milestone: --- → mozilla58
Comment on attachment 8917584 [details] [diff] [review] viewmanager-assert Sec-high, Beta57+, ESR52.5+
Attachment #8917584 -
Flags: approval-mozilla-esr52?
Attachment #8917584 -
Flags: approval-mozilla-esr52+
Attachment #8917584 -
Flags: approval-mozilla-beta?
Attachment #8917584 -
Flags: approval-mozilla-beta+
Comment 63•7 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/04f51b690e8c
Comment 64•7 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-esr52/rev/77a2d4610275
Updated•7 years ago
|
Group: layout-core-security → core-security-release
Comment 65•7 years ago
|
||
FYI, there are already quite a few crashes (77) on trunk hitting this assert, e.g. bp-e757c47e-ee7c-421e-8d3a-b6bc90171018 Should we open a new bug to handle that or reopen this one?
Flags: needinfo?(matt.woodrow)
Assignee | ||
Comment 66•7 years ago
|
||
http://searchfox.org/mozilla-central/source/layout/base/nsDocumentViewer.cpp#1636 That code is wrong isn't it? We call Destroy() from the nsDocumentViewer dtor, but if we take that early return then we don't call DestroyPresShell().
Flags: needinfo?(matt.woodrow)
Comment 67•7 years ago
|
||
the [@ nsViewManager::~nsViewManager] signature is the #6 top content crash in 57.0b10
Comment 68•7 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #66) > http://searchfox.org/mozilla-central/source/layout/base/nsDocumentViewer. > cpp#1636 > > That code is wrong isn't it? > > We call Destroy() from the nsDocumentViewer dtor, but if we take that early > return then we don't call DestroyPresShell(). Yeah, pretty much, same with the next early return after that.
Updated•7 years ago
|
Crash Signature: [@ nsPresContext::GetRootPresContext]
[@ nsPresContext::GetRootPresContext()]
[@ nsPresContext::GetParentPresContext] → [@ nsPresContext::GetRootPresContext]
[@ nsPresContext::GetRootPresContext()]
[@ nsPresContext::GetParentPresContext]
[@ nsViewManager::~nsViewManager]
Assignee | ||
Comment 69•7 years ago
|
||
I still haven't found a way to reproduce this, so somewhat guessing here. I *think* the destroy refcount early return is ok, since it looks like we manually call Destroy in this cases where we increment it (as well as calling Destroy from the dtor). Added a release assert to catch the case where we hit the dtor with it still set though.
Attachment #8923998 -
Flags: review?(tnikkel)
Assignee | ||
Updated•7 years ago
|
Attachment #8923998 -
Flags: review?(jwatt)
Updated•7 years ago
|
Whiteboard: [adv-main57+][adv-esr52.5+]
Comment 70•7 years ago
|
||
Comment on attachment 8923998 [details] [diff] [review] printing-fix Review of attachment 8923998 [details] [diff] [review]: ----------------------------------------------------------------- I'd like Bob to take a look at this too.
Attachment #8923998 -
Flags: review?(jwatt)
Attachment #8923998 -
Flags: review?(bobowencode)
Attachment #8923998 -
Flags: review+
Comment 71•7 years ago
|
||
Comment on attachment 8923998 [details] [diff] [review] printing-fix Review of attachment 8923998 [details] [diff] [review]: ----------------------------------------------------------------- My main concern here is why is that check there in the first place. Obviously the early return is wrong when called from ~nsDocumentViewer, but what about other callers of this? Clearly the setting of mPrt->mDocWasToBeDestroyed isn't doing anything either way. I also don't quite understand how mPrintEngine isn't null when this is called from ~nsDocumentViewer, as it looks like it should always have a strong reference to it.
Assignee | ||
Comment 72•7 years ago
|
||
I don't have good answers to these questions unfortunately. I'm not even entirely sure that this is the bug. All we know is that we're calling the view manager destructor while we still have a pointer to a pres shell (which should mean it has a pointer back, which is now dangling). That would suggest that that we missed calling Destroy() on the pres shell, and this patch fixes the only way I can see that it might be possible. I can't reproduce this crash at all, and really don't know this code, so it's a bit of a guess. Not running Destroy() when called from the dtor seems clearly like a problem, and the destroy ref count should already prevent it running if Destroy is called manually. I'm not sure what CheckBeforeDestroy was trying to do by keeping things alive (without holding a reference).
Comment 73•7 years ago
|
||
CheckBeforeDestroy was there before nsPrintEngine::mDocViewerPrint became a strong reference. Which I guess was not strong originally because of circular dependencies. Also having a strong reference doesn't stop someone else from calling nsDocumentViewer::Destroy. The mDestroyRefCount only seems to cover part of the print process (and looks pretty unreliable as someone else can decrement it via nsDocumentViewer::Destroy who hasn't incremented it, but that's a different issue). Perhaps instead in ~nsDocumentViewer, if mPrintEngine is not null, we should call Destroy on it and null it out, before calling our own Destroy? That would rule out the issue when called from ~nsDocumentViewer, but leave the existing behaviour when called from somewhere else.
Updated•7 years ago
|
Flags: needinfo?(matt.woodrow)
Reporter | ||
Comment 74•7 years ago
|
||
Reopening for now as we expect to take more patches here.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 75•7 years ago
|
||
Flags: needinfo?(matt.woodrow)
Attachment #8927985 -
Flags: review?(bobowencode)
Updated•7 years ago
|
Attachment #8927985 -
Flags: review?(bobowencode) → review+
Assignee | ||
Comment 76•7 years ago
|
||
Comment on attachment 8927985 [details] [diff] [review] printing-fix-v2 [Security approval request comment] How easily could an exploit be constructed based on the patch? Not easily, we still haven't figured out how to reproduce this. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Not particularly, same as above. Which older supported branches are affected by this flaw? All. If not all supported branches, which bug introduced the flaw? Unknown, very old bug though. Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Not yet, but it's old code so should be the same for all branches. How likely is this patch to cause regressions; how much testing does it need? Fairly low, but we don't have automated testing for this code, so it would be preferable to ride the trains.
Attachment #8927985 -
Flags: sec-approval?
Comment 77•7 years ago
|
||
Sec-approval+ for checkin on November 28, two weeks into the new cycle. Please do not check in before that date. We'll want this on other branches so branch patches should be made and nominated at that time as well.
Whiteboard: [adv-main57+][adv-esr52.5+] → [adv-main57+][adv-esr52.5+][checkin on 11/28]
Updated•7 years ago
|
Attachment #8927985 -
Flags: sec-approval? → sec-approval+
Updated•7 years ago
|
Attachment #8923998 -
Attachment is obsolete: true
Attachment #8923998 -
Flags: review?(tnikkel)
Attachment #8923998 -
Flags: review?(bobowencode)
Comment 78•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/1f2a0a7ae4b37bd0073c6f7e0a63fe9b71a498de I'm resetting the Fx58 status flag under the assumption that we'll eventually want to uplift this to Beta as well. We'll probably want to consider doing the same for ESR52 also.
status-firefox59:
--- → affected
tracking-firefox59:
--- → ?
Whiteboard: [adv-main57+][adv-esr52.5+][checkin on 11/28] → [adv-main57+][adv-esr52.5+]
Target Milestone: mozilla58 → ---
Comment 79•7 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/1f2a0a7ae4b3
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Updated•7 years ago
|
Reporter | ||
Updated•7 years ago
|
Flags: needinfo?(bugs)
Comment 80•7 years ago
|
||
We've hit that release assert on mDestroyRefCount twice since this landed. Given that nsPagePrintTimer holds a strong reference, the only thing I can think of is that somehow you can get back into printing before the old nsPagePrintTimer has died and you hit the mPrintEngine->CheckBeforeDestroy() as it dies and mDestroyRefCount never gets decremented. Other than try to change entirely how this all works, maybe we should just move the decrement to another function (like the increment), which ~nsPagePrintTimer calls instead of Destroy. This also gets rid of the potential problem of other things calling Destroy and decrementing mDestroyRefCount. jwatt - what do you think?
Flags: needinfo?(tnikkel) → needinfo?(jwatt)
Comment 81•7 years ago
|
||
I'd actually like to do a much bigger rewrite, but for now that sounds good to me.
Flags: needinfo?(jwatt) → needinfo?(bobowencode)
Updated•7 years ago
|
Flags: needinfo?(bobowencode)
Comment 82•7 years ago
|
||
Should we uplift the last patch to Beta for Fx58 still?
Flags: needinfo?(matt.woodrow)
Assignee | ||
Comment 83•7 years ago
|
||
I'm not sure if Bob wanted to land another fix here first or not.
Flags: needinfo?(matt.woodrow) → needinfo?(bobowencode)
Comment 84•7 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #83) > I'm not sure if Bob wanted to land another fix here first or not. I filed bug 1426087 for that, I'm not sure if uplifting highlights the issue here any more, but even if we just did this bug at least it makes the crash safe.
Flags: needinfo?(bobowencode)
Comment 85•7 years ago
|
||
Sounds like we can move forward with the uplift request then.
Flags: needinfo?(matt.woodrow)
Assignee | ||
Comment 86•7 years ago
|
||
Comment on attachment 8927985 [details] [diff] [review] printing-fix-v2 Approval Request Comment [Feature/Bug causing the regression]: Unknown, very old bug. [User impact if declined]: Potential UAF with printing. [Is this code covered by automated tests?]: No. [Has the fix been verified in Nightly?]: No. [Needs manual test from QE? If yes, steps to reproduce]: No known STR, so no. [List of other uplifts needed for the feature/fix]: None. [Is the change risky?]: No. [Why is the change risky/not risky?]: Very low risk. [String changes made/needed]: None.
Flags: needinfo?(matt.woodrow)
Attachment #8927985 -
Flags: approval-mozilla-beta?
Comment 87•7 years ago
|
||
Comment on attachment 8927985 [details] [diff] [review] printing-fix-v2 Sec-high fix. Beta58+.
Attachment #8927985 -
Flags: approval-mozilla-release+
Attachment #8927985 -
Flags: approval-mozilla-beta?
Attachment #8927985 -
Flags: approval-mozilla-beta+
Comment 88•7 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/d9c057e41607 (FIREFOX_58b_RELBRANCH) https://hg.mozilla.org/releases/mozilla-release/rev/75912c7725910ed540e53348c4620add73f36000
Comment 89•7 years ago
|
||
i've filed bug 1431009 for a new crash signature with the MOZ_RELEASE_ASSERT added here.
Comment 90•7 years ago
|
||
Comment on attachment 8927985 [details] [diff] [review] printing-fix-v2 This patch appears to be working well on 58+. Let's close the loop and get it on ESR52 as well. See comment 86 for details.
Attachment #8927985 -
Flags: approval-mozilla-esr52?
Updated•7 years ago
|
Comment 91•7 years ago
|
||
Comment on attachment 8927985 [details] [diff] [review] printing-fix-v2 Looks like we landed a diagnostic patch for this crash in the ESR that went out with the 57 release (52.5.0) and now we think this will fix the issue. Let's uplift for 52.7.0esr.
Attachment #8927985 -
Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
Updated•7 years ago
|
Comment 92•7 years ago
|
||
Al, just want to make sure you see this in case it needs mentioning in security advisories.
Flags: needinfo?(abillings)
Comment 93•7 years ago
|
||
I hate it when we fix a bug multiple times in the same bug. All the flags say this had advisories back in 57. Then we fixed it (again, sort of) in 58 but it got no advisory there because it was already marked as fixed in 57 so none of the queries brought it up (because they check whether the previous version was affected). Luckily, this is an internal report. Ideally, we would have fixed it on mainline Firefox (58) and ESR at the *same* time...
Comment 94•7 years ago
|
||
Yes, that's why I just pinged you for this, because it didn't seem like the flags were going to show this to you for 59. We should definitely open new bugs for security sensitive issues rather than re-open and land more patches in the same bug. I will mention that in the channel meetings and relman's team meeting as something to watch out for.
Comment 95•7 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-esr52/rev/f0ec180993d2
Updated•6 years ago
|
Whiteboard: [adv-esr52.7+][adv-main58+]
Updated•6 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•