GC destroys live frame / assertion "Unexpected current doc in root content" / crash [@ nsContentIterator::NextNode]

VERIFIED FIXED

Status

()

defect
P1
critical
VERIFIED FIXED
13 years ago
a month ago

People

(Reporter: jruderman, Assigned: sicking)

Tracking

(Blocks 1 bug, 4 keywords)

Trunk
PowerPC
macOS
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9 +
blocking1.8.1 -
wanted1.8.1.x +
blocking1.8.0.5 -
blocking1.8.0.9 -
wanted1.8.0.x +
in-testsuite ?

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [sg:investigate] null deref? depends on removing nsDocument::Destroy, crash signature)

Attachments

(2 attachments)

(Reporter)

Description

13 years ago
The testcase demonstrates three symptoms, which I'm guessing are all the same bug:

1) GC removes a live frame.  (GC isn't supposed to have any effect, much less a directly visible effect.)

2) It triggers the assertion "Unexpected current doc in root content" in nsDocument::~nsDocument.

3) Tabbing in it crashes [@ nsContentIterator::NextNode] with a null |parent| variable.  (I think this crash can be triggered by JavaScript, but I'm not sure exactly how.)

Security-sensitive because:

1) Bug 321299 and bug 323978 are security-sensitive.

2) I don't know whether this bug is exploitable, but GCing live things scares me.

3) I know there's an exploitable crash lurking somewhere around here, even after the fix for bug 321299, based on stack traces I've seen.  It might be related to this bug.
(Reporter)

Comment 1

13 years ago
Jesse, see also bug 326645.

So the first issue here is that once you call:

  frameOne.parentNode.removeChild(frameOne);

that tears down the document in frame1, and in particular sorta unhooks frameOneContent from the document.  But not correctly.  The reasons are basically bug 326645.  As a result, when the document is GCed it asserts but then proceeds to unhook this node (which is no longer in it!) from the node's now-current document.  Which gets us into a situation where the node thinks it's not in a document, but the document thinks the node is in it, so bad things happen when you tab.

I think we should add a dep on bug 326645; I haven't done it only because that's not security-sensitive at the moment.
Flags: blocking1.8.0.5?
(Reporter)

Comment 3

13 years ago
This has been mentioned in 326645, so adding dependency.
Depends on: 326645
Flags: blocking1.8.0.5? → blocking1.8.0.5+
(Reporter)

Comment 4

13 years ago
The main problem of the frame "2" disappearing is gone, I think because of the fix for bug 326645.  But now I see this on load:

###!!! ASSERTION: Shallow unbind won't clear document and binding parent on kids!: 'aDeep || (!GetCurrentDoc() && !GetBindingParent())', file /Users/admin/trunk/mozilla/content/base/src/nsGenericElement.cpp, line 1821

And this when tabbing:

###!!! ASSERTION: nsRange::IsIncreasing: 'Not Reached', file /Users/admin/trunk/mozilla/content/base/src/nsRange.cpp, line 754

Tabbing again results in a crash similar to the original crash:

Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_PROTECTION_FAILURE (0x0002) at 0x00000000

Thread 0 Crashed:
0  nsContentIterator::NextNode(nsIContent*, nsVoidArray*) + 432 (nsContentIterator.cpp:828)
1   libgklayout.dylib         	0x06cdab6c nsContentIterator::Next() + 232 (nsContentIterator.cpp:1005)
2   libgklayout.dylib         	0x06b18d68 nsTypedSelection::selectFrames(nsPresContext*, nsIContentIterator*, nsIContent*, nsIDOMRange*, nsIPresShell*, int) + 784 (nsSelection.cpp:4350)
3   libgklayout.dylib         	0x06b24134 nsTypedSelection::selectFrames(nsPresContext*, nsIDOMRange*, int) + 1032 (nsSelection.cpp:4464)
...
OS: Mac OS X 10.2 → Mac OS X 10.4
Yeah, we're not out of the woods yet.  nsDocument::Destroy unbinds the nodes, but doesn't drop refs to them.  Then you insert them into another document.  Then the destructor cleans up, and does shallow unbinds, which is what asserts...

See bug 326645 for some discussion about this stuff.

The basic problem, I guess, is that we can't remove frameOneContent from its document's child list because it doesn't think it's in that list.  :(  Yay nsDocument::Destroy.
Flags: blocking1.9a2?
The fix for bug 326645 has landed, "qawanted" to retest this and see if that did in fact help any.
Keywords: qawanted
(Reporter)

Comment 7

13 years ago
dveditz, see comment 4 and comment 5.
Keywords: qawanted
Depends on a bug with too many regressions atm, pushing this bug to 1.8.0.6 since no specific fix is at hand
Flags: blocking1.8.0.6?
Flags: blocking1.8.0.5-
Flags: blocking1.8.0.5+
Whiteboard: [sg:critical?]
Blocks: 342942
(Reporter)

Comment 9

13 years ago
 Bryner, can you try to fix this bug?  It involves nsDocument::Destroy, and it might be a security hole.
(Reporter)

Comment 10

13 years ago
*** Bug 342942 has been marked as a duplicate of this bug. ***
Assignee: general → bzbarsky
Flags: blocking1.8.1?
Whiteboard: [sg:critical?] → [sg:critical]
Er.. so what does assigning this to me mean?  I didn't exactly have plans to work on this, and I'm not sure we have a sane way to fix it on the branches...  It _might_ get fixed with graydon's cycle collector on trunk, I hope.
Minusing for blocking1.8.1, but we'd be happy to take a patch.
Flags: blocking1.8.1? → blocking1.8.1-
I'm not sure assigning this to me is useful; it just creates the false impression that I'm working on it...
Assignee: bzbarsky → general
Looks like we're not getting a patch in 1.8.0.7
Flags: blocking1.8.0.7? → blocking1.8.0.8?
Restoring lost blocking flag
Flags: blocking1.8.0.9?
No progress, taking off the 1.8.0.x blocking list. Please put back on the radar when this bug becomes active again.
Flags: blocking1.8.0.9? → blocking1.8.0.9-
Flags: blocking1.9?
Depends on: cycle-collector
Fwiw, the cycle collector builds do not crash on this testcase. I am not sure I understand the case well enough to know if that's a feature or a coincidence.
Flags: blocking1.9a2?
(Reporter)

Comment 18

13 years ago
The existing testcase doesn't work as-is in a build with the patch for bug 47903, which makes Gecko enforce WRONG_DOCUMENT_ERR.  I tried converting the testcase to use adoptNode, but you can't adopt a document node (adoptNode gives you NS_ERROR_DOM_NOT_SUPPORTED_ERR).

If I adopt the <html> element instead, the crash goes away, but the "frame content disappears upon GC" problem returns.  I'll attach a new testcase that demonstrates this.

I don't know whether any crashes remain.
Depends on: 47903
(Reporter)

Comment 19

13 years ago
To get the same behavior as before you want to call document.adoptNode(x) with x being the node you used to move, i.e. the first frames documentElement
(Reporter)

Comment 21

13 years ago
Isn't that what I did?
I think so, I just wanted to point out that calling adoptNode on the document is not what you want to do so the fact that it failed was ok.
(Reporter)

Comment 23

13 years ago
See also bug 361226, "Crash due to too much recursion in *::BindToTree (cycle in DOM tree?)".  I think that bug involves iframes somehow.
(Reporter)

Comment 24

12 years ago
On Mac, you now have to resize the window after the testcase does its thing in order for the "2" to really go away.  If you try to select text, several assertions appear.
Critical security bugs must have owners. If you can't work on this bug help us find another active owner for it.
Assignee: general → jst
Flags: blocking1.9? → blocking1.9+

Updated

12 years ago
Keywords: arch
Whiteboard: [sg:critical] → [sg:critical] depends on cycle collector landing

Comment 26

12 years ago
can we tell yet if the cycle collector landing was able to fix this?
(Reporter)

Comment 27

12 years ago
WFM currently (the "2" doesn't disappear).  But I'm not sure that will last, since recent builds are leaking DOMWindows all over the place.
(Reporter)

Comment 28

12 years ago
Never mind.  It does disappear if I force a repaint by resizing the window or even pressing Tab a few times.
We likely have to make some refs strong and hook up more objects to the cyclecollector for it to actually help. Not sure which ones though since i don't know what actually happens here.
Sicking, see deps.  This is basically nsDocument::Destroy suckage.
(Reporter)

Comment 31

12 years ago
I'm guessing the remaining problems are not sg:critical.
Whiteboard: [sg:critical] depends on cycle collector landing → depends on removing nsDocument::Destroy
The original problem still happens on the 1.8 branch though. sg:critical until it's solved there.
Flags: blocking1.8.1.4?
Flags: blocking1.8.0.12?
Whiteboard: depends on removing nsDocument::Destroy → [sg:critical] depends on removing nsDocument::Destroy
Flags: blocking1.8.1.4?
Flags: blocking1.8.1.4+
Flags: blocking1.8.0.12?
Flags: blocking1.8.0.12+
Flags: wanted1.8.1.x+
Flags: wanted1.8.0.x+
Johnny: any hope of a branch band-aide that doesn't require the cycle-collector?
Even _with_ the cycle collector this is not fixed on the trunk, really not realistic to expect a fix in 1.8.1.4
Assignee: jst → jonas
Flags: blocking1.8.1.5?
Flags: blocking1.8.1.4+
Flags: blocking1.8.0.13?
Flags: blocking1.8.0.12+
Whiteboard: [sg:critical] depends on removing nsDocument::Destroy → [sg:critical?] depends on removing nsDocument::Destroy
The worst I'm getting out of the testcase on 1.8.1.4pre is a consistent null dereference. I don't see any deleted frame usage.
Whiteboard: [sg:critical?] depends on removing nsDocument::Destroy → [sg:investigate] null deref? depends on removing nsDocument::Destroy
Target Milestone: --- → mozilla1.9alpha6
clearing branch blocking nominations until we see what a trunk fix looks like or evidence of a branch crash that's not a null deref.
Flags: blocking1.8.1.5?
Flags: blocking1.8.0.13?
Target Milestone: mozilla1.9alpha6 → mozilla1.9beta1
(Reporter)

Comment 37

12 years ago
Now, when I load testcase 2 and tab around, I get:

###!!! ASSERTION: Unexpected current doc in root content: 'mRootContent->GetCurrentDoc() == this', file /Users/jruderman/trunk/mozilla/content/base/src/nsDocument.cpp, line 819

###!!! ASSERTION: disconnected nodes: 'parents1.ElementAt(pos1) == parents2.ElementAt(pos2)', file /Users/jruderman/trunk/mozilla/content/base/src/nsContentUtils.cpp, line 1413

and I often get:

###!!! ASSERTION: JS object but unknown to the JS GC?: 'refcount > 0', file /Users/jruderman/trunk/mozilla/js/src/xpconnect/src/nsXPConnect.cpp, line 678

###!!! ASSERTION: Fault in cycle collector: zero refcount (ptr: 2460b60)
Target Milestone: mozilla1.9 M8 → ---
Priority: -- → P1
Should be fixed by patch in bug 348156
Status: NEW → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
Flags: in-testsuite?
verified fixed using  Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9pre) Gecko/2008050621 Firefox/3.0pre and the 2 testcases. No Crash on testcases and i also don't see the assertions

--> Verified fixed
Status: RESOLVED → VERIFIED
Group: core-security
Crash Signature: [@ nsContentIterator::NextNode]
Component: DOM → DOM: Core & HTML
Product: Core → Core
You need to log in before you can comment on or make changes to this bug.