Closed
Bug 736695
Opened 12 years ago
Closed 12 years ago
crash in nsGenericElement::UnbindFromTree , when I open Customize Toolbar with Video DownloadHelper 4.9.8 installed
Categories
(Core :: DOM: Core & HTML, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla16
Tracking | Status | |
---|---|---|
firefox14 | + | unaffected |
firefox15 | + | unaffected |
People
(Reporter: alice0775, Assigned: bzbarsky)
References
Details
(4 keywords)
Crash Data
Attachments
(3 files, 4 obsolete files)
4.28 KB,
patch
|
smaug
:
review+
|
Details | Diff | Splinter Review |
5.35 KB,
patch
|
smaug
:
review+
|
Details | Diff | Splinter Review |
4.93 KB,
patch
|
smaug
:
review+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-e8bdcfd6-c965-429f-b50a-605402120317 . ============================================================= Reproducible: Always Steps to Reproduce: 1. Start Nightly with new profile 2. Install Video DownloadHelper 4.9.8 and Restart browser ( https://addons.mozilla.org/ja/firefox/addon/video-downloadhelper/ ) 3. Open Customize Toolbar (Alt V T C) If browser does not crash, carry out the following STR. 4. Drag & drop an Icon of Video DownloadHelper to Toolbar from palette. 5. Click Done 6. Open Customize Toolbar (Alt V T C) Actual Results: Browser crashes Expected Results: Shoud not.
Reporter | ||
Comment 1•12 years ago
|
||
Err I forgot build ID. http://hg.mozilla.org/mozilla-central/rev/e5f6caa40409 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20120316 Firefox/14.0a1 ID:20120316031151
Reporter | ||
Comment 2•12 years ago
|
||
Regression window: No crash: http://hg.mozilla.org/mozilla-central/rev/a888f210af4e Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20120314 Firefox/14.0a1 ID:20120314130626 Crash http://hg.mozilla.org/mozilla-central/rev/5280e98d2d77 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20120314 Firefox/14.0a1 ID:20120314131528 Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a888f210af4e&tochange=5280e98d2d77 Triggered by: 5280e98d2d77 Kyle Huey — Bug 730587: Stash a pointer to the subtree root on DOM nodes. r=smaug, sr=j
Blocks: 730587
Keywords: regression,
reproducible
Updated•12 years ago
|
Crash Signature: [@ nsGenericElement::UnbindFromTree(bool, bool)] → [@ nsGenericElement::UnbindFromTree(bool, bool)]
[@ nsGenericElement::UnbindFromTree ]
Component: General → DOM
Keywords: topcrash
OS: Windows 7 → All
Product: Firefox → Core
QA Contact: general → general
Hardware: x86 → All
Version: Trunk → 14 Branch
Assignee: nobody → khuey
Updated•12 years ago
|
tracking-firefox14:
--- → ?
Comment 3•12 years ago
|
||
Uninstalling this extension fixes the issue. This should probably be resolved INVALID.
Comment 4•12 years ago
|
||
(In reply to Bill Gianopoulos [:WG9s] from comment #3) > Uninstalling this extension fixes the issue. This should probably be > resolved INVALID. I disagree because it's a well identified regression in the DOM component that VDH has only highlighted. If you don't want to fix it, it's at least an Extension Compatibility bug.
This fixes the crash. What else it breaks ... who knows.
Updated•12 years ago
|
Attachment #609501 -
Attachment is obsolete: true
bz is going to take this.
Assignee: khuey → bzbarsky
Assignee | ||
Comment 7•12 years ago
|
||
OK. The thing Kyle and I talked about totally doesn't work, because the destructor rearranges the DOM that UnbindFromTree is iterating. So a scriptrunner but more carefully it is. Here's hoping.
Assignee | ||
Comment 8•12 years ago
|
||
So I tried putting all of the XBL stuff on a scriptrunner, but that fails leads to crashes in tests because we end up trying to reresolve style on elements which have no current doc (and indeed no parent). Furthermore, the content has a null GetPrimaryFrame() (because it's not in the document, so it's using mSubtreeRoot), but it's the mContent for a frame that's still in the DOM tree. The fundamental problem here is that destructors can touch the anonymous content (and in the textbox case apparently do, which is why Kyle's patch failed), but that anonymous content is also supposed to get unbound by binding teardown... and may have destructors of its own. I'd guess that somewhere in here we end up with an event posted that unbinds an element even though some other destructor code somewhere stuck it back in the tree. Or something. I'm not quite sure where to go with this, without breaking compat. :(
Assignee | ||
Comment 9•12 years ago
|
||
Oh, I see the real problem. We're inserting a node that's already in the doc, so we remove it in the old location, then insert in the new one, then end up running the now-async destructor code and then it's all broken. We need to do the remove and insert in separate update batches or something.
Since I was cc'ed: I really think that we need to fix XBL's insanity of running too much code inside traditionally unsafe code paths like UnbindFromTree and nsIMutationObserver callbacks. Moving to script runners really feel like the correct solution. So even if it brings some risk as far as backwards incompatibility goes, I think it's a bullet we'll need to bite sooner or later. Beyond that there's not enough context here for me to understand what's going on, feel free to ping me on irc if you want more specific input.
Assignee | ||
Comment 11•12 years ago
|
||
A try run is at https://tbpl.mozilla.org/?tree=Try&rev=c0b100307ac8
Attachment #613307 -
Flags: review?(jonas)
Assignee | ||
Updated•12 years ago
|
Priority: -- → P2
Whiteboard: [need review]
Comment on attachment 613307 [details] [diff] [review] Proposed scary fix Review of attachment 613307 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/base/src/nsGenericElement.cpp @@ +4208,5 @@ > + if (aRefChild) { > + return aParent->IndexOf(aRefChild); > + } > + > + return aParent->GetChildCount(); I know optimizers used to deal better with ternary operator than multiple return statements. Not sure if that's still the case, but I find them basically as readable so I tend to prefer the ternary version. @@ +4296,5 @@ > + // Reget the insPos > + insPos = GetInsPos(this, aRefChild); > + if (insPos < 0) { > + return NS_ERROR_DOM_NOT_FOUND_ERR; > + } You also need to check IsAllowedAsChild. But I think this is wrong. It's not removing a to-be-replaced child that we want to do in a separate script-blocker. The node here can't ever be inserted into another parent during the scriptblocker. It's where we remove aNewChild from it's old parent that we should be inside a separate scriptblocker.
Attachment #613307 -
Flags: review?(jonas) → review-
Assignee | ||
Comment 13•12 years ago
|
||
> It's not removing a to-be-replaced child that we want to do in a separate script-blocker.
Hmm. Because it's not going to be reinserted? But that's not the important part here. We need to remove the aRefChild _before_ we remove aNewChild, right? And if the latter can't be under the same scripblocker as the insert (because it _is_ being reinserted), then neither can the former one be, since scriptblockers can't be "canceled" temporarily. Hence the need for all of these to be under separate scriptblockers.
So what I think we need to do is to: * Rearrange mutations such that we first remove newChild from its old parent, then (if needed) remove refChild from its current parent, then insert newChild in its new parent. * Create one mutationObserverBatch and one scriptblocker which spans removing newChild from its old parent * Create one mutationObserverBatch+scriptblocker which spans the other mutations * Insert code in between the batches/blockers which re-checks invariants.
Comment 16•12 years ago
|
||
What does the spec say?
Comment 17•12 years ago
|
||
This is spiking again on trunk, can we find a fix before it's uplifted to Aurora and affects even more users?
Assignee | ||
Comment 18•12 years ago
|
||
I really doubt that, esp. with the approval-only business. I'll try to get a patch done this week, but no guarantees.... if someone else has time to deal, please do so.
Comment 21•12 years ago
|
||
Could we please have some progress here? This signature is still a large topcrash on Nightly and now on Aurora as well.
Assignee | ||
Comment 22•12 years ago
|
||
> Could we please have some progress here? If you want things to not regress in the process, then probably not in the next week unless you find someone else to do it. For purposes of Aurora, does backing out bug 730587 make sense?
Assignee | ||
Comment 23•12 years ago
|
||
And to be clear, the changes being discussed earlier in this bug are _really_ scary. Very high web compat risk, very high security regression risk, very high crash regression (worse than this one) risk.
Comment 24•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #22) > For purposes of Aurora, does backing out bug 730587 make sense? Makes sense to me.
I backed bug 730587 out of Aurora.
status-firefox14:
--- → unaffected
status-firefox15:
--- → affected
tracking-firefox15:
--- → ?
Version: 14 Branch → 15 Branch
Updated•12 years ago
|
Comment 27•12 years ago
|
||
Customize toolbar is a instant browser crash here on 15.0a1 (2012-05-19). I'd guess you all know this it is being reported by the auto bug reporting.
Assignee | ||
Comment 28•12 years ago
|
||
> * Rearrange mutations such that we first remove newChild from its old parent Fwiw, that's what the spec at http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#concept-node-replace says right now too: To replace a child with node within a parent, run these steps: ... 8. Adopt node with parent's node document. 9. Remove child from its parent with the suppress observers flag set. The adopt removes "node" from the DOM. I'll try to do that.
Assignee | ||
Comment 29•12 years ago
|
||
Attachment #629448 -
Flags: review?(jonas)
Assignee | ||
Updated•12 years ago
|
Attachment #613307 -
Attachment is obsolete: true
Assignee | ||
Comment 30•12 years ago
|
||
Attachment #629449 -
Flags: review?(jonas)
Assignee | ||
Comment 31•12 years ago
|
||
Attachment #629450 -
Flags: review?(jonas)
Assignee | ||
Updated•12 years ago
|
Attachment #629449 -
Attachment is obsolete: true
Attachment #629449 -
Flags: review?(jonas)
Assignee | ||
Comment 32•12 years ago
|
||
Attachment #629452 -
Flags: review?(jonas)
Comment on attachment 629448 [details] [diff] [review] part 1. Rearrange node insertion/replacement such that we remove the newChild from its parent before removing the refChild, if we need to remove refChild. Review of attachment 629448 [details] [diff] [review]: ----------------------------------------------------------------- Waiting with reviewing the rest until I'm sure I understand this patch correctly. ::: content/base/src/nsGenericElement.cpp @@ +4205,5 @@ > + > + // Record the node to insert before, if any > + nsINode* nodeToInsertBefore; > + if (aReplace) { > + nodeToInsertBefore = aRefChild ? aRefChild->GetNextSibling() : nsnull; If aReplace is then aRefChild will be true too, otherwise we would have bailed up at the top of the function. @@ +4216,5 @@ > + nodeToInsertBefore = nodeToInsertBefore->GetNextSibling(); > + } > + > + mozAutoDocUpdate batch(GetCurrentDoc(), UPDATE_CONTENT_MODEL, true); > + nsAutoMutationBatch mb; Hmm.. I think this mutation-batch should start after removing newchild from its old parent. I.e. we want two "batches", one for removing newchild from its old parent, and one for replacing refchild with newchild (or just inserting newchild) @@ +4245,5 @@ > + PRInt32 insPos; > + if (nodeToInsertBefore) { > + insPos = IndexOf(nodeToInsertBefore); > + if (insPos < 0) { > + // XXXbz How the heck would _that_ happen, exactly? This can happen as a result of XBL, right? @@ +4254,5 @@ > + insPos = GetChildCount(); > + } > + > + // If we're replacing and we haven't removed aRefChild yet, do so now > + if (aReplace && aRefChild != aNewChild) { Why the |aRefChild != aNewChild| check? Shouldn't we treat that as removing the child and then re-inserting it? It seems like that was what the old code did?
Attachment #629448 -
Flags: review?(jonas) → review-
Assignee | ||
Comment 34•12 years ago
|
||
> If aReplace is then aRefChild will be true too Hmm. Yeah, ok. Will fix. > I.e. we want two "batches" That happens in part 2. Part 1 is about changing the order without changing anything about the mutation batching behavior. > This can happen as a result of XBL, right? Hmm. I guess so, if aRefChild is anonymous content. OK, I'll fix the comment accordingly. > Why the |aRefChild != aNewChild| check? Because if aRefChild == aNewChild then we already removed it when we removed aNewChild above. So there is nothing else to remove here (and in fact trying will trigger the asserts in that block, since aRefChild->GetNextSibling() is now null).
Assignee | ||
Updated•12 years ago
|
tracking-firefox16:
--- → ?
Assignee | ||
Comment 36•12 years ago
|
||
Comment on attachment 629448 [details] [diff] [review] part 1. Rearrange node insertion/replacement such that we remove the newChild from its parent before removing the refChild, if we need to remove refChild. Jonas is AWOL....
Attachment #629448 -
Flags: review- → review?(bugs)
Assignee | ||
Updated•12 years ago
|
Attachment #629450 -
Flags: review?(jonas) → review?(bugs)
Assignee | ||
Updated•12 years ago
|
Attachment #629452 -
Flags: review?(jonas) → review?(bugs)
Comment 37•12 years ago
|
||
Comment on attachment 629452 [details] [diff] [review] part 3. Tear down XBL bindings off a scriptrunner when unbinding nodes. >+class RemoveFromBindingManagerRunnable : public nsRunnable { { should be in the next line >+public: >+ RemoveFromBindingManagerRunnable(nsBindingManager* manager, >+ Element* element, >+ nsIDocument* doc, >+ nsIContent* bindingParent): aParameterName, please >+ mManager(manager), mElement(element), mDoc(doc), >+ mBindingParent(bindingParent) >+ {} >+ >+ NS_IMETHOD Run() { { to next line
Attachment #629452 -
Flags: review?(bugs) → review+
Comment 38•12 years ago
|
||
I'll review the difficult parts tomorrow :)
Comment 39•12 years ago
|
||
Comment on attachment 629448 [details] [diff] [review] part 1. Rearrange node insertion/replacement such that we remove the newChild from its parent before removing the refChild, if we need to remove refChild. Apparently sicking had reviewed this already, and I found the same issues before noticing that.
Attachment #629448 -
Flags: review?(bugs) → review-
Updated•12 years ago
|
Attachment #629450 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 40•12 years ago
|
||
What issues? Did I not address them in comment 34? Do you just want to see a patch with the changes from comment 34 applied?
Comment 41•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #40) > Do you just want to see a patch with the changes from comment 34 applied? Yes please :)
Assignee | ||
Updated•12 years ago
|
Attachment #629448 -
Attachment is obsolete: true
Updated•12 years ago
|
Attachment #631925 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 43•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/7e5996b0a171 https://hg.mozilla.org/integration/mozilla-inbound/rev/e48ff34679e3 https://hg.mozilla.org/integration/mozilla-inbound/rev/ca716959ca2b We still need to back out bug 730587 on Aurora 15.
Flags: in-testsuite?
Whiteboard: [need review]
Target Milestone: --- → mozilla16
Comment 44•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/7e5996b0a171 https://hg.mozilla.org/mozilla-central/rev/e48ff34679e3 https://hg.mozilla.org/mozilla-central/rev/ca716959ca2b
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
tracking-firefox16:
? → ---
Comment 47•12 years ago
|
||
Since this is RESOLVED FIXED, when will the "FIX" hit the Aurora channel?
Comment 48•12 years ago
|
||
Next merge on 2012-07-16. See <https://wiki.mozilla.org/RapidRelease/Calendar>.
Comment 49•12 years ago
|
||
Sorry asked the wrong question. When will the "FIX" hit FF15 which is impacted by this bug?
Assignee | ||
Comment 50•12 years ago
|
||
When bug 730587 is backed out from Aurora. Kyle was aiming to do that yesterday, but presumably ran out of time...
Comment 52•12 years ago
|
||
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #51) > https://hg.mozilla.org/releases/mozilla-aurora/rev/44103bdf65b1 Just updated FF15a2 as of 8:20 AM CDST [6:20 AM PDST] and it still crashes when trying to open Customize Toolbar with Video DownloadHelper 4.9.8 installed.
Comment 53•12 years ago
|
||
The first Aurora build that contains the fix is 15.0a2/20120621042006.
Comment 54•12 years ago
|
||
Thanks for the info. I assume that it is in the update channel.
Comment 55•12 years ago
|
||
This FF15a2 build on OSX 20120621042006 http://hg.mozilla.org/releases/mozilla-aurora/rev/d557426b0c8d is still crashing when trying to open Customize Toolbar with Video DownloadHelper 4.9.8 installed.
haha, thanks for catching that. I only backed out the followup to the bug :-P
https://hg.mozilla.org/releases/mozilla-aurora/rev/0f805ac53831 It should be fixed in tomorrow's Aurora.
Updated•5 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•