Closed Bug 373911 Opened 15 years ago Closed 14 years ago
xbl destructor bound to body causes trouble [@ ns
656 bytes, application/xhtml+xml
293 bytes, text/html
1.04 KB, text/xml
528 bytes, text/html
5.43 KB, patch
|Details | Diff | Splinter Review|
5.02 KB, patch
|Details | Diff | Splinter Review|
Assignee: nobody → general
Severity: normal → critical
Component: General → XBL
Product: Firefox → Core
QA Contact: general → ian
Summary: xbl destructor bound to body causes trouble → xbl destructor bound to body causes trouble [@ nsXBLBinding::AllowScripts]
I think this testcase shows the same crash. It happens when the window gets destroyed in the destructor.
martijn, your testcase seems very similar to these: Bug 332971 and Bug 332807 will attach my testcases in this bug.
Assignee: general → Olli.Pettay
Severity: critical → normal
Component: XBL → DOM: Events
This fixes the crash. Bound elements must stay alive while iterating through xbl destructors. nsBindingList AddRefs/Releases automatically. (Have to write a testcase first, but maybe similar thing is needed for constructors.)
(In reply to comment #6) > > (Have to write a testcase first, but maybe similar thing is needed for > constructors.) > it works from constructors. in "testcase inside iframe" just move the code from the destructor to the constructor.
mean constructors also crash.
martijn testcase crashes 220.127.116.11
The constructor crash doesn't seem to be XBL problem. I'm not too happy with the change I had to do to presshell.
do you expect problems xbl event handler to kill as much content as it can and then try to access dead objects?
I don't expect, but if you find any problems, please CC me to the bug. Event handling is quite often protected because of the fact that event target chain is kept alive until the end of dispatch.
Did you mean to ask for review on that last patch?
(In reply to comment #2) > I think this testcase shows the same crash. It happens when the window gets > destroyed in the destructor. > not quite sure it is the same bug, though definitely quite similar problem. your testcase works on 18.104.22.168 so you may consider filing a different bug (there are similar bugs xbl removing elements, your one may be a duplicate of some of them)
Comment on attachment 258647 [details] [diff] [review] fixes contructor too >Index: content/xbl/src/nsBindingManager.cpp >+ closure->mBindings.AppendElement(aBinding); Say this succeeds.... >+ closure->mBoundElements.AppendObject(aBinding->GetBoundElement()); And this fails with out-of-memory. Won't you crash the same way? More interestingly, should bindings just hold a strong ref to the bound element? >Index: layout/base/nsPresShell.cpp I'm not entirely convinced of this change. Why is it needed, given http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/base/nsPresShell.cpp&rev=3.982&mark=4569-4574#4564 ? >Index: layout/generic/nsGfxScrollFrame.cpp >+ nsCOMPtr<nsIPresShell> presShell = mOuter->GetPresContext()->GetPresShell(); >+ presShell->FlushPendingNotifications(Flush_OnlyReflow); Any other callers of this that should be holding strong refs?
Attachment #258647 - Flags: review?(bzbarsky) → review-
Flags: blocking1.9? → blocking1.9+
More to the point, if the presshell can go away during that flush, why is it safe to access mAsyncScrollPortEvent at that point? That doesn't make sense to me... I'd be happy to just have Flush_OnlyReflow be able to destroy the presshell (in which case I would probably disallow passing it period, in favor of Flush_Layout), but then all consumers need to deal with it.
I think we could just keep viewmanager alive in flushpendingnotifications. I didn't make boundelement strong because we need to iterate through bindings anyway and if the patch is needed in 1.8, there is no cycle collector there. Changes to nsGfxScrollFrameInner::FireScrollPortEvent aren't needed anymore.
I'm not sure I follow this patch. Is the viewmanager destroyed as part of a Flush_OnlyReflow flush? Or as part of a Flush_StyleReresolves flush? The former is a problem. The latter is acceptable. Can you tell me exactly what the crashing sequence is here? That would make it easier to review things... Oh, and you can't remove those XXX comments, because the problem they're talking about is still there even with this patch applied.
hmm, I can't get the constructor (nor destructor) testcase to crash if I have only the fixes for content/xbl. Something has changed...
er, nm, ofc it doesn't crash anymore, but I'll still think about this some more.
(In reply to comment #19) > hmm, I can't get the constructor (nor destructor) testcase to crash if > I have only the fixes for content/xbl. Something has changed... > looks like someone broke inline xbl which broke martijn's xbl testcase (it doesn't load) xulifr3.html in this bug still crashes locally with this stack: #6 0x08732e6a in nsXBLBinding::AllowScripts (this=0x9d684a0) at /opt/joro/firefox-cvs/mozilla/content/xbl/src/nsXBLBinding.cpp:1237 #7 0x08733c47 in nsXBLBinding::ExecuteDetachedHandler (this=0x9d684a0) at /opt/joro/firefox-cvs/mozilla/content/xbl/src/nsXBLBinding.cpp:835 #8 0x0874ac57 in ExecuteDetachedHandler (aBinding=0x9d684a0, aClosure=0x0) at /opt/joro/firefox-cvs/mozilla/content/xbl/src/nsBindingManager.cpp:872 #9 0xb7d3c5dc in nsVoidArray::EnumerateForwards (this=0xbfe90a54, aFunc=0x874ac0e <ExecuteDetachedHandler>, aData=0x0) at nsVoidArray.cpp:678 #10 0x0874ac00 in nsBindingManager::ExecuteDetachedHandlers (this=0x9c50ed8) at /opt/joro/firefox-cvs/mozilla/content/xbl/src/nsBindingManager.cpp:885 #11 0x0879c291 in nsGlobalWindow::PostHandleEvent (this=0x9b5c4c0,
inline xbl works here fine and both destructor testcases crash.
(In reply to comment #22) > inline xbl works here fine and both destructor testcases crash. > hmm, when reloading martijn's attachment "testcase" from bugzilla the stack is scary: xulifr3.html still works. #5 <signal handler called> #6 0x0823969c in nsNodeInfoManager::GetDocument (this=0xd8d8d8d8) at ../../dist/include/content/nsNodeInfoManager.h:113 #7 0x082396b6 in nsINodeInfo::GetDocument (this=0xa0f6e18) at ../../dist/include/content/nsINodeInfo.h:291 #8 0x08239818 in nsINode::GetOwnerDoc (this=0x9b4c198) at ../../dist/include/content/nsINode.h:213 #9 0x08732e4b in nsXBLBinding::AllowScripts (this=0x9b50260) at /opt/joro/firefox-cvs/mozilla/content/xbl/src/nsXBLBinding.cpp:1232 (gdb) frame 6 #6 0x0823969c in nsNodeInfoManager::GetDocument (this=0xd8d8d8d8) at ../../dist/include/content/nsNodeInfoManager.h:113 113 return mDocument; (gdb) x/i $pc 0x823969c <_ZNK17nsNodeInfoManager11GetDocumentEv+6>: mov 0xc(%eax),%eax (gdb) p/x $eax $1 = 0xd8d8d8d8
(In reply to comment #18) > I'm not sure I follow this patch. Is the viewmanager destroyed as part of a > Flush_OnlyReflow flush? Or as part of a Flush_StyleReresolves flush? The > former is a problem. The latter is acceptable. ProcessReflowCommands() may cause PresShell's Destroy to be called. That is because DidDoReflow() calls HandlePostedReflowCallbacks().
Comment on attachment 267140 [details] [diff] [review] Keep viewmanager alive So processing reflow commands may destroy presshell. Need to look at other cases when Flush_OnlyReflow is used. What problem is there still with viewmanager? At least it gets EndUpdateViewBatch if BeginUpdateViewBatch is called.
Which reflow callback can destroy presshells? I'm happy to get rid of FlushOnlyReflow; that's come up in other bugs too. But then we absolutely need to fix the callers -- every single caller is doing it because it can't handle frametree changes. As for viewmanager, batches are stored on the root of the viewmanager tree. So ending the batch after it's been removed from the tree has no effect -- the rest of the tree will never render right again. And the batch counter on that viewmanager will underflow, of course. So basically this patch is wallpaper. I'd rather we fixed the real problem(s).
Some attributes are set in reflow callbacks, for example in xul menu code and scrollbars IIRC.
OK. Sounds like we need to make the assumption that flushing layout can always run arbitrary script, and change all callers accordingly. We'll want this change too, I guess, but it's not nearly enough.
I really really don't want layout to run scripts during reflow. Not only does that make layout code a lot harder, it would also mean that in a pile of places where we can perform reflow, such as when getting layout DOM properties like offset* and during nsIMutationObserver notifications. Aren't those attribute-sets using PR_FALSE for aNotify and should therefore not run scripts? If that's not the case, could we set those attributes off an event instead?
Scripts don't (or shouldn't) run during reflow, but it is possible that scripts run right *after* reflow, when reflow callbacks are called. See comment #24.
Sicking's point is that for everyone but the actual frame tree code and presshell it's the same thing.
Comment on attachment 267140 [details] [diff] [review] Keep viewmanager alive r+ as wallpaper if you restore the comment you shouldn't be removing. But we really need to fix the actual bug. Before we ship 1.9. Otherwise we're in for a world of hurt...
Attachment #267140 - Flags: review?(bzbarsky) → review+
Comment on attachment 267140 [details] [diff] [review] Keep viewmanager alive I'll post a follow-up if this gets sr.
Attachment #267140 - Flags: superreview?(dbaron)
Comment on attachment 267140 [details] [diff] [review] Keep viewmanager alive sr=dbaron
Attachment #267140 - Flags: superreview?(dbaron) → superreview+
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [sg:critical? → [sg:critical?]
Might want this on branch to fix bug 400735, if this patch does so.
The presshell part isn't perfect, but prevents possible crashes. It isn't needed to fix Bug 400735. Comparing to trunk version, the xbl part has manual AddRef/Release
Comment on attachment 285866 [details] [diff] [review] possible patch for 18 Looks good.
Comment on attachment 285866 [details] [diff] [review] possible patch for 18 approved for 22.214.171.124 and 126.96.36.199, a=dveditz for release-drivers For 188.8.131.52 please land on the GECKO181_20071004_RELBRANCH. The regular 1.8 branch will become 184.108.40.206
Checked in to MOZILLA_1_8_BRANCH and GECKO181_20071004_RELBRANCH
verified fixed 220.127.116.11 using Mozilla/5.0 (X11; U; Linux i686; en-US; rv:18.104.22.168pre) Gecko/2007102403 BonEcho/22.214.171.124pre and the testcase from martijn from this bug. No Crash on testcase -> adding verified keyword
Why would this fix bug 400735? The patch here makes us more stable while executing destructors, but bug 400735 is about us crashing while running constructors.
Ah, true. I made the 1.8 patch because of the testcase in bug 400735 and because of https://bugzilla.mozilla.org/show_bug.cgi?id=400735#c11. But it seems that the testcase is possibly wrong.
verified fixed 126.96.36.199 using Mozilla/5.0 (X11; U; Linux i686; en-US; rv:188.8.131.52) Gecko/2007111504 Firefox/184.108.40.206 and the testcases from this bug -> no crash -> adding verified keyword
(In reply to comment #32) > r+ as wallpaper if you restore the comment you shouldn't be removing. But we > really need to fix the actual bug. Before we ship 1.9. Otherwise we're in for > a world of hurt.. (In reply to comment #33)) > I'll post a follow-up if this gets sr. Did a follow-up bug on this ever get filed?
Comment on attachment 285866 [details] [diff] [review] possible patch for 18 a=asac for 220.127.116.11 (same patch as shipped by distros)
Attachment #285866 - Flags: approval18.104.22.168+
please land on 1.8.0 branch
MOZILLA_1_8_0_BRANCH: Checking in content/xbl/src/nsBindingManager.cpp; /cvsroot/mozilla/content/xbl/src/nsBindingManager.cpp,v <-- nsBindingManager.cpp new revision: 22.214.171.124.4.5; previous revision: 126.96.36.199.4.4 done Checking in layout/base/nsPresShell.cpp; /cvsroot/mozilla/layout/base/nsPresShell.cpp,v <-- nsPresShell.cpp new revision: 3.8188.8.131.52.11; previous revision: 3.8184.108.40.206.10 done
You need to log in before you can comment on or make changes to this bug.