new crash on trunk and firefox 3.6 due to frame poisoning spin off from bug 526587 #12 rank in report from 2009 11 22 https://bug526587.bugzilla.mozilla.org/attachment.cgi?id=414317&t=ytKa49fedH 12. 92 0xfffffffff0dea817 Windows NT nsIFrame::GetStyleDisplay() reports at http://crash-stats.mozilla.com/report/index/244668c7-8d87-40de-b8c1-97bde2091123 rame Module Signature [Expand] Source 0 xul.dll nsIFrame::GetStyleDisplay layout/style/nsStyleStructList.h:95 1 xul.dll xul.dll@0x41cb0b 2 xul.dll PresShell::HandleEventInternal layout/base/nsPresShell.cpp:6471 3 xul.dll PresShell::HandlePositionedEvent layout/base/nsPresShell.cpp:6296 4 xul.dll PresShell::HandleEvent layout/base/nsPresShell.cpp:6160 5 xul.dll nsViewManager::HandleEvent view/src/nsViewManager.cpp:1222 6 xul.dll nsViewManager::DispatchEvent view/src/nsViewManager.cpp:1201 7 xul.dll HandleEvent view/src/nsView.cpp:167 8 xul.dll nsWindow::DispatchEvent widget/src/windows/nsWindow.cpp:2885 9 xul.dll nsWindow::DispatchWindowEvent widget/src/windows/nsWindow.cpp:2913 10 xul.dll nsWindow::DispatchMouseEvent widget/src/windows/nsWindow.cpp:3288 11 xul.dll ChildWindow::DispatchMouseEvent widget/src/windows/nsWindow.cpp:6959 sort on address for more reports http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=exact&query=&date=&range_value=1&range_unit=weeks&do_query=1&signature=nsIFrame::GetStyleDisplay%28%29
not much to go on from user comments to produce STR, but if code inspection can turn up a quick fix it would be good to take that fix in 3.6b/rc/final
So I'm not quite sure yet, but perhaps this has something to do with PresShell::NotifyDestroyingFrame and PresShell::ClearFrameRefs. The first one is called always but the latter one only when frame has NS_FRAME_EXTERNAL_REFERENCE or NS_FRAME_SELECTED_CONTENT bit. Bug 67752 (ireflow) added mFramesToDirty.RemoveEntry(aFrame) to ClearFrameRefs, but I'm not yet sure whether it adds the right bits to frames. We could probably merge NotifyDestroyingFrame and ClearFrameRefs, and use the frame bits to just optimize weakframe destroy handling.
Oops that patch contains some unrelated htmlinput things
Created attachment 414339 [details] [diff] [review] wip
Created attachment 414343 [details] [diff] [review] wip Perhaps it is even safer to clear mFramesToDirty
My analysis of mCurrentEventFrame handling is probably a bit wrong. In practice ESM will have an nsWeakFrame pointing to mCurrentEventFrame, so ClearFrameRefs will be called. mFramesToDirty handling is something to fix. And maybe mCurrentEventFrameStack handling too. But still, I'm not at all sure the patch would fix this bug :/ Jst, could I get a minidump for this. Maybe I could get something out of xul.dll@0x41cb0b
Doh. Good catch on mFramesToDirty!
Comment on attachment 414343 [details] [diff] [review] wip Ok, as far as I see, we need this. There are cases when mCurrentEventFrame is set, but without the flags, which would cause ClearFrameRefs to be called. (Those flags are set for example when ESM takes the reference.) One such case is calling first PresShell::HandleDOMEventWithTarget then call PresShell::GetCurrentEventFrame(). One of the minidumps does show a case where PresShell::HandleDOMEventWithTarget is called, and then the event listener does something which spins the event loop and new event patch is dispatched and bad things happen. I'm still not 100% the patch will fix this crash. And frame must be removed from mFramesToDirty. http://mxr-test.konigsberg.mozilla.org/bonsai/cvsblame.cgi?file=layout/generic/nsFrame.cpp&xrev=70f1501c3088&tree=mozilla-central&mark=441-446#417 I could file a followup to merge ClearFrameRefs and NotifyDestroyingFrame on trunk.
And note, mCurrentEventFrameStack.Length() is usually 0 when destroying frames.
Comment on attachment 414343 [details] [diff] [review] wip I assume mFramesToDirty will also be usually empty, since we only use it during the unwind for an interrupt?
Ah, but it's a hashtable so that lookup is O(1) anyway.
Comment on attachment 414343 [details] [diff] [review] wip This is not blocking 1.9.2, so can't land even to trunk.
http://hg.mozilla.org/mozilla-central/rev/77136b3d68fc Will push to 1.9.2 once trunk tests have run.
I think we want this to 1.9.1.x
Is this needed on the 1.9.0 branch as well?
I believe yes, we need this on 1.9.0 too.
Created attachment 418666 [details] [diff] [review] 1.9.0/1 This applies to 1.9.0.x and 1.9.1.x. The main difference to original patch is that 1.9.2/trunk have mFramesToDirty.RemoveEntry(aFrame) line, which doesn't exists on older branches.
Comment on attachment 418666 [details] [diff] [review] 1.9.0/1 Approved for 18.104.22.168 and 22.214.171.124, a=dveditz for release-drivers
Created attachment 421400 [details] [diff] [review] for 1.9.0/1 Uh, somehow I had uploaded a wrong patch for branches. This one doesn't have any 1.9.2/trunk stuff.
This crash is not fixed on 1.9.2. There are still a dozen of crashes with the identical stack: https://crash-stats.mozilla.com/query/query?product=Firefox&version=Firefox%3A3.6&date=&range_value=1&range_unit=weeks&query_search=signature&query_type=exact&query=nsIFrame%3A%3AGetStyleDisplay%28%29&build_id=&do_query=1 Crashes on 1.9.0 have a different stack. I wasn't able to find a bug which covers those crashes. Shall I file a new one? I believe those are different: 3.0.17: https://crash-stats.mozilla.com/report/index/5ab9f42b-880b-4b74-84b1-4688e2100126 And all those crashes are Windows only, not OS X.
Henrik, could you open a new bug. The crash has dropped from #12 to somewhere significantly lower, I assume. So a crash causing the stack trace seems to be fixed, but perhaps there is still something else.
For Firefox 3.6 we have bug 542833 now. I don't think it's worth filing a bug for 3.0.x yet, there are only 5 crashes for 3.0.17.
How did 126.96.36.199 crashes go down when this wasn't fixed in 188.8.131.52, Henrik?
(In reply to comment #27) > How did 184.108.40.206 crashes go down when this wasn't fixed in 220.127.116.11, Henrik? See comment 24. Crashes on 1.9.0 which I have mentioned have another stack. So those are not related to this particular bug.
I assume that there is no particular way to induce this bug on 1.9.1 or 1.9.0 for the purposes of verification (especially with comment 24 and 27)?
Al, we should wait for the release and check crashstats later. If the number drops drastically on both branches we are good.