Closed
Bug 134437
Opened 23 years ago
Closed 23 years ago
[CRASH] setting focus on inner frame a switching stylesheets causes crash [@ nsBlockFrame::ReflowBlockFrame]
Categories
(Core :: Layout, defect, P1)
Tracking
()
VERIFIED
FIXED
mozilla1.0
People
(Reporter: jrspm, Assigned: alexsavulov)
References
()
Details
(Keywords: crash, Whiteboard: [adt2 RTM])
Crash Data
Attachments
(4 files, 1 obsolete file)
|
7.18 KB,
text/plain
|
Details | |
|
2.74 KB,
patch
|
dbaron
:
review+
|
Details | Diff | Splinter Review |
|
3.12 KB,
patch
|
Details | Diff | Splinter Review | |
|
1.54 KB,
patch
|
kmcclusk
:
review+
waterson
:
superreview+
jesup
:
approval+
|
Details | Diff | Splinter Review |
using build 2002032916 on win2k (sp2sr1)
To reproduce:
1. Go to URL: http://www.hixie.ch/tests/evil/state/001.html
2. set the focus in the IFrame (just click the mouse once in the IFrame)
3. go to the view/Use Style and change the stylesheet to "Alternate"
Expected:
Should just switch the style and show the same number as was there before
Actual:
crash
Talkback ID: #TB4644175M
I first reported this crash in bug 52334. jst thought it might be related to
his changes...here is what he said:
"...That crash, however, is not due to these changes
directly. The crash that happens with the above testcase (note, you don't need
to select anything in the iframe, just giving it focus is enough) is due to a
re-entrancy problem in the layout frame destruction code that is much easier to
trigger with these changes than w/o them. We end up destroying a frame, and
while doing so we end up destroying a window which passes focus along to one of
the frames we're destroying, and that ends up reflowing that frame and since
that frame is being destroyed we end up getting references to deleted frames n'
other nice things that lead to the crash...."
After investigating this for a bit, jst said this:
"Ok, I just grabbed a build that didn't have my changes in it and did some more
testing with it. Turns out that the crash reported above is just as
reproduceable w/o my changes, I crash in the exact same place due to the exact
same reasons with a trunk build w/o my changes. Because of this fact, I won't be
spending any more time on that crash as part of this bug, we need a new bug for
that crash."
This should definitely be fixed *before* 1.0 is released!
jake
| Reporter | ||
Comment 1•23 years ago
|
||
cc'ing jst and targeting to Mozilla 1.0. Sorry if this isn't kosher, but this
crash needs to be fixed before a 1.0 release for sure!
Jake
Target Milestone: --- → mozilla1.0
Comment 2•23 years ago
|
||
worksforme with linux build 20020329. part of the frame looks a bit scrambled
for a second, but fixes itself. no crash.
Comment 3•23 years ago
|
||
Could someone test if this same crash is reproduceable on a real site that uses
iframes?
Comment 5•23 years ago
|
||
Could be related to bug 133219 and/or bug 134520
Comment 6•23 years ago
|
||
Strangely, NS_ABORT_IF_FALSE dies *NOT* abort if false - it asserts but does not
abort :(
The problem is that we are trying to reflow an empty line, because the empty
line is still marked dirty. I need to spend some time digging into why the line
is emptied without having the dirty bit cleared, but the quick fix is to check
for a null firstChild in ReflowBlockFrame, before dereferencing it.
Status: NEW → ASSIGNED
Comment 7•23 years ago
|
||
Oh, I see - NS_ABORT_IF_FALSE was supposed to abort the program, not the
routine. So, it is just an assertion by another name. Amazing what reading the
header files can do for a guy...
Comment 8•23 years ago
|
||
I have a fix for this: we are marking empty lines dirty in a couple of places,
and that is causing assertions and ultimately the crash because if a line is
marked dirty, we assume that it has at least one child. I'll attach a patch, but
I'm first trying to understand if it is ever legal to have a dirty empty line.
Summary: [CRASH] setting focus on inner frame a switching stylesheets causes crash → [CRASH] setting focus on inner frame a switching stylesheets causes crash [@ nsBlockFrame::ReflowBlockFrame]
I'm attaching Jake's incident info and I have added the stack signature into
the summary simply so that Talkback staff can see this one. There are enough
incidents with this signature that I don't want to accidentally start logging
dupes.
| Assignee | ||
Updated•23 years ago
|
Assignee: attinasi_layout → alexsavulov
Status: ASSIGNED → NEW
| Assignee | ||
Comment 11•23 years ago
|
||
taking from Marc
| Assignee | ||
Comment 12•23 years ago
|
||
here is what happens:
look at the stack below:
- the frame constructor removes content (contentremoved)
- the frame manager instructs the parent frame to remove a frame
- the block frame destroys its child (block frame) before it removes
the line that contained the block
- the removed frame destroys itself but is also destroying a view
that will cause the app event queue to be processed
- there is a reflow command in the queue
- the parent block tries to reflow an empty line that ends with a crash
solution:
in BlockFrame::DoRemoveFrame
remove the line first then call aDeletedFrame->Destroy:
- aDeletedFrame->Destroy(aPresContext);
- aDeletedFrame = nextInFlow;
// If line is empty, remove it now
if (0 == lineChildCount) {
...
++line;
}
}
+ aDeletedFrame->Destroy(aPresContext);
+ aDeletedFrame = nextInFlow;
I will prepare a patch on this
(stack excerpt:)
nsIFrame::GetStyleData(nsStyleStructID eStyleStruct_Display, const nsStyleStruct
* & 0x00126ed8) line 564 + 3 bytes
nsBlockFrame::ReflowBlockFrame(nsBlockReflowState & {...}, nsLineList_iterator
{...}, int * 0x001271b8) line 3102
nsBlockFrame::ReflowLine(nsBlockReflowState & {...}, nsLineList_iterator {...},
int * 0x001271b8, int 1) line 2461 + 27 bytes
nsBlockFrame::ReflowDirtyLines(nsBlockReflowState & {...}) line 2238 + 31 bytes
nsBlockFrame::Reflow(nsBlockFrame * const 0x036f6498, nsIPresContext *
0x035a8048, nsHTMLReflowMetrics & {...}, const nsHTMLReflowState & {...},
unsigned int & 0) line 844 + 15 bytes
.......
PresShell::ProcessReflowCommands(int 0) line 6338
PresShell::FlushPendingNotifications(PresShell * const 0x0351a898, int 0) line 5056
nsDocument::FlushPendingNotifications(nsDocument * const 0x036c0978, int 1, int
0) line 3419
......
nsWindow::ProcessMessage(unsigned int 7, unsigned int 2753886, long 0, long *
0x0012c1d8) line 3733 + 23 bytes
nsWindow::WindowProc(HWND__ * 0x004a0584, unsigned int 7, unsigned int 2753886,
long 0) line 1130 + 27 bytes
USER32! 77e12e98()
USER32! 77e139a3()
USER32! 77e1395f()
NTDLL! 77fa032f()
nsView::~nsView() line 163
nsView::`scalar deleting destructor'(unsigned int 1) + 15 bytes
.....
nsBlockFrame::DoRemoveFrame(nsIPresContext * 0x035a8048, nsIFrame * 0x036f98f0)
line 4946
nsBlockFrame::RemoveFrame(nsBlockFrame * const 0x036f6498, nsIPresContext *
0x035a8048, nsIPresShell & {...}, nsIAtom * 0x00000000, nsIFrame * 0x036f98f0)
line 4808 + 16 bytes
FrameManager::RemoveFrame(FrameManager * const 0x034f6528, nsIPresContext *
0x035a8048, nsIPresShell & {...}, nsIFrame * 0x036f6498, nsIAtom * 0x00000000,
nsIFrame * 0x036f98f0) line 1015
nsCSSFrameConstructor::ContentRemoved(nsCSSFrameConstructor * const 0x0341a0b0,
nsIPresContext * 0x035a8048, nsIContent * 0x03694310, nsIContent * 0x034feaa0,
int 2, int 0) line 9575 + 58 bytes
nsCSSFrameConstructor::RecreateFramesForContent(nsIPresContext * 0x035a8048,
nsIContent * 0x034feaa0, int 0, nsIStyleRule * 0x00000000, nsIStyleContext *
0x00000000) line 11868 + 35 bytes
nsCSSFrameConstructor::ProcessRestyledFrames(nsCSSFrameConstructor * const
0x0341a0b0, nsStyleChangeList & {...}, nsIPresContext * 0x035a8048) line 10020
PresShell::ReconstructStyleData(PresShell * const 0x0351a898, int 0) line 5382
| Assignee | ||
Comment 13•23 years ago
|
||
the promised patch, see explanation in previous comment
i also propose a similar change in the nsLineBox to prevent surprizes
Why in the world do we return to the event queue when deleting a view? That's
windows-only, isn't it? Hawe we always done that? It makes it very hard to
maintain lots of invariants. Under the circumstances, though, the patch seems
reasonable.
| Assignee | ||
Comment 15•23 years ago
|
||
well, i have not debugged yet on Linux, (i'm not that good in debugging the
XFree86 stuff but I'll ask JKeiser to help), however on windows when destroying
the view, we also destroy its window using the native call to the win32 API
::DestroyWindow(HWND) and that brings us back into the windows message loop so
we process all the pending events. I will take a look at the Linux
implementation and see what is going on there.
Comment 16•23 years ago
|
||
Comment on attachment 78805 [details] [diff] [review]
patch preview
Good catch! It is surprising that we can be reentered during the deletion of
the frame, but I think your solution to the problem is a good one. In general
it seems like a good idea to modify the line before modifying the frame that is
in the line...
Currently, there is no invariant that says we cannot be reflowed again during a
reflow (or any other operation on a frame for that matter). Maybe if we get a
real reflow queue instead of using the OS message queue we can assert that kind
of invariant, but right now we have no control over it.
| Assignee | ||
Comment 17•23 years ago
|
||
the same patch with explanatory comments added
Attachment #78805 -
Attachment is obsolete: true
| Assignee | ||
Comment 18•23 years ago
|
||
this is an Win32 issue only (I didn't tested on the 2 Macs yet, but I think that
is really Win32 only)
I tried to find a reason why shall the empty lines exist beyond the existence of
the frames that are going to be deleted). Couldn't find anything to advocate for
that since the destroying of a nsLineBox does not causes more than release the
allocated arena space. (we don't have any other objects inheriting from linelink)
(i also noticed the mem leak in the nsLineBox::DeleteLineList, but i heard
dbaron fixed it already)
Comment on attachment 78966 [details] [diff] [review]
revised patch with comments
r=dbaron, although you should fix the incorrect indentation (tabbing?) of the
comments. It also looks like they might run over 80 characters.
Attachment #78966 -
Flags: review+
Comment 20•23 years ago
|
||
> Currently, there is no invariant that says we cannot be reflowed again during a
> reflow (or any other operation on a frame for that matter).
No. Layout (i.e., reflow) may not be re-entered, period. Can't we just not
process reflows if the window is being destroyed? Or detect that we are
_already_ in reflow, and not try to re-process the queue?
| Assignee | ||
Comment 21•23 years ago
|
||
chris:
i would suggest to fix the crash with the patch and open another bug to see if
we can correct the event queue problem on windows. i think that in this case we
don't have a re-entrancy problem (we're not in the reflow yet - see stack in
comment #12), but that sync message sent from win32's ::DestroyWindow is causing
us to flush the queue (i don't know the internals of ::DestroyWindow but i
strongly suspect that is making use of ::SendMessage that is a sync message
processing). I think that marc's ideea to have reflow 'gates' (bug 137195) to
block reflow when the tree is not in a consistent state, is a good one. I'm not
sure yet if this is implementable without causing other hard to detect problems
like the one we have here in this bug (someone might be tempted to 'close the
gate' when is not recomendable), but is worth to give it a thought.
Comment 22•23 years ago
|
||
While a generic mechanism for forbidding re-entrancy would be nice, it seems
like we ought to be able to fix this specific case without it. I'm generally
opposed to adding complexity to the frame code (which is already impenetrable)
to work around the real problem that lies elsewhere.
I'd appreciate it if you could think about this some more.
| Assignee | ||
Comment 23•23 years ago
|
||
complete crash stack with comments.
** crash
nsIFrame::GetStyleData(nsStyleStructID eStyleStruct_Display, const
nsBlockFrame::ReflowBlockFrame(nsBlockReflowState & {...}, nsLineLi
nsBlockFrame::ReflowLine(nsBlockReflowState & {...}, nsLineList_ite
** here we reflow a line that's empty
nsBlockFrame::ReflowDirtyLines(nsBlockReflowState & {...}) line 223
nsBlockFrame::Reflow(nsBlockFrame * const 0x03b14d58, nsIPresContex
nsContainerFrame::ReflowChild(nsIFrame * 0x03b14d58, nsIPresContext
CanvasFrame::Reflow(CanvasFrame * const 0x03b01190, nsIPresContext
nsBoxToBlockAdaptor::Reflow(nsBoxLayoutState & {...}, nsIPresContex
nsBoxToBlockAdaptor::DoLayout(nsBoxToBlockAdaptor * const 0x03b14cb
nsBox::Layout(nsBox * const 0x03b14cbc, nsBoxLayoutState & {...}) l
nsScrollBoxFrame::DoLayout(nsScrollBoxFrame * const 0x03b0168c, nsB
nsBox::Layout(nsBox * const 0x03b0168c, nsBoxLayoutState & {...}) l
nsContainerBox::LayoutChildAt(nsBoxLayoutState & {...}, nsIBox * 0x
nsGfxScrollFrameInner::LayoutBox(nsBoxLayoutState & {...}, nsIBox *
nsGfxScrollFrameInner::Layout(nsBoxLayoutState & {...}) line 1217
nsGfxScrollFrame::DoLayout(nsGfxScrollFrame * const 0x03b01494, nsB
nsBox::Layout(nsBox * const 0x03b01494, nsBoxLayoutState & {...}) l
nsBoxFrame::Reflow(nsBoxFrame * const 0x03b0145c, nsIPresContext *
nsGfxScrollFrame::Reflow(nsGfxScrollFrame * const 0x03b0145c, nsIPr
nsContainerFrame::ReflowChild(nsIFrame * 0x03b0145c, nsIPresContext
ViewportFrame::Reflow(ViewportFrame * const 0x03b01154, nsIPresCont
nsHTMLReflowCommand::Dispatch(nsIPresContext * 0x039b5028, nsHTMLRe
PresShell::ProcessReflowCommand(nsVoidArray & {...}, int 0, nsHTMLR
PresShell::ProcessReflowCommands(int 0) line 6338
** the press shell flushes th reflow commands
PresShell::FlushPendingNotifications(PresShell * const 0x037cd3a8,
nsDocument::FlushPendingNotifications(nsDocument * const 0x03ad7660
nsHTMLDocument::FlushPendingNotifications(nsHTMLDocument * const 0x
** the window list flushes the pending notifications to retrive a
consistent length (is my guess)
nsDOMWindowList::GetLength(nsDOMWindowList * const 0x039a2e80, unsi
GlobalWindowImpl::GetLength(GlobalWindowImpl * const 0x0399b49c, un
XPTC_InvokeByIndex(nsISupports * 0x0399b49c, unsigned int 59, unsig
XPCWrappedNative::CallMethod(XPCCallContext & {...}, XPCWrappedNati
XPCWrappedNative::GetAttribute(XPCCallContext & {...}) line 1823 +
** i'm not really an expert here but we end up calling a method with
the index 59 on the GlobalWindowImpl: GetLength
XPC_WN_GetterSetter(JSContext * 0x02b69008, JSObject * 0x02d8f108,
js_Invoke(JSContext * 0x02b69008, unsigned int 0, unsigned int 2) l
js_InternalInvoke(JSContext * 0x02b69008, JSObject * 0x02d8f108, lo
js_GetProperty(JSContext * 0x02b69008, JSObject * 0x02d8f108, long
js_Interpret(JSContext * 0x02b69008, long * 0x0012942c) line 2576 +
js_Invoke(JSContext * 0x02b69008, unsigned int 1, unsigned int 2) l
nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJSClass * const 0x02deb
nsXPCWrappedJS::CallMethod(nsXPCWrappedJS * const 0x03aa05a8, unsig
PrepareAndDispatch(nsXPTCStubBase * 0x03aa05a8, unsigned int 3, uns
SharedStub() line 139
nsEventListenerManager::HandleEventSubType(nsListenerStruct * 0x03a
nsEventListenerManager::HandleEvent(nsEventListenerManager * const
nsXULElement::HandleDOMEvent(nsXULElement * const 0x02e0dcf8, nsIPr
nsXULElement::HandleDOMEvent(nsXULElement * const 0x02e0df78, nsIPr
nsXULElement::HandleDOMEvent(nsXULElement * const 0x02e11ff8, nsIPr
nsXULElement::HandleDOMEvent(nsXULElement * const 0x038fab48, nsIPr
nsXULElement::HandleDOMEvent(nsXULElement * const 0x03897040, nsIPr
nsXULElement::HandleDOMEvent(nsXULElement * const 0x0392e258, nsIPr
nsXULElement::HandleChromeEvent(nsXULElement * const 0x0392e268, ns
GlobalWindowImpl::HandleDOMEvent(GlobalWindowImpl * const 0x0399b49
nsDocument::HandleDOMEvent(nsDocument * const 0x03ad7660, nsIPresCo
** NS_GOTFOCUS gets translated to NS_FOCUS_CONTENT
nsEventStateManager::PreHandleEvent(nsEventStateManager * const 0x0
PresShell::HandleEventInternal(nsEvent * 0x0012bdfc, nsIView * 0x03
PresShell::HandleEvent(PresShell * const 0x037cd3ac, nsIView * 0x03
nsViewManager::HandleEvent(nsView * 0x03b17a20, nsGUIEvent * 0x0012
nsView::HandleEvent(nsViewManager * 0x037e7cc8, nsGUIEvent * 0x0012
nsViewManager::DispatchEvent(nsViewManager * const 0x037e7cc8, nsGU
HandleEvent(nsGUIEvent * 0x0012bdfc) line 83
nsWindow::DispatchEvent(nsWindow * const 0x03b17aec, nsGUIEvent * 0
nsWindow::DispatchWindowEvent(nsGUIEvent * 0x0012bdfc) line 886
nsWindow::DispatchFocus(unsigned int 105, int 1) line 4907 + 15 byt
** WM_SETFOCUS gets dispatched, DispatchFocus receives NS_GOTFOCUS(105)
nsWindow::ProcessMessage(unsigned int 7, unsigned int 395000, long
nsWindow::WindowProc(HWND__ * 0x001d0708, unsigned int 7, unsigned
USER32! 77e12e98()
USER32! 77e139a3()
USER32! 77e1395f()
NTDLL! 77fa032f()
** the view's window get's destroyed with ::DestroyWindow()
nsView::~nsView() line 163
nsView::`scalar deleting destructor'(unsigned int 1) + 15 bytes
nsView::Destroy(nsView * const 0x03ab1f80) line 260 + 34 bytes
** the frame destroys the view
nsFrame::Destroy(nsFrame * const 0x03aae504, nsIPresContext * 0x039
nsFrameList::DestroyFrames(nsIPresContext * 0x039b5028) line 131
nsContainerFrame::Destroy(nsContainerFrame * const 0x03aae474, nsIP
nsLineBox::DeleteLineList(nsIPresContext * 0x039b5028, nsLineList &
nsBlockFrame::Destroy(nsBlockFrame * const 0x03b181b0, nsIPresConte
** we destroy the frame before removing the line
nsBlockFrame::DoRemoveFrame(nsIPresContext * 0x039b5028, nsIFrame *
nsBlockFrame::RemoveFrame(nsBlockFrame * const 0x03b14d58, nsIPresC
FrameManager::RemoveFrame(FrameManager * const 0x03888b28, nsIPresC
** the old content is removed
nsCSSFrameConstructor::ContentRemoved(nsCSSFrameConstructor * const
nsCSSFrameConstructor::RecreateFramesForContent(nsIPresContext * 0x
nsCSSFrameConstructor::ProcessRestyledFrames(nsCSSFrameConstructor
PresShell::ReconstructStyleData(PresShell * const 0x037cd3a8, int 0
PresShell::StyleSheetDisabledStateChanged(PresShell * const 0x037cd
nsDocument::SetStyleSheetDisabledState(nsIStyleSheet * 0x03959068,
** the style sheet is disabled
CSSStyleSheetImpl::SetDisabled(CSSStyleSheetImpl * const 0x0395906c
XPTC_InvokeByIndex(nsISupports * 0x0395906c, unsigned int 5, unsign
XPCWrappedNative::CallMethod(XPCCallContext & {...}, XPCWrappedNati
XPCWrappedNative::SetAttribute(XPCCallContext & {...}) line 1826 +
XPC_WN_GetterSetter(JSContext * 0x02b69008, JSObject * 0x02d8fe18,
js_Invoke(JSContext * 0x02b69008, unsigned int 1, unsigned int 2) l
js_InternalInvoke(JSContext * 0x02b69008, JSObject * 0x02d8fe18, lo
js_SetProperty(JSContext * 0x02b69008, JSObject * 0x02d8fe18, long
js_Interpret(JSContext * 0x02b69008, long * 0x0012d8f0) line 2587 +
js_Invoke(JSContext * 0x02b69008, unsigned int 1, unsigned int 2) l
js_InternalInvoke(JSContext * 0x02b69008, JSObject * 0x02d5d548, lo
JS_CallFunctionValue(JSContext * 0x02b69008, JSObject * 0x02d5d548,
nsJSContext::CallEventHandler(nsJSContext * const 0x02c1e280, void
nsJSEventListener::HandleEvent(nsJSEventListener * const 0x02e18410
nsEventListenerManager::HandleEventSubType(nsListenerStruct * 0x02e
nsEventListenerManager::HandleEvent(nsEventListenerManager * const
nsXULElement::HandleDOMEvent(nsXULElement * const 0x02e190d8, nsIPr
nsXULElement::HandleDOMEvent(nsXULElement * const 0x03938880, nsIPr
nsXULElement::HandleDOMEvent(nsXULElement * const 0x03b88d10, nsIPr
nsXULElement::HandleDOMEvent(nsXULElement * const 0x03cbce70, nsIPr
nsXULElement::HandleDOMEvent(nsXULElement * const 0x038fe008, nsIPr
PresShell::HandleDOMEventWithTarget(PresShell * const 0x019751c8, n
nsMenuFrame::Execute() line 1684
nsMenuFrame::HandleEvent(nsMenuFrame * const 0x03cbab38, nsIPresCon
PresShell::HandleEventInternal(nsEvent * 0x0012f8d0, nsIView * 0x03
PresShell::HandleEvent(PresShell * const 0x019751cc, nsIView * 0x03
nsViewManager::HandleEvent(nsView * 0x03c41930, nsGUIEvent * 0x0012
nsView::HandleEvent(nsViewManager * 0x02c56958, nsGUIEvent * 0x0012
nsViewManager::DispatchEvent(nsViewManager * const 0x02c56958, nsGU
HandleEvent(nsGUIEvent * 0x0012f8d0) line 83
** the UI event gets handled
nsWindow::DispatchEvent(nsWindow * const 0x03c419fc, nsGUIEvent * 0
nsWindow::DispatchWindowEvent(nsGUIEvent * 0x0012f8d0) line 886
nsWindow::DispatchMouseEvent(unsigned int 301, unsigned int 0, nsPo
ChildWindow::DispatchMouseEvent(unsigned int 301, unsigned int 0, n
nsWindow::ProcessMessage(unsigned int 514, unsigned int 0, long 308
nsWindow::WindowProc(HWND__ * 0x000606ea, unsigned int 514, unsigne
USER32! 77e12e98()
USER32! 77e130e0()
USER32! 77e15824()
nsAppShellService::Run(nsAppShellService * const 0x0183fcf8) line 3
main1(int 2, char * * 0x00277780, nsISupports * 0x00000000) line 14
main(int 2, char * * 0x00277780) line 1763 + 37 bytes
mainCRTStartup() line 338 + 17 bytes
KERNEL32! 77e97d08()
| Assignee | ||
Comment 24•23 years ago
|
||
i'm not in the mechanics of NS_FOCUS_CONTENT yet. could someone explain me why
does the method GlobalWindowImpl::GetLength get's called there and why that
method needs to flush all the pending notifications? (maybe we don't even need
to call that at that point)
Comment 25•23 years ago
|
||
GlobalWindowImpl::GetLength() must flush to ensure that all [i]frame's have been
created since the value returned by GetLength() relies on that. And as for why
that method is called in the first place, it's called from JS, i.e. someone has
what looks like a onfocus handler that touches window.length in JS, why, I don't
know. I don't think any of that can be changed, so we need to fix this elsewhere.
Comment 26•23 years ago
|
||
(Thanks for continuing to look into this, Alex.)
jst: will this change once IFRAMEs are initialized at frame creation time? IOW,
will GetLength() still need to flush?
alex: perhaps it would be possible to modify nsPresShell such that we set
`mIsReflowing' in PresShell::ReconstructStyleData? (If so, we'd probably want to
rename mIsReflowing to mIsSafeToReflow and invert its value. We also may want to
examine other similar entry points to the frame ctor...)
Comment 27•23 years ago
|
||
Well, initially the iframe loading changes will only affect <iframe> elements,
not <frame> elements, so for now GetLength() will need to flush. At some point
we'll change how <frame> elements are loade too, and then we wouldn't really
need to flush in GetLength().
However, GetLength() is only one of many places where we need to flush in the
GlobalWindowImpl code, so even if we'd make GetLength() not flush, we'd still
have this exact same problem if someone causes some of the other code that
flushes to run in a situation similar to this one.
| Assignee | ||
Comment 28•23 years ago
|
||
I will look at the mIsSubmitting issue proposed by Chris. I was thinking about
that too, still I didn't wanted yet to go that way. Somehow the new
mIsSafeToReflow seems to me to do then the same that Marc proposes in his bug
137195. Maybe this way we shoot both bugs from the tree :-)
Thanks Chris and JST for helping. I'll dig into it and come back with a new patch.
Comment 29•23 years ago
|
||
Marc and I talked at length about this on the phone. Just want to make sure we
get that discussion down on paper. (Or whatever.)
The solution I proposed above (and the solution that alex had originally
proposed) would probably cause errors of omission, because we'll simply refuse
to do the operation that's being requested. (E.g., what happens if we refuse to
flush? Will this cause us to fail to properly handle the focus event? Will the
OS's notion of focus be out-of-sync with the event system's?)
So is there something else we can do? The _real_ problem here (no, really! :-))
is that widget destruction is forcing us to re-enter the event queue at an
inopportune time. So why not defer widget destruction until it's safe for us to
do so? Specifically, rather than immediately destroying the widget in
nsView::Destroy, we ought to enqueue the widget for later destruction when we've
re-established our layout invariants.
Marc and I discussed two ways to do this. One would be to simply post a PLEvent
back to the main event queue that would be responsible for destroying the
widget. Another would be to maintain a special queue inside the view manager (or
pres shell, or whatever) that we'd process at `safe' times, but before returning
out to the main event queue.
The advantage of the former solution (PLEvent) is that the implementation is
simple and self-contained within the view/widget system. The down-side of this
solution is that it allows a window of time where other, previously scheduled
events (possibly targeted at the doomed widget) could be processed. This could
lead to disaster (e.g., widget looking for a destroyed view, etc.), although we
could probably mitigate that with some sanity checks in the widget.
The advantage of the latter (special queue) is that we get rid of the `zombie'
widgets more quickly; i.e., before returning out to the main event queue to
process any events intended for the doomed widgets. (Presumably, the OS can
manage cleanup of its event queue properly.) The down-side of this approach is
that we need to make sure that we process the special queue after performing any
widget-destroying operations: it may be difficult to pinpoint and maintain all
of these.
cc'ing roc & kmcclusk for input.
Comment 30•23 years ago
|
||
Marc: noted what we'd discussed above.
I vote for Plan B. We can queue destroyed widgets in the view manager and then
destroy them all later. (My guess is that the presshell should poll the view
manager after it gets its frame/view tree into a consistent state having
restyled the frames.) We can unset the widget's view so that no events from the
widget can get through.
One reason I prefer this is that post-1.0 we're in a position to eliminate the
use of native widgets almost completely (provided someone, me?, can figure out
how to handle plugins). Thus I think the burden of keeping track of where
widgets are destroyed will eventually dissipate on its own.
Comment 32•23 years ago
|
||
Why not suppress the dispatch of the NS_GOTFOCUS event within the widget code?
If a widget is in the process of being destroyed we could set a flag which
prevents us from dispatching any focus events. If we need to dispatch the
NS_GOTFOCUS we can schedule it with a user event so that it will be processed
when we return to the message pump.
I think this would be safer than deferring the destruction of widgets. I think
the real suprise with this bug is that the NS_GOTFOCUS is dispatched
synchronously with the destruction of the widget.
We also need to answer the question of whether the NS_GOTFOCUS event really
needs to be dispatched when the focused widget is destroyed. We should test
Linux and Mac to determine if they synchronously dispatch the NS_GOTFOCUS event
to another widget when the widget with focus is destroyed. Do they even dispatch
the NS_GOTFOCUS event at all when a widget is destroyed? My guess is that they
do something different than WIN32 since they do not crash when loading the URL
in this bug.
If Linux and Mac do not dispatch the NS_GOTFOCUS we should probably suppress the
NS_GOTFOCUS on WIN32. Whenever possible we should strive to make all three
platforms work exactly the same at the widget and gfx level to prevent future
platform specific problems.
Updated•23 years ago
|
| Assignee | ||
Comment 33•23 years ago
|
||
LINUX does not dispatch syncronously the FOCUS event. Actually there is no
GOT_FOCUS dispatching anymore after switching. Here comes the surprize: I
reloaded the page, clicked in the <iframe> and SEGFAULT. I asked Marc to help
testing on Mac.
| Assignee | ||
Comment 34•23 years ago
|
||
The Linux crasher from the previous comment gets fixed if I apply the patch
(attachment 78966 [details] [diff] [review]) proposed in this bug. This makes me think that there are
other issues related to destroying frames before we remove empty lines. I'm not
sure right now but I kindof doubt now that deffering the FOCUS message or view
removal will solve all our problems (since I discovered the sequel crasher on
linux and the fact that is getting fixed by the patch proposed here). Need to
find the crash reason on linux though before i can make a definitive statement.
| Assignee | ||
Comment 35•23 years ago
|
||
sorry, i made a wrong statement. i can get the crash even after applying the
first patch on linux. (i must have made a testing mistake the first time).
debugging....
Comment 37•23 years ago
|
||
Alex: what you're describing in comment 33 sounds like a different issue
altogether. What is the stack trace you're seeing there? If it's different than
attachment 77884 [details], let's file a different bug.
kmcclusk: destroying a widget will typically force us to process several events
synchronously if I'm remembering my Java AWT hacking days correctly -- kin? I
think we really want to wipe out the entire class of problems in one swell foop
by queuing destruction.
Comment 38•23 years ago
|
||
Comment on attachment 78966 [details] [diff] [review]
revised patch with comments
(I don't think this is the right approach.)
Attachment #78966 -
Flags: needs-work+
Updated•23 years ago
|
Whiteboard: [adt2]
Comment 39•23 years ago
|
||
In case anyone is curious, the JS code that's triggering the flush is in
isDocumentFrame() in contentAreaUtils.js:
http://lxr.mozilla.org/seamonkey/source/xpfe/communicator/resources/content/contentAreaUtils.js#43
It's being called from contentAreaFrameFocus() in navigator.js:
http://lxr.mozilla.org/seamonkey/source/xpfe/browser/resources/content/navigator.js#167
From what I recall, Chris is right, if the native widget that's being destroyed
has focus, the focus is passed up to it's parent during the destruction of the
child, which may trigger immediate focus callbacks on the parent, on some window
systems.
Comment 40•23 years ago
|
||
We generate cross platform events for only a small subset of the toolkit
specific events. We also suppress a large number of the cross platform events
because they only need to be dispatched within the cross platform code for
toplevel windows not managed by views. These events including: NS_CREATE,
NS_DESTROY, NS_SETZLEVEL, NS_MOVE. I put this suppression code for these in
several months ago because these events are *not* needed and they were causing
performance issues when child windows were destroyed on large pages. If there
are other cross platform events being dispatched synchronously and they cause
problems we should analyze which cross platform events are being sent and see if
they are really necessary.
In regards to Alex's comments:
Both the WIN32 and LINUX problems point to the same issue. We have not defined
or enforced what should happen with the widget module in regards to dispatching
NS_GOTFOCUS when the focus'ed widget has been destroyed.
Alex has identified that we behave differently at the widget module level when
destroying a focus'ed widget.
WIN dispatches a synchronous NS_GOTFOCUS message.
Linux does not dispatch a NS_GOTFOCUS at all.
For windows the synchronous dispatch causes problems because we are processing
the NS_GOTFOCUS synchronously and we reflow unexpectedly. For Linux its
NS_GOTFOCUS behavior causes problems because the event state manager is never
told that the focus'ed object it is holding a reference has been destroyed.
We should define what the correct behavior for dispatching the NS_GOTFOCUS event
is. All NS_EVENTS should be dispatched in a consistent manner regardless of the
platform. We should not dispatch NS_GOTFOCUS events differently between Linux
and WIN32.
There are two proposed solutions:
solution #1: Make all platforms synchronously dispatch NS_GOTFOCUS
-----------------------------------------------------
WIN32: No change
LINUX: Add code to synchronously dispatch NS_GOTFOCUS when the widget is destroyed.
XP CODE: Defer the widget destruction to a safe time using a PL_event
solution #2: Make all platforms defer the dispatch of the NS_GOTFOCUS.
------------------------------------------------------
WIN32: Don't dispatch a NS_GOTFOCUS if the focus'ed widget is being destroyed,
schedule it to be sent later.
LINUX: Send a deferred NS_GOT_FOCUS
XPCODE: No change.
Both solutions may run into a problem with the event state manager. From the
Linux crash it looks like the event state manager is referencing something that
was destroyed when it processes the NS_GOTFOCUS. If we defer the NS_GOTFOCUS
either by deferring the widget destruction or explicitly scheduling a user event
we may crash in the event state manager later because it is trying to remove
focus from something that has been destroyed. We should see what the event state
manager is referencing when it crashes. Is it a widget, frame or something
else? I would guess its a frame. If it's a frame then we will probably have
to notify the event state manager that the frame is about to be destroyed.
| Assignee | ||
Comment 41•23 years ago
|
||
mozilla/LINUX references a registered zombie listener that was listening for the
focus (after my latest observations). This migth be only a 'normal' bug resulted
from a mistake in the code. maybe we just lack the removal of that listener.
investigating (i still have not opened a boog on that one - shame).
working on it
| Assignee | ||
Comment 42•23 years ago
|
||
ok, here is a little hack that fixes the crashes on both platforms:
in
PresShell::ReconstructStyleData(PRBool aRebuildRuleTree)
{
....
if (frameChange == NS_STYLE_HINT_RECONSTRUCT_ALL)
set->ReconstructDocElementHierarchy(mPresContext);
else {
+
+ nsIView* view;
+ rootFrame->GetView(mPresContext, &view);
+ if(view) {
+ nsIWidget* widget;
+ view->GetWidget(widget);
+ if(widget)
+ widget->SetFocus();
+ }
+
cssFrameConstructor->ProcessRestyledFrames(changeList, mPresContext);
if (aRebuildRuleTree) {
GetRootFrame(&rootFrame);
WalkFramesThroughPlaceholders(mPresContext, rootFrame,
....
sure, the focus is moved from the widget that's going to be destroyed and so
there is no try to move the focus anymore.
now i'm going to try to find a way to fix the problem in the way GOT_FOCUS is
fired/handled on both platforms. (on MacOSaks I and Marc couldn't repro any of
the crashes)
Comment 43•23 years ago
|
||
So this super-sloppy patch is meant as a sketch of how we might do deferred
widget destruction using `Method A' (PLEvent) from comment 29. It's a
windows-only fix, but it seems to work. Thoughts?
| Assignee | ||
Comment 44•23 years ago
|
||
Chris,
don't we want then to hide the window or push it in the background or something
if is doomed to be destroyed anyway (before we VERIFY(::PostMessage...)? hiding
it might cause the same sync focus event so maybe reducing the size to 0x0 would
be a better approach. hmmm. I'm thinking that if you defer the destruction for
every window, there some fancy 'special effects' are possible, particulary if we
have some absolutly pos windows and stuff (afaik, absolutely pos windows have
their own window). but this are just thoughts, I didn't tested anything yet.
what about XUL and UI are they going to be affected (i'm not a specialist with
the UI though)?
Comment 45•23 years ago
|
||
> don't we want then to hide the window or push it in the background or
> something if is doomed to be destroyed anyway (before we
> VERIFY(::PostMessage...)?
That doesn't make any sense to me. It's not like this window is going to be
around for any user-noticable period of time.
> hmmm. I'm thinking that if you defer the destruction for every window, there
> some fancy 'special effects' are possible, particulary if we have some
> absolutly pos windows and stuff
What special effects?
> what about XUL and UI are they going to be affected
Absolutely; for example, menus have widgets.
If your concern is that this will somehow make the app `behave differently in a
visible way', I think that is ill-founded: we're talking about changing event
ordering, that's all. Try the patch, I'm pretty sure you won't be able to notice
any difference in behavior (modulo bugs, of course).
| Assignee | ||
Comment 46•23 years ago
|
||
the patch (attachment 81064 [details] [diff] [review]) by chris waterson works. i tested it with nested
iframes, with absolute/fixed position and couldn't see problems so far. in a
discution with kevin mcckluskey he suggested that we'd better check if the
window has the focus and only then deffer the destruction.
| Assignee | ||
Comment 47•23 years ago
|
||
there is a global flag that's set right before we call ::DestroyWindow and
reset after that so every window that receives the focus will schedule the
event for a later processing until the flag is getting reseted.
this patch works also. chris mentioned in a chat session that we might try to
do both: deferring the focus event and the window destruction.
thoughts?
Comment 48•23 years ago
|
||
Does this by any chance fix bug 110112, too?
This patch seems reasonable to me. Note that when I was cobbling together
attachment 81064 [details] [diff] [review], I noticed that there were cases where destroying one native
widget would (natch) destroy its children. You might want to sanity check this
change to make sure that it doesn't somehow re-enter (and prematurely clear
gIsDestroyingAny):
NS_ASSERTION(!gIsDestroyingAny, "oops, already set gIsDestroyingAny");
gIsDestroyingAny = PR_TRUE;
VERIFY(::DestroyWindow(mWnd));
NS_ASSERTION(gIsDestroyingAny, "oops, already cleared gIsDestroyingAny");
gIsDestroyingAny = PR_FALSE;
(You don't need to check that in; maybe just run with the assertion set for a
while to verify that you don't get into a re-entrancy situation.)
| Assignee | ||
Comment 49•23 years ago
|
||
good ideea. now i have the problem that some checkin caused the crash reported
here to be repaired so i need to see if the 1.0.0 branch is afected too.
investigating. damn...
| Assignee | ||
Comment 50•23 years ago
|
||
the crash is gone from both trunk and 1.0.0 branch? i'm wondering what caused that?
| Assignee | ||
Comment 51•23 years ago
|
||
hehe, i think it was karnaze's fix for bug 136746 that fixed this crash :-)
investigating
| Assignee | ||
Comment 52•23 years ago
|
||
chris:
i found out that karnaze's patch for bug 136746 fixed the crash here so i
removed his patch in a local build and was able to crash again. however, i was
not able to reproduce the crash of bug 110112 using the steps provided in
comment http://bugzilla.mozilla.org/show_bug.cgi?id=110112#c23 on that bug, i
tried other combinations and also failed to reproduce so i can't tell if the fix
proposed in attachment 81791 [details] [diff] [review] fixes the crash on bug 110112 too. now the feeling
i, kevin and karnaze have is to apply the patch proposed here since karnaze's
patch could be removed. what do you think about?
No longer depends on: 137195
Comment 53•23 years ago
|
||
Comment on attachment 81791 [details] [diff] [review]
the focus event deferring approach
sr=waterson.
Attachment #81791 -
Flags: superreview+
Comment 54•23 years ago
|
||
Comment on attachment 81791 [details] [diff] [review]
the focus event deferring approach
Looks good!
r=kmcclusk@netscape.com
Comment 55•23 years ago
|
||
Comment on attachment 81791 [details] [diff] [review]
the focus event deferring approach
Looks good!
r=kmcclusk@netscape.com
Attachment #81791 -
Flags: review+
| Assignee | ||
Comment 56•23 years ago
|
||
FIXED ON TRUNK
(won't fix on branch only if requested by ADT since the crash was fixed by the
fix for bug 136746)
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Comment 58•23 years ago
|
||
Marking this verified fixed. I haven't seen this crash in MozillaTrunk Talkback
data for a while now.
Status: RESOLVED → VERIFIED
| Assignee | ||
Updated•23 years ago
|
Comment 59•23 years ago
|
||
Marking adt1.0.1+ on behalf of the adt for checkin to the 1.0 branch.
Comment 60•23 years ago
|
||
Comment on attachment 81791 [details] [diff] [review]
the focus event deferring approach
Approved for 1.0 branch. Please remove mozilla1.0.1+ when it's fixed and add
fixed1.0.1
Attachment #81791 -
Flags: approval+
Updated•23 years ago
|
| Assignee | ||
Updated•23 years ago
|
Keywords: mozilla1.0.1+ → fixed1.0.1
| Assignee | ||
Comment 61•23 years ago
|
||
fixed on the branch 1.0.1
Comment 62•23 years ago
|
||
Verified on branch (2002-07-16-05 0S X ) and (2002-07-16-08 Windows ME).
Keywords: verified1.0.1
Updated•14 years ago
|
Crash Signature: [@ nsBlockFrame::ReflowBlockFrame]
You need to log in
before you can comment on or make changes to this bug.
Description
•