Last Comment Bug 343457 - Crash when closing window while tooltip is about the be shown or hidden
: Crash when closing window while tooltip is about the be shown or hidden
Status: VERIFIED FIXED
[sg:critical]
: crash, verified1.8.0.7, verified1.8.1
Product: Core
Classification: Components
Component: Layout (show other bugs)
: Trunk
: x86 Linux
: -- normal (vote)
: ---
Assigned To: Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
:
:
Mentors:
Depends on:
Blocks: 343613 346054
  Show dependency treegraph
 
Reported: 2006-07-03 04:05 PDT by Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
Modified: 2007-01-22 02:13 PST (History)
6 users (show)
mconnor: blocking1.8.1+
dveditz: blocking1.8.0.7+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
testcase (2.15 KB, text/html)
2006-07-03 04:12 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
no flags Details
'minimal' patch to fix crashes in testcase (10.84 KB, patch)
2006-07-03 04:15 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
no flags Details | Diff | Splinter Review
testcase (the file for iframe) (1.09 KB, application/vnd.mozilla.xul+xml)
2006-07-03 05:44 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
no flags Details
The testcase (241 bytes, text/html)
2006-07-03 05:45 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
no flags Details
possible patch (21.95 KB, patch)
2006-07-03 05:53 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
no flags Details | Diff | Splinter Review
using single linked list (22.27 KB, patch)
2006-07-04 04:20 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
roc: review+
roc: superreview+
Details | Diff | Splinter Review
This was checked in (22.63 KB, patch)
2006-07-11 06:07 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
no flags Details | Diff | Splinter Review
for 1.8.1 (24.61 KB, patch)
2006-07-28 03:55 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
no flags Details | Diff | Splinter Review
using nsIPresShell_MOZILLA_1_8_BRANCH (25.51 KB, patch)
2006-07-28 05:49 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
roc: review+
roc: superreview+
mtschrep: approval1.8.1+
Details | Diff | Splinter Review
for 1.8.0 (25.55 KB, patch)
2006-08-10 01:17 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
dveditz: approval1.8.0.7+
Details | Diff | Splinter Review

Description Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-03 04:05:29 PDT
testcase and possible patch coming.
Marking security sensitive, since this is about accessing deleted objects.
Comment 1 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-03 04:12:48 PDT
Created attachment 227944 [details]
testcase
Comment 2 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-03 04:15:24 PDT
Created attachment 227945 [details] [diff] [review]
'minimal' patch to fix crashes in testcase
Comment 3 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-03 05:09:51 PDT
Comment on attachment 227945 [details] [diff] [review]
'minimal' patch to fix crashes in testcase

Wrong patch.
Comment 4 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-03 05:44:31 PDT
Created attachment 227948 [details]
testcase (the file for iframe)
Comment 5 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-03 05:45:32 PDT
Created attachment 227950 [details]
The testcase
Comment 6 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-03 05:53:46 PDT
Created attachment 227951 [details] [diff] [review]
possible patch

Is this too ugly? There are probably similar problems also elsewhere in layout/ 
and using nsFrameDestroyNotifier fixing those should be quite easy.
This shouldn't slow things down too much since mDestroyNotifiers list
is almost always empty and nsVoidArray::Count() is fast operation.
Comment 7 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-03 06:05:02 PDT
This is ofc related to roc's safe event dispatching bug, but for 1.8
we need to fix the crashes anyway. (and I may not like the way safe dispatching changes the time when events are dispatched)
Comment 8 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-03 06:16:03 PDT
Fortunately the situation is not that bad.
As far as I see, pretty much all code which should be fixed is in layout/xul
Comment 9 Jesse Ruderman 2006-07-03 16:40:06 PDT
What is the stack signature for this crash?  I'm curious whether it's a topcrash and whether it's the same signature/bug as bug 184363.
Comment 10 Robert O'Callahan (:roc) (email my personal email if necessary) 2006-07-03 18:37:53 PDT
(In reply to comment #7)
> This is ofc related to roc's safe event dispatching bug

That's not going anywhere, because I don't really like the approach.
Comment 11 Robert O'Callahan (:roc) (email my personal email if necessary) 2006-07-03 18:40:34 PDT
How is this new mechanism different from NS_FRAME_EXTERNAL_REFERENCE?
Comment 12 Robert O'Callahan (:roc) (email my personal email if necessary) 2006-07-03 18:45:07 PDT
Also, this is related to bug 279703.
Comment 13 Robert O'Callahan (:roc) (email my personal email if necessary) 2006-07-03 18:47:55 PDT
The patch here might make for a better branch fix than the work in bug 279703, if we can actually fix all the important bugs without the redesign in bug 279703.

A way to go here might be to reimplement nsFrameNotifier using NS_FRAME_EXTERNAL_REFERENCE.
Comment 14 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-04 01:02:37 PDT
(In reply to comment #9)
> What is the stack signature for this crash?  I'm curious whether it's a
> topcrash and whether it's the same signature/bug as bug 184363.

This is not the same bug as 184363.
(I was actually trying to fix that, but couldn't reproduce it. And while reading the code it came quite clear that popups could crash.)


Comment 15 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-04 01:18:08 PDT
I'd actually remove NS_FRAME_EXTERNAL_REFERENCE and just use 
nsFrameDestroyNotifier also in ESM (although no need for that right now). 
The problem here is that if notifier marks 
frame with NS_FRAME_EXTERNAL_REFERENCE bit, and then before notifier is deleted 
ESM wants to mark frame also to have NS_FRAME_EXTERNAL_REFERENCE, then when 
notifier is deleted the bit would be removed although ESM thinks that it is still there.
Comment 16 Robert O'Callahan (:roc) (email my personal email if necessary) 2006-07-04 02:34:00 PDT
nsFrameDestroyNotifier could just not remove the EXTERNAL bit.

But yeah, this could be a better way. If we restrict nsFrameDestroyNotifier to be used only on the stack of the main thread, then you don't need an array in the prescontext. We can make nsFrameDestroyNotifiers form a singly linked list.
Comment 17 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-04 04:20:19 PDT
Created attachment 228035 [details] [diff] [review]
using single linked list

Using a single linked list, but supporting notifiers in heap.
Though heap usage is not in anyway optimized - if ESM starts using notifiers
easy optimization would be to start marking frames which have notifier with 
NS_FRAME_EXTERNAL_REFERENCE. (In that case I'd rename the macro, perhaps NS_FRAME_MAY_HAVE_NOTIFIER)

Currently when all notifiers are in stack this should be fast, since mDestroyNotifiers is pretty much always nsnull.
Comment 18 Robert O'Callahan (:roc) (email my personal email if necessary) 2006-07-05 01:33:58 PDT
Comment on attachment 228035 [details] [diff] [review]
using single linked list

nsFrameDestroyNotifier needs a comment explaining what it's for and how it's meant to be used.

+  nsFrameDestroyNotifier(nsIFrame* aFrame)
+  {
+    mPrev = nsnull;
+    mShell = aFrame ? aFrame->GetPresContext()->GetPresShell() : nsnull;
+    mFrame = mShell ? aFrame : nsnull;
+    if (mShell) {
+      mShell->AddDestroyNotifier(this);
+    }
+  }

Just assert that aFrame is non-null on entry and then go ahead and use it.

Also I don't think you should keep mShell around. Reacquire it in Clear().

I would really like to see ESM start using this itself along with the state bit optimization. Having two separate mechanisms is a recipe for confusion.

+  nsFrameDestroyNotifier popupFrameNotifier(entry->mPopupFrame);
+  NS_ENSURE_STATE(popupFrameNotifier.IsAlive());

move the NS_ENSURE_STATE above and test entry->mPopupFrame.

+  if (notifier.IsAlive()) {
+    // Now open the popup.
+    OpenPopup(entry, PR_TRUE);
+    if (notifier.IsAlive()) {
+      // Now fire the popupshown event.
+      OnCreated(aXPos, aYPos, aPopupContent);
+    }
+  }

Rewrite as "if (!...) return NS_OK;"
Comment 19 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-06 02:54:25 PDT
I was thinking to change the name nsFrameDestroyNotifier to nsWeakFrame.
nsWeakFrame as a member of a class (for example ESM) makes more sense than 
nsFrameDestroyNotifer.

I hope the name nsWeakFrame is ok.

(I may not have time to check in this before Monday)
Comment 20 Robert O'Callahan (:roc) (email my personal email if necessary) 2006-07-06 08:33:47 PDT
That's a good name.

I think you should add comments explaining that usage of this class should be kept to a minimum. Generally speaking, we should avoid having to hold frame references by holding content references instead, and/or moving functionality from frames to content.
Comment 21 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-11 05:55:21 PDT
(In reply to comment #18)
> 
> Just assert that aFrame is non-null on entry and then go ahead and use it.
> 

Supporting null nsFrame is useful, then there is no need for some ifs in popupsetframe for example.
Comment 22 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-11 06:07:02 PDT
Created attachment 228803 [details] [diff] [review]
This was checked in

Using the name "nsWeakFrame", no mShell member, leaving ability to pass
null nsIFrames to nsWeakFrame constructor.
Comment 23 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-11 11:24:27 PDT
This may have affected btek tp. Going to back out.
Comment 24 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-11 16:09:28 PDT
Is it ok if I retry to check in nsWeakFrame patch with the change that in nsIPresShell::AddWeakFrame I add aWeakFrame->GetFrame()->AddStateBits(NS_FRAME_EXTERNAL_REFERENCE);
and move the changes from ::NotifyDestroyingFrame to ::ClearFrameRefs.
This way weak references are cleared only when nsFrame has the external bit set.
This is actually pretty much what comment #13 suggests.

I'm not sure the patch caused much of the tp regression.
The noise on btek has been quite bad :(
Comment 25 Robert O'Callahan (:roc) (email my personal email if necessary) 2006-07-11 18:28:19 PDT
Yes, go for it.
Comment 26 Boris Zbarsky [:bz] (still a bit busy) 2006-07-14 10:33:09 PDT
Fwiw, at least some of the code being touched here should simply reget the frame after doing the DOM stuff it does, instead of just bailing out if that particular frame has gotten destroyed.  At least imo.
Comment 27 Robert O'Callahan (:roc) (email my personal email if necessary) 2006-07-14 14:57:35 PDT
Yeah, I was pushing that point in another bug...
Comment 28 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-27 02:36:45 PDT
I think this should go into 1.8 too.
If agreed, I'll make a branch version of the patch.
Comment 29 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-28 03:55:39 PDT
Created attachment 231081 [details] [diff] [review]
for 1.8.1

Roc, could you review this one too. The changes to nsPopupSetFrame.* and nsPopupBoxObject.* are a bit different comparing to trunk patch,
especially methods nsPopupBoxObject::HidePopup and nsPopupBoxObject::ShowPopup.
Comment 30 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-28 04:03:16 PDT
btw, what is the policy regarding interfaces in layout/ ?
This changes nsIPresShell. Should I create nsIPresShell_MOZILLA_1_8_BRANCH ?
nsIFrame has been changed...

I'll create a new patch with nsIPresShell_MOZILLA_1_8_BRANCH.
Comment 31 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-07-28 05:49:39 PDT
Created attachment 231090 [details] [diff] [review]
using nsIPresShell_MOZILLA_1_8_BRANCH
Comment 32 Boris Zbarsky [:bz] (still a bit busy) 2006-07-28 09:46:24 PDT
> btw, what is the policy regarding interfaces in layout/ ?

Same as elsewhere.

> Should I create nsIPresShell_MOZILLA_1_8_BRANCH ?

Yes.

> nsIFrame has been changed...

AAARGH.  Not just that, but the IID did not change.  :(  That should NOT have happened.  :(
Comment 33 Mike Schroepfer 2006-07-30 10:02:20 PDT
Comment on attachment 231090 [details] [diff] [review]
using nsIPresShell_MOZILLA_1_8_BRANCH

approved by schrep for drivers
Comment 34 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-08-10 01:17:42 PDT
Created attachment 233055 [details] [diff] [review]
for 1.8.0
Comment 35 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2006-08-10 01:21:39 PDT
Comment on attachment 233055 [details] [diff] [review]
for 1.8.0

This is basically the same patch as for 1.8, just one line change, the one with FireChromeDOMEvent.
This should go in with the patch in Bug 346054.
Comment 36 Daniel Veditz [:dveditz] 2006-08-10 11:14:51 PDT
Comment on attachment 233055 [details] [diff] [review]
for 1.8.0

approved for 1.8.0 branch, a=dveditz for drivers
Comment 37 alice nodelman [:alice] [:anode] 2006-08-22 15:50:11 PDT
https://bugzilla.mozilla.org/attachment.cgi?id=227950&action=view should not crash in either tooltip scenario

Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.0.7pre) Gecko/20060821 Firefox/1.5.0.7pre

verified 1.8.0.7

Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1b2) Gecko/2006082203 BonEcho/2.0b2

verified 1.8.1b2

Note You need to log in before you can comment on or make changes to this bug.