Closed Bug 417636 Opened 16 years ago Closed 16 years ago

Crash when clicking "Clean Up" on Downloads dialog box

Categories

(Core :: Widget: Cocoa, defect, P2)

PowerPC
macOS
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: steven, Assigned: smichaud)

References

Details

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en; rv:1.9b4pre) Gecko/2008021400 Camino/2.0a1pre (like Firefox/3.0b4pre)
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en; rv:1.9b4pre) Gecko/2008021400 Camino/2.0a1pre (like Firefox/3.0b4pre)

I had two items in my Downloads dialog box and two windows open. I clicked on the Downloads dialog box to bring it to the front and clicked on the "Clean Up" button to clear my downloads list. Camino then crashed and produced a crash log.

Reproducible: Couldn't Reproduce

Steps to Reproduce:
1. Download two different items.
2. Open two windows.
3. Press "Clean Up" on the Download dialog box.
Actual Results:  
Camino crashed and generated a crash log.

Expected Results:  
Downloads to be cleared.
Crash log as a result of this bug.
Steven, could you shed some light on why a window that is 100% Camino code is having events stolen by a ChildView?
I also had two windows open at the time, could one of those windows have something that would have caused this crash around the time I clicked on that button?
Heh, that's neat. Thread 1 of this stack is *exactly* the crashed thread in bug 417485.

I wonder if Crash Reporter is just getting confused as to what crashed. Is that possible?
By the way, the "Steven" referred to in comment 2 is Steven Michaud (aka the JEP guy and general events guru), not the same Steven who filed this bug.

In case anyone else gets confused like I was for a minute there.
Stuart, nothing is "stealing events" from Camino.

Rather, nsCocoaWindow_NSWindow_sendEvent: in nsCocoaWindow.mm is now
(as of the patch for bug 413882) "hooking" all calls to [NSWindow
sendEvent:], in order to correctly route NS_GOTFOCUS and NS_LOSTFOCUS
events to the widgets corresponding to ChildView objects.

This is done using something called "method swizzling", which is an
alternative to [NSObject poseAsClass:] that isn't deprecated on
Leopard.  For more information see my patch for bug 413882 (attachment
302924 [review]) and http://www.cocoadev.com/index.pl?MethodSwizzling.

Notice that this method swizzling is necessary in order to fix bug
413882 on embedders (like Camino).

Prior to my patch for bug 403232, NS_GOTFOCUS and NS_LOSTFOCUS events
were sent from ChildView's becomeFirstResponder: and
resignFirstResponder: methods.  But this meant they were sent "too
often", and caused all kinds of focus bugs.  So my patch for bug
403232 caused them to be sent "less often" -- only in the context of
calls to nsChildView::SetFocus().

But "less often" was actually "too seldom", and caused bug 413882.
So, now (as of the patch for bug 413882) I'm now sending them "more
often", but still not as indiscriminately as they were being sent
before the patch for bug 403232.
However ...

Steven Hollingsworth's crash log looks like it was caused by a call on
the sendFocusEvent: method to a ChildView object that had already been
deleted.

I was very careful (in nsCocoaWindow_NSWindow_sendEvent:) to retain
and release oldFirstResponder around calls to (in effect) [super
sendEvent:].  But it looks like NS_GOTFOCUS and/or NS_LOSTFOCUS events
can cause ChildView objects to get deleted.  So I also need to retain
and release newFirstResponder around these calls (oldFirstResponder is
already being retained and released around them).

After I've had a chance to do some testing, I'll post a new version of
my patch at bug 413882.
Chris, I think the fact that thread 1 is (probably) waiting at a call
to nsSocketTransportService::Poll() (i.e. that it's polling a socket
for traffic) is just a coincidence.

Yes, crash logs are often very unreliable.  But in this case I
think I have a pretty reasonable explanation for Steven
Hollingsworth's crash.
The current approach is subtly changing the lifetime of objects that Cocoa widgets has no legitimate say over. That may change the teardown behavior of view hierarchies in ways that no embeddor would ever expect, and potentially create hard-to-find bugs as a result.

Can't
  if (type == NSLeftMouseDown)
    oldFirstResponder = [[self firstResponder] retain];
instead be
  if (type == NSLeftMouseDown && [[self firstResponder] isKindOfClass:[ChildView class]]
    oldFirstResponder = [[self firstResponder] retain];
giving exactly the behavior you want, without any impact on anything other than a ChildView?
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to comment #6)
> Notice that this method swizzling is necessary in order to fix bug
> 413882 on embedders (like Camino).

I don't see any demonstration of that statement in the bugs you reference, only that one system that was based on becomeFirstResponder:/resignFirstResponder: didn't work. I don't see any explanation of how specifically the semantics of becomeFirstResponder: and resignFirstResponder: are different than this custom-rolled to determine exactly that.

I also don't see how you expect this code to work when the focus is changing for some reason other than a left mouse click. Not everyone uses a mouse for everything.
Assignee: nobody → joshmoz
Component: General → Widget: Cocoa
Flags: blocking1.9?
Product: Camino → Core
QA Contact: general → cocoa
Version: unspecified → Trunk
(In reply to comment #9)

I can't do that, because I have to worry about the case where
oldFirstResponder isn't a ChildView but newFirstResponder is.

The lifetime changes here are _very_ subtle, and only happen on a
mouse down.  I really think they're very unlikely to cause trouble.
But I'll keep your concerns in mind, in case they may be relevant to
some future bug.
(In reply to comment #10)

There's a long discussion of these issues in bug 403232 and bug
413882.

Basically, I know that becomeFirstResponder:/resignFirstResponder:
don't "work", but I don't yet know why.  I have, however, found other
ways to send NS_GOTFOCUS and NS_LOSTFOCUS events that resolve a lot of
bugs.  And I'm pretty sure that a more comprehensive isn't feasible in
the time we have left before the Firefox 3 release.

Short summary:  We have a hackish solution to sending NS_LOSTFOCUS and
NS_GOTFOCUS events when the focus changes between ChildView objects,
but (though hackish) it works very well.  We have _lots_ of other bugs
that are in far worse shape, and not very much time to fix them.

I think it's pretty clear that these other bugs should take
precedence.
(In reply to comment #11)
> I can't do that, because I have to worry about the case where
> oldFirstResponder isn't a ChildView but newFirstResponder is.

I'm aware of that. What line of code specifically do you believe would not work
correctly given my change in that scenario?

(In reply to comment #12)
> There's a long discussion of these issues in bug 403232 and bug
> 413882.

I know, and as I said, I read it. I was objecting to your claim that this solution is "necessary"; I don't agree that you have demonstrated that. I understand that you believe it to be expedient, but that's not the same thing. I accept that you have convinced the powers that be to go with your approach, but I don't have to accept the assertion that there is no solution using the established methods for doing this, rather than trying to reinvent it (and in doing so, create new issues like ignoring the keyboard as a primary input device).
Yes, the solution is expedient and quick, and works well ... which is the whole point.
>> I can't do that, because I have to worry about the case where
>> oldFirstResponder isn't a ChildView but newFirstResponder is.
>
> I'm aware of that. What line of code specifically do you believe would
> not work correctly given my change in that scenario?

How can I know?  It's an edge case.

I think the (very) small potential risk of (slightly) extending the
lifetimes of non-ChildView objects is outweighed by the (unknown and
largely unknowable) risk of ignoring this edge case.
By the way, Stuart, I see you've marked this as NEW.  Does that mean
you've seen the bug yourself?  And if so, do you have any more
information that may be relevant?
(In reply to comment #15)
> >> I can't do that, because I have to worry about the case where
> >> oldFirstResponder isn't a ChildView but newFirstResponder is.
> >
> > I'm aware of that. What line of code specifically do you believe would
> > not work correctly given my change in that scenario?
> 
> How can I know?  It's an edge case.

How is a deterministic path through the method based on the scenario you gave an edge case? Walk through my proposed version of the method in the case where the previous responder is not a ChildView, and tell me which part won't work, per your assertion in comment 11.

(In reply to comment #16)
> By the way, Stuart, I see you've marked this as NEW.  Does that mean
> you've seen the bug yourself?

I marked it new based on your comment 7, which I read as acknowledging circumstances that would cause this crash.
(In reply to comment #14)
> and works well

Could you explain how this approach works for keyboard users?
> Could you explain how this approach works for keyboard users?

It resolves a lot of focus problems.
Oops, I now realize that your suggestion in comment #0 _doesn't_ make
me fail to send an NS_GOTFOCUS event to a newFirstResponder that's a
ChildView object if oldFirstResponder isn't a ChildView object.

You could simply have said that.

I'll incorporate your change.
(In reply to comment #20)
> You could simply have said that.

That was implicit in my statement that it would give you exactly the behavior you want. Since you wouldn't tell me which part you thought wouldn't work when I asked, I don't see how you expected me to clarify for you that it would.

(In reply to comment #19)
> > Could you explain how this approach works for keyboard users?
> 
> It resolves a lot of focus problems.

A system whereby focus transfer is defined solely in terms of NSLeftMouseButton resolves problems for people who transfer focus with keyboard shortcuts?
> A system whereby focus transfer is defined solely in terms of
> NSLeftMouseButton resolves problems for people who transfer focus
> with keyboard shortcuts?

It's _not_ "defined solely in terms of NSLeftMouseButton".

NS_LOSTFOCUS and NS_GOTFOCUS are still also being sent in the context
of calls to nsChildView::SetFocus().
Sorry, I think I've gotten us off into the weeds by asking overly vague questions. Let me ask one concrete question that should help me understand this better.

(In reply to comment #12)
> Basically, I know that becomeFirstResponder:/resignFirstResponder:
> don't "work", but I don't yet know why.

If the swizzle approach works, but the bFR:/rFR: didn't, then clearly there are cases where your swizzle approach detects a focus change but bFR:/rFR: don't, or vice versa, but I can't find a description of a specific instance of how that happens in any of the discussion. Could you give an example of a call stack to the one that is being called where there is a discrepancy?
(In reply to comment #23)

That's a reasonable request ... but I just don't have the time to work
on it, and am unlikely to for quite a while.

In the meantime, if you wish to pursue this on your own, I suggest
logging calls to [ChildView becomeFirstResponder:] and [ChildView
resignFirstResponder:] and doing simple things like clicking on the
desktop and then on a browser window.  Or clicking around in different
places in the browser window (text-input fields and otherwise).

You'll see lots of "extra" calls.  Sometimes a single click will
focus, unfocus, and then refocus the same ChildView object.  This is
what I saw when I started looking into this several months ago, and
suggested that I might get good results if I didn't send NS_GOTFOCUS
and NS_LOSTFOCUS events on every call to becomeFirstResponder: and
resignFirstResponder:
I've posted a proposed patch for this bug (bug 417636) at bug 413882.
Flags: blocking1.9? → blocking1.9+
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → NEW
Priority: -- → P2
Gavin, why was this re-opened again? The patch in bug 413882 included the presumptive fix of comment 7.
Assignee: joshmoz → smichaud
It was reopened because I accidentally resolved it. If it actually should have been resolved, feel free to resolve it again!
Oh, that newer patch hasn't actually landed yet. So you were right the second time ;)

I have seen this a couple of times myself now, by the way.
I just landed a patch for bug 404433 which should also fix this bug.
Please reopen if it keeps happening.
Status: NEW → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
verified fixed using  Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9pre) Gecko/2008050621 Firefox/3.0pre - no crash when i click the "clear list" button.

--> Verified fixed
Status: RESOLVED → VERIFIED
The crash wasn't 100% reproducible, and the steps were for Camino; verification for this should really be based on whether or not crash in the swizzled sendEvent: continue to show up.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: