Closed Bug 453222 Opened 16 years ago Closed 3 years ago

nsCocoaWindow_NSWindow_sendEvent: changes exception behavior in code having nothing to do with core

Categories

(Core :: Widget: Cocoa, defect)

All
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: stuart.morgan+bugzilla, Unassigned)

Details

(Keywords: fixed1.9.0.9, regression)

Attachments

(2 files, 2 obsolete files)

I was testing a patch in Camino today that ended up throwing an exception in one of the preference panes. Instead of it being completely harmless, as this particular exception would have been, I got this crash: 2008-09-01 15:58:04.572 Camino[13554:10b] Mozilla has caught an Obj-C exception [NSInvalidArgumentException: *** -[NSCFString stringValue]: unrecognized selector sent to instance 0xa08b81a0] 2008-09-01 15:58:04.574 Camino[13554:10b] Generating stack trace for Obj-C exception... 2008-09-01 15:58:07.555 Camino[13554:10b] Stack trace: Looking up symbols in process 13554 named: Camino __raiseError (in CoreFoundation) objc_exception_throw (in libobjc.A.dylib) -[NSObject doesNotRecognizeSelector:] (in CoreFoundation) ___forwarding___ (in CoreFoundation) _CF_forwarding_prep_0 (in CoreFoundation) -[NonNegativeIntegerFormatter stringForObjectValue:] (in History) (History.mm:64) -[NSCell setFormatter:] (in AppKit) -[NSControl setFormatter:] (in AppKit) -[OrgMozillaCaminoPreferenceHistory mainViewDidLoad] (in History) (History.mm:91) -[NSPreferencePane loadMainView] (in PreferencePanes) -[MVPreferencesController selectPreferencePaneByIdentifier:] (in Camino) (MVPreferencesController.mm:266) -[MVPreferencesController(MVPreferencesControllerPrivate) selectPreferencePane:] (in Camino) (MVPreferencesController.mm:438) -[NSToolbarButton sendAction:to:] (in AppKit) -[NSToolbarButton sendAction] (in AppKit) -[NSToolbarItemViewer mouseDown:] (in AppKit) -[NSWindow sendEvent:] (in AppKit) -[NSWindow(MethodSwizzling) nsCocoaWindow_NSWindow_sendEvent:] (in libwidget_mac.dylib) (nsCocoaWindow.mm:2178) -[NSApplication sendEvent:] (in AppKit) -[NSApplication run] (in AppKit) NSApplicationMain (in AppKit) main (in Camino) (main.m:76) _start (in Camino) start (in Camino) This code would never have involved any core code, nor unwound past any C++ stacks. It created a crasher out of what would in this case have been a completely harmless exception, and the same will happen in most cases, since almost all UI-level code starts with a window event. I don't think that turning every exception in an embeddor into a crash is an acceptable situation.
Keywords: regression
Come to think of it, this should be a trivial fix; I can't think of any reason there would need to be a NS_OBJC_BEGIN_TRY_ABORT_BLOCK in that method at all, since its purpose is to prevent exceptions from bubbling into C++ callers, but the caller to sendEvent: is ObjC, not C++. If sendEvent: is being called anywhere in core, it would already need to be protected at that level. Marking wanted1.9.0.x? given that.
Flags: wanted1.9.0.x?
Attached patch remove abort handler (obsolete) — Splinter Review
Patch against CVS trunk; I assume it would apply to Hg trunk as well.
Attachment #336405 - Flags: review?(joshmoz)
Also see bug 442245 comment #11 (and attachment 327986 [details] [diff] [review]), which offers a better (more narrowly targeted) solution to this bug's problem.
I disagree with the characterization that changing the behavior of UI-level code (by eating the exception) that doesn't involve core is a better solution than not doing so. Ideally, the swizzling wouldn't happen at all, since it causes side-effects beyond its intended purpose; given that I'm stuck with the swizzling, I would like the side-effects to be minimized as far as possible. That means not interfering with the flow of exceptions that it has no business handling. Unless you can explain a reason I'm missing that it serves a necessary purpose in that method, I would really like it removed in 1.9.0.x
(In reply to comment #2) This patch only "works" in Camino (or other embedders) -- only in Camino does it make Objective-C exceptions non-fatal that occur in (or in code called from) the [NSWindow nsCocoaWindow_NSWindow_sendEvent:] "hook". In Firefox (and other non-embedders) you just crash in a different place (in nsAppShell::Run()). So, to fix this problem in non-embedders, you need to take the approach used by my patch for bug 442245.
(In reply to comment #4) > I disagree with the characterization that changing the behavior of > UI-level code (by eating the exception) that doesn't involve core is > a better solution than not doing so. Why the big fuss? In Camino (where your attachment 336405 [details] [diff] [review] patch does "work", after a fashion), that patch and my patch from bug 442245 comment #11 do (almost) exactly the same thing -- they both stop Objective-C exceptions in code called from [NSWindow nsCocoaWindow_NSWindow_sendEvent:] from being fatal. My patch from bug 442245 comment #11 does this by selectively disabling the crash-on-Objective-C-exception behavior. Your patch (from comment #2) selectively turns it off. The only significant difference between our patches is something you don't discuss: My patch (from bug 442245 comment #11) only stops Objective-C exceptions from being fatal if they occur in code called from the call to "super" in [NSWindow nsCocoaWindow_NSWindow_sendEvent:]. Your patch (from comment #2) stops Objective-C exceptions from being fatal in any code in (or called from) that method. This is why I said (in comment #3) that my patch is more narrowly targeted.
(In reply to comment #5) > In Firefox (and other non-embedders) you just crash in a different > place (in nsAppShell::Run()). > > So, to fix this problem in non-embedders, you need to take the > approach used by my patch for bug 442245. If you want a different behavior in Firefox, I have no objection to your handling the exception at the point where non-embedding code is calling into Cocoa in the first place. (In reply to comment #6) > The only significant difference between our patches is something you > don't discuss: My patch (from bug 442245 comment #11) only stops > Objective-C exceptions from being fatal if they occur in code called > from the call to "super" in [NSWindow > nsCocoaWindow_NSWindow_sendEvent:]. Your patch (from comment #2) > stops Objective-C exceptions from being fatal in any code in (or > called from) that method. My patch makes it so that core does not interfere in the normal flow of exception delivery in code that has, again, *nothing to do with core*. Your patch still changes it. I don't understand why you don't view that as a difference. I'm not sure how I can be more clear about this: I do not view changing the behavior of essentially every single exception in Camino to be an acceptable side effect of a method that is only there to watch focus changes.
(Alternately, you could fix this to my satisfaction by not doing the swizzling at all, which I objected to originally for exactly this kind of reason: it has strange side effects in code unrelated to its intended purpose. I'm still not happy about the fact that it was landed without a clear technical explanation for its necessity--and that I was told that it was up to me to prove that you *didn't* need to use runtime hackery rather than documented focus methods--under the of banner schedule pressure, and had hoped that it would be revisited anyway after the rush to ship 3.0.)
I just realized that perhaps we're talking at cross purposes because of the title of the bug. I used "crash" because that's what currently happens. The fundamental bug is: "NSWindow sendEvent: swizzling changes the behavior of exceptions in code having nothing to do with core" Any patch that fixes that issue is fine with me, but your fix in bug 442245 doesn't.
> swizzling changes the behavior of exceptions ... You appear to have a fundamental misunderstanding of what method swizzling does. It does _not_, in itself, have any side effects at all. Method swizzling (in effect) "hooks" a system call (just as programmers used to "hook" interrupts). There's no practical point in doing this unless the "hook" does change behavior in some way -- possibly entirely replacing the original system call (or interrupt), or possibly intervening before or after the system call (or interrupt) to do other things. But if the hook just calls the previous system call (or interrupt), it has no side effects.
I'm not sure why you routinely talk down to me as if I don't understand the platform despite a lot of evidence to the contrary; I'd really appreciate it if that could stop. If you put back a critical part of the line you are quoting that you edited out, what it says is "NSWindow sendEvent: swizzling changes the behavior of exceptions [...]", which is precisely what the current NSWindow sendEvent: swizzling does (and what it would still do after your proposed patch). If you want me to be more precise, "nsCocoaWindow's NSWindow sendEvent: swizzling changes the behavior of exceptions [...]" but since there is only one place in the codebase where sendEvent: is swizzled that I'm aware of, I thought I was being clear enough.
Summary: NSWindow sendEvent: swizzling creates crashes from exceptions in code having nothing to do with core → nsCocoaWindow_NSWindow_sendEvent: changes exception behavior in code having nothing to do with core (causes crashes)
Attachment #336405 - Flags: review?(joshmoz)
Comment on attachment 336405 [details] [diff] [review] remove abort handler I agree that we should not impede the flow of exceptions through the method we swizzle when our code is not involved. Obviously the ideal solution here is to not swizzle, but since we need to do that for now for focus reasons, swizzling is going to stay until we find another solution. Given that the swizzling is going to stay for now, we need to either not touch exceptions or at least not crash. I'd have to take a much longer look at not touching exceptions, but I'm convinced at this point that we should at least not crash. We should take smichaud's patch on bug 442245, which gets rid of the crash, but leave this bug open for the issue of eating exceptions that might have nothing to do with our code.
Flags: wanted1.9.0.x?
(In reply to comment #12) > I'd have to take a much longer look at not touching exceptions I'm not sure why given comment 1, but if you are concerned about it I'd like to suggest an alternate compromise for 1.9.0.x: remove all exception-handling from nsCocoaWindow_NSWindow_sendEvent:, then make another layer of swizzling that is installed *only* in the non-embedding case (there's already runtime detection of embedding vs. non-embedding in other parts of Cocoa Widgets) that handles the exception however you like. So non-embedding would go through the current method minus exception handling, and Camino gets the behavior that I want. Meanwhile Firefox et al. go through the new method that does nothing but wrap a call to the current method in a handler, getting your bug 442245 comment 11 behavior.
Hardware: PC → All
Blocks: 458205
Josh, could you comment on my suggestion in comment 13?
No longer blocks: 458205
I just crashed in a stack similar to this using 10.6, although the stack is a little bit further down in the report - http://crash-stats.mozilla.com/report/index/f6ae2555-a4de-43c5-8706-d36942090210
Marcia, this crash has nothing to do with this bug. And once again (like bug 474629 and bug 476561) it's in system code -- not in browser code. The crash is in "[self nsCocoaWindow_NSWindow_sendEvent:anEvent];", which is the call to super.
So, since it's now five months later with no response at all, let's try this a different way: does anyone have any objection to my proposal in comment 13? It's really frustrating to me that this regression in Camino remains unfixed simply because I can't get an answer.
Summary: nsCocoaWindow_NSWindow_sendEvent: changes exception behavior in code having nothing to do with core (causes crashes) → nsCocoaWindow_NSWindow_sendEvent: changes exception behavior in code having nothing to do with core
On second thought, I withdraw my question. I'm tired of begging for scraps in Cocoa Widget. When this swizzling code was first going in, I objected strenuously, and was assured that while it was going to go in because of the importance of Firefox 3's schedule, revisiting the code would be a high priority after Firefox 3 shipped. When it caused crashes in Camino the first time (not this bug), I objected again, pointing out that that was exactly the kind of thing that made it a bad idea, and I was once again assured that it was absolutely going to be looked at seriously after Firefox 3 shipped. Then we hit this bug, after Firefox 3 has shipped, which had basically two components: causing crashes (affecting both Firefox and Camino) and interfering with normal exception flow in all windows (desirable in Firefox apparently, a bug for embeddors like Camino). Surprise surprise, only the part that affected Firefox got fixed. The part that affects Camino has been ignored for *five months*, because no one can be bothered to even comment on any of the specific suggestions I've made, much less take up the promised revisiting of the whole section of code (and of course, even if I did get an answer it would be my job to write the fix, my job to prove that it's safe for Firefox--heaven forbid that there be even the slightest risk to Firefox, since regressing Firefox to fix a regression in Camino is just out of the question--and my job to beg for branch approval). All I have to show for this bug is yet again having to put up with Steven finding ways to be as patronizing to me as possible rather than actually acknowledging that there's a bug. It's pretty hard to see how that constitutes a priority on fixing the regressions that were introduced in Camino in the name of shipping Firefox. This is a pattern that has played out with minor variations in a number of Cocoa Widget bugs during the 1.9 timeframe: someone regressed Camino to improve Firefox, often in significant ways, and we got one of two answers when we objected: 1) This is short term, and we will definitely fix it after Firefox 3 ships. 2) If you don't like the regressions we are introducing into Camino, feel free to fix them yourself. Answer 1 has turned out generally not to be true. Answer 2 really, really sucks, as it means that the arbiter of what code goes into the component is not "is this the right thing for the platform" but, "which product can pay more people". A model where people paid primarily to improve one specific product are allowed to sacrifice other products and leave others to clean up the mess means that essentially Cocoa Widgets gets sold to the highest bidder. Unsurprisingly, Camino is not the highest bidder. And of course, 2 ignores the fact that it's harder to get changes accepted when they aren't for Firefox, and it's harder to get changes accepted once the code has become a stable (for Firefox) branch, so it takes way more effort for us to fix them then it took for the regression to be introduced in the first place. After months and months and months of this kind of thing, I'm tired of dealing with it. If someone else wants to try to get this bug fixed, best of luck to you; if not, perhaps it should be closed as WONTFIX/CAMINO REGRESSIONS DON'T MATTER (as essentially happened with the key handling regression). When Firefox started using Cocoa Widgets, the promise was that everyone would benefit. Instead, we benefitted if we happened to need exactly the same behavior Firefox needed, but when there was a conflict, we generally lost. Rather than sharing Cocoa Widgets, it pretty much feels like Firefox stole Cocoa Widgets from the Camino project and then graciously allowed us to use what we could as long as we didn't get in the way.
After fixing the crash reported here via the patch for bug 442245 it was my understanding that remaining issues were not seriously affecting Camino users or developers despite this instance of exception handling in Cocoa widgets clearly being incorrect. I hoped to resolve it when we got a chance to re-examine our focus implementation in general. When it became clear that re-doing focus in Cocoa widgets wasn't going to happen for Gecko 1.9.1 due to the extreme risk involved, I planned to make it a 1.9.2 priority. Around the time I was thinking about 1.9.2 priorities I learned about Neil Deakin's work in bug 178324. The patch for bug 178324 resolves this issue entirely by removing the method swizzling on native windows. Since Neil Deakin's work would not help Camino on the 1.9.0 branch, we should do what Stuart suggested in comment #13 on the 1.9.0 branch if we have a reason to patch there. That solution sounds fine to me. Like I said though, I was under the impression that there were only theoretical consequences for Camino once the crash was fixed, so maybe we don't need a 1.9.0 patch. I apologize for my poor communication here. I knew about the work that Neil was doing after he blogged about it in December but didn't mention it in this bug. Stuart's last question I simply lost in bugmail, I did not see it until this week.
I don't know what "a reason to patch" means. The current behavior is wrong (in practice, any time we do hit an exception, not just in theory), which I thought was the criteria for when something should be fixed. If that just means "does it meet the bar for landing on a branch", I don't know. But see above under "much harder to fix regressions than it was for them to be introduced" and "tired of begging". I've been beating my head against walls for too long to have the energy left to continue with this particular wall. Fix it, or don't. It's your call.
(In reply to comment #20) > If that just means "does it meet the bar for landing on a branch", I don't know. That's all it means.
Attachment #336405 - Flags: superreview?(roc)
Attachment #336405 - Flags: review+
Comment on attachment 336405 [details] [diff] [review] remove abort handler r+ for Gecko 1.9.0 and Gecko 1.9.1, looking at it again this seems like the right way to go. If we generate an obj-c exception here it will bubble into whatever called sendEvent: on an NSWindow, and if that ultimately rises back to Gecko's code we should have guards there. No point in changing this on trunk right now since Neils focus patch should remove this swizzling altogether.
Josh, you've forgotten something: As I say in comment #5, this patch "works" in Camino, but causes problems in non-embedders (like Firefox) -- you'll reopen bug 442245.
I don't think he's forgotten that: > if that ultimately rises back to Gecko's code we should have guards there. which is the argument I made in comment 1, and again at the beginning of comment 7. If you need a guard for bug 442245 it should be at a C++/Obj-C boundary elsewhere in core code.
But the patch as it stands will reopen bug 442245.
In non-embedders.
Steven: since you worked on bug 442245 and I imagine are the most familiar with the issues there, could you help amend attachment 336405 [details] [diff] [review] so it won't regress bug 442245? Is it indeed as simple as adding guards at the boundary of Gecko and Cocoa? (or at the C++/Obj-C boundary? Not sure which would be better)
Attachment #336405 - Flags: superreview?(roc) → superreview+
(In reply to comment #27) I will be blunt: There's no reason to think attachment 336405 [details] [diff] [review] will make any difference to Camino. So it has no point, and should be rejected. Finding other places to add guards will be very difficult. As I say in comment #5, the next set of guard macros "up the stack" (ones that crash on Objective-C errors, and don't just log them) is in nsAppShell::Run(), where they govern a large chunk of all the activity that takes place while the browser is running.
(Following up comment #28) This mess isn't my doing. It's only fair that others should have to clean it up.
We can avoid regressing 442245 on the branch, lets give it a shot. This is a blast from the past but it may be a simple and practical way to resolve this on the 1.9.0 branch. The logic behind not having app-specific ifdefs in this code doesn't realistically matter much in this situation.
Attachment #362634 - Flags: review?(stuart.morgan+bugzilla)
Attachment #362634 - Flags: review?(stuart.morgan+bugzilla) → review+
Comment on attachment 362634 [details] [diff] [review] 1.9.0 branch fix, v1.1 r=smorgan The comment is confusing here, if that matters. It sounds like it's saying "Don't avoid crashing Camino", but there's no crash to avoid at this level.
Attachment #362634 - Flags: superreview?(roc)
Attachment #336405 - Flags: review?(stuart.morgan+bugzilla)
Comment on attachment 336405 [details] [diff] [review] remove abort handler This patch doesn't apply cleanly. The NS_OBJC_BEGIN_TRY_ABORT_BLOCK was replaced a while ago with NS_OBJC_BEGIN_TRY_LOGONLY_BLOCK. Can you attach a new patch that applies cleanly? Assuming you still want to remove the LOGONLY guard, it will regress bug 442245 (except we'll be crashing much higher up in the call stack). I think I'd much prefer your suggested solution which swizzles in the LOGONLY guard for non-embedders so we don't regress them.
Attachment #336405 - Attachment is obsolete: true
Attachment #336405 - Flags: review?(stuart.morgan+bugzilla)
Or, if Steven's hunch in bug 178324 comment 72 is right, we could just handle this in ToolbarWindow.
Peter - your proposals may turn out to be good options for the trunk if Neil does not remove swizzling altogether after all, but for the 1.9.0 branch, which we need to resolve separately and first of all, I think a minimal impact approach is best and I think my patch is as about as minimal as we could make it.
Josh: my swizzling suggestion in comment 32 can be done regardless of what Neil does. And comment 33 would look something like this.
Comment on attachment 362994 [details] [diff] [review] Limit LOGONLY for sendEvent: to users of ToolbarWindow. With that comment changed, of course ;)
Comment on attachment 362634 [details] [diff] [review] 1.9.0 branch fix, v1.1 Gross, but as a branch fix to make the pain go away, OK.
Attachment #362634 - Flags: superreview?(roc) → superreview+
I think Peter's patch is a good solution. But on second thought I'd like to use it on all the nsCocoaWindow NSWindow subclasses (none of which are used by embedders, or at least by Camino).
Attachment #363390 - Flags: review?(joshmoz)
Attachment #363390 - Flags: superreview?(roc)
Attachment #363390 - Flags: review?(joshmoz)
Attachment #363390 - Flags: review+
Attachment #363390 - Flags: superreview?(roc) → superreview+
Attachment #363390 - Flags: approval1.9.0.8?
Attachment #362994 - Attachment is obsolete: true
The above patch is the right fix for the trunk too, no? And it obsoletes attachment 362634 [details] [diff] [review].
(In reply to comment #39) Yes, I think so.
Attachment #363390 - Flags: approval1.9.0.8? → approval1.9.0.8+
Comment on attachment 363390 [details] [diff] [review] Another 1.9.0 branch fix (expand jag's patch to cover all non-embedder NSWindow subclasses) Approved for 1.9.0.8, a=dveditz for release-drivers
checked in for 1.9.0.8 Checking in widget/src/cocoa/nsCocoaWindow.mm; /cvsroot/mozilla/widget/src/cocoa/nsCocoaWindow.mm,v <-- nsCocoaWindow.mm new revision: 1.151; previous revision: 1.150 Leaving the bug open for a trunk/1.9.1 resolution.
Keywords: fixed1.9.0.8
On trunk we just want to do the same thing, right? Steven, can you create that patch? Would you like me to?
Assignee: joshmoz → nobody

Hi Jag, Does this issue still occur ? I know its been a long time ago but can we close this issue ? or does it still happen ?

Flags: needinfo?(jag+mozilla)

The swizzled method in question was removed in https://hg.mozilla.org/mozilla-central/rev/cabb8925dcd3.

Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(jag+mozilla)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: