Closed
Bug 398499
Opened 17 years ago
Closed 17 years ago
Crash in DispatchEventToHandlers(EventTargetRec*, OpaqueEventRef*, HandlerCallRec*
Categories
(Core Graveyard :: Widget: Mac, defect)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: marcia, Assigned: smichaud)
References
()
Details
(Keywords: crash, regression, verified1.8.1.12, Whiteboard: [radar:5466547] 1.8.1.x-only crash)
Attachments
(5 files, 1 obsolete file)
81.63 KB,
text/plain
|
Details | |
157.42 KB,
text/plain
|
Details | |
30.22 KB,
text/plain
|
Details | |
71.81 KB,
text/plain
|
Details | |
1.29 KB,
patch
|
mark
:
review+
roc
:
superreview+
dveditz
:
approval1.8.1.12+
|
Details | Diff | Splinter Review |
Seen while testing Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.8pre) Gecko/2007100307 BonEcho/2.0.0.8pre.
STR:
1. Load the URL. Crash.
Apple Crash report attached.
Reporter | ||
Comment 1•17 years ago
|
||
Additional STR:
1. Change focus to another application, such as Thunderbird and put that window on top. Then switch back to FF and that should generate the crash.
Reporter | ||
Comment 2•17 years ago
|
||
I checked the trunk and I do not get this crash when following the same steps. So far now this is 1.8 branch only.
Reporter | ||
Comment 3•17 years ago
|
||
nominating, not sure how critical this is.
Severity: normal → critical
Flags: blocking1.8.1.8?
Comment 4•17 years ago
|
||
Not DOM Events issue. Maybe Widget: Mac?
Assignee: nobody → joshmoz
Component: DOM: Events → Widget: Mac
QA Contact: events → mac
Reporter | ||
Comment 5•17 years ago
|
||
Attachment #283490 -
Attachment is obsolete: true
Reporter | ||
Comment 6•17 years ago
|
||
BTW thanks to Jeremy for pointing me to this bug, he should get the credit for finding it.
Updated•17 years ago
|
Flags: blocking1.8.1.8? → blocking1.8.1.9?
Comment 7•17 years ago
|
||
1.8 is widget:mac (Carbon) and trunk is Cocoa. This crashes in a Carbon system library (which I'm guessing is not used from Cocoa apps) and requires interaction with another app so it makes sense that it crashes only 1.8 and makes me think it's triggered in or related to widget:mac. [mid-air with smaug]
(In reply to comment #1)
> Additional STR:
> 1. Change focus to another application, such as Thunderbird and put that window
> on top. Then switch back to FF and that should generate the crash.
No need to switch back. AFAICT this also requires mouse activity over any part of the moz app which is covered by another app.
(In reply to comment #3)
> nominating, not sure how critical this is.
fwiw, I'm 98% sure that a stack in my local CrashReporter (apple) log from march was *not* on a MediaWiki site. And that happens to be the first Firefox crash in the log with HIToolbox 1.4.9 (the previous crash was 1.4.8 and a different stack) So this is probably not a MediaWiki specific issue.
The CrashReporter log from marcia has a truncated stack (as does my own on OS X 10.4.10). I'll attach a my own gdb trace for 10.4.10. I'll walk Marcia through getting a gdb trace if anyone wants it for Leopard.
(In reply to comment #6)
> BTW thanks to Jeremy for pointing me to this bug, he should get the credit for
> finding it.
btw, thanks to Marcia to for reproducing on Leopard! :) It was hard to miss on branch when browsing MediaWiki diffs :-{
<rdar://problem/5466547>
Whiteboard: [radar:5466547]
Can somebody confirm that this still happens reliably on a fully software-updated Mac OS X 10.5 build 9a559? Does it happen reliably on this build? How likely are users to hit this bug?
Comment 9•17 years ago
|
||
(In reply to comment #8)
> Can somebody confirm that this still happens reliably on a fully
> software-updated Mac OS X 10.5 build 9a559? Does it happen reliably on this
> build? How likely are users to hit this bug?
Are you having trouble reproducing?
Marcia repro'd on that build and it was prolly up to date but I'll have to double check with her. (there's prolly enough info in the report to determine the answer if you knew enough about their build changes but I don't :P) See attachment 283577 [details] for a CrashReporter log on Leopard. I see I forgot to attach my gdb trace (10.4.10) after my last comment; coming right up.
This reliably happens on 10.4.10 on many tens of Mediawiki generated pages (mostly but maybe not exclusively diff pages, I forget) and therefore will likely effect the majority of OS X users that are more than just casual editors on any Mediawiki wiki. (Haven't tried to reduce to MW specific versions yet)
FWIW, I'm 95% certain a matching stack in my log from several months back is *not* from Mediawiki.
from a message I just sent to an apple guy that responded to the radar:
> btw, i'd love to get some numbers on how many people are actually crashing
> from this using statistics from talkback but I can't get it to run in rosetta
> (I even tried invoking oah/traslate directly and it segfaulted) ISTR firefox
> once worked in rosetta on this machine and I hear it still works for other
> people so probably a local issue. Later this week or next I'll try again to
> get ahold of a PPC tester / find a public PPC machine / try with rosetta on
> another box. Also, I know CrashReporter generates lots of reports and it
> would be nice to get stats from those too.
Comment 10•17 years ago
|
||
Comment 11•17 years ago
|
||
CrashReporter log of crashing on the same page under Rosetta. Note HIToolbox doesn't even show up in the stack.
Reporter | ||
Comment 12•17 years ago
|
||
I just tested this again on the latest Leopard seed 9a559 on a PPC mac and I crash hitting that URL (it is a bit of a delayed crash). Talkback fires, but I am having trouble submitting the report due to a network error (perhaps a separate bug). I have an Apple report from this PPC crash which I can attach.
Assignee | ||
Comment 13•17 years ago
|
||
I've reproduced this crash on OS X 10.4.10 (on a PPC Mac). I'll test
later on Leopard, but it's already pretty clear that this is a bad
bug, and not Leopard-specific.
By the way, Jeremy, what your gdb stack (attachment 285198 [details], comment
#10) has at it's bottom is truly bizarre -- it looks like you've quit
Firefox! Something similar (and equally bizarre) shows up in a stack
trace posted for a seemingly related bug (in 1.8-branch Firefox,
attachment 285380 [details], bug 400291). I now suspect that the bottom parts
of both stacks are corrupt.
If possible, Jeremy, please do a debug build of 1.8-branch Firefox and
test with that -- because it will have debug symbols, gdb and
crashreporterd will be less likely to generate corrupt stacks from it.
(I'll do the same, of course ... but I probably won't have time until
tomorrow.)
Assignee | ||
Comment 14•17 years ago
|
||
> Something similar (and equally bizarre) shows up in a stack trace
> posted for a seemingly related bug ...
I should have said "seemingly unrelated". Aside from the (apparent)
stack-trace corruption, this bug doesn't seem related to bug 400291.
Assignee | ||
Comment 15•17 years ago
|
||
I see this bug (on OS X 10.4.10) on all the Firefox 2.0.0.X versions
that I tested, but not on any of the Firefox 1.5.0.X versions. So it
seems to have been introduced (a long time ago) on the 1.8 branch (as
opposed to the 1.8.0 branch).
Assignee | ||
Comment 16•17 years ago
|
||
This bug is a regression that took place on the 1.8 branch between the
2006-05-30-07-mozilla1.8 nightly (which doesn't have the bug) and the
2006-05-31-08-mozilla1.8 nightly (which does).
This regression window almost certainly implicates the single patch
that was landed as a fix for bug 106695 and bug 339640:
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2006-05-30+12%3A34&maxdate=2006-05-30+12%3A34&cvsroot=%2Fcvsroot
Keywords: regression
Assignee | ||
Comment 17•17 years ago
|
||
This bug isn't Leopard-specific.
Summary: [10.5] Crash in DispatchEventToHandlers(EventTargetRec*, OpaqueEventRef*, HandlerCallRec* → Crash in DispatchEventToHandlers(EventTargetRec*, OpaqueEventRef*, HandlerCallRec*
Assignee | ||
Comment 18•17 years ago
|
||
Here's a stack trace made using a build with debug symbols (built from
current 1.8-branch source code).
It includes both a gdb trace of the top and bottom of the main thread
stack, and a crashreporterd trace of all the threads' stacks.
Assignee | ||
Comment 19•17 years ago
|
||
> This regression window almost certainly implicates the single patch
> that was landed as a fix for bug 106695 and bug 339640:
I was wrong about this. The actual culprit was the single patch
(landed slightly later) to fix bug 54488, bug 339370 and bug 339482:
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2006-05-30+12%3A43%3A00&maxdate=2006-05-30+12%3A43%3A00&cvsroot=%2Fcvsroot
Assignee | ||
Comment 20•17 years ago
|
||
Here's a patch that fixes this bug.
Contrary to appearances, the bug isn't caused by an infinite recursion
... just by a very large non-infinite one :-)
This bug's URL has a very large number of nsMacControl objects on its
page. For each of these, a handler for kEventWindowActivated and
kEventWindowDeactivated Carbon events is installed on the window to
which the control belongs. Currently (before my patch), this handler
calls CallNextEventHandler() to invoke all previously installed
handlers. But if there are n nsMacControl objects in a page/window,
using CallNextEventHandler() makes the handler recurse n-1 times.
This bug's URL has so many controls that this recursion (apparently)
causes a stack overflow.
Calling CallNextEventHandler() from this handler servers no purpose,
so I've eliminated the call.
Even with my patch, this bug's URL still redraws incredibly slowly,
and sometimes fails to redraw at all. But that's another problem :-(
Attachment #286306 -
Flags: review?(joshmoz)
Assignee | ||
Comment 21•17 years ago
|
||
Needless to say, this isn't an Apple bug.
Jeremy, please let them know that.
Attachment #286306 -
Flags: review?(joshmoz) → review?(mark)
Comment 22•17 years ago
|
||
Comment on attachment 286306 [details] [diff] [review]
Fix
This is OK with me. Make sure that you test that the sheet show/hide stuff in nsMacWindow still works in conjunction with controls, since I know that it uses {kEventClassWindow, kEventWindowActivated} too.
Attachment #286306 -
Flags: review?(mark) → review+
Assignee | ||
Comment 23•17 years ago
|
||
> Make sure that you test that the sheet show/hide stuff in
> nsMacWindow still works in conjunction with controls, since I know
> that it uses {kEventClassWindow, kEventWindowActivated} too.
Please give me some pointers about how to do this -- examples of URLs
(or browser menus/functions) that use this stuff, and so forth.
Comment 24•17 years ago
|
||
It's probably enough if you bring up a Bugzilla page and then try to navigate through all of the prefs sheets in Google Toolbar.
Comment 25•17 years ago
|
||
(In reply to comment #21)
> Jeremy, please let them know that.
Done.
Assignee | ||
Comment 26•17 years ago
|
||
> It's probably enough if you bring up a Bugzilla page and then try to
> navigate through all of the prefs sheets in Google Toolbar.
Thanks, Mark.
I did this (while occasionally switching to another browser window and
back, or to another app and back), and saw no problems.
Assignee | ||
Comment 27•17 years ago
|
||
Comment on attachment 286306 [details] [diff] [review]
Fix
Roc, Jeremy has already pointed out the following error in my comment.
I'll fix it when I land the patch:
+ // actually "consume" an activate or deactive event here, there should be
should be
+ // actually "consume" an activate or deactivate event here, there should be
Attachment #286306 -
Flags: superreview?(roc)
Comment 28•17 years ago
|
||
Comment on attachment 286306 [details] [diff] [review]
Fix
(In reply to comment #27)
As long as we're enumerating errata here, I'll add to the pile:
>+ // for each of them. So CallNextEventHandler() recurses through this
>+ // handler once for each "additional" control, and if there are too many
>+ // controls a stack overflow can result (bmo bug 398499). Since we never
This sentence could probably use an extra comma or to be split into 2 sentences.
>+ // no problem alwaye returning eventNotHandledErr.
s/alwaye/always/
Assignee | ||
Comment 29•17 years ago
|
||
Roc? :-)
Comment on attachment 286306 [details] [diff] [review]
Fix
sorry, I lost track of this one somehow
Attachment #286306 -
Flags: superreview?(roc) → superreview+
Assignee | ||
Comment 31•17 years ago
|
||
Comment on attachment 286306 [details] [diff] [review]
Fix
This patch's risk is minimal (the call to CallNextEventHandler()
didn't serve any useful purpose).
It's benefits are perhaps moderate (very few web pages will have
enough controls to trigger the crash my patch fixes). Or you could
argue that, since Wikipedia is an important website, the patch's
benefits are substantial.
Attachment #286306 -
Flags: approval1.8.1.10?
Personally I don't think we should take this on branch.
Assignee | ||
Comment 33•17 years ago
|
||
Oh? Why not?
(The problem is branch-only, by the way.)
Because we typically only take crash fixes on branch if they have security implications, and AFAIK this doesn't. We might make exceptions for crashes that are regressions from other branch changes, or for crashes that are very frequent, but this doesn't seem to meet those criteria either.
> We might make exceptions for crashes that are regressions from other branch
> changes
What I mean specifically is that we might take a fix for a crash that was introduced since Firefox 2.0.0.0 shipped. But it seems this bug was in Firefox 2.0.0.0.
Assignee | ||
Comment 36•17 years ago
|
||
> Because we typically only take crash fixes on branch if they have
> security implications, and AFAIK this doesn't.
No, I agree this crash doesn't have security implications.
But this policy seems _way_ too restrictive ... at least in my
opinion. Even if you sometimes make exceptions to it.
Though it'd be another matter if this bug didn't have a patch.
Given that a patch exists, I'm not sure it makes sense to
distinguish between "primordial" crash bugs and regressions.
(By the way, one reason this bug got a lot of attention is that at
first it seemed to be Leopard-only.)
We've had some pretty bad regressions in the 2.0.0.x series. You could argue that we're not restrictive *enough*.
> Given that a patch exists, I'm not sure it makes sense to
> distinguish between "primordial" crash bugs and regressions.
It does make sense. Crashes that 2.0.0.x has always incurred get worked around by Web developers and users. New crashes are a lot more painful.
Assignee | ||
Comment 38•17 years ago
|
||
You also need to remember that this patch is close to zero risk --
basically a no-brainer.
But it's out of my hands ... do with it what you will :-)
Now that we have a lot of test automation on trunk, the situation for the Firefox 3 branch should be a lot better, so we may want to relax our policy there. Or not. But anyway, this is not the right place for that discussion.
Assignee | ||
Comment 40•17 years ago
|
||
> We've had some pretty bad regressions in the 2.0.0.x series. You
> could argue that we're not restrictive *enough*.
Regressions caused by narrowly-scoped fixes for crash bugs?
I'd be surprised to hear there were more than one or two ... or even
that many.
I _do_ think your approach is correct for new features ... but I'd bet
that almost all of these regressions you talk about came from adding
new features.
> Regressions caused by narrowly-scoped fixes for crash bugs?
Yes.
> I'd be surprised to hear there were more than one or two ...
Be surprised.
> I'd bet that almost all of these regressions you talk about came from adding
> new features.
None of the ones I'm aware of came from adding new features (not surprising, since we haven't added any new features in the 2.0.0.x series except, arguably, some Vista compatibility stuff).
Note that in many of the 2.0.0.x releases we fixed > 50 security bugs each. It's not surprising that some of these fixes caused regressions, even though we each one as safe as we could.
Assignee | ||
Comment 43•17 years ago
|
||
> Note that in many of the 2.0.0.x releases we fixed > 50 security
> bugs each.
Yes, I can see that security bugs deserve priority. And it's hard to
keep track of the implications of that many different changes. And
they inevitably (and legitimately) draw attention away from other
bugs (even crash bugs).
And I grant that, even without the crash, 1.8-branch Firefox behaves
very badly on this bug's URL.
Comment 44•17 years ago
|
||
Comment on attachment 286306 [details] [diff] [review]
Fix
1.8.1.10 isn't the release to take this in (short, focused). We'll look later.
Attachment #286306 -
Flags: approval1.8.1.10? → approval1.8.1.11?
Updated•17 years ago
|
Flags: wanted1.8.1.x+
Flags: wanted1.8.0.x-
Whiteboard: [radar:5466547] → [radar:5466547] 1.8.1.x-only crash
Comment 45•17 years ago
|
||
Comment on attachment 286306 [details] [diff] [review]
Fix
approved for 1.8.1.10, a=dveditz for release-drivers
Attachment #286306 -
Flags: approval1.8.1.12? → approval1.8.1.12+
Comment 46•17 years ago
|
||
I meant 1.8.1.12
Assignee | ||
Comment 47•17 years ago
|
||
Landed on 1.8 branch.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Comment 48•17 years ago
|
||
Steven, please use the fixed1.8.1.* keyword (12, in this case) after checking in on the branch.
Keywords: fixed1.8.1.12
Assignee | ||
Comment 49•17 years ago
|
||
I'll try to remember next time :-)
Comment 50•17 years ago
|
||
Verified in 1.8.1.12 with Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.12pre) Gecko/2008012208 BonEcho/2.0.0.12pre.
Status: RESOLVED → VERIFIED
Keywords: fixed1.8.1.12 → verified1.8.1.12
Updated•17 years ago
|
Flags: blocking1.8.1.12?
You need to log in
before you can comment on or make changes to this bug.
Description
•