Closed Bug 682208 Opened 13 years ago Closed 11 years ago

crash [@ -[ChildView maybeRollup:] ]

Categories

(Core :: Widget: Cocoa, defect)

x86_64
macOS
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
mozilla21
Tracking Status
firefox18 --- affected
firefox19 + wontfix
firefox20 + unaffected
firefox21 --- unaffected

People

(Reporter: bzbarsky, Assigned: smichaud)

References

Details

(Keywords: crash, topcrash)

Crash Data

Attachments

(3 files, 1 obsolete file)

This bug was filed from the Socorro interface and is 
report bp-2fb93a36-2fc7-41c8-81b2-14d042110825 .
============================================================= 

This is a null-deref on this line:

  2759        gRollupListener->ShouldRollupOnMouseWheelEvent(&shouldRollup);

that I hit while two-finger-scrolling around on Google code search.  Could gRollupListener have become null between the line 2751 null-check and this code for some reason?
> Could gRollupListener have become null between the line 2751
> null-check and this code for some reason?

I don't see how this could have happened.  So this crash stack is
probably misleading, and there's probably more going on there than
meets the eye.

There are a small number of other crashes [@ -[ChildView
maybeRollup:]], in different FF versions and on different versions of
OS X.  Most of them are null dereferences, but not all of them.

https://crash-stats.mozilla.com/query/query?product=Firefox&version=ALL%3AALL&platform=mac&range_value=1&range_unit=weeks&date=08%2F31%2F2011+15%3A20%3A37&query_search=signature&query_type=contains&query=maybeRollup&reason=&build_id=&process_type=any&hang_type=any&do_query=1

I'll look into this when I have the chance.
Assignee: nobody → smichaud
Crash Signature: [@ -[ChildView maybeRollup:]] → [@ -[ChildView maybeRollup:] ]
Summary: crash [@ -[ChildView maybeRollup:]] → crash [@ -[ChildView maybeRollup:] ]
(Following up comment #1)

>> Could gRollupListener have become null between the line 2751
>> null-check and this code for some reason?
>
> I don't see how this could have happened.

But unless the crash stack is misleading I think it *must* have
happened.

I still can't figure out why.  But while digging around I discovered
an apparent mistake earlier in -[ChildView maybeRollup:] --
gMenuRollup->GetSubmenuWidgetChain() is called twice.

Maybe fixing that will make these crashes go away.  If it doesn't, the
next thing to try is a second null check on gRollupListener.
Neil, your patch for bug 404766 added a call to GetSubmenuWidgetChain(), instead of just replacing the existing one.  This looks like a mistake.
Attachment #585799 - Flags: review?
Attachment #585799 - Flags: review? → review?(enndeakin)
Comment on attachment 585799 [details] [diff] [review]
Remove extra call to GetSubmenuWidgetChain()

Is gRollupListener null or just invalid? gRollupListener shouldn't be changed until at least Rollup is called near the end of the method.
Attachment #585799 - Flags: review?(enndeakin) → review+
Comment on attachment 585799 [details] [diff] [review]
Remove extra call to GetSubmenuWidgetChain()

Landed on mozilla-inbound:
http://hg.mozilla.org/integration/mozilla-inbound/rev/95b9cb1a8e2d
> Is gRollupListener null or just invalid?

I can't reproduce this bug, so I don't know for sure.  But most of the Socorro crash logs call them null dereferences.

In a few weeks I'll check again to see if these crashes persist.  If they do I'll reopen this bug and try to decide what to do next.

These crashes aren't high volume, so fixing them isn't particularly urgent.
Whiteboard: [inbound]
https://hg.mozilla.org/mozilla-central/rev/95b9cb1a8e2d
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla12
I don't think this is fixed.
Assignee: smichaud → joshmoz
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [inbound]
> I don't think this is fixed.

Why not?
Attached patch another null check v1.0 (obsolete) — Splinter Review
I don't know how this is getting to be NULL but since the crashes are still happening and we don't have a better idea I think we should try this.
Attachment #592286 - Flags: review?(bgirard)
There are a couple of reports from nightly builds after your patch in our crash database.

I'm not very confident that my patch will fix the problem, the stacks seem fishy and I can't tell you exactly why my patch would work for sure, but maybe we should consider it in case it helps.
Attachment #592286 - Flags: review?(bgirard) → review+
Attachment #592286 - Attachment is obsolete: true
Attachment #592286 - Flags: review+
Assignee: joshmoz → nobody
I think I just hit this in bp-40d19efd-3f8c-4698-ae9d-e0d502120829. Both myself and the only other report with a comment noticed that the crash was related to a drop-down being open.
I was looking at the crash reports for this again, which caused me to audit usage of gRollupListener. I found two places where we aren't quite as safely null checking as we should be. In both cases we check that gRollupListener != nullptr some time before we actually user gRollupListener - it could have changed between the check and the usage. How likely that is given the calls inbetween, I do not know.
Attachment #672269 - Flags: review?(smichaud)
None of the code inside there would change gRollupListener, except for the Rollup() call.
I would tend to agree with you, but given that this crash still exists this seems like the assumption most likely to be flawed.
Comment on attachment 672269 [details] [diff] [review]
further safety null checks

Yeah Neil - I'm going to drop the request. The more I think about it the more I find it hard to believe this would help. Do you have any theories here?
Attachment #672269 - Flags: review?(smichaud)
It's #9 top browser crasher in 18.0 and #6 in 19.0b1 on Mac OS X.

More reports at:
https://crash-stats.mozilla.com/report/list?signature=-[ChildView+maybeRollup%3A]
Keywords: topcrash
Hardware: x86 → x86_64
(In reply to Steven Michaud from comment #21)
> Lots of URLs at bug 832438 comment #0.

Adding qawanted to test around these URLs on a Mac (especially around dropdown boxes).

Steven - were there any recent (FF18) changes in this area that can be identified through code inspection? Sounds like the crash has spiked significantly.
Assignee: nobody → smichaud
I tried testing some of the URLs on two different machines and so far I haven't been able to reproduce the crash. I focused on some of the shopping sites where people described in the comments changing the view to show more items, show a different alpha order, etc.
> Steven - were there any recent (FF18) changes in this area that can
> be identified through code inspection? Sounds like the crash has
> spiked significantly.

These crashes have been around forever.  I suspect they've only become
more prominent because we've fixed so many other, worse crashes.

For years I've meant to take a closer look at them, and their current
prominence gives me an excuse to do so.  I'll try to start sometime
next week (or the week after).
(In reply to Steven Michaud from comment #24)
> I suspect they've only become more prominent because we've fixed so many other, worse
> crashes.
Because also Mac OS X 10.5 has been desupported which was a cause of many crashes.
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:20.0) Gecko/20130121 Firefox/20.0
Build ID:20130121042018

Firefox crashed for me with this signatures [@ -[ChildView maybeRollup:] ] while I was changing filters in the Sort By dropdown:
Steps to reproduce:
1. Navigated to http://www.yankeecandle.com/yankee-candles
2. From the left side select different candle categories
3. While the candle category is loading,  change the Sort By filter - at some point Firefox crashed 
https://crash-stats.mozilla.com/report/index/bp-4843ea96-b9e1-4766-99aa-3f53e2130122

Tried to reproduce the same crash again using the same steps, but I got a different signature: 
[@ _ZThn184_N22nsComboboxControlFrame15GetRollupWidgetEv ] 
https://crash-stats.mozilla.com/report/index/bp-6efe5c0f-b2f6-4044-8f4c-3b9c62130122
https://crash-stats.mozilla.com/report/index/bp-cfaf2123-a5e5-4e80-8acb-46c132130122
Simona, I tried your STR (from comment #26) and couldn't reproduce the crashes.  Maybe it would help to have a fuller description of exactly what you did.  I find that the "Sort By" drop-down box often disappears before I have a chance to make a selection in it (and presumably trigger the crash).
(In reply to Steven Michaud from comment #27)
> Simona, I tried your STR (from comment #26) and couldn't reproduce the
> crashes.  Maybe it would help to have a fuller description of exactly what
> you did.  I find that the "Sort By" drop-down box often disappears before I
> have a chance to make a selection in it (and presumably trigger the crash).

Repeat steps 2 and 3 from Comment 26 multiple times in order to keep the website continuously loading. That did the trick for me.
> Repeat steps 2 and 3 from Comment 26 multiple times in order to keep
> the website continuously loading. That did the trick for me.

Not for me -- even 20-30 times in a row.

You must be doing something I'm not.  Maybe a very detailed
description of exactly what you do to crash will help.

Failing that, I may still be able to glean something from your crash
logs.  But I'm going to put that off til next week.
We're about to go to build with Beta 4. There's a good chance that this will go unfixed for FF19, unless we find a critical reason to land later in the cycle.
This is now (in the last week) the #16 topcrasher in FF 19b3 (with 9 crashes).  So its frequency seems to be dropping (I can't begin to guess why).

I'll still try to find time this week to pore over this bug's crash logs.  But it sounds like it's getting less urgent.
It's #10 browser crasher on Mac OS X in 18.0.1, #11 in 19.0b3, and #6 in 20.0a2.
If you consider bug 806820 fixed, it moves up two or three positions.
(In reply to Steven Michaud from comment #29)
> > Repeat steps 2 and 3 from Comment 26 multiple times in order to keep
> > the website continuously loading. That did the trick for me.
> 
> Not for me -- even 20-30 times in a row.
> 
> You must be doing something I'm not.  Maybe a very detailed
> description of exactly what you do to crash will help.

Please see the screencast for more details: http://screencast.com/t/zWwOm6jB
See Also: → 836236
Scoobidiver, where do you get your information about what the Mac topcrashers are on a particular branch or "release"?

Here's how I do it:

1) Visit https://crash-stats.mozilla.com/products/Firefox.
2) Click on "Advanced search".
3) Choose one or more versions under "Version".
4) Choose "Mac OS X" as the "Operating System".
5) Click the "Filter crash reports" button.

You get a list of the top 100 crashes, ranked by number of crashes.

I did this again just now for "Firefox 19.0b3", and "-[ChildView maybeRollup:]" is #19 in the list.
My list does include both crashes and "crash hangs".  Did you remove the "crash hangs" from your list?
(In reply to comment #33)

Thanks for the screencast.  It's given me more information, but still not enough.

The most interesting (and puzzling) part is where you go back and forth between clicking on different items in the Candles list and clicking on "Sort By".

Or at least it looks like you're just clicking on "Sort By", and not "changing the Sort By filter".

What's puzzling is that (in your screencast) the "Sort By" drop down box opens and closes multiple times each time you "click" on "Sort By" (or whatever it is you're doing).  When I click on "Sort By" (while I'm going back and forth between it and items in the Candles list) it never opens and closes more than once.

Please describe, in great detail, and very precisely, exactly what you do when you appear (in the screencast) to "click on" "Sort By".  Also tell me exactly what kind of pointing device you're using -- a trackpad? a mouse?  If it's a mouse, what kind?  Does the mouse (if that's what your using) have a scrollwheel, and if so are you using it?
(In reply to Steven Michaud from comment #34)
> Scoobidiver, where do you get your information about what the Mac
> topcrashers are on a particular branch or "release"?
I use that kind of query: https://crash-stats.mozilla.com/topcrasher/byos/Firefox/19.0b3/Mac%20OS%20X/7/browser
Since this is a higher volume topcrash on 20, marking for tracking.
Bug 836236 is clearly closely related to this bug, if not a dup.

Digging around in its crash stacks I discovered a possible fix.  In a bit I'll post a patch and tryserver build.  Since Simona can reproduce this bug, I'll ask her to try it.
Attached patch Possible fixSplinter Review
Here's the possible fix I mentioned in comment #40.

I've started tryserver builds on all platforms.  I'll post the Mac one here when it becomes available (in a few hours).
(Following up comment #41)

This fix is in code that was added by the patch for bug 701760.  So it can't be a fix for the "original" bug.

But the patch for bug 701760 (by itself or in combination with a later patch) might have triggered an increase in frequency in this bug's crashes.  So this patch might bring these crashes' frequency down to their original (low) level.

First, though, let's see if this patch fixes the bug for Simona.
Here's the Mac tryserver build for the patch from comment #41:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/smichaud@pobox.com-26905392dba7/try-macosx64/firefox-21.0a1.en-US.mac.dmg

Please try this, Simona!  Let us know if this fixes your crashes (or changes their crash stacks in some way).
(In reply to Steven Michaud from comment #43)
> Here's the Mac tryserver build for the patch from comment #41:
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/smichaud@pobox.com-
> 26905392dba7/try-macosx64/firefox-21.0a1.en-US.mac.dmg
> 
> Please try this, Simona!  Let us know if this fixes your crashes (or changes
> their crash stacks in some way).

After several tries I got one crash with this build but unfortunately the report can't be opened.
https://crash-stats.mozilla.com/report/index/bp-b1ecb63d-d21b-45ed-a90f-c77632130131

I tried to reproduce the crash afterwards but I couldn't.
Simona, here's another tryserver build for you to test.  (It's from bug 813442, which is currently marked security sensitive, so access to it is restricted.)

http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mpalmgren@mozilla.com-9bc209b82cb3/try-macosx64/firefox-21.0a1.en-US.mac.dmg
(In reply to Steven Michaud from comment #45)
> Simona, here's another tryserver build for you to test.  (It's from bug
> 813442, which is currently marked security sensitive, so access to it is
> restricted.)
> 
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mpalmgren@mozilla.
> com-9bc209b82cb3/try-macosx64/firefox-21.0a1.en-US.mac.dmg

I could not crash Firefox on this tryserver build using the same steps as before.
Depends on: 837047
FYI: there is a testcase in bug 837047
(In reply to Steven Michaud from comment #45)
> Simona, here's another tryserver build for you to test.  (It's from bug
> 813442, which is currently marked security sensitive, so access to it is
> restricted.)
> 
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mpalmgren@mozilla.
> com-9bc209b82cb3/try-macosx64/firefox-21.0a1.en-US.mac.dmg

With the latest tryserver build - Firefox doesn't crash when following the STR from bug 837047.
Dropping QAWANTED as our testing seems to confirm the try-server build works. Can we land these changes in time for 19b5?
Keywords: qawanted
QA Contact: simona.marcu
We've made some other changes in the FF20 timeframe that we hope will resolve these crashes. Wontfixing for FF19.
Keywords: steps-wanted
Target Milestone: mozilla12 → mozilla21
Blocks: 837047
No longer depends on: 837047
Any news on landing in m-c?
The latest crashes happened in 20.0a2/20130206 for Aurora and 21.0a1/20130115 for m-c.
Status: REOPENED → RESOLVED
Closed: 13 years ago11 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: