Closed Bug 458961 Opened 15 years ago Closed 15 years ago

[10.4] top crash [@ 0xfffeff20][@ libobjc.A.dylib@0x24c7][@ objc_msgSend][@ nsObjCExceptionLogAbort][@ nsNativeThemeCocoa::DrawCellWithScaling]

Categories

(Core :: Widget: Cocoa, defect)

All
macOS
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla1.9.1b2

People

(Reporter: samuel.sidler+old, Assigned: smichaud)

References

()

Details

(4 keywords)

Crash Data

Attachments

(2 files, 2 obsolete files)

I'm still digging into this one, but there's a topcrash (currently #7) that only affects PPC Macs. The comments are relatively spread out with a few talking about searching and a few talking about printing. The stack is a bit more informative:

From bp-a3654c07-94b6-11dd-b908-001cc45a2c28

Crashing Thread
Frame 	Module 	Signature [Expand] 	Source
0 		@0xfffeff20 	
1 	XUL 	nsNativeThemeCocoa::DrawCellWithScaling 	mozilla/widget/src/cocoa/nsNativeThemeCocoa.mm:291
2 	XUL 	nsNativeThemeCocoa::DrawPushButton 	mozilla/widget/src/cocoa/nsNativeThemeCocoa.mm:460
3 	XUL 	nsNativeThemeCocoa::DrawWidgetBackground 	mozilla/widget/src/cocoa/nsNativeThemeCocoa.mm:1115
4 	XUL 	nsCSSRendering::PaintBackgroundWithSC 	mozilla/layout/base/nsCSSRendering.cpp:3491
5 	XUL 	nsCSSRendering::PaintBackground 	mozilla/layout/base/nsCSSRendering.cpp:3343
6 	XUL 	nsButtonFrameRenderer::PaintBorderAndBackground 	mozilla/layout/forms/nsButtonFrameRenderer.cpp:228
7 	XUL 	nsDisplayButtonBorderBackground::Paint 	mozilla/layout/forms/nsButtonFrameRenderer.cpp:145
8 	XUL 	nsDisplayList::Paint const 	mozilla/layout/base/nsDisplayList.cpp:296
9 	XUL 	PresShell::RenderDocument 	mozilla/layout/base/nsPresShell.cpp:5000
10 	XUL 	nsCanvasRenderingContext2D::DrawWindow 	mozilla/content/canvas/src/nsCanvasRenderingContext2D.cpp:2391
11 	libgoogleintercept.dylib 	libgoogleintercept.dylib@0x4340 	
12 	libgoogleintercept.dylib 	libgoogleintercept.dylib@0x3ccc 	
13 	XUL 	nsDocLoader::FireOnStateChange 	mozilla/uriloader/base/nsDocLoader.cpp:1235
14 	XUL 	nsDocLoader::FireOnStateChange 	mozilla/uriloader/base/nsDocLoader.cpp:1242
15 	XUL 	nsDocLoader::FireOnStateChange 	mozilla/uriloader/base/nsDocLoader.cpp:1242
16 	XUL 	nsDocLoader::doStopDocumentLoad 	mozilla/uriloader/base/nsDocLoader.cpp:858
17 	XUL 	nsDocLoader::DocLoaderIsEmpty 	mozilla/uriloader/base/nsDocLoader.cpp:763
18 	XUL 	nsDocLoader::OnStopRequest 	mozilla/uriloader/base/nsDocLoader.cpp:679
19 	XUL 	nsLoadGroup::RemoveRequest 	mozilla/netwerk/base/src/nsLoadGroup.cpp:688
20 	XUL 	nsDocument::DoUnblockOnload 	mozilla/content/base/src/nsDOMSerializer.cpp:5741
21 	XUL 	nsImageLoadingContent::Event::~Event 	mozilla/gfx/cairo/libpixman/src/pixman-access-accessors.c:786
22 	XUL 	nsRunnable::Release 	nsThreadUtils.cpp:51
23 	XUL 	nsThread::ProcessNextEvent 	mozilla/xpcom/threads/nsThread.cpp:542
24 	XUL 	NS_ProcessPendingEvents_P 	nsThreadUtils.cpp:180
25 	XUL 	nsBaseAppShell::NativeEventCallback 	mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp:121
26 	XUL 	nsAppShell::ProcessGeckoEvents 	mozilla/widget/src/cocoa/nsAppShell.mm:302
27 	CoreFoundation 	CoreFoundation@0x242fc 	
28 	CoreFoundation 	CoreFoundation@0x2382c 	
29 	CoreFoundation 	CoreFoundation@0x232ac 	
30 	HIToolbox 	HIToolbox@0x8b1c 	
31 	HIToolbox 	HIToolbox@0x81b0 	
32 	HIToolbox 	HIToolbox@0x801c 	
33 	AppKit 	AppKit@0x8730 	
34 	AppKit 	AppKit@0x83f4 	
35 	AppKit 	AppKit@0x4938 	
36 	XUL 	nsAppShell::Run 	mozilla/widget/src/cocoa/nsAppShell.mm:591
37 	XUL 	nsAppStartup::Run 	mozilla/toolkit/components/startup/src/nsAppStartup.cpp:181
38 	XUL 	XRE_main 	mozilla/toolkit/xre/nsAppRunner.cpp:3174
39 	firefox-bin 	main 	mozilla/browser/app/nsBrowserApp.cpp:158
40 	firefox-bin 	start 	crt.c:272
41 	firefox-bin 	start 	
42 		@0xfffffffe

The implication, to me, is that it's happening when drawing buttons, potentially when zoomed in? I could be wrong, but that's a guess based on the stack. Still digging, as I said...
Flags: wanted1.9.0.x+
CVS blame for the first three files points to bug 422248 (vlad), bug 418497 (vlad again), and a backout of bug 54488 in May (Josh did the backing out).
Note also: While the crash report in comment 0 includes libgoogleintercept.dll, other reports don't, but do include the first three lines (after 0xfffeff20).

For example, bp-b3984b00-9477-11dd-942e-001a4bd43ed6 (expanded for reference):

Crashing Thread
Frame 	Module 	Signature 	Source
0 		@0xfffeff20 	
1 	XUL 	nsNativeThemeCocoa::DrawCellWithScaling(NSCell*, CGContext*, CGRect const&, _NSControlSize, float, float, float, float, float const (*) [3][4], int) 	mozilla/widget/src/cocoa/nsNativeThemeCocoa.mm:291
2 	XUL 	nsNativeThemeCocoa::DrawPushButton(CGContext*, CGRect const&, int, int, int) 	mozilla/widget/src/cocoa/nsNativeThemeCocoa.mm:460
3 	XUL 	nsNativeThemeCocoa::DrawWidgetBackground(nsIRenderingContext*, nsIFrame*, unsigned char, nsRect const&, nsRect const&) 	mozilla/widget/src/cocoa/nsNativeThemeCocoa.mm:1115
4 	XUL 	nsCSSRendering::PaintBackgroundWithSC(nsPresContext*, nsIRenderingContext&, nsIFrame*, nsRect const&, nsRect const&, nsStyleBackground const&, nsStyleBorder const&, nsStylePadding const&, int, nsRect*) 	mozilla/layout/base/nsCSSRendering.cpp:3491
Summary: PPC-only top crash [@ 0xfffeff20] → PPC-only top crash [@ 0xfffeff20] [@ nsNativeThemeCocoa::DrawCellWithScaling]
Yesterday I landed a patch (for bug 444864) on the 1.9.0 branch that
changes nsNativeThemeCocoa::DrawCellWithScaling().  Let's see if that
makes any difference here.
Yeah, you would. ;)

Stephen, do you remember what you're doing when it happens? Anything special? Certain types of pages open? Zooming? Printing? If you were using a build from Oct 12, it's clearly not fixed yet even though bug 444864 landed on Sept 17 (on m-c)...
Here's a search for crashes in nsNativeThemeCocoa::DrawCellWithScaling on the 1.9.1 branch (and limited to OS X), over the last two weeks:

http://crash-stats.mozilla.com/?do_query=1&branch=1.9.1&platform=mac&query_search=stack&query_type=contains&query=nsNativeThemeCocoa%3A%3ADrawCellWithScaling&date=&range_value=2&range_unit=weeks

It returns a grand total of 16 crashes :-)

If this is all, then clearly something else is going on (outside of nsNativeThemeCocoa::DrawCellWithScaling) to trigger these crashes.

But could Socorro be miscounting (or missing) 1.9.1-branch crashes?
Those 16 crashes make it a topcrash on 1.9.1 (in the top 20). My guess is also that people who use beta builds are more likely to have Intel processors ("cutting edge") while regular users are still split pretty evenly (or were last I knew, anyway).
(In reply to comment #5)
> Yeah, you would. ;)
> 
> Stephen, do you remember what you're doing when it happens? Anything special?
> Certain types of pages open? Zooming? Printing? If you were using a build from
> Oct 12, it's clearly not fixed yet even though bug 444864 landed on Sept 17 (on
> m-c)...

Sadly, I can't remember anything helpful, but I can say that it's not when printing, although I do zoom a fair amount, so it could be related to that; I'll keep my eyes open.

Also, it would help if we could extract the URL that's automatically submitted with the incident listed in comment 4; I remember that at least one time I crashed, I only had two tabs open...
Ah ha! I was confused for a second, but this isn't a PPC-only crash. The top frame 0xfffeff20 is the PPC frame while libobjc.A.dylib@0x24c7 is the Intel summary.

Combined, that makes this crash #5 overall.
Summary: PPC-only top crash [@ 0xfffeff20] [@ nsNativeThemeCocoa::DrawCellWithScaling] → top crash [@ 0xfffeff20][@ libobjc.A.dylib@0x24c7][@ nsNativeThemeCocoa::DrawCellWithScaling]
Ok, I've hit this thing as well.  

I've got a nearly reproducible setup for it, happens about 90% of the time.
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.1b2pre) Gecko/20081014 Minefield/3.1b2pre

1. Turn on both jit engines
2. go to maps.google.com (seems to help to have several tabs open of other things).
3. do a directions from point a to point b map -> it might crash there, if not keep going
4. quit minefield
5. session restore to your old setup, ensure that the active tab you're coming back to is the google maps tab.

While tracking this down I also encountered another crash inside the javascript engine, not sure if it is somehow related to this one those reports are here:
* http://crash-stats.mozilla.com/report/index/710dd151-9ca5-11dd-aac0-0013211cbf8a?p=1
* http://crash-stats.mozilla.com/report/index/74bace8f-9ca5-11dd-bfd0-001a4bd43ef6?p=1

it does not seem to happen if you turn off either of the jits.

Extensions installed: Dom Inspector, Feedly, Geolocation, Venkman, JS Bridge, Mozmill, QA Companion, Ubiquity

It does not seem to happen on a clean profile.
(In reply to comment #10)
> 1. Turn on both jit engines

Just a word of note: This is a topcrash for Firefox 3.0.3, which doesn't have JIT. It's certainly possible (and likely?) that JIT makes this crash occur more frequently, but whatever fix should *not* be in JIT, but should happen in (probably) widget code.

Your other crashes look like JS crashes, possibly JIT crashes. I'd file them separately.
The libobjc crashes might also show up as objc_msgSend, and these crashes also show up under nsObjCExceptionLogAbort.

http://crash-stats.mozilla.com/?do_query=1&branch=1.9&version=Firefox%3A3.0.3&platform=mac&query_search=stack&query_type=contains&query=nsNativeThemeCocoa%3A%3ADrawCellWithScaling&date=&range_value=1&range_unit=days

shows that in the past *day* there have been over 700 crashes just from Firefox 3.0.3 alone that are very highly likely to be this bug.*

* We can't search by "Frame 1 is foo", alas, so it's possible that there are other stacks there that just happen to contain nsNativeThemeCocoa::DrawCellWithScaling somewhere else in the top 10 frames, but still, probability is strongly in this bug's "favor".
Summary: top crash [@ 0xfffeff20][@ libobjc.A.dylib@0x24c7][@ nsNativeThemeCocoa::DrawCellWithScaling] → top crash [@ 0xfffeff20][@ libobjc.A.dylib@0x24c7][@ objc_msgSend][@ nsObjCExceptionLogAbort][@ nsNativeThemeCocoa::DrawCellWithScaling]
As far as I can see this is a 10.4-only crash. It occurs in this line:
[NSGraphicsContext setCurrentContext:savedContext];
and only when nsCanvasRenderingContext2D::DrawWindow or nsPrintEngine::PrintPage are on the stack - i.e. only when painting into an invisible context.

Over the weekend I hit this crash about 10 times on my Macbook, but when I finally decided to track it down I couldn't reproduce it anymore...
Assignee: joshmoz → mstange
Status: NEW → ASSIGNED
Hardware: Macintosh → All
Blocks: 458205
Using Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.1b2pre) Gecko/20081104 Minefield/3.1b2pre, I am crashing quite a bit in this stack. I have been switching in and out of Private Browsing, and when I put focus back on a tab I seem to crash pretty regularly.  Breakpad is http://crash-stats.mozilla.com/report/index/62d4161c-aa99-11dd-bda9-001cc45a2c28. As Markus states in Comment 13, I have only hit this on Tiger and not on Leopard.
Here are a fairly good set of STR to reproduce this bug using Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.1b2pre) Gecko/20081104 Minefield/3.1b2pre.

1. Open a set of tabs in regular browsing mode.
2. Switch to PB mode by using the Tools menu item. Open a few more tabs.
3. Switch back to regular browsing mode. Click on a tab. Crash. Ack.
(In reply to comment #16)

I've tried this a couple of times, and didn't see any crash (testing
with today's Minefield nightly on OS X 10.4.11).  More detail, please
:-)

> Click on a tab.

Click where on the tab?
I am having a crashfest today with this stack on Tiger, nominating for blocking so it gets some additional consideration.
Flags: blocking1.9.1?
I just crashed with a different set of STR.  Here they are. Testing with a relatively fresh profile with no extensions, JIT content enabled (as it is now by default).

1. View | Sidebar History (Open the history sidebar)
2. Tools | Private Browsing (Pref it on by selecting the menu item)
3. After the Private browsing session launches, Ctrl+T to open a new tab.
4. Focus the location bar. Start typing some text.

As soon as I hit Step 4, I crashed.

The odd thing about this bug is that the crashes seem to come at different times - when typing in the location bar, once when just focusing a tab that had cnn.com loaded, and the first time I crashed was merely switching in and out of the Private Browsing mode.
(In reply to comment #19)

I assume you mean cmd-t in step 3.

> 4. Focus the location bar. Start typing some text.

The location bar should already be focused (after you did cmd-t in
step 3).  Something else going on?
Attached patch experimental patch (obsolete) — Splinter Review
I just started try server builds with this patch. I haven't had a chance to test it yet.
(In reply to comment #21)

Looks worth a try.
Yes, sorry cmd+t. You are right that the location bar is focused. When I just tested again I then opened another tab and started typing a new URL in the location bar, and it crashed immediately.


> (In reply to comment #19)
> 
> I assume you mean cmd-t in step 3.
> 
> > 4. Focus the location bar. Start typing some text.
> 
> The location bar should already be focused (after you did cmd-t in
> step 3).  Something else going on?
Marcia, looks like I can't reproduce either of your STRs (from comment
#16 and comment #19).  But you can try Markus' tryserver build (of the
patch from comment #21) once it's done.  That may take a while, though
-- the tryservers are very popular today.
I will also try to reproduce this on the Lab machine running Tiger.
Er... somehow the Mac tryserver build has skipped me.
I created my own build:
http://tests.themasta.com/fastscrolling_firefox-3.1b2pre.en-US.mac.dmg

Marcia, could you test if you can reproduce the crash with this build?
I tested on a machine in the lab with a new profile and also can reproduce this crash using  Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.1b2pre) Gecko/20081104 Minefield/3.1b2pre. I will try the build in Comment 29 and report back with my findings.
BTW, I forgot the post a better set of STR:

1. Start with a fresh profile and fire up the browser
2. Tools ->Private Browsing to turn on the feature.
3. After the Private Browsing Session launches, open one new tab. Immediately open a second tab.
4. After Step 3, I crash 99% of the time.

Although Bug 461715 was filed on Camino, it has a similar stack and the trigger point seems to involve two tabs.
Unfortunately the Build noted in Comment 29 gives me a 404 :(
I gave the build in Comment 33 a quick whirl and it looks much better - I have not yet been able to crash following the STR in Comment 31.
(In reply to comment #34)

One question for you, Marcia:  Do you see error messages in the system
console testing with Markus' test build (from comment #33), especially
where you'd normally crash using other builds?

The errors I'm looking for (the ones you might be seeing) are
complaints about failures to restore a graphics state (because a call
was made to [NSGraphicsContext restoreGraphicsState] without a matching
(prior) call to [NSGraphicsContext saveGraphicsState]).
> The errors I'm looking for (the ones you might be seeing) are
> complaints about failures to restore a graphics state (because a
> call was made to [NSGraphicsContext restoreGraphicsState] without a
> matching (prior) call to [NSGraphicsContext saveGraphicsState]).

Or errors that happened because the "current graphics context" (the
result of [NSGraphicsContext currentContext]) changed between a call
to [NSGraphicsContext saveGraphicsState] and the matching call to
[NSGraphicsContext restoreGraphicsState].
Steven: here are the two errors I saw this AM while running the build:

2008-11-05 10:00:59.111 firefox-bin[1527] *** -[NSImage isDrawingToScreen]: selector not recognized [self = 0x117efc70]
2008-11-05 10:00:59.114 firefox-bin[1527] Mozilla has caught an Obj-C exception [NSInvalidArgumentException: *** -[NSImage isDrawingToScreen]: selector not recognized [self = 0x117efc70]]
I'd love to see a fix but I don't think we need to block on this. It is not a regression and it is PPC only. If we don't have a fix by the release we can always fix it in an update.
Flags: wanted1.9.1+
Flags: blocking1.9.1?
Flags: blocking1.9.1-
Josh, see comment 9. This isn't PPC-only. This needs more attention and I really think it should block. This is a topcrash that should get looked at all around.

Re-nominating since this affects all Mac platforms (but might be 10.4-only).
Flags: blocking1.9.1- → blocking1.9.1?
Attachment #346300 - Flags: review?(smichaud)
Yes, this is a bad problem (and yes it also happens on Intel).  But
it's not yet reproducible (despite Marcia's best efforts).  And,
though it may be possible to work around the problem (as Markus' patch
might do), we very likely won't be able to *fix* it until we can
reproduce it (it's unlikely that we'll fully understand this problem
until we can reproduce it).
Comment on attachment 346300 [details] [diff] [review]
experimental patch

I can't r+ this until I better understand the error messages Marcia
reported in comment #37.

I should be able to spend several hours on this today.
(Following up comment #40)

Furthermore, since this seems to be 10.4-only, it's probably caused by
an Apple bug.  So whatever "fix" we come up with will probably just
work around Apple's bug.
blocking1.9.1- again based Steven's info (comment #40 and comment #42). This should be a priority because it is a topcrash in a shipping product - the release of Gecko 1.9.1 is not as relevant as this is not a regression.
Flags: blocking1.9.1? → blocking1.9.1-
Comment on attachment 346300 [details] [diff] [review]
experimental patch

Marcia's errors from comment #37 tell me that something is trying to
call a method on ("send a message to") the address of an object that
was deleted, then allocated as another (different kind of) object.

If that's truly a consequence of Markus' patch (if all Markus' patch
does is to shift this bug's problem elsewhere), I don't think we can
use it.  So I'm r-minusing it.

Markus, barring unforeseen interruptions, I'm going to be spending the
rest of today on this bug.  Between the two of us, we may be able to
come up with something :-)

'isDrawingToScreen' isn't an NSImage method.  But it *is* an
NSGraphicsContext method (and also a method of a bunch of undocumented
classes that seem related to NSGraphicsContext).  So the object that
was deleted and replaced was presumably an NSGraphicsContext object
(or it belonged to one of those other, undocumented classes).
Attachment #346300 - Flags: review?(smichaud) → review-
This patch certainly looks like a good step forward in terms of code correctness though, even if it's not the whole fix; there doesn't appear to be anything in the current code that would guarantee that savedContext is actually still valid by the end of the function.


I'd also suggest making a build that nil-checks cgContext, ctx, and the return values of graphicsContextWithGraphicsPort:flipped: and logs problems to see if that's happening, since the methods that take them may not handle nil arguments (as is often the case for Apple's frameworks).

Steven, can you explain the source of this comment?
    // [NSView focusView] may return nil here, but
    // [NSCell drawWithFrame:inView:] can deal with that.
The docs for the method don't say that; is it just based on testing? If so, I strongly recommend nil-checking that as well.
(In reply to comment #45)

> Steven, can you explain the source of this comment?
>     // [NSView focusView] may return nil here, but
>     // [NSCell drawWithFrame:inView:] can deal with that.
> The docs for the method don't say that; is it just based on testing?

It's based on testing.  See bug 444864 (that's security related, but I
can give you access if you want).

Null-checking wherever we can (in a test build) isn't a bad idea.  I'm
still trying to reproduce this.  If I fail I'll put together some kind
of testing patch for someone (like Marcia) to use who can reproduce
this problem, to gather extra information.

> This patch certainly looks like a good step forward in terms of code
> correctness though, even if it's not the whole fix

As best I can tell, Markus' patch is just a different way of doing the
same thing.  Which isn't a bad idea ... but it doesn't seem to have
avoided the underlying problem.
(In reply to comment #46)
> It's based on testing.  See bug 444864 (that's security related, but I
> can give you access if you want).

I read that bug before I asked, but I didn't see anything about how you came to that conclusion, just that it was true.

Passing nil to a method that doesn't explicitly say it accepts it is asking for trouble. I would very, very strongly urge you not to rely on one of Apple's frameworks to handle a nil argument based solely on empirical testing. The general rule is that they do not, and they can (and do) change nil handling if it's not documented, so even if it works in all current version of 10.4.x and 10.5.x in certain test scenarios, there's no guarantee at all that it will work in general, or in future versions. Even if it's unrelated to this bug, it's not unlikely to be a source of bugs in the future.
(In reply to comment #46)
> As best I can tell, Markus' patch is just a different way of doing the
> same thing.  Which isn't a bad idea ... but it doesn't seem to have
> avoided the underlying problem.

Markus's patch avoid the dangerous assumption that savedContext would remain valid, but there were still some explicit setContext: calls that weren't about saving/restoring the old state. I was suggesting Markus's patch *plus* nil checking of the remaining calls to setContext:
(In reply to comment #47)
> (In reply to comment #46)
> > It's based on testing.  See bug 444864 (that's security related, but I
> > can give you access if you want).
> 
> I read that bug before I asked, but I didn't see anything about how you came to
> that conclusion, just that it was true.

IIRC Steven first added the null check but then removed it because reftests were failing; so there are cases when the focusView is nil but we still want to draw. But maybe that's just another bug.
If there are reftests that end up requiring making OS framework calls with nil arguments (that aren't explicitly documented to accept them), that should absolutely be considered a bug in the tests.
Here's a slight modification to Markus' patch.  I'm not able to test
it (because I'm still not able to reproduce this bug), but Marcia's
promised to try it out.

I re-read Apple's docs on the NSGraphicsContext class, and
particularly on setCurrentContext: and the two sets of
saveGraphicsState/restoreGraphicsState methods (the class and instance
methods).  I also followed the docs' links to Apple sample code that
uses these functions.

I noticed that, whenever an attempt was made to save and restore
"state" around calls to setCurrentContext, the "state" was always
saved just beforehand and restored just afterwards.  In one case
[NSGraphicsContext saveGraphicsState] and [NSGraphicsState
restoreGraphicsState] were used.  But in the other the currentContext
was saved in a variable and restored afterwards (using another call to
setCurrentContext).

Since Markus had trouble wrapping calls to setCurrentContext with
calls to saveGraphicsState and restoreGraphicsState, I took the other
approach (as we do in the original code) -- though I made sure to save
and restore just before and after the calls to setCurrentContext.  I
also wrapped both calls to setCurrentContext -- not just the one
Markus' patch wraps.

Though (as I've just explained) I have reasons for the changes I've
made, I'm only guessing that they'll make any difference.

Here's a tryserver build made with my patch.

https://build.mozilla.org/tryserver-builds/2008-11-07_15:12-smichaud@pobox.com-bugzilla458961-test2/smichaud@pobox.com-bugzilla458961-test2-firefox-try-mac.dmg

Marcia, please bang away at it and let us know your results.  Tell us
whether you see any crashes (and what they are).  Also tell us what
error messages (if any) you see in the system console.

Thanks in advance!
I tested Steve's build in Comment 51. Following the same STR in Comment 31 which crash my Tiger Intel Mac reliably every time, I was not able to crash using this build during my initial test run.  Also, I observed no errors in the Error Console or the System Console. I will report back if I encounter any crashes or issues, but so far this build looks as if it has resolved the issue.
Comment on attachment 346998 [details] [diff] [review]
Another experimental patch

Markus:  On the strength of Marcia's good results, I'm asking you to
review my patch.
Attachment #346998 - Flags: review?(mstange)
Attachment #346998 - Flags: review?(stuart.morgan+bugzilla)
Marcia, your STR in comment 31 are excellent. I *can* reliably reproduce the crash on my 10.4 machine if I add another step after opening the tabs:
4. Press Ctrl+Tab.

Steven, does that work for you, too?

Your patch indeed fixes it for me. And I don't see any messages in the console either.
I can confirm that STR from comment 31 with adjustment from comment 54 work for me 100% too. I can reproduce it in a debugger. The second patch fixes the problem for me.
Attachment #346998 - Flags: review?(stuart.morgan+bugzilla) → review+
Attachment #346998 - Flags: superreview?(vladimir)
Attachment #346998 - Flags: review?(mstange)
Attachment #346998 - Flags: review+
I, too, can now reproduce this problem (using Marcia's STR from
comment #31 plus Markus' addition from comment #54) -- but only with
today's (2008-11-08-02-mozilla-central) and yesterday's
(2008-11-07-02-mozilla-central) Minefield nightlies.  I can't repro
using the 2008-11-06-02-mozilla-central or
2008-11-04-02-mozilla-central nightlies.  And (as it happens)
yesterday I was testing builds made from code pulled around noon of
2008-11-06 (the rough equivalent of the 2008-11-06 nightly).

And I, too, see no problems with my tryserver build (from comment #51,
the rough equivalent of today's nightly) -- no crashes, and no errors
in the system console.

So it looks like this problem's been fixed (or actually that we've
worked around the Apple bug, which may be as simple as not retaining
and releasing an NSGraphicsContext object on 10.4).
(Following up comment #56)

So it looks like, as of the 2008-11-07 Minefield nightly, this problem
got much easier to reproduce -- which probably means that, unless we
do something now, we'll see a lot more of these crashes.

So I'm again requesting blocking.  And I think we need to get this
patch into beta2.
Flags: blocking1.9.1- → blocking1.9.1?
Comment on attachment 346998 [details] [diff] [review]
Another experimental patch

odd, but sure, looks good!  guess something internally was being twiddled that wasn't being reset properly.
Attachment #346998 - Flags: superreview?(vladimir) → superreview+
Approved for 1.9.1 btw -- add it to https://wiki.mozilla.org/1.9.1_beta_2_checkins and try to get it in today ideally, tomorrow morning at the latest.  Thanks!
Flags: blocking1.9.1? → blocking1.9.1+
Landed on mozilla-central (changeset b5dc0e849959).
Assignee: mstange → smichaud
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Nominating for blocking1.9.0.5 since there's now a patch. Note that we won't take this for 1.9.0.5, but will consider it for 1.9.0.6 after it's had sufficient baking (there's just no 1.9.0.6 flags yet).
Flags: blocking1.9.0.5?
Flags: blocking1.9.0.5? → blocking1.9.0.6+
Whiteboard: [needs trunk baking]
Verified fixed using Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.1b2pre) Gecko/20081110 Minefield/3.1b2pre. I am no longer able to reproduce this crash, which I was able to reproduce rather easily using the STR from Comment 31.
Status: RESOLVED → VERIFIED
Reminder to myself:  FF3.1b2 was released yesterday, so let's let this patch bake there for a while before seeking 1.9.0-branch approval.

http://crash-stats.mozilla.com/ is still pretty flaky (today it won't give me any results for a range of 1-day or greater).  And it no longer lists nightly build-dates in its reports (only the build dates of alphas or betas).

But if (say) a week passes and there are no reports of this crash for beta2, this patch should land on the 1.9.0 branch.
Comment on attachment 346998 [details] [diff] [review]
Another experimental patch

Josh suggests I should seek approval now.

And in fact this has been on the trunk (and 1.9.1 branch) for a while
without any apparent problems.
Attachment #346998 - Flags: approval1.9.0.6?
Don't we need a new patch for 1.9.0 here? I would think so because bug 394892 didn't land on 1.9.0.
> Don't we need a new patch for 1.9.0 here?

Yes, it'll need to be rewritten a bit.  I'll post a new one soon.
Attached patch 1.9.0-branch version of patch (obsolete) — Splinter Review
How does this look to you, Markus?

It ignores the doSaveCTM parameter.  But that shouldn't matter, right?
Attachment #352194 - Flags: review?(mstange)
Attachment #352194 - Flags: approval1.9.0.6?
Attachment #346998 - Flags: approval1.9.0.6?
(In reply to comment #68)
> How does this look to you, Markus?

Good, except for this:

>+    NSGraphicsContext* savedContext = [NSGraphicsContext currentContext];
>+
>     // Set up the graphics context we've been asked to draw to.
>     savedContext = [NSGraphicsContext currentContext];

One of them should be enough. :)

> It ignores the doSaveCTM parameter.  But that shouldn't matter, right?

Right.
> One of them should be enough. :)

Argh, yes.  Thanks for catching my mistake.
Attachment #352539 - Flags: review?(mstange)
Attachment #352539 - Flags: approval1.9.0.6?
Attachment #352539 - Flags: review?(mstange) → review+
Attachment #352194 - Attachment is obsolete: true
Attachment #352194 - Flags: review?(mstange)
Attachment #352194 - Flags: approval1.9.0.6?
Attachment #346300 - Attachment is obsolete: true
This has been already verfied on 1.9.1 by Marcia a month ago. Setting verified1.9.1 keyword.
Whiteboard: [needs trunk baking]
Comment on attachment 352539 [details] [diff] [review]
1.9.0-branch version of patch, rev1

Approved for 1.9.0.6, a=dveditz for release-drivers.
Attachment #352539 - Flags: approval1.9.0.6? → approval1.9.0.6+
Landed on the 1.9.0 branch:

Checking in widget/src/cocoa/nsNativeThemeCocoa.mm;
/cvsroot/mozilla/widget/src/cocoa/nsNativeThemeCocoa.mm,v  <--  nsNativeThemeCocoa.mm
new revision: 1.95; previous revision: 1.94
done
Keywords: fixed1.9.0.6
I wonder if fix will resolve the reports with the stack signature 

 libobjc.A.dylib@0x15688

http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.1b2&query_search=signature&query_type=contains&query=&date=&range_value=1&range_unit=weeks&do_query=1&signature=libobjc.A.dylib%400x15688

comments in the crash reports sound similar but the stack traces vary widely from what is in this report
Chris, this landed on mozilla-central prior to the branch for 1.9.1. If it's not fixed now, there's nothing in this bug that can fix it. :)
Target Milestone: --- → mozilla1.9.1b2
(In reply to comment #74)

This bug is 10.4-specific.  All of the crash reports in your list are on 10.5.  They're probably completely unrelated.  Indeed, they probably represent several different (unrelated) bugs.

In general, crashes in a given system library are unlikely to be related (even if the address is available, as in this case).  It's much better to search by something higher "up" the stack.
Summary: top crash [@ 0xfffeff20][@ libobjc.A.dylib@0x24c7][@ objc_msgSend][@ nsObjCExceptionLogAbort][@ nsNativeThemeCocoa::DrawCellWithScaling] → [10.4] top crash [@ 0xfffeff20][@ libobjc.A.dylib@0x24c7][@ objc_msgSend][@ nsObjCExceptionLogAbort][@ nsNativeThemeCocoa::DrawCellWithScaling]
There are no related crashes listed on 1.9.0. Will mark this verfied1.9.0.6 with Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.6) Gecko/2009011912 Firefox/3.0.6.
No longer blocks: 458205
Crash Signature: [@ 0xfffeff20] [@ libobjc.A.dylib@0x24c7] [@ objc_msgSend] [@ nsObjCExceptionLogAbort] [@ nsNativeThemeCocoa::DrawCellWithScaling]
You need to log in before you can comment on or make changes to this bug.