Last Comment Bug 345388 - crash with gmail open [@ CoreGraphics.203.30.0 + 0x13ceb0] [@ nsImageMac::DrawToImage] [@ CGImageGetData] [@ CGImageEPSRepRelease]
: crash with gmail open [@ CoreGraphics.203.30.0 + 0x13ceb0] [@ nsImageMac::Dra...
Status: VERIFIED FIXED
: crash, regression, topcrash, verified1.8.1
Product: Core Graveyard
Classification: Graveyard
Component: GFX: Mac (show other bugs)
: 1.8 Branch
: PowerPC Mac OS X
: -- critical with 2 votes (vote)
: mozilla1.8.1
Assigned To: Mark Mentovai
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-07-20 16:23 PDT by Marcia Knous [:marcia - use ni]
Modified: 2011-06-09 14:58 PDT (History)
14 users (show)
dbaron: blocking1.8.1+
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Apple stack of recent crash (23.22 KB, application/rtf)
2006-08-04 11:09 PDT, Marcia Knous [:marcia - use ni]
no flags Details
Apple stack of recent crash (plain text) (22.65 KB, text/plain)
2006-08-18 08:00 PDT, Josh Aas
no flags Details
Don't fade out pop-ups to hide them on 10.3 (1.89 KB, patch)
2006-08-28 13:13 PDT, Mark Mentovai
jaas: review+
mconnor: approval1.8.1+
Details | Diff | Review

Description Marcia Knous [:marcia - use ni] 2006-07-20 16:23:07 PDT
Seen using Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1b1) Gecko/20060720 BonEcho/2.0b1.

I got this stack today in two separate incidents, and thought I would get a bug on file.  Common threads when I crashed:

1. Good number of tabs open
2. In both instances gmail.google.com was either open or I was trying to open it from the URL bar.
3. Running some Flash sites and ajaxy sites at the time of the crash.

I have not been able to get exact STR but am working on seeing if I can exact steps. Talkback IDs for one of these two crashes: 21174898 and 21178067
Comment 1 Adam Guthrie 2006-07-20 18:00:05 PDT
Incident ID: 21174898
Stack Signature	CoreGraphics.203.30.0 + 0x13ceb0 (0x97006eb0) 84bf4243
Product ID	Firefox2
Build ID	2006072006
Trigger Time	2006-07-20 13:11:49.0
Platform	MacOSX
Operating System	Darwin 7.9.0
Module	CoreGraphics.203.30.0 + (0013ceb0)
URL visited	
User Comments	Was accessing gmail account, the site had not fully loaded yet.
Since Last Crash	6215 sec
Total Uptime	9147 sec
Trigger Reason	SIGSEGV: Segmentation Violation: (signal 11)
Source File, Line No.	N/A
Stack Trace 	
CoreGraphics.203.30.0 + 0x13ceb0 (0x97006eb0)
libRIP.A.dylib.203.27.0 + 0x3168 (0x91b83168)
nsImageMac::DrawToImage()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/extensions/transformiix/source/xslt/util//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/db/sqlite3/src/pragma.c, line 438]
gfxImageFrame::DrawTo()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/gfx/src/shared//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/gfx/src/shared/gfxImageFrame.cpp, line 842]
imgContainerGIF::DoComposite()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/modules/libpr0n/decoders/gif//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/modules/libpr0n/decoders/gif/imgContainerGIF.cpp, line 842]
imgContainerGIF::Notify()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/modules/libpr0n/decoders/gif//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/modules/libpr0n/decoders/gif/imgContainerGIF.cpp, line 446]
nsTimerImpl::Fire()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/xpcom/threads//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/xpcom/threads/nsTimerImpl.cpp, line 398]
handleTimerEvent()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/xpcom/threads//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/xpcom/threads/nsTimerImpl.cpp, line 462]
PL_HandleEvent()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/xpcom/threads//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/xpcom/threads/plevent.c, line 689]
PL_ProcessPendingEvents()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/xpcom/threads//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/xpcom/threads/plevent.c, line 623]
CoreFoundation.299.37.0 + 0x4800 (0x901c4800)
CoreFoundation.299.37.0 + 0x20b8 (0x901c20b8)
CoreFoundation.299.37.0 + 0x69e4 (0x901c69e4)
RunCurrentEventLoopInMode()
ReceiveNextEventCommon()
AcquireNextEventInMode()
RunApplicationEventLoop()
nsAppShell::Run()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/widget/src/mac//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/widget/src/mac/nsAppShell.cpp, line 94]
nsAppStartup::Run()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/extensions/transformiix/source/xslt/util//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/db/sqlite3/src/pragma.c, line 152]
XRE_main()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/toolkit/xre//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/toolkit/xre/nsAppRunner.cpp, line 2396]
_start()   start()
Comment 2 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2006-08-02 13:39:24 PDT
This is the top mac crash in current 1.8 branch data.  There are two slight variations at the top of the stack, though, and I'm not sure if they're different bugs or the same bug.  All of the crashes in CoreGraphics that I looked at have this bit a little below the top of the stack:

imgContainerGIF::DoComposite()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/modules/libpr0n/decoders/gif//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/modules/libpr0n/decoders/gif/imgContainerGIF.cpp, line 842]
imgContainerGIF::Notify()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/modules/libpr0n/decoders/gif//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/modules/libpr0n/decoders/gif/imgContainerGIF.cpp, line 446]
nsTimerImpl::Fire()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/xpcom/threads//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/xpcom/threads/nsTimerImpl.cpp, line 398]

but then the part of the stack above that is either this:

CoreGraphics.203.30.0 + 0x3b38c (0x964a838c)
CoreGraphics.203.30.0 + 0xf0b8 (0x9647c0b8)
CoreFoundation.299.37.0 + 0x1848 (0x901c1848)
nsImageMac::EnsureCachedImage()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/extensions/transformiix/source/xslt/util//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/db/sqlite3/src/pragma.c, line 227]
nsImageMac::DrawToImage()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/extensions/transformiix/source/xslt/util//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/db/sqlite3/src/pragma.c, line 406]
gfxImageFrame::DrawTo()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/gfx/src/shared//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/gfx/src/shared/gfxImageFrame.cpp, line 842]

or this:

CoreGraphics.203.30.0 + 0x13ceb0 (0x965a9eb0)
libRIP.A.dylib.203.27.0 + 0x3168 (0x91b83168)
nsImageMac::DrawToImage()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/extensions/transformiix/source/xslt/util//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/db/sqlite3/src/pragma.c, line 438]
gfxImageFrame::DrawTo()  [/builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/build/unifox/ppc/gfx/src/shared//builds/tinderbox/Fx-Mozilla1.8/Darwin_8.7.0_Depend/mozilla/gfx/src/shared/gfxImageFrame.cpp, line 842]

Let me know if you'd like me to file the first of those as a separate bug.  They look similar enough that they may well be the same bug.
Comment 3 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2006-08-03 14:32:35 PDT
Josh, any status on this?
Comment 4 Mark Mentovai 2006-08-03 14:57:49 PDT
Missing symbols:

CoreGraphics.203.30.0 + 0x3b38c  = CGImageEPSRepRelease
CoreGraphics.203.30.0 + 0xf0b8   = imageFinalize
CoreFoundation.299.37.0 + 0x1848 = CFRelease

CoreGraphics.203.30.0 + 0x13ceb0 = CGImageGetData
libRIP.A.dylib.203.27.0 + 0x3168 = ripc_DrawImage

I'm going to add CG.203.30.0 (from 10.3.9) to the talkback server.  I already added CG.258.33.0 (from 10.4.7) earlier.
Comment 5 Marcia Knous [:marcia - use ni] 2006-08-04 11:09:16 PDT
Created attachment 232159 [details]
Apple stack of recent crash

Attaching this in case it can be useful.
Comment 6 Josh Aas 2006-08-04 11:10:58 PDT
I will look into this today.
Comment 7 Marcia Knous [:marcia - use ni] 2006-08-04 14:54:22 PDT
I am trying to narrow down the regression range on this one.  The first Talkback stack I got was on 7-20-06. I am running the 7-19 build right now, and I have crashed once but not in the same stack. One difficult issue is that I don't have exact STR. Will try playing around with the 7-20 build to get exact steps.
Comment 8 Marcia Knous [:marcia - use ni] 2006-08-04 15:10:31 PDT
On the 7-20 build, I can usually crash by pulling the URL dropdown, highlighting one of the pre-existing URL entries, then hitting Enter on the keyboard. It might not always happen the first time, but if I follow those steps enough times I can usually get a crash. On the 7-19 build I was not able to see a crash when I did this repeatedly.
Comment 9 Josh Aas 2006-08-14 23:18:53 PDT
I can't reproduce this. By "URL dropdown" you mean the one that drops down from the URL bar and suggests things based on what you already typed right? I am using the 7/24 build on PPC (1.8 branch) to try to reproduce this.
Comment 10 Mike Beltzner [:beltzner, not reading bugmail] 2006-08-15 13:17:38 PDT
--> Firefox2, any way we can get more data on this?
Comment 11 Josh Aas 2006-08-18 08:00:24 PDT
Created attachment 234406 [details]
Apple stack of recent crash (plain text)

This is the same as the last attachment but posted in plain text.
Comment 12 Josh Aas 2006-08-18 08:27:27 PDT
Looks like this happens when the timer for animating a GIF image fires. There are two calls to ::CGContextDrawImage in nsImageMac::DrawToImage:

437 ::CGContextDrawImage(bitmapContext, destRect, dest->mImage);
438 ::CGContextDrawImage(bitmapContext, drawRect, mImage);

The problem is probably that mImage is null or invalid in either the current nsImageMac or the dest nsImageMac (the last argument in either call). Knowing which one of the two calls is crashing would help narrow this down a bit.
Comment 13 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2006-08-19 22:53:12 PDT
These don't seem to have shown up in talkback for a few days.  Have they stopped happening in recent builds?

It should be possible to figure out which call it was if you have the build associated with the crash log above -- look at the disassembly and figure out which instruction the return address from the call is pointing to.
Comment 14 (not reading, please use seth@sspitzer.org instead) 2006-08-19 23:52:24 PDT
> These don't seem to have shown up in talkback for a few days.  Have they
> stopped happening in recent builds?

Marcia seemed able to reproduce this crasher on her Mac OS X 10.3 PPC machine at will.

Marcia, could you try again with a fresh build?
Comment 15 Jean-Yves Perrier [:teoli] 2006-08-21 04:15:24 PDT
(In reply to comment #14)
> > These don't seem to have shown up in talkback for a few days.  Have they
> > stopped happening in recent builds?
> 
> Marcia seemed able to reproduce this crasher on her Mac OS X 10.3 PPC machine
> at will.
I'm also able to reproduce this bug consistently since 21.7.2006 branch build (previous one I tested was 14.7.2006, without the problem. I didn't test with branch build 14.7-21.7; I can if needed for a precise regression windows.

Mean time between two crashes is less than 15 minutes with Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1b2) Gecko/20060820 BonEcho/2.0b2 ID:2006082003
With a clean profile, about 45 minutes.

I have about 45 TB ids with that same build. And I'm also using Mac OS X 10.3.

If I can help, I'm ready to conduct specific tests.

Comment 16 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2006-08-21 08:57:52 PDT
(In reply to comment #13)
> These don't seem to have shown up in talkback for a few days.  Have they
> stopped happening in recent builds?

Never mind that.  Showed up again over the weekend quite a bit.
Comment 17 Marcia Knous [:marcia - use ni] 2006-08-22 15:57:41 PDT
I just crashed while testing and got a variant of this stack. In both instances I was loading an RSS feed in tabs, and then stopped the feed from loading. I got some weird "ghosting" in the menu where the dropdown is that is attached to the back button.

I should note this crash was a bit delayed - first the browser hung and then I got the crash.
Comment 18 Larry 2006-08-22 22:25:43 PDT
Over the past few days, using any nightly Mac/PPC build from the past few days,
OS X 10.3.9, on every browsing session I am seeing in  nsImageMac::DrawToImage
EXC_BAD_ACCESS / KERN_INVALID_ADDRESS ... latest and seemingly worst when
running 1.8.1b2_2006082203.  Seems to occur on any website that I visit.  The
executable throws the exception but seems to go into some crazy exit loop, taking
over the system CPU time.  Will try to get more stack info in debug mode; Larry.
Comment 19 Larry 2006-08-22 23:40:38 PDT
The exception that I've been seeing a lot today is ... EXC_BAD_ACCESS in
 #0  0x94d93eb0 in CGImageEPSRepGetAlternateImage ()
 #1  0x91b83168 in ripc_DrawImage ()
 #2  0x94c75e7c in CGContextDrawImage ()

and then for example,
 #3  0x0011dc5c in nsImageMac::DrawToImage(nsIImage*, int, int, int, int) ()
or
 #3  0x0011d9d0 in nsImageMac::Draw(nsIRenderingContext&, nsIDrawingSurface*, int, int, int, int, int, int, int, int) ()

That's when it barfs while I have tried to access a cascading menu, such
as in the Bookmarks Toolbar; or, sometimes on a drop-down form element,
or even when the page content is being manually scrolled.

I hope that is the same issue in this bug ... sorry, I have not debugged this before
and I did not try to use a debug executable yet; Larry.
Comment 20 Larry 2006-08-22 23:45:31 PDT
My crash characteristic appears to match the attached file, i.e. 2.0b1 stack info from 2006-08-18.
Comment 21 Mark Mentovai 2006-08-23 13:10:28 PDT
Comment on attachment 234406 [details]
Apple stack of recent crash (plain text)

>Exception:  EXC_BAD_ACCESS (0x0001)
>Codes:      KERN_INVALID_ADDRESS (0x0001) at 0x3f800008
>
>Thread 0 Crashed:
>0   com.apple.CoreGraphics         	0x97006eb0 CGImageEPSRepGetAlternateImage + 0xc

Here's a quick disassembly of that function.

0x937bd3a0 <CGImageEPSRepGetAlternateImage+0>:  mr.     r3,r3
0x937bd3a4 <CGImageEPSRepGetAlternateImage+4>:  li      r0,0
0x937bd3a8 <CGImageEPSRepGetAlternateImage+8>:  beq-    0x937bd3b0 <CGImageEPSRepGetAlternateImage+16>
0x937bd3ac <CGImageEPSRepGetAlternateImage+12>: lwz     r0,8(r3)
0x937bd3b0 <CGImageEPSRepGetAlternateImage+16>: mr      r3,r0
0x937bd3b4 <CGImageEPSRepGetAlternateImage+20>: blr

Looks OK.  We're crashing in the |lwz| because we're trying to read a page that's not mapped.  We're reading 8(r3) and attempting to access 0x3f800008, so $r3 is 0x3f800000.  The register dump confirms this:

>PPC Thread State:
>  srr0: 0x97006eb0 srr1: 0x0200f930                vrsave: 0x00000000
>    cr: 0x44022244  xer: 0x00000000   lr: 0x91b83168  ctr: 0x97006ea4
>    r0: 0x00000000   r1: 0xbfffec70   r2: 0x43545854   r3: 0x3f800000

0x3f800000 is the representation of floating-point 1.0:

  int i = 0x3f800000;
  printf("%f\n", *(float*)&i);

prints 1.000000.

I was able to reproduce this crash using my own debug build on 10.3.9/ppc, without a debugger attached.  When I attached a debugger, I couldn't reproduce this crash, but experienced very frequent crashes, probably due to timing changes.  Since I noticed 0x3f800000 and its significance, every crash I've experienced has been an attempt to dereference 0x3f800000 (plus some small offset), in each instance because that value found its way into $r3.

The one crash I have up now is at js_GetGCThingFlags + 0x20, due to an attempt to dereference 0x3f800000.  Partial disassembly:

0x01054890 <js_GetGCThingFlags+0>:      stmw    r30,-8(r1)
0x01054894 <js_GetGCThingFlags+4>:      stwu    r1,-64(r1)
0x01054898 <js_GetGCThingFlags+8>:      mr      r30,r1
0x0105489c <js_GetGCThingFlags+12>:     stw     r3,88(r30)   ; 88(r30) <= r3
0x010548a0 <js_GetGCThingFlags+16>:     lwz     r0,88(r30)   ; r0 <= 88(r30) (= r3)
0x010548a4 <js_GetGCThingFlags+20>:     rlwinm  r0,r0,0,0,21
0x010548a8 <js_GetGCThingFlags+24>:     stw     r0,32(r30)   ; 32(r30) <= r0 (= r3)
0x010548ac <js_GetGCThingFlags+28>:     lwz     r2,32(r30)   ; r2 <= 32(r30) (= r0 = r3)
0x010548b0 <js_GetGCThingFlags+32>:     lwz     r0,0(r2)     ; deref r2 (= r0 = r3)

$r3 is supposed to contain the first argument, and is clearly supposed to be a pointer in this case.  Instead, it's 0x3f800000.
Comment 22 Mark Mentovai 2006-08-23 14:01:21 PDT
I did manage to crash CGImageEPSRepRelease (on nsImageMac::EnsureCachedImage) in the debugger and found that 0x3f800000 is ultimately coming from memory, as it was in the js_GetGCThingFlags crash.  The question now is: how is it getting into memory?
Comment 23 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2006-08-24 10:55:56 PDT
This query might be of interest:
http://talkback-public.mozilla.org/search/start.jsp?search=1&searchby=uuid&match=exact&searchfor=U3529AF57-12B0-4B7E-ABB13A04-BF487260&vendor=MozillaOrg&product=All&platform=All&buildid=&sdate=&stime=&edate=&etime=&sortby=bbid&rlimit=500
It's the crashes from a single user who has reported a lot of these crashes.

Looking at the talkback data summaries, it's also interesting to note that most of the js_GetGCThingFlags crashes are on Mac:
http://talkback-public.mozilla.org/reports/firefox/FF2x/FF2x-topcrashers.html
Comment 24 Larry 2006-08-24 11:37:56 PDT
A bit off-topic and maybe is a FAQ but -- any idea why the process
goes crazy and takes over the system, after it hits this exception?
Is the exception handler stuck in a loop?  I have a feeling that once
I forcibly kill  firefox-bin, no Talkback incident will be submitted
for that doomed session.  I'll see if I can look at the call stack, next
time mine goes into that runaway situation.
Comment 25 Marcia Knous [:marcia - use ni] 2006-08-24 17:38:08 PDT
It might be helpful to do a query for my email address (marcia@mozilla.org) and see my crashes. I have been logging which OS I was using when I crashed.

I sent this in email as well, but I have not seen hardly any crashes running on 10.4.7. I only crashed once so far running the FFT - you can see the stack in my Talkback submissions.

Finally, I have for sure hit the js_GetGCThingFlags crash multiple times.
Comment 26 Mark Mentovai 2006-08-24 17:48:15 PDT
Comment 24: you're experiencing a delay while the system's own CrashReporter walks the stacks and associates code addresses with function names.  The app has already crashed at this point, but there's a delay inherent in the data collection process.

Those of you experiencing this crash, if you wait for the CrashReporter window, you'll see something like this near the top:

Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_INVALID_ADDRESS (0x0001) at 0x3f800008

If what you see looks like EXC_BAD_ACCESS/KERN_INVALID_ADDRESS and the address is near 0x3f800000, you're most likely experiencing this bug.  If you see something else, it's most likely another bug, and you should search Bugzilla, and open a new bug report if none is filed yet.
Comment 27 Mark Mentovai 2006-08-25 08:13:11 PDT
Marcia, in comments 7 and 8, you indicated that the regression range was 0719 - 0720.  Is that trunk or branch?
Comment 28 Marcia Knous [:marcia - use ni] 2006-08-25 09:51:15 PDT
That would be branch. Also, regarding the CGImageEPSRepRelease crash, I crashed in that stack this morning. The operation I was performing at the time of the crash was clicking on a link in Yahoo Mail. Talkback incident ID on that one is: Incident ID: 22501746 or I can paste the Apple crash report if that is useful.

(In reply to comment #27)
> Marcia, in comments 7 and 8, you indicated that the regression range was 0719 -
> 0720.  Is that trunk or branch?
> 

Comment 29 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2006-08-25 14:40:14 PDT
Queries for js_GetGCThingFlags and XPCJSStackFrame::~XPCJSStackFrame in talkback seem to agree with the regression range in comment 8 (although there are not enough incidents to confirm it exactly).

That regression range gives the following checkin query:
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2006-07-19+03%3A15&maxdate=2006-07-20+06%3A29&cvsroot=%2Fcvsroot
Comment 30 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2006-08-25 14:50:30 PDT
It seems like it might be worth reducing that range by either (a) producing builds between those two builds or (b) if somebody can build and reproduce the bug, just doing a binary search without publishing the builds.
Comment 31 Stefan [:stefanh] (away until May 28) 2006-08-25 17:46:05 PDT
I belive I get the same crash on seamonkey 2006071814-1.8 on 10.3.9. At least the second stack looks familiar - even though there are symbols missing.
-------------------------
Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_INVALID_ADDRESS (0x0001) at 0x3f800014

Thread 0 Crashed:
0   libgklayout.dylib     	0x010f9fdc NSGetModule + 0xeceac
-------------------------
Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_INVALID_ADDRESS (0x0001) at 0x3f800008

Thread 0 Crashed:
0   com.apple.CoreGraphics   	0x96422eb0 CGImageEPSRepGetAlternateImage + 0xc
1   libRIP.A.dylib           	0x927cd168 ripc_DrawImage + 0x48
2   com.apple.CoreGraphics   	0x96304e7c CGContextDrawImage + 0x17c
3   libgfx_mac.dylib         	0x01610988 NSGetModule + 0x14b0
4   libgfx_mac.dylib         	0x016221c8 NSGetModule + 0x12cf0
5   libimglib2.dylib         	0x00411b98 NSGetModule + 0xa890
6   libimglib2.dylib         	0x004115f4 NSGetModule + 0xa2ec
7   libxpcom_core.dylib      	0x2c04adec nsTimerImpl::Fire() + 0xc4
8   libxpcom_core.dylib      	0x2c04aef8 handleTimerEvent(TimerEventType*) + 0x8c
-------------------------
Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_INVALID_ADDRESS (0x0001) at 0x3f800008

Thread 0 Crashed:
0   libxpconnect.dylib       	0x002dada4 0x29a000 + 0x40da4
1   libxpconnect.dylib       	0x002c736c NSGetModule + 0x148a8
2   libmozjs.dylib           	0x230184ec JS_DHashTableEnumerate + 0x78
3   libxpconnect.dylib       	0x002c73c0 NSGetModule + 0x148fc
4   libxpconnect.dylib       	0x002b0780 0x29a000 + 0x16780
5   libgklayout.dylib        	0x012bfe64 NSGetModule + 0x2b2d34
6   libmozjs.dylib           	0x2302dd2c js_GC + 0x890
7   libmozjs.dylib           	0x2302d23c js_ForceGC + 0x40
8   libgklayout.dylib        	0x012bfd0c NSGetModule + 0x2b2bdc
9   libxpcom_core.dylib      	0x2c04adec nsTimerImpl::Fire() + 0xc4
10  libxpcom_core.dylib      	0x2c04aef8 handleTimerEvent(TimerEventType*) + 0x8c
Comment 32 Stefan [:stefanh] (away until May 28) 2006-08-25 18:01:36 PDT
> I belive I get the same crash on seamonkey 2006071814-1.8 on 10.3.9.

I haven't yet tested further back.
Comment 33 Jean-Yves Perrier [:teoli] 2006-08-26 01:53:54 PDT
(In reply to comment #23)
> This query might be of interest:
> http://talkback-public.mozilla.org/search/start.jsp?search=1&searchby=uuid&match=exact&searchfor=U3529AF57-12B0-4B7E-ABB13A04-BF487260&vendor=MozillaOrg&product=All&platform=All&buildid=&sdate=&stime=&edate=&etime=&sortby=bbid&rlimit=500
> It's the crashes from a single user who has reported a lot of these crashes.
> 

It's me: I can get 20-30 crashes a day since the regression, even with a clean profile. OS is X.3.9 exclusively. I'm available for any testing, though Ido not have a setup to compile Fx myself (if you think it may be useful, I can set up the thing to do it).
Comment 34 Marcia Knous [:marcia - use ni] 2006-08-26 11:55:20 PDT
When you get the crashes, are you doing the same kind of thing, like typing in the URL bar - or the crashes fairly random? In my case, click events seem to more of less be the trigger for a fair amount of the crashes (clicking a link, clicking on a URL in the location bar)

Also, please post what extensions you are using, if any. Thanks.

(In reply to comment #33)
> (In reply to comment #23)
> > This query might be of interest:
> > http://talkback-public.mozilla.org/search/start.jsp?search=1&searchby=uuid&match=exact&searchfor=U3529AF57-12B0-4B7E-ABB13A04-BF487260&vendor=MozillaOrg&product=All&platform=All&buildid=&sdate=&stime=&edate=&etime=&sortby=bbid&rlimit=500
> > It's the crashes from a single user who has reported a lot of these crashes.
> > 
> 
> It's me: I can get 20-30 crashes a day since the regression, even with a clean
> profile. OS is X.3.9 exclusively. I'm available for any testing, though Ido not
> have a setup to compile Fx myself (if you think it may be useful, I can set up
> the thing to do it).
> 

Comment 35 Adam Guthrie 2006-08-26 12:35:01 PDT
teoli2003@jyp.ch, could you help us figure out exactly when this regressed? Older builds are available on archive.m.o [0]. I don't know if you're familiar with how to find a regression range. If not, you may find this guide [1] helpful.

[0] http://archive.mozilla.org/pub/firefox/nightly/
[1] http://wiki.mozilla.org/MozillaQualityAssurance:Triage#How_to_Help_with_Regressions_--_Finding_Regression_Windows
Comment 36 Mark Mentovai 2006-08-26 21:02:53 PDT
So far, the comments indicate that the range is 0719 - 0720 on the branch, and prior to 0718 on the trunk - might be helpful to know exactly when.  We really need to compare some well-timed builds (see comment 30).  I'll spit out a few tomorrow or Monday, even if I won't have the time to test them myself.
Comment 37 Jean-Yves Perrier [:teoli] 2006-08-28 01:36:00 PDT
(In reply to comment #36)
> So far, the comments indicate that the range is 0719 - 0720 on the branch, and
> prior to 0718 on the trunk - might be helpful to know exactly when. 

I have started finding a regression window . For the branch build I was unable to get a crash on 2006071803, but can get easily some crashes on 2006071903 (four in half an hour), with similar stack than reported here. This is one day earlier than reported until now.

I'm trying to find a regression window for the trunk right now.

Comment 38 Jean-Yves Perrier [:teoli] 2006-08-28 03:50:47 PDT
Regression window for trunk:
I'm not able to get a single crash with 2006071405; I'm able to get many crashes with the next one: 2006071622
(There isn't trunk build for 20060715)

I wasn't able to get Talkback ID so I can't for sure guarantee it is similar stacks, but the symptom are the same, so I'm quite confident.
Comment 39 :Gavin Sharp [email: gavin@gavinsharp.com] 2006-08-28 05:49:07 PDT
Those two ranges both contain the patches for bug 344238 and bug 344570.
Comment 41 Mark Mentovai 2006-08-28 06:55:24 PDT
Given the problems reported with the autocomplete dropdown, I'm inclined to suspect the popup fade-out portion of bug 344570.
Comment 42 Mark Mentovai 2006-08-28 13:13:32 PDT
Created attachment 235778 [details] [diff] [review]
Don't fade out pop-ups to hide them on 10.3

I've been using this patch for about 20 minutes on 10.3 with no crashes.  I'll make a test build available shortly.
Comment 43 Mark Mentovai 2006-08-28 13:56:21 PDT
A test build with the patch above is available here:

http://jackassofalltrades.com/tmp/bonecho-200608281615-345388v1.dmg

This build is ppc-only (because we only see the bug on 10.3, which only runs on ppc.)
Comment 44 Marcia Knous [:marcia - use ni] 2006-08-28 16:58:39 PDT
I ran this build for a good portion of today and tried all the various manipulations to get it to crash, and it did not crash. I particularly focused on drop down menus and the URL bar, which is where the largest concentration of crashes occurred. I will keep running it and report back if I see any crashes.

(In reply to comment #43)
> A test build with the patch above is available here:
> 
> http://jackassofalltrades.com/tmp/bonecho-200608281615-345388v1.dmg
> 
> This build is ppc-only (because we only see the bug on 10.3, which only runs on
> ppc.)
> 

Comment 45 Jean-Yves Perrier [:teoli] 2006-08-29 00:12:57 PDT
(In reply to comment #43)
> A test build with the patch above is available here:

I didn't get a single crash for the moment (more than 1 hour) with that test build. Usually during this amount of time I get 4 to 6 crashes. I continue to test it, but it seems to have corrected the problem.

Well done Mark and kudos to all the others that helped with the issue! :-)
Comment 46 Mark Mentovai 2006-08-29 06:39:22 PDT
Comment on attachment 235778 [details] [diff] [review]
Don't fade out pop-ups to hide them on 10.3

With two positive reports, let's get this in the tree.
Comment 47 Mark Mentovai 2006-08-29 08:28:54 PDT
Checked in on the trunk.  Thanks to everyone who reported and helped test.
Comment 48 Mike Connor [:mconnor] 2006-08-29 10:31:20 PDT
Comment on attachment 235778 [details] [diff] [review]
Don't fade out pop-ups to hide them on 10.3

a=mconnor on behalf of drivers, we have a very small number of Mac trunk testers, and far fewer on 10.3, so getting on trunk ASAP to bake there.
Comment 49 Mark Mentovai 2006-08-29 11:03:45 PDT
Checked in on MOZILLA_1_8_BRANCH before 1.8.1rc1.
Comment 50 Larry 2006-08-29 11:56:40 PDT
The nsMacWindow patch from 20060828 on 10.3.9/ppc looks good here too.
Is this a good time to add some info into bug 344570 to say that there was a
problem in 10.3.x with its fix?  Larry.
Comment 51 Jean-Yves Perrier [:teoli] 2006-09-01 00:23:33 PDT
I have done now a couple of days of testing.

I got only two crashes, one with the test build (no Talkback), one with a nightly that led to a FadeMenuWindows stack. That's far far far better than before (well done again!).

I'm wondering if this a mere coincidence or if, maybe, another piece of code, used far less often, isn't calling the same problematic function?
Comment 52 Adam Guthrie 2006-09-03 10:58:47 PDT
Marking as VERIFIED per Talkback data. No crashes [@ CGImageGetData] or [@ CGImageEPSRepRelease] have been reported since 2006-08-29 (when Mark's patch went in).

Note You need to log in before you can comment on or make changes to this bug.