Closed Bug 815543 Opened 12 years ago Closed 8 years ago

[10.8] Flash crash in OpenGL@0x3317

Categories

(Core Graveyard :: Plug-ins, defect)

x86_64
macOS
defect
Not set
critical

Tracking

(firefox20-)

RESOLVED WORKSFORME
Tracking Status
firefox20 - ---

People

(Reporter: scoobidiver, Unassigned)

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

It's #7 top crasher in 18.0a2, #6 in 19.0a1 and #2 in 20.0a1 on Mac OS X. Signature OpenGL@0x3317 More Reports Search UUID 6e409894-2610-4c12-9a65-c0dc62121126 Date Processed 2012-11-26 18:27:05 Process Type plugin Version: Filename: Flash Player.plugin Uptime 6317 Install Age 1.8 hours since version was first installed. Install Time 2012-11-26 16:32:47 Product Firefox Version 20.0a1 Build ID 20121126030823 Release Channel nightly OS Mac OS X OS Version 10.8.2 12C3006 Build Architecture amd64 Build Architecture Info family 6 model 58 stepping 9 Crash Reason EXC_BAD_ACCESS / KERN_INVALID_ADDRESS Crash Address 0x26887f00 App Notes AdapterVendorID: 0x8086, AdapterDeviceID: 0x 166GL Context? GL Context+ GL Layers? GL Layers+ EMCheckCompatibility True Adapter Vendor ID 0x8086 Adapter Device ID 0x 166 Frame Module Signature Source 0 OpenGL OpenGL@0x3317 1 OpenGL OpenGL@0x7ffe 2 OpenGL OpenGL@0x8018 3 libmozglue.dylib je_malloc jemalloc.c:4219 4 CoreGraphics CoreGraphics@0x204d70 5 CoreGraphics CoreGraphics@0x1432cc 6 QuartzCore QuartzCore@0x52721 7 QuartzCore QuartzCore@0x4c2f2 8 libobjc.A.dylib libobjc.A.dylib@0x9a34 9 libmozglue.dylib arena_dalloc jemalloc.c:1679 10 CoreGraphics CoreGraphics@0x143533 11 libsystem_c.dylib libsystem_c.dylib@0x1a134 12 CoreGraphics CoreGraphics@0x67026 13 CoreGraphics CoreGraphics@0x143309 14 CoreGraphics CoreGraphics@0x3a9cdd 15 GLEngine GLEngine@0x17a2e 16 QuartzCore QuartzCore@0x43567 More reports at: https://crash-stats.mozilla.com/report/list?signature=OpenGL%400x3317
Note the jemalloc stuff in the stacks. Could this be a jemalloc bug, possibly one specific to OS X 10.8?
The displayed stack is almost certainly incorrect, and it's likely that jemalloc is not on the callstack at all. If it were, that probably just means there is heap corruption. It is extremely unlikely that there is actually an allocator bug here.
If some plugin used malloc() or friends, would that cause jemalloc code to run?
I don't know, on mac. In any case, I have confirmation from Ted that since we're missing symbols for the OS libs, the stack after the first frame is complete guesswork. He's going to look at uploading the symbols from his mac which is matching (or mostly matching) and we can reprocess one or more of these to see what the stack actually is.
Flags: needinfo?(justin.lebar+bug)
I put the symbols I could get up on the symbol server. I didn't have the exact same versions of all libraries, but I had a number of them.
(In reply to Steven Michaud from comment #3) > If some plugin used malloc() or friends, would that cause jemalloc code to > run? If the plugin runs in the same process as Gecko, I would expect so. If the plugin runs OOP, I'm not sure. Maybe glandium knows.
Flags: needinfo?(justin.lebar+bug)
I also wonder if system libraries/frameworks end up running jemalloc code when they use malloc() and friends.
(In reply to Steven Michaud from comment #7) > I also wonder if system libraries/frameworks end up running jemalloc code > when they use malloc() and friends. They must; otherwise, strdup() would be an allocator mismatch waiting to happen.
(In reply to comment #8) Makes sense. So on to the second question :-) Do you think the references to jemalloc code in the stack from comment #0 are spurious? (Give me 10 minutes and I'll use atos to translate the addresses to symbols by hand.)
(Following up comment #10) Parts of this stack make no sense to me (for example all the stuff that happens under pthread_mutex_unlock, if that's indeed being called). So it may indeed be corrupt. What do you think, Benoit?
Comment on attachment 685807 [details] Stack from comment #0 with symbols It makes no sense for arena_dalloc to be called in any stack which involves je_malloc, so that part is at least wrong.
Yeah, that stack makes no sense. As for malloc, on OSX, in both main process and subprocesses, jemalloc is registered as the default zone allocator, which means je_malloc is always used. For realloc and free, however, je_free is only used if the pointer it is given is identified as having been allocated by jemalloc. If not, the free and realloc from the zone that allocated that particular memory are used.
That stack is completely bogus. I have a slightly-better stack I symbolized by dumping symbols from my local 10.8.2 install, but it's still not perfect (I don't have the matching version of CoreGraphics). The top two frames are: Thread 0 (crashed) 0 OpenGL!glcGetIOAccelService + 0xacb rbx = 0x000000011bad1000 r12 = 0x00000000000000f6 r13 = 0x0000000000000f68 r14 = 0x000000011bad1000 r15 = 0x00007fff5fbf7f78 rip = 0x00007fff8c5e8317 rsp = 0x00007fff5fbf7e60 rbp = 0x00007fff5fbf7e70 Found by: given as instruction pointer in context 1 OpenGL!CGLUpdateContext + 0x19 rbx = 0x000000000000211c r12 = 0x00000000000000f6 r13 = 0x0000000000000f68 r14 = 0x000000011bad1000 r15 = 0x00007fff5fbf7f78 rip = 0x00007fff8c5ed019 rsp = 0x00007fff5fbf7e80 rbp = 0x00007fff5fbf7f00 Found by: call frame info It wanders off into some CoreGraphics code after that. It doesn't seem terribly likely that this is allocator related, but I could be wrong. If we can find symbols for that version of CoreGraphics we can probably get a decent stack here. (The crash report says 10.8.2, which I'm on, but perhaps I'm missing one update or something?)
bug 790390 is like a dupe of this bug.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #14) > That stack is completely bogus. I have a slightly-better stack I symbolized > by dumping symbols from my local 10.8.2 install, but it's still not perfect > (I don't have the matching version of CoreGraphics). The top two frames are: > Thread 0 (crashed) > 0 OpenGL!glcGetIOAccelService + 0xacb > rbx = 0x000000011bad1000 r12 = 0x00000000000000f6 > r13 = 0x0000000000000f68 r14 = 0x000000011bad1000 > r15 = 0x00007fff5fbf7f78 rip = 0x00007fff8c5e8317 > rsp = 0x00007fff5fbf7e60 rbp = 0x00007fff5fbf7e70 > Found by: given as instruction pointer in context > 1 OpenGL!CGLUpdateContext + 0x19 > rbx = 0x000000000000211c r12 = 0x00000000000000f6 > r13 = 0x0000000000000f68 r14 = 0x000000011bad1000 > r15 = 0x00007fff5fbf7f78 rip = 0x00007fff8c5ed019 > rsp = 0x00007fff5fbf7e80 rbp = 0x00007fff5fbf7f00 > Found by: call frame info > > It wanders off into some CoreGraphics code after that. It doesn't seem > terribly likely that this is allocator related, but I could be wrong. If we > can find symbols for that version of CoreGraphics we can probably get a > decent stack here. (The crash report says 10.8.2, which I'm on, but perhaps > I'm missing one update or something?) Perhaps something regressed in apple's 10.8 OGL driver. It's also possible that we're hitting hardware accelerated CoreGraphics, getting the rest of the CoreGraphics symbols could confirm that. My other guess is that it could be triggered either by using HiDPI or could be caused by the plugin changes that landed to better support HiDPI. Let's take look at the URL to see if we can get a hint at reproducing this issue (does it happen all over the web or a particular site).
Keywords: needURLs
(In reply to comment #14) > That stack is completely bogus. I have a slightly-better stack I > symbolized by dumping symbols from my local 10.8.2 install, but it's > still not perfect (I don't have the matching version of > CoreGraphics). I translated the (non-Mozilla) symbols in that stack using atos on my own (fully current) 10.8.2 install. (And yes, I specified '-arch x86_64'.) How do you know you don't have the matching version of CoreGraphics?
Translating the symbols listed doesn't help much, because the stackwalk itself is broken. Ted knows that they don't match because the debug ID of his local versions doesn't match the debug ID listed in the crash report.
Right, I ran dump_syms on my local binaries and checked the Debug ID vs. what's in the crash report. Debug IDs on mac are either the UUID of the binary (from LC_UUID), or if that's not present a hash of the first page of the text section. If you want to try dumping your CoreGraphics binary and see if yours matches it's pretty simple, just run $objdir/dist/host/bin/dump_syms /path/to/CoreGraphics and look at the first line of output. If it matches 9A1324EFC9CB30E4AE9F0AEF69052FAE0, you have the right version.
> Translating the symbols listed doesn't help much, because the > stackwalk itself is broken. Yes. But I'm mainly interested in finding out how Ted knows his local CoreGraphics is the "wrong" version. > Ted knows that they don't match because the debug ID of his local > versions doesn't match the debug ID listed in the crash report. So my question now becomes "how do you generate the debug ID"? Ted has answered it.
CoreGraphics debug ID on my 10.8.2 system (x86_64 architecture) is DCC70C6EAB6D3457A8237569CB29B1070. So I don't have the "right" CoreGraphics, either.
The stacks at https://crash-stats.mozilla.com actually show two different build IDs for OS X 10.8.2 -- 12C3006 and 12C60. 12C60 is the one I have, and seems to be the latest. The debug IDs for CoreGraphics in the few 12C60 examples I looked at are the same as mine (and presumably yours). But the various stacks, though they all have jemalloc stuff in them, look quite different aside from the top few lines. Another sign of corruption.
Apple has released several different "updates to OS X 10.8.2". That presumably explains the different build IDs. http://support.apple.com/kb/DL1580 (Update) http://support.apple.com/kb/DL1581 (Combo Update) http://support.apple.com/kb/DL1600 (Supplement) http://support.apple.com/kb/DL1611 (Supplement 2.0)
Tracking for now to keep on our radar as 20 moves through the channels, but we should confirm if this is a dupe of bug 790390 and keep an eye on volume - if this drops off topcrash list please update.
I just noticed that I have a fat 32/64 binary of flash and our plugin code will prefer the matching architecture. Do we know when Adobe shipping fat binaries? It could explain new instability if we just started loading the 64 bit version.
> Do we know when Adobe shipping fat binaries? Version 11 and up (see bug 804606 comment #27). I don't know when the first 11 release took place, but you may be able to find out at http://helpx.adobe.com/flash-player/kb/archived-flash-player-versions.html.
(Released 10/3/2011) Flash Player 11.0.1.152 so that's not it.
There have been only two crashes for the last week.
Not tracking for release, given comment# 29.Please renominate if the crash landscape changes .
I'm marking this bug as WORKSFORME as bug crashlog signature didn't appear from a long time (over half year).
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: