Closed Bug 460518 Opened 11 years ago Closed 10 years ago
OOM crash using regress-271716-n
Found this crash during the JS Tests with Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b2pre) Gecko/20081017 Firefox/3.1b2pre Firefox crash when using the regress-271716-n.js Test From the Testlog: *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug firefox-bin(33550,0xa0555fa0) malloc: *** mmap(size=2097152) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug firefox-bin(33550,0xa0555fa0) malloc: *** mmap(size=2097152) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug firefox-bin(33550,0xa0555fa0) malloc: *** mmap(size=2097152) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug firefox-bin(33550,0xa0555fa0) malloc: *** mmap(size=2097152) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug firefox-bin(33550,0xa0555fa0) malloc: *** mmap(size=2097152) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug js1_5/Regress/regress-271716-n.js: EXIT STATUS: CRASHED signal 10 SIGBUS (280.538384 seconds) From the Bisect Script from Bob: checking that the test fails in the bad revision 20585:tip 0 files updated, 0 files merged, 0 files removed, 0 files unresolved test failure js1_5/Regress/regress-271716-n.js.*SIGBUS found, bad revision 20585:tip confirmed ... regression changeset: 20434:77f265c8fb3e parent: 20433:2f932da81b37 parent: 20388:81c6cab2bac6 user: Blake Kaplan <email@example.com> date: Mon Oct 13 11:24:23 2008 -0700 summary: Re merge to pick up bz's backout *** revision 77f265c8fb3e found ***
Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_PROTECTION_FAILURE at address: 0x00000000 0xffff08b7 in __memcpy () (gdb) bt #0 0xffff08b7 in __memcpy () #1 0x93c90f3a in fixupSelectorsInMethodList () #2 0x93c92222 in _class_getMethodNoSuper () #3 0x93c8829e in _class_lookupMethodAndLoadCache () #4 0x93c986d6 in objc_msgSend () #5 0x96d86f23 in +[NSException raise:format:arguments:] () #6 0x96d86f6a in +[NSException raise:format:] () #7 0x9089f4ad in NSAllocateObject () #8 0x908ada2d in +[NSMapTable alloc] () #9 0x91d75db8 in -[_NSDisplayOperation initWithWindow:windowRegion:] () #10 0x91d75cd8 in -[_NSDisplayOperationStack enterDisplayOperationForWindow:windowRegion:] () #11 0x923824ab in -[_NSDisplayOperationStack enterViewWillDrawOperationForWindow:windowRegion:] () #12 0x91d756dc in -[NSView _sendViewWillDrawInRect:] () #13 0x91cb7b37 in -[NSView displayIfNeeded] () #14 0x91cb7725 in -[NSWindow displayIfNeeded] () #15 0x91cb7548 in _handleWindowNeedsDisplay () #16 0x96d0b9c2 in __CFRunLoopDoObservers () #17 0x96d0cd1c in CFRunLoopRunSpecific () #18 0x96d0dcf8 in CFRunLoopRunInMode () #19 0x91861480 in RunCurrentEventLoopInMode () #20 0x918611d2 in ReceiveNextEventCommon () #21 0x9186110d in BlockUntilNextEventMatchingListInMode () #22 0x91cb53ed in _DPSNextEvent () #23 0x91cb4ca0 in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] () #24 0x91cadcdb in -[NSApplication run] () #25 0x14551630 in nsAppShell::Run (this=0x14bf55f0) at /work/mozilla/builds/1.9.1/mozilla/widget/src/cocoa/nsAppShell.mm:692 #26 0x1424475a in nsAppStartup::Run (this=0x845a20) at /work/mozilla/builds/1.9.1/mozilla/toolkit/components/startup/src/nsAppStartup.cpp:192 #27 0x000c229e in XRE_main (argc=4, argv=0xbfffea10, aAppData=0x80e6b0) at /work/mozilla/builds/1.9.1/mozilla/toolkit/xre/nsAppRunner.cpp:3264 #28 0x000026e3 in main (argc=4, argv=0xbfffea10) at /work/mozilla/builds/1.9.1/mozilla/browser/app/nsBrowserApp.cpp:156 (gdb)
This looks like an OOM. And I think it was caused by the GC not running because that mechanism wasn't always present in TM. But it seems to be working now. Can you rerun this? But even if it works, I'd watch out for it again, because that mechanism for TM has been unstable lately. Also, for my own edification, can you tell me how to run JS tests in browser? I did it by extracting the test source into a new HTML file because I don't know the official way.
Depends on: 450000
I ran this again yesterday and it ran for a really long time and then my laptop ran out of memory. What exactly is supposed to happen? Should it not run out of memory, or just not crash the browser?
it should typically run you out of memory and depending on your OS, machine etc will do one of the following: TIME OUT (not terminate in 480 seconds), get killed via SIGKILL or SIGABRT. It should not terminate with a SIGBUS or SIGSEGV though. How much ram do you have? I can get it in gdb if it will help.
Hmmm, I was trying to retest because I think it's may have been fixed in TM. I think it ran for longer than 480 seconds, but is that timeout in the browser or just the shell tests? But yes, can you retry it, since it is apparently not duplicating for me right now, either due to local weirdness or being fixed. I have 2 GB.
crash @ MakeDeletedClusters gfxAtsuiFontGroup::InitTextRun in gfx/thebes is in the stack, so maybe this is just oom crashing elsewhere in the browser and not js specific. I'll try to bisect but since this is a browser only issue it will take some time.
-> GFX: Thebes
Assignee: general → nobody
QA Contact: general → thebes
Well, one thing that confused me is that with garbage collection it seems like the script *shouldn't* crash, that it should just run forever using a small amount of memory. If that's the case then the problem is on the JS side. That's why I kept asking what you expected to see. But it seemed like you did expect it to run out of memory or crash nicely.
If it's crashing there, it looks like we're going oom at a time when something inside ATSUI tries to allocate memory. It doesn't check the result for NULL and tries to blindly use it, and things blow up. Not sure what we can do if this is the first allocation that fails.
I just noticed that I was mistaken in comment 10 in thinking that the GC should prevent this from running out of memory. The program does in fact try to allocate an infinite amount of memory. From Bob's stack traces it looks like we just fail at arbitrary points due to OOM. If so, I think this is unfixable. The overhaul of application OOM handling might be able to fix it but I don't think general OOMs can reasonably be handled on their own.
Flags: blocking1.9.1+ → blocking1.9.1-
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Summary: CRASHED signal 10 SIGBUS using regress-271716-n.js Test → OOM crash using regress-271716-n.js
You need to log in before you can comment on or make changes to this bug.