Closed Bug 706442 Opened 14 years ago Closed 13 years ago

Firefox 10.0a2 Crash Report [@ js::LifoAlloc::getOrCreateChunk(unsigned int) ]

Categories

(Core :: JavaScript Engine, defect)

10 Branch
x86
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla11
Tracking Status
firefox8 --- unaffected
firefox9 - unaffected
firefox10 + affected
firefox11 + ---
firefox13 - ---
firefox14 + ---
firefox-esr10 --- wontfix
status1.9.2 --- unaffected

People

(Reporter: cbook, Assigned: cdleary)

References

Details

(Keywords: crash, regression, sec-critical, Whiteboard: [js:waitingforinfo][qa?])

Crash Data

Attachments

(1 file)

noticing on the top-changer list on crash-stats. Example Crash-Report -> https://crash-stats.mozilla.com/report/index/72527b8c-3a12-4e7f-8a27-087c22111129 general overview: https://crash-stats.mozilla.com/report/list?range_value=3&range_unit=days&signature=js%3A%3ALifoAlloc%3A%3AgetOrCreateChunk%28unsigned%20int%29&version=Firefox%3A10.0a2 all windows 7/xp crashes Crashing Thread Frame Module Signature [Expand] Source 0 mozjs.dll js::LifoAlloc::getOrCreateChunk js/src/ds/LifoAlloc.cpp:180 1 mozjs.dll js::analyze::ScriptAnalysis::addJump js/src/jsanalyze.cpp:81 2 mozjs.dll js::analyze::ScriptAnalysis::analyzeBytecode js/src/jsanalyze.cpp:593 3 mozjs.dll JSScript::makeAnalysis js/src/jsinfer.cpp:5507 4 mozjs.dll JSScript::ensureRanAnalysis js/src/jsinferinlines.h:1270 5 mozjs.dll js::types::TypeMonitorCall js/src/jsinferinlines.h:327 6 mozjs.dll js::Interpret js/src/jsinterp.cpp:3959 7 mozjs.dll js::types::TypeMonitorCallSlow js/src/jsinfer.cpp:4972 8 mozjs.dll js::RunScript js/src/jsinterp.cpp:584 9 mozjs.dll js_fun_call js/src/jsfun.cpp:1761 10 mozjs.dll js::types::TypeSet::addType js/src/jsinferinlines.h:1028 11 mozjs.dll js::Interpret js/src/jsinterp.cpp:3948 12 mozjs.dll js::types::TypeMonitorCallSlow js/src/jsinfer.cpp:4972 13 mozjs.dll js::RunScript js/src/jsinterp.cpp:584 14 mozjs.dll js_fun_call js/src/jsfun.cpp:1761 15 mozjs.dll js::types::TypeSet::addType js/src/jsinferinlines.h:1028 16 mozjs.dll js::Interpret js/src/jsinterp.cpp:3948 17 mozjs.dll js::CallObject::create js/src/vm/CallObject.cpp:78 18 mozjs.dll js::CreateFunCallObject js/src/jsfun.cpp:743 19 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:647 20 mozjs.dll js_fun_call js/src/jsfun.cpp:1761 21 mozjs.dll js::types::TypeSet::addType js/src/jsinferinlines.h:1028 22 mozjs.dll js::Interpret js/src/jsinterp.cpp:3541 23 mozjs.dll js::types::TypeMonitorCallSlow js/src/jsinfer.cpp:4972 24 mozjs.dll js::RunScript js/src/jsinterp.cpp:584 25 mozjs.dll js_fun_call js/src/jsfun.cpp:1761 26 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:629 27 mozjs.dll js::Interpret js/src/jsinterp.cpp:3948 28 mozjs.dll js::types::TypeMonitorCallSlow js/src/jsinfer.cpp:4972 29 mozjs.dll js::RunScript js/src/jsinterp.cpp:584 30 mozjs.dll js::Invoke js/src/jsinterp.h:148 31 mozjs.dll js_fun_apply js/src/jsfun.cpp:1817 32 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:629 33 mozjs.dll js::Interpret js/src/jsinterp.cpp:3948 34 mozjs.dll js::types::TypeMonitorCallSlow js/src/jsinfer.cpp:4963 35 mozjs.dll js::RunScript js/src/jsinterp.cpp:584 36 mozjs.dll js::Invoke js/src/jsinterp.h:148 37 mozjs.dll js_fun_apply js/src/jsfun.cpp:1817 38 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:629 39 mozjs.dll js::Interpret js/src/jsinterp.cpp:3948 40 mozjs.dll js::types::TypeMonitorCallSlow js/src/jsinfer.cpp:4963 41 mozjs.dll js::RunScript js/src/jsinterp.cpp:584 42 mozjs.dll js_fun_call js/src/jsfun.cpp:1761 43 mozjs.dll js::ContextStack::currentScript js/src/vm/Stack-inl.h:619 44 mozjs.dll js::Interpret js/src/jsinterp.cpp:4049 45 mozjs.dll js::ContextStack::pushInvokeFrame js/src/vm/Stack.cpp:691 46 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:647 47 mozjs.dll JS_CallFunctionValue js/src/jsapi.cpp:5199 48 xul.dll nsXPCWrappedJSClass::CallMethod js/xpconnect/src/XPCWrappedJSClass.cpp:1530 49 @0xffffff81 50 mozjs.dll JS_WrapObject js/src/jsapi.cpp:1438 51 xul.dll XPCConvert::NativeInterface2JSObject js/xpconnect/src/XPCConvert.cpp:1276 52 @0x72c94ff 53 mozjs.dll js::types::TypeMonitorCallSlow js/src/jsinfer.cpp:4972 54 mozjs.dll js::ContextStack::pushInvokeFrame js/src/vm/Stack.cpp:691 55 xul.dll nsHttpTransaction::LocateHttpStart netwerk/protocol/http/nsHttpTransaction.cpp:734 56 xul.dll SelectorMatches layout/style/nsCSSRuleProcessor.cpp:2146 57 xul.dll SelectorMatches layout/style/nsCSSRuleProcessor.cpp:2146 58 xul.dll nsEventDispatcher::Dispatch content/events/src/nsEventDispatcher.cpp:677
Assignee: general → cdleary
Group: core-security
This bug is on aurora, but not beta.
Attachment #578039 - Flags: review?(luke)
Attachment #578039 - Flags: approval-mozilla-aurora?
Status: NEW → ASSIGNED
Attachment #578039 - Flags: review?(luke) → review+
Fixed on trunk, but still waiting for aurora approval decision.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Comment on attachment 578039 [details] [diff] [review] Clear the chunk's next field after releasing further chain chunks. There was no risk assessment, but we believe this is a low risk fix.
Attachment #578039 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/1b3ce7846516 Sorry about the risk assessment -- promise I'll remember to include it next time!
Is this something QA can verify?
Whiteboard: [sg:critical] → [sg:critical][qa?]
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #7) > Is this something QA can verify? bump
Group: core-security
Version: Trunk → 10 Branch
this crash seems to be still around in current versions...
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Seems to be most crashy with Firefox 12 on Windows XP. One of the comments mention CastleVille on Facebook. Firefox 14, 15, and 16 seem to be pretty low, though I'm not sure if that's because of ADUs or if this has been mitigated by a patch.
Group: core-security
Do we have a test case here that we can verify? Does Comment 10 imply that we still have an sg:crit issue in our code?
If Firefox 11 was fixed when in mozilla-central (as comment 3 and the status flag indicate), this should (in theory) be fixed for Firefox 12 and onward by nature of the train model. If Firefox 10 was fixed when in mozilla-aurora (as comment 6 and the status flag indicate), this should be fixed in the first Firefox 10.0 ESR. Though, since QA was never able to reproduce this bug, we were also never able to truly verify the fix. I'm happy to test if there is something we can try.
It's #169 top browser crasher in 10.0.2, #136 in 11.0 over the last 4 weeks, #116 in 12.0, #110 in 13.0, #99 in 14.0b6, #91 in 15.0a2, and #200 in 16.0a1 over the last week. 10.0.2, 10.0.3, 10.0.4 and 10.0.5 ESR are affected. Based on 140 comments over the last 4 weeks, some people were playing a game on Facebook, were using Facebook, were playing videos, were signing into their email account, or were opening an email. Here are interesting comments: "Virtual memory warning box appeared just before crash." "Everytime we tries to log off any program this happens." "since it was updated it crashes all the time" (Fx 13) "suddenly at facebook - play castle ville canceled and the desktop was visible. reloaded Firefox, facebook, then again the game loaded, was originally built and then grey screen with an AIDS in a circle. "reload page" pressed, again of the game and then again demolition and desktop and this page beginning of visible. https://apps.facebook.com/playcastleville/?fb_source=bookmark_apps&ref=bookmarks&count=0&1_0.1.1=fb_bmpos" (translated from German by Bing Translator)
The crashstats don't make this sound very fixed.
This appears to be some sort of OOM bug, but it's hard to tell. It's crashing on this line of code: BumpChunk *newChunk = BumpChunk::new_(chunkSize); which calls js_malloc and then initializes the returned memory. The crash addresses are also always page starts, which makes me think somehow the allocator is returning an unmapped page. I thought Windows wasn't supposed to do that, though. It could be a bug in jemalloc. I did notice that type inference was what was calling into the LIFO allocator on the crash reports that I clicked, so it's possible sometimes type inference is allocating more than it should.
Whiteboard: [sg:critical][qa?] → [js:waitingforinfo][sg:critical][qa?]
A user using Sync with a corrupt profile has a Firefox that crashes sometimes with this signature: see bug 769556.
(In reply to David Mandelin from comment #16) > This appears to be some sort of OOM bug, but it's hard to tell. Windows crash reports now have information about memory usage. For instance, 4 random reports with this signature that I selected have system memory usage percentage at 96%, 39%, 98% and 98%.
(In reply to Andrew McCreight [:mccr8] from comment #18) > (In reply to David Mandelin from comment #16) > > This appears to be some sort of OOM bug, but it's hard to tell. > > Windows crash reports now have information about memory usage. For > instance, 4 random reports with this signature that I selected have system > memory usage percentage at 96%, 39%, 98% and 98%. OK, so reasonably likely :-) to be mostly OOM-related, possibly from runaway alloc in TI but I don't think that's likely. What can we do now? Who knows the real story about what happens on Windows if you call jemalloc and the system is very low on memory?
(In reply to David Mandelin from comment #19) > What can we do now? Who knows the real story about what happens on Windows > if you call jemalloc and the system is very low on memory? jlebar may know.
> Who knows the real story about what happens on Windows if you call jemalloc and the system is very > low on memory? The common thread I see in all the crash reports I looked at is low "available page file". That number is ullAvailPageFile from MEMORYSTATUSEX, "The maximum amount of memory the current process can commit, in bytes." [1] It's pretty weird in some cases [2, 3, 4] that we have less than 5MB of available page file and 100+ MB of "available physical memory" (that's ullAvailPhys: "amount of physical memory currently available, in bytes. This is the amount of physical memory that can be immediately reused without having to write its contents to disk first."). Looking at some random, unrelated crash reports, there's always much more available page file than available physical memory, so I think this is an anomalous situation. I'm not sure, but I think this may mean that the system has run out of space in its pagefile -- that is, the pagefile is too small, and Windows can't grow it, perhaps because the system is out of disk space. Windows doesn't overcommit -- I understand that this means that if a process allocates a bunch of MEM_COMMIT virtual memory but doesn't touch it, that space is reserved and must fit either in core or the page file. So if something (perhaps Firefox) on the user's machine is eating up a lot of MEM_COMMIT vmem, it's possible we could get into this state where there's a lot of physical memory available (because the pages haven't /actually/ been committed yet), but no pagefile available. Exactly what this means for jemalloc, I'm not sure yet. But if anyone cares to correct me on the above (glandium?) that might help. :) [1] http://msdn.microsoft.com/en-us/library/windows/desktop/aa366770%28v=vs.85%29.aspx [2] https://crash-stats.mozilla.com/report/index/d41a602f-015b-4dbb-9263-f687e2120709 [3] https://crash-stats.mozilla.com/report/index/90fcf4e0-db21-4d0b-abf1-8c6cb2120709 [4] https://crash-stats.mozilla.com/report/index/82cce1b7-2ede-42dc-b2f5-6c1312120709
We never check the return value in jemalloc's pages_commit (VirtualAlloc(MEM_COMMIT)). I wouldn't be surprised if that's what's failing here.
In addition to comment 17, that user hit this crash with a new profile, suspicious software uninstalled and a disk check done. Certain other applications (Chrome, Safari, TB) also crashes.
I'm not sure this really needs to be a security bug. It seems like we have a new crash that just happens to have the same signature as an existing fixed sg:crit. But maybe it is too early to tell...
> It seems like we have a new crash that just happens to have the same signature as an existing fixed > sg:crit. Indeed, that's what appears to be happening. We should be able to tell when I land the abort in bug 772338.
Great analysis. Thanks, Justin. By the way, how do you read the MEMORYSTATUSEX out of a crashdump? Do you just call GlobalMemoryStatusEx in the debugger? I had no idea such things were possible...
> By the way, how do you read the MEMORYSTATUSEX out of a crashdump? Do you just call > GlobalMemoryStatusEx in the debugger? I had no idea such things were possible... No, it's not so magical. :) We just make the syscall while we're building the crash report; the data we collect shows up on the crash-stats page.
(In reply to Justin Lebar [:jlebar] from comment #27) > > By the way, how do you read the MEMORYSTATUSEX out of a crashdump? Do you just call > > GlobalMemoryStatusEx in the debugger? I had no idea such things were possible... > > No, it's not so magical. :) We just make the syscall while we're building > the crash report; the data we collect shows up on the crash-stats page. Oh. :-) Well, I never saw those fields before, so I still learned something!
This particular flavor of the crash seems unlikely to be sec-critical, so removing the whiteboard tag to prevent confusion.
Whiteboard: [js:waitingforinfo][sg:critical][qa?] → [js:waitingforinfo][qa?]
This really needs a new bug filed for it if we want to track the new regression, as it is unrelated to the original issue.
Status: REOPENED → RESOLVED
Closed: 14 years ago13 years ago
Resolution: --- → FIXED
Triage comment: Since this appears to have been classified as non-exploitable we're wontfixing for ESR10 and will get it in ESR17.
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: