Closed
Bug 1234147
Opened 8 years ago
Closed 8 years ago
crash in js::jit::JitcodeGlobalTable::allocateTower & js::GetOwnPropertyDescriptor
Categories
(Core :: JavaScript Engine: JIT, defect)
Core
JavaScript Engine: JIT
Tracking
()
People
(Reporter: Sylvestre, Assigned: shu)
References
Details
(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [adv-main47+][adv-esr45.2+][post-critsmash-triage])
Crash Data
Attachments
(2 files, 2 obsolete files)
77.74 KB,
text/plain
|
Details | |
1.21 KB,
patch
|
shu
:
review+
ritu
:
approval-mozilla-aurora+
ritu
:
approval-mozilla-beta+
ritu
:
approval-mozilla-esr45+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-43b8d14d-1e00-4699-a388-4d1852151216 and report bp-55fff57e-07a8-4f8d-9e78-a839f2151221 ============================================================= Leaving my Firefox during the week end, it is quite common that it is crashed when I am coming back. Even the signatures are different, I am grouping then as the context in which they appears are similar. In the terminal, I can see: console.error: Could not write session state file out of memory undefined console.error: _bugzillajs: out of memory console.error: _bugzillajs: out of memory console.error: Could not write session state file out of memory undefined console.error: console.error: console.error: console.error: console.error: console.error: console.error: console.error: console.error: out of memory console.error: out of memory console.error: console.error: out of memory console.error: console.error: _bugzillajs: out of memory console.error: _bugzillajs: out of memory console.error: _bugzillajs: out of memory console.error: _bugzillajs: out of memory
Comment 1•8 years ago
|
||
Does disabling ion prevent this from occurring?
Reporter | ||
Comment 2•8 years ago
|
||
I don't know, I didn't look if this is still happening.
Reporter | ||
Updated•8 years ago
|
status-firefox45:
--- → wontfix
status-firefox46:
--- → affected
status-firefox47:
--- → affected
status-firefox48:
--- → affected
Comment 3•8 years ago
|
||
Has this issue been officially confirmed? I'm encountering it on Firefox 43.0.1 and attempting to check 45.
Comment 4•8 years ago
|
||
I can confirm its presence in 45.0 (windows). The issue is repeatable and encountered when Firefox is open for a prolonged amount of time several days / week. Am I correct in my interpretation that this issue is related to an elevation in the JavaScript optimization level? I am attempting to test by disabling basline, ion, and asmjs. Any ideas on causes or mitigating factors shy of restarting Firefox?
Comment 5•8 years ago
|
||
Randell Jesup [:jesup] caught this crash after "Running central for a week with 2000+ tabs on linux64 (maybe 100 or 150 loaded)". #0 0x00007f27d1365222 in js::jit::JitcodeGlobalTable::allocateTower(unsigned int) (freeList=0x7f27be43ea48) at ../../../js/src/jit/JitcodeMap.h:101 #1 0x00007f27d1365222 in js::jit::JitcodeGlobalTable::allocateTower(unsigned int) (this=this@entry=0x7f27be43e800, height=33) at /home/jesup/src/mozilla/head/js/src/jit/JitcodeMap.cpp:685 It's crashing because height=33 in frame #1, and JitcodeSkiplistTower::MAX_HEIGHT is 32. I'll attach more info from Jesup as an attachment.
Flags: needinfo?(shu)
Comment 6•8 years ago
|
||
Comment 7•8 years ago
|
||
I'm restricting this as a sec bug since my crash shows an 0xe5e5e5e5 signature (UAF). Likely either sec-crit or sec-high; I tend to a sec-crit for a UAF in the jit code with unclear origins. Note: linux64, opt build. I get occasional crashes of the Content process, and with a local build it's not useful to look at crash-stats/about:crashes, so I tend to leave GDB running on the content process, which is how I caught it. I'm attaching a more-complete log of my gdb session, with the changeset ID it was built from. NI dveditz for sec level assessment.
Comment 8•8 years ago
|
||
Attachment #8745013 -
Attachment is obsolete: true
Comment 9•8 years ago
|
||
Since this seems to take a long time to happen and we don't know of a particular way to trigger it at will it's not sec-critical, but sec-high seems like a reasonable worry level.
Flags: needinfo?(dveditz)
Keywords: sec-high
Reporter | ||
Updated•8 years ago
|
Comment 10•8 years ago
|
||
FYI: ~450 crashes in the last month with this signature. A number of them show e5e5 addresses. Crashes go back to at least 42.0. Some have slightly different calling stacks. Note looking at only maybe 10 I saw an uptime of 1140, so it's likely hittable fairly fast.
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(shu)
Assignee | ||
Comment 11•8 years ago
|
||
Attachment #8745528 -
Flags: review?(kvijayan)
Updated•8 years ago
|
Attachment #8745528 -
Flags: review?(kvijayan) → review+
Assignee | ||
Comment 12•8 years ago
|
||
Comment on attachment 8745528 [details] [diff] [review] Fix height computation for JitcodeSkiplistTower. [Security approval request comment] > How easily could an exploit be constructed based on the patch? It depends on a PRNG returning 0. Seems pretty hard. > Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Yeah, it makes it pretty obvious that it'll generate an out-of-bounds index. > Which older supported branches are affected by this flaw? I heard someone say 42. > If not all supported branches, which bug introduced the flaw? > Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? The same patch should apply. > How likely is this patch to cause regressions; how much testing does it need? Not likely.
Attachment #8745528 -
Flags: sec-approval?
Comment 13•8 years ago
|
||
(In reply to Shu-yu Guo [:shu] from comment #12) > Comment on attachment 8745528 [details] [diff] [review] > Fix height computation for JitcodeSkiplistTower. > > [Security approval request comment] > > How easily could an exploit be constructed based on the patch? > > It depends on a PRNG returning 0. Seems pretty hard. So I believe this would cause an out-of-bounds index; however this is happening way too many times for me to think it's a PRNG returning 0 causing it unless it's a) a 16-bit PRNG, or b) really, really badly seeded, or c) being called a truly stupendous number of times (possible, but I hope we're not). If it's none of those, then there's another source of the problem as well. (I hope not!)
Assignee | ||
Comment 14•8 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #13) > (In reply to Shu-yu Guo [:shu] from comment #12) > > Comment on attachment 8745528 [details] [diff] [review] > > Fix height computation for JitcodeSkiplistTower. > > > > [Security approval request comment] > > > How easily could an exploit be constructed based on the patch? > > > > It depends on a PRNG returning 0. Seems pretty hard. > > So I believe this would cause an out-of-bounds index; however this is > happening way too many times for me to think it's a PRNG returning 0 causing > it unless it's a) a 16-bit PRNG, or b) really, really badly seeded, or c) > being called a truly stupendous number of times (possible, but I hope we're > not). > Good points. The PRNG itself is some xorshift-looking thing with a citation: https://dxr.mozilla.org/mozilla-central/source/js/src/jit/JitcodeMap.cpp#665-670 a) rand_ is a uint32_t b) It's seeded as 0, this seems independently bad c) I suppose it could be. I'll compile a local thing and leave it running a bit and see how many iterations until it becomes 0
Assignee | ||
Comment 15•8 years ago
|
||
shu@pikashu ~ % cat test.cpp #include <stdio.h> #include <stdint.h> #include <limits.h> template<typename T> inline T RotateLeft(const T aValue, uint_fast8_t aShift) { return (aValue << aShift) | (aValue >> (sizeof(T) * CHAR_BIT - aShift)); } int main() { uint32_t rand_ = 0; uint64_t iters = 0; do { rand_ ^= RotateLeft(rand_, 5) ^ RotateLeft(rand_, 24); rand_ += 0x37798849; iters++; } while (rand_ != 0); fprintf(stdout, "\ntook %zu iters\n", iters); return 0; } shu@pikashu ~ % ./a.out took 574295818 iters Welp.
Comment 16•8 years ago
|
||
So that leaves "we're doing this millions of times per run", or it's something else. Perhaps drop a quick static counter and dump a printf every 100,000 hits, then browse? If you don't get any/many, it's not hitting 0 by luck (or there's another cause of this failure).
Comment 17•8 years ago
|
||
Wouldn't that patch always return 0 or 1 from generateTowerHeight? I'm new to the codebase, but it looks like to me that generateTowerHeight should ultimately return a value between 1 and 32? Wouldn't the solution be: return std::min(JitcodeSkiplistTower::MAX_HEIGHT, std::max(1U, result+1)
Assignee | ||
Comment 18•8 years ago
|
||
(In reply to Chad Dotson from comment #17) > Wouldn't that patch always return 0 or 1 from generateTowerHeight? > > I'm new to the codebase, but it looks like to me that generateTowerHeight > should ultimately return a value between 1 and 32? > > Wouldn't the solution be: > > return std::min(JitcodeSkiplistTower::MAX_HEIGHT, std::max(1U, result+1) Oh oops, yes you're right, not being very bright today. It should be a max(1, result), not min(1, result). The clipping at MAX_HEIGHT is controlled by the bit-counting loop itself, though that's not very clear.
Assignee | ||
Comment 19•8 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #16) > So that leaves "we're doing this millions of times per run", or it's > something else. Perhaps drop a quick static counter and dump a printf every > 100,000 hits, then browse? If you don't get any/many, it's not hitting 0 by > luck (or there's another cause of this failure). Yes, I'll try that. Chad, do you have Gecko Profiler Addon installed?
Flags: needinfo?(chad)
Comment 20•8 years ago
|
||
Comment on attachment 8745528 [details] [diff] [review] Fix height computation for JitcodeSkiplistTower. Review of attachment 8745528 [details] [diff] [review]: ----------------------------------------------------------------- Wouldn't this always return 0 or 1 from generateTowerHeight? What about this instead from generateTowerHeight: return std::min(JitcodeSkiplistTower::MAX_HEIGHT, std::max(1U, result+1)
Comment 21•8 years ago
|
||
(In reply to Shu-yu Guo [:shu] from comment #19) > (In reply to Randell Jesup [:jesup] from comment #16) > > So that leaves "we're doing this millions of times per run", or it's > > something else. Perhaps drop a quick static counter and dump a printf every > > 100,000 hits, then browse? If you don't get any/many, it's not hitting 0 by > > luck (or there's another cause of this failure). > > Yes, I'll try that. Chad, do you have Gecko Profiler Addon installed? No unfortunately.
Comment 22•8 years ago
|
||
I threw the printf in. Loading huffingtonpost into a tab, and letting it sit there (debug build) hits the generator 100000 times every few minutes - with one, idle (though fancy) page loaded. Wow..... Still not sure, but this may mean it could hit 0's occasionally. This also says "this is a potential perf hotspot".
Assignee | ||
Comment 23•8 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #22) > I threw the printf in. Loading huffingtonpost into a tab, and letting it > sit there (debug build) hits the generator 100000 times every few minutes - > with one, idle (though fancy) page loaded. Wow..... Still not sure, but > this may mean it could hit 0's occasionally. > > This also says "this is a potential perf hotspot". Yeah this is very surprising to me, since that code, IIRC, shouldn't be hot at all if profiling is turned off. Let me try to page this stuff back in...
Assignee | ||
Comment 24•8 years ago
|
||
NIing Kannan since he designed this. Any ideas on how to make this less hot? The comment suggests that we allocate a JitcodeGlobalEntry for every baseline compile because we can't do this post-hoc for on-stack scripts. I'm actually not sure why that is. When we flip on profiling, can't we just take a list of all the on-stack scripts and make JitcodeGlobalEntries for them then?
Flags: needinfo?(chad) → needinfo?(kvijayan)
Comment 25•8 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #22) > I threw the printf in. Loading huffingtonpost into a tab, and letting it > sit there (debug build) hits the generator 100000 times every few minutes - > with one, idle (though fancy) page loaded. Wow..... Still not sure, but > this may mean it could hit 0's occasionally. > > This also says "this is a potential perf hotspot". The tests I've run have included little human interaction after initial page load. Though the content does continue to update.
Updated•8 years ago
|
status-firefox-esr45:
--- → affected
Comment 26•8 years ago
|
||
Comment on attachment 8745528 [details] [diff] [review] Fix height computation for JitcodeSkiplistTower. Clearing sec-approval? while y'all debate what's going on. Please set it to ? again when you have a final patch for sure.
Attachment #8745528 -
Flags: sec-approval?
Comment 27•8 years ago
|
||
(In reply to Shu-yu Guo [:shu] from comment #24) > NIing Kannan since he designed this. > > Any ideas on how to make this less hot? The comment suggests that we > allocate a JitcodeGlobalEntry for every baseline compile because we can't do > this post-hoc for on-stack scripts. > > I'm actually not sure why that is. When we flip on profiling, can't we just > take a list of all the on-stack scripts and make JitcodeGlobalEntries for > them then? This is a reasonable alternative approach. I didn't think it would be necessary, though since we don't expect script compilation to happen a ton. Is it Ion IC generation that's causing it? Or baseline compilation? Fundamentally, compilation should be "rare", and this skiplist generation code should be very cheap compared to compiling ANYTHING. So the reasoning was that it would be a small delta compared to the cost of compilation. If that reasoning is off, it makes sense to implement the alternative you have suggested. The key requirement is: there should exist entries in the map for every piece of jitcode before the first sample is taken.
Flags: needinfo?(kvijayan)
Comment 28•8 years ago
|
||
Guys - this is a sec-high UAF - we need to land a fix ASAP and uplift to aurora/beta and probably to ESR45. While I really want to understand/fix if it's an indicator we're doing something too often (perf is important!) that should NOT block landing a fix for the crash/security fix. I keep my local build sitting in GDB (just in case, since crash-stats reports are worthless for local builds), and all my local crashes (every few days-ish) are this. The answer to the open question of "did we analyze what caused this correctly" seems to be yes, and it's being called a lot more often than we anticipated.
Flags: needinfo?(shu)
Comment 29•8 years ago
|
||
BTW, I verified that when I crash with this, JtcodeGlobalTable::rand_ is 0
Assignee | ||
Comment 30•8 years ago
|
||
Oops, sorry I forgot to land this! I will land a fix now and worry about the performance later.
Flags: needinfo?(shu)
Assignee | ||
Comment 31•8 years ago
|
||
Or rather, I will re-request sec-approval for the fix now and worry about performance in the meantime.
Assignee | ||
Comment 32•8 years ago
|
||
Carrying r=djvj with min/max typo fixed.
Attachment #8745528 -
Attachment is obsolete: true
Attachment #8747597 -
Flags: review+
Assignee | ||
Comment 33•8 years ago
|
||
Comment on attachment 8747597 [details] [diff] [review] Fix height computation for JitcodeSkiplistTower. We need to land a security fix; perf problems will be looked at later. [Security approval request comment] How easily could an exploit be constructed based on the patch? It's clear there's a bounds issue with returning 0. The patch itself doesn't make it clear how to get the PRNG to return 0---which we know happens with running it enough times---but isn't mentioned by the code. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? On the nature of the problem, but again, not on the fact that the PRNG can return 0 if run for long enough. Which older supported branches are affected by this flaw? All branches I think, this landed a while ago. If not all supported branches, which bug introduced the flaw? Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Should not need different backport. How likely is this patch to cause regressions; how much testing does it need? Will not cause regressions.
Attachment #8747597 -
Flags: sec-approval?
Updated•8 years ago
|
Group: core-security → javascript-core-security
Comment 34•8 years ago
|
||
Comment on attachment 8747597 [details] [diff] [review] Fix height computation for JitcodeSkiplistTower. sec-approval+. We should backport this to branches, including ESR45.
Attachment #8747597 -
Flags: sec-approval? → sec-approval+
Updated•8 years ago
|
status-firefox49:
--- → affected
tracking-firefox47:
--- → ?
tracking-firefox48:
--- → ?
tracking-firefox49:
--- → +
tracking-firefox-esr45:
--- → ?
Comment 35•8 years ago
|
||
Assuming fix is accepted, what is the general timeline for it appearing in the different builds?
Assignee | ||
Comment 36•8 years ago
|
||
Setting checkin-needed until my SSH key is updated.
Keywords: checkin-needed
Comment 37•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/e968a9db2350
Assignee: nobody → shu
Keywords: checkin-needed
Comment 38•8 years ago
|
||
this caused a bustage and had to be backed out -> https://treeherder.mozilla.org/logviewer.html#?job_id=27439259&repo=mozilla-inbound
Flags: needinfo?(shu)
Comment 39•8 years ago
|
||
Looks like a min/max name collision from windows.h.
Comment 40•8 years ago
|
||
just wrap std::max with parens. return (std::max)(1U, result);
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(shu)
Assignee | ||
Comment 41•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/6a30c1484e91
Comment 42•8 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/6a30c1484e91
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Updated•8 years ago
|
Group: javascript-core-security → core-security-release
Hello Shu, could you please nominate patches for uplift to Beta47, Aurora48, ESR45? Thanks!
Flags: needinfo?(shu)
Comment 44•8 years ago
|
||
FYI, I've had no crashes on mozilla-central (with 2000+ tabs, 100+ loaded) in ~ two weeks with this patch
Assignee | ||
Comment 45•8 years ago
|
||
Comment on attachment 8747597 [details] [diff] [review] Fix height computation for JitcodeSkiplistTower. [Approval Request Comment] If this is not a sec:{high,crit} bug, please state case for ESR consideration: User impact if declined: UAF crash when leaving browser open for a long time Fix Landed on Version: 49? Risk to taking this patch (and alternatives if risky): I don't know String or UUID changes made by this patch: None See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info. Approval Request Comment [Feature/regressing bug #]: 1004831 [User impact if declined]: UAF crash if leaving browser open for a long time [Describe test coverage new/current, TreeHerder]: on m-c [Risks and why]: I don't know [String/UUID change made/needed]: None
Flags: needinfo?(shu)
Attachment #8747597 -
Flags: approval-mozilla-esr45?
Attachment #8747597 -
Flags: approval-mozilla-beta?
Attachment #8747597 -
Flags: approval-mozilla-aurora?
Updated•8 years ago
|
Comment on attachment 8747597 [details] [diff] [review] Fix height computation for JitcodeSkiplistTower. Sec-high, Jesup cannot repro crashes after the fix (thanks for testing!), Aurora48+, Beta47+, ESR45+
Attachment #8747597 -
Flags: approval-mozilla-esr45?
Attachment #8747597 -
Flags: approval-mozilla-esr45+
Attachment #8747597 -
Flags: approval-mozilla-beta?
Attachment #8747597 -
Flags: approval-mozilla-beta+
Attachment #8747597 -
Flags: approval-mozilla-aurora?
Attachment #8747597 -
Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/ef7602300e89 https://hg.mozilla.org/releases/mozilla-beta/rev/37ce493d5746 https://hg.mozilla.org/releases/mozilla-esr45/rev/d30748143c21
Updated•8 years ago
|
Whiteboard: [adv-main47+][adv-esr45.2+]
Comment 48•8 years ago
|
||
(In reply to Kannan Vijayan [:djvj] from comment #27) > (In reply to Shu-yu Guo [:shu] from comment #24) > > NIing Kannan since he designed this. > > > > Any ideas on how to make this less hot? The comment suggests that we > > allocate a JitcodeGlobalEntry for every baseline compile because we can't do > > this post-hoc for on-stack scripts. > > > > I'm actually not sure why that is. When we flip on profiling, can't we just > > take a list of all the on-stack scripts and make JitcodeGlobalEntries for > > them then? > > This is a reasonable alternative approach. I didn't think it would be > necessary, though since we don't expect script compilation to happen a ton. > Is it Ion IC generation that's causing it? Or baseline compilation? > > Fundamentally, compilation should be "rare", and this skiplist generation > code should be very cheap compared to compiling ANYTHING. So the reasoning > was that it would be a small delta compared to the cost of compilation. If > that reasoning is off, it makes sense to implement the alternative you have > suggested. > > The key requirement is: there should exist entries in the map for every > piece of jitcode before the first sample is taken. I suspect you should file a followup to find out if this is being called too much, and if so why.
Flags: needinfo?(kvijayan)
Updated•8 years ago
|
Whiteboard: [adv-main47+][adv-esr45.2+] → [adv-main47+][adv-esr45.2+][post-critsmash-triage]
Updated•8 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•