Closed
Bug 1150337
Opened 8 years ago
Closed 8 years ago
Odin: crash that appears to be related to typed array access
Categories
(Core :: JavaScript Engine: JIT, defect)
Tracking
()
RESOLVED
FIXED
mozilla40
Tracking | Status | |
---|---|---|
firefox38 | --- | unaffected |
firefox39 | + | fixed |
firefox40 | + | fixed |
firefox-esr31 | --- | unaffected |
firefox-esr38 | --- | unaffected |
b2g-v1.4 | --- | unaffected |
b2g-v2.0 | --- | unaffected |
b2g-v2.0M | --- | unaffected |
b2g-v2.1 | --- | unaffected |
b2g-v2.1S | --- | unaffected |
b2g-v2.2 | --- | unaffected |
b2g-master | --- | fixed |
People
(Reporter: dougc, Assigned: sunfish)
Details
(Keywords: sec-critical)
Attachments
(3 files, 2 obsolete files)
297 bytes,
application/javascript
|
Details | |
14.43 KB,
patch
|
sunfish
:
review+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
19.09 KB,
patch
|
abillings
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
Seeing some unexpected behaviour testing asm.js pointer-masking code. While trying to narrow it down with a differential analysis, found some code that crashes the browser. It looks like an int8 typed array access with a dehoisted index offset, and the index adds up to be beyond the 32 bit boundary. Perhaps some issue with the typed system, or an edge case in this optimization. Are you aware of any issues here? 0x00007fffab0cb50e: mov %r10b,(%r15,%r8,1) 0x00007fffab0cb512: mov %esi,%r8d 0x00007fffab0cb515: imul $0xffffa7e7,%r8d,%r8d 0x00007fffab0cb51c: add %edi,%r8d 0x00007fffab0cb51f: imul $0xffff492e,%eax,%eax 0x00007fffab0cb525: add %eax,%r8d 0x00007fffab0cb528: sar $0x10,%r8d => 0x00007fffab0cb52c: movsbl 0x97a8(%r15,%r8,1),%eax (gdb) info reg r8 r8 0xfffffffb 4294967291
Assignee | ||
Comment 1•8 years ago
|
||
Marking s-s. If possible, please attach a testcase.
Group: javascript-core-security
Reporter | ||
Comment 2•8 years ago
|
||
No simple test case yet, but it's from this asm.js code from poppler which might give some clues. function __ZN9DCTStream10readMCURowEv(i1) { ... // This is the bad HEAP8 access. i13 = HEAP8[(i16 + (Math_imul(i18, -22553) | 0) + (Math_imul(i17, -46802) | 0) >> 16) + 38824 >> 0] | 0; This appears to have been translated from: // color space conversion if (colorXform) { // convert YCbCr to RGB if (numComps == 3) { for (y2 = 0; y2 < mcuHeight; ++y2) { for (x2 = 0; x2 < mcuWidth; ++x2) { pY = rowBuf[0][y2][x1+x2]; pCb = rowBuf[1][y2][x1+x2] - 128; pCr = rowBuf[2][y2][x1+x2] - 128; pR = ((pY << 16) + dctCrToR * pCr + 32768) >> 16; rowBuf[0][y2][x1+x2] = dctClip[dctClipOffset + pR]; pG = ((pY << 16) + dctCbToG * pCb + dctCrToG * pCr + 32768) >> 16; rowBuf[1][y2][x1+x2] = dctClip[dctClipOffset + pG]; pB = ((pY << 16) + dctCbToB * pCb + 32768) >> 16; rowBuf[2][y2][x1+x2] = dctClip[dctClipOffset + pB]; } } // convert YCbCrK to CMYK (K is passed through unchanged)
Reporter | ||
Comment 3•8 years ago
|
||
Might have overlooked the sign extensions of the index register, %rd8 to %r8 above. If %rd8 were signed extended to %r8, rather than being zero extended, then perhaps the address calculation would be valid. Perhaps this optimization should only be applied when the lhs of the addition is known to be positive, not just when the resulting index is known to be within bounds. This code pattern is also interesting as it computes a signed index into a table, and then adds 32768 to make it a non-negative index. This pattern probably needs to be optimized well too, so perhaps skipping the offset dehoisting optimization in this case is not a good long term fix. Also this code pattern challenges an inline bounds checking strategy that assumes small negative pre-offset indexes are not expected and can be handled OOL. fwiw I see that LLVM has a lot of code to dealing with sign extension optimization for 64 bit processors, and it is still not complete. Related issues are blocking V8 and JSC performance - the 32 bit index to 64 bit addressing mode optimization. This might be an area needing some development work.
![]() |
||
Comment 4•8 years ago
|
||
dougc: by any chance does the problem you're seeing go away when you create a fresh profile? I ask b/c it's possible you are getting a cache hit from asm.js-cached code from a previous build. While the asm.js cache is keyed on FF buildid (which changes on every build), the buildid only changes when the 'firefox' executable is rebuilt (but not, e.g., when you rebuild just libxul.so). I ask b/c the displacement 0x97a8 is > AsmJSCheckedImmediateRange which shouldn't be possible when !MAsmJSHeapAccess::needsBoundsCheck in tryAddDisplacement (which should only be true for constant or masked indices).
Reporter | ||
Comment 5•8 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #4) > dougc: by any chance does the problem you're seeing go away when you create > a fresh profile? Yes, it still crashes with a fresh profile. It reports compiling the asm.js module, not just loading from the cache. Massive modifies the asm.js code to frustrate caching. > I ask b/c the displacement 0x97a8 is > AsmJSCheckedImmediateRange which > shouldn't be possible when !MAsmJSHeapAccess::needsBoundsCheck in > tryAddDisplacement (which should only be true for constant or masked > indices). Guessing that it's derived that the bounds check is not needed. It should be possible to write a simple test for this. Note the value of r8 which as an int32 would be negative and adding the offset would make the 32 bit index positive. The problem might be that r8d was not sign extended when used as r8 in the 64 bit address mode calculation so adding the offset gives an unexpected value.
![]() |
||
Comment 6•8 years ago
|
||
Well, we actually *want* all int32's that flow into loads/stores to be zero in the high uint32; this guarantees they hit w/in the 4gb+AsmJSCheckedImmediateRange when interpreted as a signed 64-bit index in the load/store. Could you try commenting out the removeBoundsCheck() calls in jit/RangeAnalysis.cpp for asm.js to see if that is the culprit? I forgot this is the third case where we can have !needsBoundsCheck.
Reporter | ||
Comment 7•8 years ago
|
||
This looks like an example, but is does not crash the js shell here. Range analysis can derive that the bounds check is not necessary so the optimization is applied. When %rax is 0xffffffff this would appear to access memory well beyond the array buffer, and beyond the reserved area? movsbl 0xd000(%r15,%rax,1), %eax
![]() |
||
Comment 8•8 years ago
|
||
Ah, is the issue here that RangeAnalysis needs to take into account MAsmJSHeapAccess::offset()?
Reporter | ||
Comment 9•8 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #6) > Well, we actually *want* all int32's that flow into loads/stores to be zero > in the high uint32; this guarantees they hit w/in the > 4gb+AsmJSCheckedImmediateRange when interpreted as a signed 64-bit index in > the load/store. Yes, but then the assumption that '4gb+AsmJSCheckedImmediateRange' is adequate might not be valid, and it might need 8gb? > Could you try commenting out the removeBoundsCheck() calls in > jit/RangeAnalysis.cpp for asm.js to see if that is the culprit? I forgot > this is the third case where we can have !needsBoundsCheck. Disabling the dehoisting optimization avoids the crash. The needsBoundsCheck optimization does not look like the cause, rather it tickles the problem. It appears necessary to avoid the dehoisting optimization on 64-bit when the pre-offset index can be negative, and perhaps this could be flagged during range analysis too, but this will hurt performance on this code pattern. This is also the solution proposed for V8 which does not use the memory protection scheme, see 'Check that the argument index is positive to ensure that the loss of truncate is not an issue.' in https://codereview.chromium.org/860283004/ If the allocation can be expanded to 8gb then this would be better for performance? Note that this code pattern frustrates pointer-masking too, and requires the access be emitted as a[(i+c)&m] which will result in less optimal code. It's a challenging pattern and it's importance was missed.
![]() |
||
Comment 10•8 years ago
|
||
(In reply to Douglas Crosher [:dougc] from comment #9) > Yes, but then the assumption that '4gb+AsmJSCheckedImmediateRange' is > adequate might not be valid, and it might need 8gb? No, the intention was that we only allow immediates < AsmJSCheckedImmediateRange. (We could of course have AsmJSCheckedImmediateRange = 4gb.) > > Could you try commenting out the removeBoundsCheck() calls in > > jit/RangeAnalysis.cpp for asm.js to see if that is the culprit? I forgot > > this is the third case where we can have !needsBoundsCheck. > > The needsBoundsCheck > optimization does not look like the cause, rather it tickles the problem. Stepping through the example in comment 7, this does appear to be the cause: RangeAnalysis::analyze is calling removeBoundsCheck() when it shouldn't. minHeapLength is 65536 (due to the constant store) and, despite the fact that the result can clearly be negative, range->lower() == 53120 and range->upper() == 53375. I don't know enough about the invariants of range analysis to know if this is a bug in range analysis or the interpretation of the results, but given that the test includes (range->lower() >= 0), it seems like the former. On a side note, instrumenting right now and running over all the humble bundle and Unity demos, this range analysis only seems to call removeBoundsCheck() (when needsBoundCheck() was previously true) on <1% of total AsmJSHeapAccesses.
Reporter | ||
Comment 11•8 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #10) ... > > > Could you try commenting out the removeBoundsCheck() calls in > > > jit/RangeAnalysis.cpp for asm.js to see if that is the culprit? I forgot > > > this is the third case where we can have !needsBoundsCheck. > > > > The needsBoundsCheck > > optimization does not look like the cause, rather it tickles the problem. > > Stepping through the example in comment 7, this does appear to be the cause: > RangeAnalysis::analyze is calling removeBoundsCheck() when it shouldn't. The index is always within bounds in the test example, so it is correct to remove the bounds check. The naming of 'needsBoundsCheck' leads to endless confusion and I really wish it were renamed to provenInBounds or something similar. > minHeapLength is 65536 (due to the constant store) and, despite the fact > that the result can clearly be negative, range->lower() == 53120 and > range->upper() == 53375. I don't know enough about the invariants of range > analysis to know if this is a bug in range analysis or the interpretation of > the results, but given that the test includes (range->lower() >= 0), it > seems like the former. It's only the pre-offset index that can be negative, the 'i' in 'i+c'. The bounds check tests the index 'i+c', not 'i'. I don't see an issue with range analysis. If the protected area is not extended to 8gb then the suggestion is to check the range of 'i' in 'i+c' and skip the addressing mode matching when 'i' can be negative. There might have been a test for this in the past, in the x64 lowering code, but it has been reworked. > On a side note, instrumenting right now and running over all the humble > bundle and Unity demos, this range analysis only seems to call > removeBoundsCheck() (when needsBoundCheck() was previously true) on <1% of > total AsmJSHeapAccesses. The loop analysis has never been extended to deal with a stride other than one and it might not be working with the asm.js patterns. Lots of work to do here. Try code generated by Emscripten with the POINTER_MASKING option and you'll see most bounds checking is removed.
![]() |
||
Comment 12•8 years ago
|
||
(In reply to Douglas Crosher [:dougc] from comment #11) > The naming of 'needsBoundsCheck' leads to endless > confusion and I really wish it were renamed to provenInBounds or something > similar. Patches welcome. I don't think the name isn't the source of the confusion here. Rather, I think the root bug is the mistaken assumption that if 'i+c' is in-bounds, then so is 'i' (and thus we can use any immediate displacement that fits (which is false in the case where i is negative or large positive with large c)) combined with the fact that 64-bit effective addressing uses 64-bit arith so out-of-bounds i+c just goes farther out-of-bounds instead of going back in-bounds as it would on 32-bit. I agree that the most direct solution is to map 8gb and I'm inclined to do that. I think we could also remove AsmJSCheckedImmediateRange since it'd no longer be necessary. > It's only the pre-offset index that can be negative, the 'i' in 'i+c'. The > bounds check tests the index 'i+c', not 'i'. I don't see an issue with range > analysis. Ahh, I see, I had assumed range analysis was running after we'd folded the +c into the heap access but I see now that EAA runs right after and ins->ptr() is still the add. > The loop analysis has never been extended to deal with a stride other than > one and it might not be working with the asm.js patterns. Lots of work to do > here. Try code generated by Emscripten with the POINTER_MASKING option and > you'll see most bounds checking is removed. But I assume that removal happens a priori in CheckArrayAccess(), not in RangeAnalysis::analyze?
Assignee | ||
Comment 13•8 years ago
|
||
On PTO thus week, but at a quick read through the conclusion here seems right. This is a case I missed in bug 986981. Extending the guard region isn't the only possible fix, but it seems like it should work. Also, it'd only be 6GiB, not 8, since the displacement is signed. Negative offsets are a different kettle of fish.
![]() |
||
Comment 14•8 years ago
|
||
6gb; sold!
Comment 15•8 years ago
|
||
How bad is this, security-wise? Would this allow any kind of controllable crash, or is it always going to crash on a particular address?
Assignee | ||
Comment 16•8 years ago
|
||
Carefully crafted web content can read and write arbitrary memory within a 2 GiB region in the address space immediately following the asm.js heap area, on x64. The problem was introduced in bug 986981.
Updated•8 years ago
|
Keywords: sec-critical
Updated•8 years ago
|
status-firefox38:
--- → unaffected
status-firefox39:
--- → affected
status-firefox40:
--- → affected
Comment 17•8 years ago
|
||
[Tracking Requested - why for this release]: sec-critical in the same general code area as one of our Pwn2Own exploits, so more people will be looking at it. Plus the demo crash is in a standard benchmark.
tracking-firefox39:
--- → ?
tracking-firefox40:
--- → ?
Updated•8 years ago
|
Updated•8 years ago
|
status-firefox-esr31:
--- → unaffected
Assignee | ||
Comment 19•8 years ago
|
||
This version also adds a testcase. Posting for review now.
Attachment #8589755 -
Attachment is obsolete: true
Attachment #8589785 -
Flags: review?(luke)
![]() |
||
Comment 20•8 years ago
|
||
Comment on attachment 8589785 [details] [diff] [review] ao-fix.patch Review of attachment 8589785 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/EffectiveAddressAnalysis.h @@ +20,5 @@ > + template<typename MAsmJSHeapAccessType> > + bool TryAddDisplacement(MAsmJSHeapAccessType *ins, int32_t o); > + > + template<typename MAsmJSHeapAccessType> > + void AnalyzeAsmHeapAccess(MAsmJSHeapAccessType* ins); Since this is non-static, I think the correct naming style is lowercase first letter of member functions. ::: js/src/jit/MIRGenerator.h @@ +259,5 @@ > + if (sizeof(intptr_t) == sizeof(int32_t)) > + return !access->needsBoundsCheck(); > + > + // Otherwise, only allow the checked size. > + return false; It'd be nice to document our case analysis a bit more to explain what we want to assume and why that's not true in 64-bit effective addressing. Also, can this function return the immediate limit instead of a bool (so we can more directly see the connection with this explanation)?
Attachment #8589785 -
Flags: review?(luke) → review+
Assignee | ||
Comment 21•8 years ago
|
||
Attached is an updated patch which addresses your review comments here and from irc. Carrying forward r+.
Attachment #8589785 -
Attachment is obsolete: true
Attachment #8589875 -
Flags: review+
Updated•8 years ago
|
status-b2g-v1.4:
--- → unaffected
status-b2g-v2.0:
--- → unaffected
status-b2g-v2.0M:
--- → unaffected
status-b2g-v2.1:
--- → unaffected
status-b2g-v2.1S:
--- → unaffected
status-b2g-v2.2:
--- → unaffected
status-b2g-master:
--- → affected
status-firefox-esr38:
--- → unaffected
Assignee | ||
Comment 22•8 years ago
|
||
Comment on attachment 8589875 [details] [diff] [review] ao-fix.patch [Security approval request comment] How easily could an exploit be constructed based on the patch? Once you know how to trigger the bug, accessing arbitrary memory in a 2 GiB region after the asm.js heap on x64 is pretty easy (and the included testcase demonstrates how). Finding something sensitive in that region without crashing the process probably requires some luck on systems with ASLR, especially given that this affects x64 only. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Yes, the test included in the patch paints a bulls-eye on the security problem. I can take the test out for now if that'd be better. Which older supported branches are affected by this flaw? mozilla-aurora If not all supported branches, which bug introduced the flaw? bug 986981 Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? I don't, but it'd be fairly easy to create, and pretty low risk. This code hasn't changed a lot since the aurora branch. How likely is this patch to cause regressions; how much testing does it need? Pretty unlikely. The main change in this patch is pretty straightforward.
Attachment #8589875 -
Flags: sec-approval?
Comment 23•8 years ago
|
||
Comment on attachment 8589875 [details] [diff] [review] ao-fix.patch sec-approval+ for trunk. If this applies "as is" to Aurora, please nominate it for that as well.
Attachment #8589875 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Comment 24•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/9f2ad189edc9
Assignee | ||
Comment 25•8 years ago
|
||
And a followup, https://hg.mozilla.org/integration/mozilla-inbound/rev/bd9cb02d4da8 to fix an assert which I had rearranged to avoid a compiler warning, incorrectly, and it failed on ARM.
Comment 26•8 years ago
|
||
Backed out for OOM failures. https://hg.mozilla.org/integration/mozilla-inbound/rev/9015759a1ebc https://treeherder.mozilla.org/logviewer.html#?job_id=8886415&repo=mozilla-inbound
Reporter | ||
Comment 27•8 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #26) > Backed out for OOM failures. Notice it failed on a 32 bit machine, so might not have been caused by the actual patch to the compiler which would only have affected 64-bit, but rather the tests using too much memory? The amount of memory used in some of these test files caused problems testing on ARM devices with limited memory (which was necessary before we had the simulator). Perhaps just split the test file.
Assignee | ||
Comment 28•8 years ago
|
||
The problem was in the testcase, which attempted to create a very large array in an attempt to maximize the likelyhood of detecting a problem on x64, but it led to OOMs on x32. I've now reduced the size of the array in the testcase, which fixes the problem on Windows XP on try. https://hg.mozilla.org/integration/mozilla-inbound/rev/a1b728c86053
Comment 29•8 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/a1b728c86053
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Assignee | ||
Comment 30•8 years ago
|
||
Approval Request Comment [Feature/regressing bug #]: 986981 [User impact if declined]: sec-critical bug [Describe test coverage new/current, TreeHerder]: TreeHerder [Risks and why]: Pretty low-risk; the actual patch here is fairly simple, and it has seen a lot of testing. [String/UUID change made/needed]: none
Attachment #8597540 -
Flags: approval-mozilla-aurora?
Updated•8 years ago
|
Attachment #8597540 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 31•8 years ago
|
||
AWFY detected a regression/improvement on: - slave: Mac OS X 10.10 64-bit (Mac Pro, shell) - mode: Ion Regression(s)/Improvement(s): - asmjs-apps: zlib-throughput: -0.79% (improvement) Recorded range: - http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=e08e45fd8922&tochange=a1b728c86053 More details: http://arewefastyet.com/regressions/#/regression/734156
Updated•8 years ago
|
Group: javascript-core-security → core-security
Updated•8 years ago
|
Group: core-security
You need to log in
before you can comment on or make changes to this bug.
Description
•