ARM asm.js crash with c64 emulator crash [@ js::jit::Assembler::bind(js::jit::Label*, js::jit::BufferOffset) ]

VERIFIED FIXED in Firefox 29, Firefox OS v1.2

Status

()

Core
JavaScript Engine: JIT
VERIFIED FIXED
4 years ago
2 years ago

People

(Reporter: vlad, Assigned: dougc)

Tracking

({sec-high})

unspecified
mozilla31
ARM
All
sec-high
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox28 wontfix, firefox29 verified, firefox30 verified, firefox31+ verified, firefox-esr24 verified, b2g18 wontfix, b2g-v1.1hd wontfix, b2g-v1.2 fixed, b2g-v1.3 fixed, b2g-v1.3T fixed, b2g-v1.4 fixed, b2g-v2.0 fixed)

Details

(Whiteboard: [games:p2][adv-main29+])

Attachments

(1 attachment, 2 obsolete attachments)

https://crash-stats.mozilla.com/report/index/e6472076-5b5b-4be3-93b7-3f1482130923

Fennec Nightly 2013-09-23, Nexus 7 (2013).  (Also happened with 09-12 before I upgraded.)

STR:
1. Open http://vice.janicek.co/x64/
2. Click on "play" button to start

Odd note in the crash report:
"Build Architecture Info 	ARMv0 | 4"
What does that mean?
(Assignee)

Comment 1

4 years ago
Reproducible on Android and B2G.

Program received signal SIGSEGV, Segmentation fault.
getInst (off=<optimized out>, this=<optimized out>) at ../../../src/js/src/jit/shared/IonAssemblerBuffer.h:207
207	            for (; cur != NULL; cur = cur->getPrev(), cur_off -= cur->size()) {


202	            cur = tail;
203	            cur_off = bufferSize;
204	        }
205	        int count = 0;
206	        if (local_off < cur_off) {
207	            for (; cur != NULL; cur = cur->getPrev(), cur_off -= cur->size()) {
208	                if (local_off >= cur_off) {
209	                    local_off -= cur_off;
210	                    break;
211	                }


#0  getInst (off=<optimized out>, this=<optimized out>) at ../../../src/js/src/jit/shared/IonAssemblerBuffer.h:207
#1  editSrc (bo=<optimized out>, this=<optimized out>) at ../../../src/js/src/jit/arm/Assembler-arm.h:1228
#2  nextLink (next=<synthetic pointer>, b=<optimized out>, this=<optimized out>) at ../../../src/js/src/jit/arm/Assembler-arm.cpp:2279
#3  js::jit::Assembler::bind (this=0xbeb4ecd8, label=0xbeb4eaf4, boff=<optimized out>) at ../../../src/js/src/jit/arm/Assembler-arm.cpp:2305
#4  0xb6666ec8 in patchableCallPreBarrier<js::jit::Address> (type=js::jit::MIRType_Value, address=..., this=0xbeb4ecd8) at ../../../src/js/src/jit/IonMacroAssembler.h:660
#5  js::jit::BaselineCompiler::emit_JSOP_SETALIASEDVAR (this=0xbeb4ecc8) at ../../../src/js/src/jit/BaselineCompiler.cpp:1874
#6  0xb6667790 in js::jit::BaselineCompiler::emitBody (this=0xbeb4ecc8) at ../../../src/js/src/jit/BaselineCompiler.cpp:694
#7  0xb6668498 in js::jit::BaselineCompiler::compile (this=0xbeb4ecc8) at ../../../src/js/src/jit/BaselineCompiler.cpp:93
#8  0xb65617ba in BaselineCompile (cx=0xb2cf7500, script=...) at ../../../src/js/src/jit/BaselineJIT.cpp:233
#9  0xb65618d0 in CanEnterBaselineJIT (cx=0xb2cf7500, script=..., osr=<optimized out>) at ../../../src/js/src/jit/BaselineJIT.cpp:298
#10 0xb6561a10 in js::jit::CanEnterBaselineAtBranch (cx=0xb2cf7500, fp=0xb390d078, newType=<optimized out>) at ../../../src/js/src/jit/BaselineJIT.cpp:317
#11 0xb6442ee2 in Interpret (cx=0xb2cf7500, state=...) at ../../../src/js/src/vm/Interpreter.cpp:1543
#12 0xb6446a44 in js::RunScript (cx=0xb2cf7500, state=...) at ../../../src/js/src/vm/Interpreter.cpp:419
#13 0xb64479d2 in RunScript (state=..., cx=0xb2cf7500) at ../../../src/js/src/vm/Interpreter.cpp:388
...
OS: Android → All
well, that code should be thoroughly tested.  Is it possible that we've run out of memory or something, and didn't notice until there?
(Assignee)

Comment 3

4 years ago
Testing with a debug build keeps crashing on an assertion failure
see bug 921809. Shall explore further when this has been addressed.
Depends on: 921809
(Assignee)

Comment 4

4 years ago
With 91809 fixed, the debugger suggests that it is overflowing the limited
range of the ARM branch instruction encoding when storing the buffer offsets
for unbound labels, and doing so in baseline compiled code.

It might be useful to ask someone familiar with the baseline compiler to
take a look at it.  In particular is it possible that the baseline compiler
is compiling all the data in the literal vectors into the code?
The emulator includes many disk images in the JS x64.js file,
and perhaps this might account for the huge baseline compiled
code buffers.

It might also be necessary to explore how to handle labels and
branches in large code buffers on the ARM backend.

One option when a label is unbound might be to store an offset
from the branch rather than the buffer offset.  This might at
least handle short branches within a much larger code buffer,
which might be the case here.  Although perhaps this alone
would not be big help.
Whiteboard: [games:p2]
(Assignee)

Comment 5

4 years ago
Narrowed down the main source of the code bloat.  It is being generated by a huge number of calls to: js::jit::BaselineCompiler::emit_JSOP_INITELEM_ARRAY

It does appear that the JS is being translated into JS ops that initialize the vectors and that the baseline compiler emits code for each OP leading to a huge code object.

Might it have been possible for these vectors to have been stored as literal objects rather than being parsed to JS ops that construct them?

If not then might it be possible for the baseline compiler to recognize and optimize such initialization sequences?

These appear to be generated from source code of the form:
r.FS_createDataFile("/bin/C64","pc64.vpl",[35,10,35,32,86,73,67, ....])
Flags: needinfo?(jdemooij)
Douglas, how many elements are there in these array literals? We optimize such literals in the global scope when we know they are evaluated only once, but not inside functions or loops.

We've been thinking about adding COW arrays (bug 919658) and that optimization should help here too.
Flags: needinfo?(jdemooij)
(Assignee)

Comment 7

4 years ago
(In reply to Jan de Mooij [:jandem] from comment #6)
> Douglas, how many elements are there in these array literals? We optimize
> such literals in the global scope when we know they are evaluated only once,
> but not inside functions or loops.

Thank you for the info.  There are around 350k elements combined in the array literals in this example.  It seems to generate a lot of baseline compiled code to initialize them.

Moving these literal vectors to the top level worked around the problem and it was possible to start the C64 emulator on an ARM tablet.  Emscripten appears to emit the memory FS initialization at the top level by default so this bug might not be that significant an issue because there is a good workaround.
Group: core-security
Fwiw, marked s-s because it was raised as a possible security issue. Please unhide the bug or let us know if the crash is confirmed to be harmless. Maybe dougc or jandem can tell more :)
(Assignee)

Comment 9

4 years ago
The crash in the ARM backend is not limited to this example, or to large literal arrays in baseline compiled code.  Any huge chunk of code could overflow the ARM branch instruction encoding limitation currently exposed.  Randomly crashing is not a good state to be in, and it would be better for the ARM backend to at least catch the overflow at bail out.
(In reply to Douglas Crosher [:dougc] from comment #9)
> Any huge chunk of code could
> overflow the ARM branch instruction encoding limitation currently exposed. 
> Randomly crashing is not a good state to be in, and it would be better for
> the ARM backend to at least catch the overflow at bail out.

needinfo? from Marty, this sounds bad.
Component: JavaScript Engine → JavaScript Engine: JIT
Flags: needinfo?(mrosenberg)
Keywords: sec-high
So there was an assertion on that path, but I was later informed that those don't exist in non-debug builds.  I have a patch that hopefully turns it into a code path that will cause us to abort compilation.  I think there is another path lurking here that can be exposed when we fail much later, during the linking stage.

I'm needinfo'ing luke because I'm not sure if there are any precautions we must take for bailing during the code generation phase of ASM.js
Flags: needinfo?(mrosenberg) → needinfo?(luke)
Created attachment 815957 [details] [diff] [review]
failOutOfRange-r0.patch
Attachment #815957 - Flags: review?(dtc-moz)

Comment 13

4 years ago
Will this failure ultimately bubble up as a 'return false' from GenerateLIR?  If so, that's fine, we'll emit a warning "internal compiler failure (probably out of memory)".  Is some sort of overflow what is happening here?
Flags: needinfo?(luke)
(Assignee)

Comment 14

4 years ago
Comment on attachment 815957 [details] [diff] [review]
failOutOfRange-r0.patch

Review of attachment 815957 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you, the patch is an incremental improvement. Could you please move the check after 'old' is initialized.

It appears to still allow bad branch offsets to pass through here in the case of a bound label and perhaps some more cases could be easily handled.

I would not that this bail out path is rather poorly tested, and when hit on constant pool failures it often results in crashes elsewhere, but this is probably a separate issue and the path is probably needed to OOM.

Could I ask what state the assembler is in to support larger branches?

Perhaps a collection of 'uses' could be added to the label object so that uses that can not be linked in a list via the instruction encoding could be spilled into this collection.  Alternatively could the full 32 bits of the instruction be used for the linked list, and the instruction encoded later?

This still leaves the issue of encoding a larger jump.  Does the assembler support variable length instruction sequences?  Might it be necessary to store the target in the constant pool, and might it be necessary to reserve a constant pool slot for every unbound jump just in case it is needed?

::: js/src/jit/arm/Assembler-arm.cpp
@@ +1959,5 @@
>      BufferOffset ret;
>      // See if the list was empty :(
>      if (l->used()) {
> +        if (!BOffImm::isInRange(old)) {
> +            m_buffer.bail();

Move after 'old' is initialized.
Attachment #815957 - Flags: review?(dtc-moz)
Marty, Doug, is this still an issue? Can we land the patch?
Flags: needinfo?(mrosenberg)
Flags: needinfo?(dtc-moz)
(Assignee)

Comment 16

4 years ago
Created attachment 8344328 [details] [diff] [review]
If a branch is out of range, just fail to compile, rather than attempt to throw an assertion, and continue running.

The patch needs to at least have a use-before-initializing error corrected.

With this patch it is now possible to run the C64 emulator on an ARM system.  It hits one of the checks in this patch, but keeps going without crashing.  There must have been some other fixes to the bailout paths because the bailout just caused other errors last time I checked.
Flags: needinfo?(dtc-moz)
(Assignee)

Comment 17

4 years ago
Comment on attachment 815957 [details] [diff] [review]
failOutOfRange-r0.patch

Review of attachment 815957 [details] [diff] [review]:
-----------------------------------------------------------------

r+ if the initialization of 'old' is moved before its use in the check, see comment 14 and attachment 8344328 [details] [diff] [review].
Attachment #815957 - Flags: review+
How far back does this go and when can we get the reviewed patch checked in?
Group: javascript-core-security
Can we get someone to check in this six month old patch?
Flags: needinfo?(nihsanullah)
(Assignee)

Comment 20

4 years ago
fwiw The wider problem of supporting large code objects on the ARM backend was tackled in bug 760642 and is being worked on in bug 972710.  There is still a lot of work to do, but there is a user repo with the current code and it runs this asm.js demo without crashing or bailing out.
Depends on: 972710
(Assignee)

Comment 21

4 years ago
Created attachment 8394554 [details] [diff] [review]
Guard against branches being out of range and bail out of compilation if so.

Retested by adding some verbose debug output and running the example and it still works around the reported failure.  The code catches the overflow in as_b and bails out of compilation without crashing.
Assignee: general → dtc-moz
Attachment #815957 - Attachment is obsolete: true
Attachment #8344328 - Attachment is obsolete: true
Attachment #8394554 - Flags: review?(jdemooij)
Flags: needinfo?(nihsanullah)
Flags: needinfo?(mrosenberg)
Comment on attachment 8394554 [details] [diff] [review]
Guard against branches being out of range and bail out of compilation if so.

Thanks for fixing this. Moving r? to Marty as I'm not very familiar with this.

(Is that "This will currently throw an assertion" comment still true?)
Attachment #8394554 - Flags: review?(jdemooij) → review?(mrosenberg)
Comment on attachment 8394554 [details] [diff] [review]
Guard against branches being out of range and bail out of compilation if so.

Review of attachment 8394554 [details] [diff] [review]:
-----------------------------------------------------------------

Strange, I thought we already fixed this.  Oh well, better late than never.
Attachment #8394554 - Flags: review?(mrosenberg) → review+
status-firefox28: --- → wontfix
status-firefox29: --- → affected
status-firefox30: --- → affected
status-firefox31: --- → affected
status-firefox-esr24: --- → ?
tracking-firefox31: --- → +
(Assignee)

Updated

4 years ago
Keywords: checkin-needed
https://hg.mozilla.org/integration/mozilla-inbound/rev/1cdcddd3c656
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/1cdcddd3c656

Setting status-esr24 to wontfix since we don't ship any ARM products off that branch. Doug, please nominate this for Aurora/Beta uplift ASAP. B2G branch uplifts will be taken care of via the sec-high security rating.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
status-b2g18: --- → wontfix
status-b2g-v1.1hd: --- → wontfix
status-b2g-v1.2: --- → affected
status-b2g-v1.3: --- → affected
status-b2g-v1.4: --- → affected
status-b2g-v2.0: --- → fixed
status-firefox31: affected → fixed
status-firefox-esr24: ? → wontfix
Flags: needinfo?(dtc-moz)
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
(Assignee)

Comment 26

4 years ago
Comment on attachment 8394554 [details] [diff] [review]
Guard against branches being out of range and bail out of compilation if so.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): long standing issue.
User impact if declined: crashes for large JS code objects, and it might be exploitable.
Testing completed (on m-c, etc.): local, and m-c
Risk to taking this patch (and alternatives if risky): low
String or IDL/UUID changes made by this patch:
Attachment #8394554 - Flags: approval-mozilla-beta?
Attachment #8394554 - Flags: approval-mozilla-aurora?
Flags: needinfo?(dtc-moz)
Comment on attachment 8394554 [details] [diff] [review]
Guard against branches being out of range and bail out of compilation if so.

(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #25)
> Setting status-esr24 to wontfix since we don't ship any ARM products off
Well, Debian ships ESR24 under ARM. ;)
Attachment #8394554 - Flags: approval-mozilla-beta?
Attachment #8394554 - Flags: approval-mozilla-beta+
Attachment #8394554 - Flags: approval-mozilla-aurora?
Attachment #8394554 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/2dfaa9c55efd
https://hg.mozilla.org/releases/mozilla-beta/rev/7be150811dd8
https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/bad26c337b18
https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/a84b01a0bb5c
status-b2g-v1.2: affected → fixed
status-b2g-v1.3: affected → fixed
status-b2g-v1.4: affected → fixed
status-firefox29: affected → fixed
status-firefox30: affected → fixed
status-b2g-v1.3T: --- → fixed
Group: javascript-core-security
Duplicate of this bug: 925819
Verified on today's Nightly FxAndroid 31. Ioana please verify on Beta/Aurora.
status-firefox31: fixed → verified
Flags: needinfo?(ioana.chiorean)

Comment 31

4 years ago
Verified on latest-aurora (04/16) and latest beta (yesterday's build 29.0b8) 
Setting bug as verified fixed
Status: RESOLVED → VERIFIED
status-firefox29: fixed → verified
status-firefox30: fixed → verified
Flags: needinfo?(ioana.chiorean)
Whiteboard: [games:p2] → [games:p2][adv-main29+]
status-firefox-esr24: wontfix → affected
Comment on attachment 8394554 [details] [diff] [review]
Guard against branches being out of range and bail out of compilation if so.

Discussed with RyanVM on IRC. It seems that bug 1019684 needs this one.
Attachment #8394554 - Flags: approval-mozilla-esr24+
https://hg.mozilla.org/releases/mozilla-esr24/rev/a4216c392233
status-firefox-esr24: affected → fixed
Because I don't have a physical ARM machine, I had to build fx esr24 on a x86 VM using the following flags:
- ac_add_options --disable-webrtc (needed more space, kept receiving "/usr/bin/ld: final link failed: Memory exhausted")
- ac_add_options --enable-arm-simulator

Once fx was built, used the following link for the test cases (the original one isn't working anymore)
- http://retroplay.co/c64

- loaded the "C-64" demo by selecting "Play" and waited till it loaded without issues
- loaded the "C-128" demo by selecting "Play" and waited till it loaded without issues
- played around with both demo's for about 5 minutes each without any issues
- loaded the two demo's on several different tab(s) at the same time without issues
- loaded the two demo's on several different window(s) without any issues
QA Whiteboard: [qa!]
status-firefox-esr24: fixed → verified
Group: core-security
You need to log in before you can comment on or make changes to this bug.