Closed Bug 1312480 Opened 3 years ago Closed 3 years ago

Crash [@ DoMarking<JSObject>] or Crash [@ js::gc::IsInsideNursery]

Categories

(Core :: JavaScript Engine: JIT, defect, P1, critical)

x86_64
Linux
defect

Tracking

()

VERIFIED FIXED
mozilla53
Tracking Status
firefox-esr45 --- unaffected
firefox50 --- unaffected
firefox51 + verified
firefox52 + verified
firefox53 + verified

People

(Reporter: decoder, Assigned: h4writer)

References

(Blocks 1 open bug)

Details

(5 keywords, Whiteboard: [jsbugmon:update,ignore][post-critsmash-triage])

Crash Data

Attachments

(1 file, 6 obsolete files)

The following testcase crashes on mozilla-central revision 215f96861176 (build with --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --disable-debug --enable-optimize, run with --fuzzing-safe --thread-count=2 --ion-offthread-compile=off):

var lfLogBuffer = `
  z = new Int32Array;
  function f() {
    z.iterator = 2;
  }
for (var i=1; i<3; new f | this)
  f();
`;
var lfRunTypeId = -1;
loadFile(lfLogBuffer);
function loadFile(lfVarx) {
  try {
    switch (lfRunTypeId) {
      case 1: eval(lfVarx);
      default:
        evaluate(lfVarx);
    }
  } catch (lfVare) {}
}



Backtrace:

 received signal SIGSEGV, Segmentation fault.
#0  DoMarking<JSObject> (gcmarker=0x7ffff6961a80, thing=0x7ffff3681140) at js/src/gc/Marking.cpp:800
#1  0x000000000096ffb3 in js::InterpreterFrame::trace (this=this@entry=0x7ffff33c20a8, trc=trc@entry=0x7ffff6961a80, sp=0x7ffff33c2128, pc=0x7ffff6918ba6 ":") at js/src/vm/Stack.cpp:358
#2  0x0000000000970114 in MarkInterpreterActivation (act=0x7fffffffd280, trc=0x7ffff6961a80) at js/src/vm/Stack.cpp:417
#3  js::MarkInterpreterActivations (rt=<optimized out>, trc=trc@entry=0x7ffff6961a80) at js/src/vm/Stack.cpp:427
#4  0x0000000000ae8e6a in js::gc::GCRuntime::traceRuntimeCommon (this=this@entry=0x7ffff695f8f8, trc=trc@entry=0x7ffff6961a80, traceOrMark=traceOrMark@entry=js::gc::GCRuntime::MarkRuntime, lock=...) at js/src/gc/RootMarking.cpp:331
#5  0x0000000000ae92c2 in js::gc::GCRuntime::traceRuntimeForMajorGC (this=this@entry=0x7ffff695f8f8, trc=trc@entry=0x7ffff6961a80, lock=...) at js/src/gc/RootMarking.cpp:268
#6  0x00000000007ea59f in js::gc::GCRuntime::beginMarkPhase (this=this@entry=0x7ffff695f8f8, reason=reason@entry=JS::gcreason::ALLOC_TRIGGER, lock=...) at js/src/jsgc.cpp:3906
#7  0x00000000007fa835 in js::gc::GCRuntime::incrementalCollectSlice (this=this@entry=0x7ffff695f8f8, budget=..., reason=reason@entry=JS::gcreason::ALLOC_TRIGGER, lock=...) at js/src/jsgc.cpp:5844
#8  0x00000000007fbb59 in js::gc::GCRuntime::gcCycle (this=this@entry=0x7ffff695f8f8, nonincrementalByAPI=nonincrementalByAPI@entry=false, budget=..., reason=reason@entry=JS::gcreason::ALLOC_TRIGGER) at js/src/jsgc.cpp:6188
#9  0x00000000007fc041 in js::gc::GCRuntime::collect (this=0x7ffff695f8f8, nonincrementalByAPI=<optimized out>, budget=..., reason=<optimized out>) at js/src/jsgc.cpp:6326
#10 0x00000000007fcb2b in js::gc::GCRuntime::startGC (this=0x7ffff695f8f8, gckind=GC_NORMAL, reason=JS::gcreason::ALLOC_TRIGGER, millis=<optimized out>) at js/src/jsgc.cpp:6406
#11 0x00000000007fdbb2 in js::gc::GCRuntime::gcIfRequested (this=this@entry=0x7ffff695f8f8) at js/src/jsgc.cpp:6607
#12 0x0000000000907a3d in AutoGCIfRequested::~AutoGCIfRequested (this=<synthetic pointer>, __in_chrg=<optimized out>) at js/src/vm/Interpreter.cpp:414
#13 js::InternalCallOrConstruct (cx=0x7ffff695f000, args=..., construct=<optimized out>) at js/src/vm/Interpreter.cpp:434
#14 0x0000000000907d98 in js::Call (cx=cx@entry=0x7ffff695f000, fval=..., fval@entry=..., thisv=..., thisv@entry=..., args=..., rval=..., rval@entry=...) at js/src/vm/Interpreter.cpp:522
#15 0x00000000007dbc7c in js::Call (rval=..., thisObj=<optimized out>, fval=..., cx=0x7ffff695f000) at js/src/vm/Interpreter.h:96
#16 MaybeCallMethod (cx=cx@entry=0x7ffff695f000, obj=obj@entry=..., id=..., id@entry=..., vp=vp@entry=...) at js/src/jsobj.cpp:2926
#17 0x00000000007de87f in JS::OrdinaryToPrimitive (cx=cx@entry=0x7ffff695f000, obj=obj@entry=..., hint=hint@entry=JSTYPE_NUMBER, vp=vp@entry=...) at js/src/jsobj.cpp:3009
#18 0x00000000007dee82 in js::ToPrimitiveSlow (cx=cx@entry=0x7ffff695f000, preferredType=preferredType@entry=JSTYPE_NUMBER, vp=vp@entry=...) at js/src/jsobj.cpp:3057
#19 0x00000000007ed3fe in js::ToPrimitive (vp=..., preferredType=JSTYPE_NUMBER, cx=<optimized out>) at js/src/jsobj.h:1064
#20 js::ToNumberSlow (cx=0x7ffff695f000, v_=..., out=out@entry=0x7fffffffbe00) at js/src/jsnum.cpp:1616
#21 0x00000000007ed555 in js::ToNumberSlow (cx=<optimized out>, v=..., out=out@entry=0x7fffffffbe00) at js/src/jsnum.cpp:1629
#22 0x00000000007ee798 in js::ToInt32Slow (cx=<optimized out>, v=..., out=out@entry=0x7fffffffbe20) at js/src/jsnum.cpp:1735
#23 0x000000000057b2e8 in JS::ToInt32 (out=0x7fffffffbe20, v=..., cx=0x7ffff695f000) at /srv/jenkins/jobs/mozilla-central-build-jsshell/workspace/arch/64/compiler/gcc/sanitizer/none/type/opt/dist/include/js/Conversions.h:167
#24 js::BitOr (cx=0x7ffff695f000, lhs=..., rhs=..., out=0x7fffffffbe6c) at js/src/vm/Interpreter-inl.h:723
#25 0x00007ffff7e3fe9f in ?? ()
[...]
#37 0x0000000000000000 in ?? ()
rax	0xfff8800000000002	-2111062325329918
rbx	0x7ffff3681140	140737277071680
rcx	0x10228	66088
rdx	0x7ffff696b028	140737330458664
rsi	0x7ffff36fc0a0	140737277575328
rdi	0x0	0
rbp	0x7ffff6961a80	140737330420352
rsp	0x7fffffffb530	140737488336176
r8	0x7ffff36fe0e0	140737277583584
r9	0x10000000000	1099511627776
r10	0x4	4
r11	0x246	582
r12	0x7ffff6961a80	140737330420352
r13	0x7ffff33c2128	140737274192168
r14	0x7fffffffb8c0	140737488337088
r15	0x7ffff6918ba6	140737330121638
rip	0xad55ea <DoMarking<JSObject>(js::GCMarker*, JSObject*)+90>
=> 0xad55ea <DoMarking<JSObject>(js::GCMarker*, JSObject*)+90>:	mov    0x10(%rax),%rax
   0xad55ee <DoMarking<JSObject>(js::GCMarker*, JSObject*)+94>:	movb   $0x1,0x351(%rax)


Looks like a GC crash with a non-zero address, marking s-s until triaged.
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:update]
JSBugMon: Bisection requested, result:
=== Treeherder Build Bisection Results by autoBisect ===

The "good" changeset has the timestamp "20160904060750" and the hash "9855046ffeb7c945f572de0c51e82cf3c9b30dfc".
The "bad" changeset has the timestamp "20161024092523" and the hash "26fcee410e9f006d1561535ad91906ca590aa507".

Likely regression window: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=9855046ffeb7c945f572de0c51e82cf3c9b30dfc&tochange=26fcee410e9f006d1561535ad91906ca590aa507
Jon, can you please take a look?
Component: JavaScript Engine → JavaScript: GC
Flags: needinfo?(jcoppeard)
I get a crash like this when running the testcase:

#0  0x000000000045b084 in js::gc::IsInsideNursery (cell=0xfff8800000000002) at /home/jon/clone/dev/js/src/test2-build/dist/include/js/HeapAPI.h:333
#1  0x000000000045d73c in js::gc::Cell::isTenured (this=0xfff8800000000002) at /home/jon/clone/dev/js/src/gc/Heap.h:251
#2  0x000000000045d9fa in js::gc::TenuredCell::arena (this=0xfff8800000000002) at /home/jon/clone/dev/js/src/gc/Heap.h:1242
#3  0x0000000000538428 in js::gc::TenuredCell::zoneFromAnyThread (this=0xfff8800000000002) at /home/jon/clone/dev/js/src/gc/Heap.h:1271
#4  0x0000000000a97ed1 in JSObject::zoneFromAnyThread (this=0x7fb27e888140) at /home/jon/clone/dev/js/src/jsobj.h:318
#5  0x00000000010aba39 in js::CheckTracedThing<JSObject> (trc=0x7fb282f47cd8, thing=0x7fb27e888140) at /home/jon/clone/dev/js/src/gc/Marking.cpp:205
#6  0x00000000010c6ec5 in DoMarking<JSObject> (gcmarker=0x7fb282f47cd8, thing=0x7fb27e888140) at /home/jon/clone/dev/js/src/gc/Marking.cpp:796
#7  0x00000000010c1873 in DispatchToTracer<JSObject*> (trc=0x7fb282f47cd8, thingp=0x7fb27e4120c8, name=0x148af2f "arguments") at /home/jon/clone/dev/js/src/gc/Marking.cpp:661
#8  0x00000000010b16e6 in js::TraceRoot<js::ArgumentsObject*> (trc=0x7fb282f47cd8, thingp=0x7fb27e4120c8, name=0x148af2f "arguments")
    at /home/jon/clone/dev/js/src/gc/Marking.cpp:479
#9  0x0000000000e4e869 in js::InterpreterFrame::trace (this=0x7fb27e4120a8, trc=0x7fb282f47cd8, sp=0x7fb27e412128, pc=0x7fb282f38a46 ":")
    at /home/jon/clone/dev/js/src/vm/Stack.cpp:358

This is reading the group_ field of an object and getting 0xfff8800000000002.  I set a watchpoint in rr and found that this field is being overwritten by JIT code.  A little more digging shows that the value comes from the line |z.iterator = 2| in the testcase: changing this to e.g. |z.iterator = 0x131| gives a value of 0xfff8800000000131.

So I think this is a JIT issue... maybe h4writer will know more.
Flags: needinfo?(jcoppeard) → needinfo?(hv1989)
Keywords: sec-high
Am I doing doing something wrong here? Tried 32bit/64bit given comments in the description and runtime flags, but couldn't reproduce on the given revision?
Flags: needinfo?(hv1989) → needinfo?(choller)
Flags: needinfo?(choller)
Whiteboard: [jsbugmon:update] → [jsbugmon:update,reconfirm]
Whiteboard: [jsbugmon:update,reconfirm] → [jsbugmon:update,reconfirm,ignore]
JSBugMon: This bug has been automatically confirmed to be still valid (reproduced on revision 86f702229e32).
Hannes, the bot is able to reproduce just fine (see previous comment). Can you double-check? What compiler are you using?
Flags: needinfo?(hv1989)
Whiteboard: [jsbugmon:update,reconfirm,ignore] → [jsbugmon:update]
Flags: needinfo?(hv1989)
Priority: -- → P1
Putting as P1. Though possible I will only be able look at it on Monday.
Assignee: nobody → hv1989
Component: JavaScript: GC → JavaScript Engine: JIT
Make sure it is in my todo list
Flags: needinfo?(hv1989)
Could this be a discrepancy in moving slots to tenured?

Before we tenure:
> 0x7fd20a641415: movabs $0x7fd204079060,%rax   => rax = 0x7fd204079060 (global)
> 0x7fd20a64141f: mov 0x10(%rax), %rdx          => rdx = 0x7fd203d1a000 (slots)
> 0x7fd20a64142d: and 0xa60(%rdx), %rax         => rax = 0x7fd204100378 (specific item on slots)
> 0x7fd20a641489: mov 0x10(%rax), %rdx          => rdx = 0x7fd2041003b8 (slots)
=> we tenure and update rdx/rax on the stack to:
> rax = 0x7fd204088100
> rdx = 0x7fd204088140

The rax is correctly updated.
The rdx is incorrect. 0x10(rax) gives me 0x7fd203d20f80. The new location of slots.

We update this pointer at:
> #0  0x00000000010bbffc in js::Nursery::forwardBufferPointer (this=0x7fd208e6aa08, pSlotsElems=0x7ffe4de32e20) at /home/h4writer/Build/mozilla-inbound/js/src/gc/Nursery.cpp:470                             │
> #1  0x000000000081db6e in js::jit::UpdateIonJSFrameForMinorGC (trc=0x7ffe4de32640, frame=...) at /home/h4writer/Build/mozilla-inbound/js/src/jit/JitFrames.cpp:1150                                         │
> #2  0x000000000081eafa in js::jit::UpdateJitActivationsForMinorGC (rt=0x7fd208e6a208, trc=0x7ffe4de32640) at /home/h4writer/Build/mozilla-inbound/js/src/jit/JitFrames.cpp:1476                             │
> #3  0x00000000010bd0ff in js::Nursery::doCollection (this=0x7fd208e6aa08, rt=0x7fd208e6a208, reason=JS::gcreason::OUT_OF_NURSERY, tenureCounts=...)                                                         │
>     at /home/h4writer/Build/mozilla-inbound/js/src/gc/Nursery.cpp:731                                                                                                                                       │

I suspect the wrong value is in the "forwardedBuffers" ??

Interestingly before tenure:
Object: 0x7fd204100378
Slots: 0x7fd2041003b8
are separated by 0x40

After tenure:
Object: 0x7fd204088100
Wrong slots: 0x7fd204088140
are separated by 0x40

Could it be that instead of saving the pointer to the slots, we are saving the offset to the slots,
which we blindly think is the same after tenure?
Flags: needinfo?(hv1989) → needinfo?(jcoppeard)
Flags: needinfo?(jcoppeard)
With help from Jan, found out the issue here. I'm going to write it up later.
The issue here is that we mangle the TypedArray elements and the slots when tenuring.

TypedArrayObject has "elements()" and slots_:

>                          ----------------------------------------------
>                         |                ------                        |
>                         |               |      V                       V
> [ (TypedArrayObject)  slots_  ... DATA_SLOT FIXED_DATA_START ... ] [ (HeapSlot)     ]


Upon tenuring we need to update the pointers from slot_ and DATA_SLOT, since the TypedArrayObject can be moved. We keep track of this in the "forwardedBuffers" in the Nursery. This works all fine. In this example I see both the slots_ and DATA_SLOT added in forwardedBuffers.

The problem arises when the TypedArray data length is 0. In that case the DATA_SLOT points to the end of the TypedArrayObject. Which happen to be the same as the start of the HeapSlot. 

>                          -------------------------
>                         |                ------   |
>                         |               |       V V
> [ (TypedArrayObject)  slots_  ... DATA_SLOT ] [ (HeapSlot)     ]

When tenuring the slots are not directly located after the TypedArrayObject. We move the HeapSlot to another location.
We update the pointers, but get confused:
js::Nursery::setForwardingPointer (this=0x7fd208e6aa08, oldData=0x7fd2041003b8, newData=0x7fd203d20f80, direct=true)
js::Nursery::setForwardingPointer (this=0x7fd208e6aa08, oldData=0x7fd2041003b8, newData=0x7fd204088140, direct=false)

We encounter 0x7fd2041003b8 on the stack and have to replace that when getting back into ion.
Now we make the mistake by replacing it with "0x7fd204088140" (the elements). Instead of 0x7fd203d20f80 (the location of the new slots). As a result wrong things start to happen when we override the values.

Jan mentioned this is probably a regression of bug 1279992 or a dependency of that bug.

Not sure what the solutions would be:
1) Add some empty data after FIXED_DATA_START when length is zero. To make sure the DATA_SLOT can never point to the start of a new object?
2) Jan suggested to keep "slots" and "elements" in different forwardedBuffers. Which as a result we wouldn't mangle them.
...

What do you think Jon?
Flags: needinfo?(jcoppeard)
If it is probably caused by bug 1279992 or a depending bug. As a result FF51+
(In reply to Hannes Verschore [:h4writer] from comment #11)
Nice work, that sounds like the problem.

I like the idea of separating forwardedBuffers for slots and elements.  I'm also wondering if it's possible to have DATA_SLOT set to null for zero-length buffers.
Flags: needinfo?(jcoppeard)
(In reply to Jon Coppeard (:jonco) from comment #13)
> ... I'm
> also wondering if it's possible to have DATA_SLOT set to null for
> zero-length buffers.

That was also one of the things I was wondering.
Attached patch patch (obsolete) — Splinter Review
Attachment #8814077 - Flags: review?(jdemooij)
I think this makes sense, but can't we still allocate a typed array with length 0 and non-null data here:

http://searchfox.org/mozilla-central/rev/b4e6fa3527436bdbcd9dd2e1982382fe423431f3/js/src/jit/MacroAssembler.cpp#1093-1095

As far as I know, |new Int32Array(0)| inside a loop should trigger this. It should be easy to fix there, but it would be good to make sure it triggers the assertion.
Flags: needinfo?(hv1989)
(In reply to Jan de Mooij [:jandem] from comment #16)
> I think this makes sense, but can't we still allocate a typed array with
> length 0 and non-null data here:
> 
> http://searchfox.org/mozilla-central/rev/
> b4e6fa3527436bdbcd9dd2e1982382fe423431f3/js/src/jit/MacroAssembler.cpp#1093-
> 1095
> 
> As far as I know, |new Int32Array(0)| inside a loop should trigger this. It
> should be easy to fix there, but it would be good to make sure it triggers
> the assertion.

It didn't trigger the assertion in the current tests. I'll add one and fix that too.
Attached patch patch (obsolete) — Splinter Review
Good find. The assert would have triggered. Added testcase for this and fixed the issue.
Attachment #8814077 - Attachment is obsolete: true
Attachment #8814077 - Flags: review?(jdemooij)
Flags: needinfo?(hv1989)
Attachment #8814841 - Flags: review?(jdemooij)
Comment on attachment 8814841 [details] [diff] [review]
patch

Review of attachment 8814841 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/TypedArrayObject.h
@@ +165,5 @@
>      bool hasInlineElements() const;
>      void setInlineElements();
>      uint8_t* elements() const {
> +        uint8_t* output = *(uint8_t **)((((char *)this) + this->dataOffset()));
> +        MOZ_ASSERT_IF(length() == 0 && !hasBuffer(), output == nullptr);

Great to have this assert.
Attachment #8814841 - Flags: review?(jdemooij) → review+
Calling Fx50 and ESR45 unaffected given the comment 12. Please request sec-approval so this can be landed.
Tracking 53+ for this sec high issue.
Track 51+ as sec-high.
Comment on attachment 8814841 [details] [diff] [review]
patch

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
It shows this issue happens when a TypedArray is 0 length. In that way it shows what is affected, but not how it is affected. It might be harder to reason how to create an exploit out of it. The testcase added doesn't show extra information and doesn't test for the patch. Though having this information and applying a fuzzer on it might make it easy to create an exploit out of it.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
The testcase added gives the same information as the patch itself. It doesn't test the actual issue. The testcase was added because the original patch forgot to update a part. That being said, the answer is the same as the above comment.

Which older supported branches are affected by this flaw?
FF51+

If not all supported branches, which bug introduced the flaw?
bug 1279992

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
They should be similar

How likely is this patch to cause regressions; how much testing does it need?
Could introduce new issues, but should be visible and fuzzers should catch it fast and less severe than this bug.
Flags: needinfo?(hv1989)
Attachment #8814841 - Flags: sec-approval?
(In reply to Hannes Verschore [:h4writer] from comment #23)
> Do you have backports for the affected branches? If not, how different, hard
> to create, and risky will they be?
> They should be similar

This patch applies on beta/aurora and nightly correctly.
Blocks: 1279992
Comment on attachment 8814841 [details] [diff] [review]
patch

[Triage Comment]
sec-approval and a=dveditz to land on beta and aurora
Attachment #8814841 - Flags: sec-approval?
Attachment #8814841 - Flags: sec-approval+
Attachment #8814841 - Flags: approval-mozilla-beta+
Attachment #8814841 - Flags: approval-mozilla-aurora+
Attached patch Patch (obsolete) — Splinter Review
Finally found the culprit. It seems that we also set the "elements()" to nullptr for template objects. (In order not having to create the buffer). This made my "hasInlineElements" check wrong.

Updated it:
-> for template objects: "hasInlineElements" should only be true when size < TypedArrayObject::INLINE_BUFFER_LIMIT

-> for non-template objects: I added an extra check to see if we have no buffer present (for size 0), just to be safe. I think we can add have an external buffer with size 0 ?

Locally this fixed the issue. Running try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=36aadfe5435bd8afc7d5d00c37fd76e14097e791
Attachment #8814841 - Attachment is obsolete: true
Attachment #8822149 - Flags: review?(jdemooij)
Comment on attachment 8822149 [details] [diff] [review]
Patch

Review of attachment 8822149 [details] [diff] [review]:
-----------------------------------------------------------------

Ah yes, this makes sense.
Attachment #8822149 - Flags: review?(jdemooij) → review+
Comment on attachment 8822149 [details] [diff] [review]
Patch

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
It shows this issue happens when a TypedArray is 0 length. In that way it shows what is affected, but not how it is affected. It might be harder to reason how to create an exploit out of it. The testcase added doesn't show extra information and doesn't test for the patch. Though having this information and applying a fuzzer on it might make it easy to create an exploit out of it.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
The testcase added gives the same information as the patch itself. It doesn't test the actual issue. The testcase was added because the original patch forgot to update a part. That being said, the answer is the same as the above comment.

Which older supported branches are affected by this flaw?
FF51+

If not all supported branches, which bug introduced the flaw?
bug 1279992

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
This patch applies on all affected branches.

How likely is this patch to cause regressions; how much testing does it need?
Could introduce new issues, but should be visible and fuzzers should catch it fast and less severe than this bug.
Attachment #8822149 - Flags: sec-approval?
Comment on attachment 8822149 [details] [diff] [review]
Patch

sec-approval+ for the fix *only*. Do *not* check in a test for this until we fix it on all affected branches.

Since this doesn't affect release, please nominate beta and aurora patches (or put nominations on existing patch if it applies clean). We will approve and then once it is fixed everywhere, you can check in the test. Otherwise, if we missed fixing it on beta but checked in the tests, we could 0day ourselves when beta goes to release.
Attachment #8822149 - Flags: sec-approval? → sec-approval+
Attached patch Patch for aurora/beta (obsolete) — Splinter Review
Approval Request Comment
[Feature/Bug causing the regression]:
Bug 1279992

[User impact if declined]:
Access to write/read random memory.

[Is this code covered by automated tests?]:
Yes

[Has the fix been verified in Nightly?]:
Not yet. Still has to land. Though it already had a cycle that failed and try shows the failing issues are now gone.

[Needs manual test from QE? If yes, steps to reproduce]: 
No

[List of other uplifts needed for the feature/fix]:
/

[Is the change risky?]:
[Why is the change risky/not risky?]:
There is a risk to it. It is none trivial code and has some corner cases. Though I think I got them all now. Try is happy. Less risky than having access to write/read random memory.

[String changes made/needed]:
/
Attachment #8822451 - Flags: approval-mozilla-beta?
Attachment #8822451 - Flags: approval-mozilla-aurora?
This landed to inbound this morning, but was then backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/b71f8d890b51cf0a2ad2ccbd98a007feee3af32b for breaking test_lz4.js and test_lz4_sync.js xpcshell tests

https://treeherder.mozilla.org/logviewer.html#?job_id=65292486&repo=mozilla-inbound
Flags: needinfo?(hv1989)
If we can land a fix that sticks by Thursday, I'm planning to go to build for beta 12 Thurs. afternoon.
I'm on it. Feel free to ping me if I haven't commented by that time.
Flags: needinfo?(hv1989)
Attachment #8822451 - Flags: approval-mozilla-beta?
Attachment #8822451 - Flags: approval-mozilla-aurora?
Attached patch Quick fix (obsolete) — Splinter Review
Before going further into the details why the previous failed I created a quick fix. This isn't the optimal solution, but it fixes the issue. The idea is to not put the address in the forwarded pointers when the length is 0. This means we would keep the old pointer (only) in jitcode or update it to another unrelated pointer. But this isn't an issue since the length is 0 and we never read anything from that pointer. I think it might be better to land this for now and land the better fix on nightly when I find a way to fix the issue that is apparently still present.
Attachment #8823976 - Flags: review?(jdemooij)
Comment on attachment 8823976 [details] [diff] [review]
Quick fix

Review of attachment 8823976 [details] [diff] [review]:
-----------------------------------------------------------------

This makes sense I think, but I'll request an additional review from jonco in case I'm missing something.
Attachment #8823976 - Flags: review?(jdemooij)
Attachment #8823976 - Flags: review?(jcoppeard)
Attachment #8823976 - Flags: review+
Comment on attachment 8823976 [details] [diff] [review]
Quick fix

Already asking to get this moving at a quicker pace.

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
It shows this issue happens when a TypedArray is 0 length. Having this information and applying a fuzzer on it might make it easy to create an exploit out of it.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
no

Which older supported branches are affected by this flaw?
FF51+

If not all supported branches, which bug introduced the flaw?
bug 1279992

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
This patch applies on all affected branches.

How likely is this patch to cause regressions; how much testing does it need?
Could introduce new issues, but should be visible and fuzzers should catch it fast and less severe than this bug.
Attachment #8823976 - Flags: sec-approval?
Comment on attachment 8823976 [details] [diff] [review]
Quick fix

Review of attachment 8823976 [details] [diff] [review]:
-----------------------------------------------------------------

Yes this sounds good.
Attachment #8823976 - Flags: review?(jcoppeard) → review+
Comment on attachment 8823976 [details] [diff] [review]
Quick fix

sec-approval+ for trunk.
We'll want Aurora and Beta patches made and nominated.
Attachment #8823976 - Flags: sec-approval? → sec-approval+
Comment on attachment 8823976 [details] [diff] [review]
Quick fix

Approval Request Comment
[Feature/Bug causing the regression]:
Bug 1279992

[User impact if declined]:
Access to write/read random memory.

[Is this code covered by automated tests?]:
Yes

[Has the fix been verified in Nightly?]:
Not yet. Just landed on inbound. Try run for it was:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1150c4968c4355d0de17bf7719451e5a8159c789
But the testcase is fixed by this patch.

[Needs manual test from QE? If yes, steps to reproduce]: 
No

[List of other uplifts needed for the feature/fix]:
/

[Is the change risky?]:
[Why is the change risky/not risky?]:
This code is quite complicated. The original fix was the "nice one". Since the merge is coming closer I crafted this patch. I have more confidence in this patch. Also it touches less code (loc and impact of other code). Given all the failures with the previous patch I don't dare to say 100% sure, but I think this one should stick and don't give troubles.

[String changes made/needed]:
/
Attachment #8823976 - Flags: approval-mozilla-beta?
Attachment #8823976 - Flags: approval-mozilla-aurora?
Comment on attachment 8823976 [details] [diff] [review]
Quick fix

Let's get this in aurora now. But since we are starting the beta 12 build soon and this hasn't landed on m-c yet maybe we could aim for beta 13 on Monday.
Attachment #8823976 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Keywords: leave-open
Whiteboard: [jsbugmon:update] → [jsbugmon:update,ignore]
JSBugMon: The testcase found in this bug no longer reproduces (tried revision 0d823cf54df5).
(In reply to Jon Coppeard (:jonco) from comment #13)
> I like the idea of separating forwardedBuffers for slots and elements.  I'm
> also wondering if it's possible to have DATA_SLOT set to null for
> zero-length buffers.

(In reply to Hannes Verschore [:h4writer] from comment #28)
> Created attachment 8822149 [details] [diff] [review]
> Patch

(In reply to Wes Kocher (:KWierso) from comment #33)
> This landed to inbound this morning, but was then backed out in
> https://hg.mozilla.org/integration/mozilla-inbound/rev/
> b71f8d890b51cf0a2ad2ccbd98a007feee3af32b for breaking test_lz4.js and
> test_lz4_sync.js xpcshell tests
> 
> https://treeherder.mozilla.org/logviewer.html#?job_id=65292486&repo=mozilla-
> inbound

The failures are because we set DATA_SLOT to null for zero-length buffers:
http://searchfox.org/mozilla-central/rev/8144fcc62054e278fe5db9666815cc13188c96fa/js/src/ctypes/CTypes.cpp#3550

Here we treat nullptr as a failure. I already tried to understand that code and adjust it, but that seems complex.
Should we persist and kong-fu through that code or be happy with the current fix we have?
Flags: needinfo?(jdemooij)
(In reply to Hannes Verschore [:h4writer] from comment #47)
> Should we persist and kong-fu through that code or be happy with the current
> fix we have?

As I said on IRC, it's possible the current fix caused bug 1329635...
Flags: needinfo?(jdemooij)
Sounds like we should back this out of aurora and beta.
Flags: needinfo?(wkocher)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #49)
> Sounds like we should back this out of aurora and beta.

Indeed we should back this out of aurora and beta and central. There are issues with this patch (bug 1329635).
Depends on: 1329635
Regular fix had its problems. Since we use nullptr often as an OOM issue or not found issue, this had problems with ctypes. Lars convinced me against taking that approach.

Quick fix didn't work, since we actively iterate the stack and replace the values. I assumed we did this lazyily. As a result we tried to replace the pointer with a value not found.

New approach: Make sure that if length is zero we allocate a little bit more space. As a result the pointer to the inline data never equals the end of object or the beginning of another element. The change was quite simple, since we decide the length with a single function. Most of the patch contains extra debugging information to make sure fuzzers will find this quickly. This is done by putting 0 in that extra space. If that is not in bounds and get overwritten we will assert that it is not 0 anymore.
Attachment #8822149 - Attachment is obsolete: true
Attachment #8822451 - Attachment is obsolete: true
Attachment #8823976 - Attachment is obsolete: true
Attachment #8823976 - Flags: approval-mozilla-beta?
Attachment #8825699 - Flags: review?(jdemooij)
Comment on attachment 8825699 [details] [diff] [review]
Patch: Always reserve some space as inline elements for length 0

Review of attachment 8825699 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thanks again for fixing this.

::: js/src/jit/MacroAssembler.cpp
@@ +1107,5 @@
>          size_t numZeroPointers = ((nbytes + 7) & ~0x7) / sizeof(char *);
>          for (size_t i = 0; i < numZeroPointers; i++)
>              storePtr(ImmWord(0), Address(obj, dataOffset + i * sizeof(char *)));
> +#ifdef DEBUG
> +        if (nbytes == 0)

In MCallOptimize.cpp there's this check:

  if (providedLen < 0)

What do you think about changing that to |providedLen <= 0| and then here MOZ_ASSERT(nbytes > 0); ?

Something like |new Int32Array(0)| isn't common/important but supporting this adds substantial complexity, so let's just handle this completely in the VM for now. That also matches what you did in AllocateObjectBufferWithInit :)

::: js/src/vm/TypedArrayObject.cpp
@@ +140,5 @@
>  }
>  
> +#ifdef DEBUG
> +void
> +TypedArrayObject::assertCorrect() const

Can we also call this from TypedArrayObject::finalize ? That's called for all non-nursery typed arrays before they die, so it seems like a good place to check this one last time.

@@ +224,5 @@
>      }
>  
>      size_t headerSize = dataOffset() + sizeof(HeapSlot);
>      if (headerSize + nbytes <= GetGCKindBytes(newAllocKind)) {
>          MOZ_ASSERT(oldObj->hasInlineElements());

We can assert the extra byte we're adding to nbytes fits in GetGCKindBytes:

// See AllocKindForLazyBuffer.
MOZ_ASSERT_IF(nbytes == 0, headerSize + 1 < GetGCKindBytes(newAllocKind));

@@ +228,5 @@
>          MOZ_ASSERT(oldObj->hasInlineElements());
> +#ifdef DEBUG
> +        if (nbytes == 0) {
> +            uint8_t* output = newObj->fixedData(TypedArrayObject::FIXED_DATA_START);
> +            output[0] = 0;

Some places might zero-initialize anyway, so what do you think about adding:

#ifdef DEBUG
static const uint8_t ZeroLengthArrayData = 0x4A; (or whatever)
#endif

And then use ZeroLengthArrayData everywhere instead of 0?

::: js/src/vm/TypedArrayObject.h
@@ +174,5 @@
> +        return elementsRaw();
> +    }
> +
> +#ifdef DEBUG
> +    void assertCorrect() const;

Nit: something more descriptive maybe: assertZeroLengthArrayData ?
Attachment #8825699 - Flags: review?(jdemooij) → review+
Addresses review comments.

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
It shows this issue happens when a TypedArray is 0 length. In that way it shows what is affected, but not how it is affected. It might be harder to reason how to create an exploit out of it.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
No

Which older supported branches are affected by this flaw?
FF51+

If not all supported branches, which bug introduced the flaw?
bug 1279992

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
This patch works on all affected branches (beta, aurora)

How likely is this patch to cause regressions; how much testing does it need?
- Could introduce new issues, but should be visible and fuzzers should catch it fast and less severe than this bug. The patch itself is two lines. All others instrumentation is for fuzzers to quickly find issues if still present.
- We will have a performance hit on TypedArray with 0 length construction (compared to before this patch). Though it should be same performance as current release. And performance is not that important on TypedArray with 0 length.
Attachment #8825699 - Attachment is obsolete: true
Attachment #8825837 - Flags: sec-approval?
Attachment #8825837 - Flags: review+
let's get uplift requests in too, especially for beta
Flags: needinfo?(hv1989)
Comment on attachment 8825837 [details] [diff] [review]
Patch: Always reserve some space as inline elements for length 0

Review of attachment 8825837 [details] [diff] [review]:
-----------------------------------------------------------------

Approval Request Comment
[Feature/Bug causing the regression]:
Bug 1279992

[User impact if declined]:
Access to write/read random memory.

[Is this code covered by automated tests?]:
Yes

[Has the fix been verified in Nightly?]:
Not yet. Still has to land. Though it already had a cycle that failed and try shows the failing issues are now gone.

[Needs manual test from QE? If yes, steps to reproduce]: 
No

[List of other uplifts needed for the feature/fix]:
/

[Is the change risky?]:
[Why is the change risky/not risky?]:
There is a risk to it. It is none trivial code and has some corner cases. Though I think I got them all now. Try is happy. Less risky than having access to write/read random memory.

[String changes made/needed]:
/
Attachment #8825837 - Flags: approval-mozilla-beta?
Attachment #8825837 - Flags: approval-mozilla-aurora?
Flags: needinfo?(hv1989)
Comment on attachment 8825837 [details] [diff] [review]
Patch: Always reserve some space as inline elements for length 0

Sec-approval+ for trunk.
Giving branch approvals as well after discussing with Ritu. Please land on trunk first though.
Attachment #8825837 - Flags: sec-approval?
Attachment #8825837 - Flags: sec-approval+
Attachment #8825837 - Flags: approval-mozilla-beta?
Attachment #8825837 - Flags: approval-mozilla-beta+
Attachment #8825837 - Flags: approval-mozilla-aurora?
Attachment #8825837 - Flags: approval-mozilla-aurora+
GChang, Liz, FYI. This sec-high fix should be included before we gtb the last beta build of the 51 cycle.
Flags: needinfo?(lhenry)
Flags: needinfo?(gchang)
https://hg.mozilla.org/releases/mozilla-aurora/rev/1abb4b193740653f1161d86f38dbb79287e7d69b

This'll be in tomorrow's Trunk/Aurora nightlies. Let's see how that goes with some time for fuzzing and do the Beta uplift later tomorrow if things look good.
https://hg.mozilla.org/mozilla-central/rev/6cfebedd2040 (Fix will be also on trunk nightlys today)
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Depends on: 1330662
Flags: needinfo?(lhenry)
Group: javascript-core-security → core-security-release
Flags: needinfo?(gchang)
Whiteboard: [jsbugmon:update,ignore] → [jsbugmon:update,ignore][post-critsmash-triage]
JSBugMon: This bug has been automatically verified fixed.
JSBugMon: This bug has been automatically verified fixed on Fx51
JSBugMon: This bug has been automatically verified fixed on Fx52
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.