Closed Bug 1096026 Opened 5 years ago Closed 5 years ago

Assertion failure: !isInside(*pSlotsElems), at gc/Nursery.cpp:481

Categories

(Core :: JavaScript Engine, defect, critical)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla36
Tracking Status
firefox34 --- fixed
firefox35 --- fixed
firefox36 --- verified
firefox-esr31 --- unaffected
b2g-v1.4 --- unaffected
b2g-v2.0 --- unaffected
b2g-v2.0M --- unaffected
b2g-v2.1 --- unaffected
b2g-v2.2 --- fixed

People

(Reporter: decoder, Assigned: bhackett)

References

(Blocks 1 open bug)

Details

(4 keywords, Whiteboard: [jsbugmon:update,ignore][adv-main34+][b2g-adv-main2.2-])

Attachments

(3 files, 1 obsolete file)

The following testcase asserts on mozilla-central revision d380166816dd (run with --no-threads --fuzzing-safe):


gczeal(2);
var Vec3u16Type = TypedObject.uint16.array(3);
function foo_u16() {
  for (var i = 0; i < 30000; i += 3) {
    var vec = new Vec3u16Type([i, i+1, i+2]);
    var sum = vec[0] + vec[1] + vec[(/[]/g )];
    foo_u16(sum, 3*i + 3);
  }
}
foo_u16();
Another GC assertion in TypedObject, marking s-s.
Flags: needinfo?(nmatsakis)
Whiteboard: [jsbugmon:update,bisect]
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:update]
JSBugMon: Bisection requested, result:
autoBisect shows this is probably related to the following changeset:

The first bad revision is:
changeset:   https://hg.mozilla.org/mozilla-central/rev/a9cab8a15d2b
user:        Brian Hackett
date:        Wed Oct 29 11:19:51 2014 -0700
summary:     Bug 1085029 - Use common-descriptor logic more often in TypedObjectPrediction, r=nmatsakis.

This iteration took 331.308 seconds to run.
Is this caused by the above mentioned revision?
Flags: needinfo?(bhackett1024)
Blocks: 1085029
Keywords: regression, sec-high
Attached patch WIP (obsolete) — Splinter Review
The immediate problem here is that forwarding pointers aren't being set for typed objects.  While fixing this though I found some other problems with typed objects and tracing, namely that data pointers for typed arrays whose buffer has data in an inline typed object won't be updated properly.

Additionally, the forwarding logic for zero-length elements vectors is incorrect; the code just skips inserting a forwarding pointer in this case, but these zero length vectors can still be used to access array lengths and initialized lengths at negative offsets, so they need to be forwarded properly.

The attached patch should fix all this stuff.  I need to write tests for the various problems this patch is trying to fix, especially the zero length elements issue mentioned above, since that doesn't involve typed objects at all.  If I can get that to crash (and probably even if I can't) I'll write a smaller patch just targeting that which can be uplifted.
Assignee: nobody → bhackett1024
Flags: needinfo?(nmatsakis)
Flags: needinfo?(bhackett1024)
Attached patch patchSplinter Review
Above patch with some tests added.
Attachment #8521490 - Attachment is obsolete: true
Attachment #8521714 - Flags: review?(sphink)
Here's a testcase that shows a problem with the don't-forward-zero-length element arrays stuff.  Tested on 10.9 x64.

with({}){}
gczeal(2);
function foo(arr) {
    var z = arr.length;
    var y = [];
    var x = arr[0];
    print(0. + x + y + z);
    return y;
}
for (var i = 0; i < 100; i++)
    foo([1]);
for (var i = 0; i < 100; i++)
    foo([undefined]);
var arr = [];
Object.preventExtensions(arr);
for (var i = 0; i < 100; i++)
    foo(arr);

What happens is that Ion keeps the array elements live across the NewArray opcode.  When that opcode triggers a minor GC, the elements aren't forwarded properly since their capacity is zero.  Then the MInitializedLength associated with the element read accesses a negative index of the bogus elements, reads a bogus value and the element read then gets another bogus value rather than returning undefined.  This could also be used to write out values to bogus locations.
Attached patch elements patchSplinter Review
Chunk from the above patch which fixes the zero length elements issue.
Attachment #8521738 - Flags: review?(terrence)
Comment on attachment 8521738 [details] [diff] [review]
elements patch

Review of attachment 8521738 [details] [diff] [review]:
-----------------------------------------------------------------

Do you know why we can't just put the forwarding pointer somewhere in ObjectElements, since that memory is always present? I guess there may have been some path where we use the length before checking for forwarding, but it seems like it would be better fix those cases than add more complexity to our forwarding paths.

Also, is the reason you made forwardedPointers lazy performance? None of the extra overhead appears to be in particularly hot paths, so I think we should initialize it eagerly for the simplicity wins.

I'm fine with this approach, but I'd like to see it again in non-lazy form, assuming that there isn't some aspect I'm missing.

::: js/src/gc/Nursery.cpp
@@ +444,3 @@
>      MOZ_ASSERT(isInside(oldHeader));
>      MOZ_ASSERT(!isInside(newHeader));
> +    if (nelems - ObjectElements::VALUES_PER_HEADER < 1) {

Let's add a comment explaining why we need this test. Something like:
// If there is not enough room to store the forwarding pointer inline, put it in the forwardedBuffers map.

@@ +444,4 @@
>      MOZ_ASSERT(isInside(oldHeader));
>      MOZ_ASSERT(!isInside(newHeader));
> +    if (nelems - ObjectElements::VALUES_PER_HEADER < 1) {
> +        if (!forwardedBuffers.initialized() && !forwardedBuffers.init())

This is on the runtime, so let's initialize it eagerly in Nursery::init for simplicity.

@@ +453,1 @@
>      *reinterpret_cast<HeapSlot **>(oldHeader->elements()) = newHeader->elements();

// Otherwise store it inline in the first element.

@@ +472,5 @@
>  
> +    // The new location for this buffer is either stored inline with it or in
> +    // the forwardedBuffers table.
> +    do {
> +        if (forwardedBuffers.initialized()) {

And we can then remove this check and rephrase this as:
if (ForwardedBufferMap::Ptr p = forwardedBuffers.lookup(old))
    *pSlotsElems = reinterpret_cast<HeapSlot *>(p->value());
else
    *pSlotsElems =  *reinterpret_cast<HeapSlot **>(old);

I suppose unconditionally doing a lookup will be a bit slower here, but this code should be rarely run very rarely.

@@ +849,5 @@
>  
>      // Update any slot or element pointers whose destination has been tenured.
>      TIME_START(updateJitActivations);
>      js::jit::UpdateJitActivationsForMinorGC<Nursery>(&rt->mainThread, &trc);
> +    forwardedBuffers.finish();

And move this to ~Nursery.

::: js/src/gc/Nursery.h
@@ +212,5 @@
> +     * for buffers whose length is less than pointer width, or when different
> +     * buffers might overlap each other. For these, an entry in the following
> +     * table is used.
> +     */
> +    typedef HashMap<void *, void *, PointerHasher<void *, 1>, SystemAllocPolicy> ForwardedBufferMap;

ObjectElements should have Value alignment, so use a shift of 3 for PointerHasher.
Attachment #8521738 - Flags: review?(terrence)
Comment on attachment 8521714 [details] [diff] [review]
patch

Review of attachment 8521714 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/TypedObject.cpp
@@ +1552,5 @@
> +    if (buffer.forInlineTypedObject()) {
> +        InlineTypedObject &realOwner = buffer.firstView()->as<InlineTypedObject>();
> +        attach(cx, realOwner, offset);
> +        return;
> +    }

So does this mean we can assert in OutlineTypedObject::setOwnerAndData that the owner is either nullptr, an InlineTypedObject or an ArrayBuffer that holds its own data? If so, please do.

Oh, it looks like you assert this in the new obj_trace(). Please assert in setOwnerAndData as well (as documentation).

@@ +1689,5 @@
>  
>      // Update the data pointer if the owner moved and the owner's data is
> +    // inline with it. Note that if the owner is an array buffer we don't need
> +    // to handle the case where its data is from some inline typed object, as
> +    // in that case this object will be directly parented to that typed object.

I'd like this to read more like an invariant.

// Note that an array buffer pointing to data in an inline typed object will
// never be used as an owner for another view. In such cases, the owner will
// be the inline typed object itself.

@@ +1697,5 @@
>          (owner->is<InlineTypedObject>() ||
>           owner->as<ArrayBufferObject>().hasInlineData()))
>      {
> +        ptrdiff_t d = reinterpret_cast<uint8_t *>(owner) - reinterpret_cast<uint8_t *>(oldOwner);
> +        newData = oldData + d;

I would be fine with just

  newData += reinterpret_cast<uint8_t *>(owner) - reinterpret_cast<uint8_t *>(oldOwner);

but this is good too.

::: js/src/gc/Nursery.cpp
@@ +428,5 @@
>      return zone->allocator.arenas.allocateFromArena(zone, thingKind);
>  }
>  
>  void
> +Nursery::setForwardingPointer(void *oldData, void *newData, bool canInline)

The 'canInline' param confused me because of the other use of "inline". Can you make it the painfully verbose canInlineForwardingPointer instead?

@@ +437,5 @@
> +    if (canInline) {
> +        *reinterpret_cast<void **>(oldData) = newData;
> +    } else {
> +        if (!forwardedBuffers.initialized() && !forwardedBuffers.init())
> +            CrashAtUnhandlableOOM("Nursery::setForwardingPointer");

Please init this unconditionally instead of lazily.

@@ +485,5 @@
>  
> +    // The new location for this buffer is either stored inline with it or in
> +    // the forwardedBuffers table.
> +    do {
> +        if (forwardedBuffers.initialized()) {

If this is important for perf, is it possible to check the number of elements here instead?

@@ +834,5 @@
>  
>      // Update any slot or element pointers whose destination has been tenured.
>      TIME_START(updateJitActivations);
>      js::jit::UpdateJitActivationsForMinorGC<Nursery>(&rt->mainThread, &trc);
> +    forwardedBuffers.finish();

(.clear() if you end up making this non-lazy.)

::: js/src/vm/ArrayBufferObject.cpp
@@ +1208,5 @@
> +            obj->setPrivate(dstData);
> +
> +            // We can't use an inline forwarding pointer here, as there might
> +            // not be enough bytes available, and other views might have data
> +            // pointers whose forwarding pointers would overlap this one.

I don't follow this. Why can't lazyArrayBuffers guarantee enough space? AllocKindForLazyBuffer guarantees enough space, right? Or am I getting things mixed up?

And how would other views overlap?
So it basically looks good to me, but I'm worried that I'm not understanding everything correctly. ni? bhackett to answer my last question in that review.
Flags: needinfo?(bhackett1024)
(In reply to Terrence Cole [:terrence] from comment #9)
> Comment on attachment 8521738 [details] [diff] [review]
> elements patch
> 
> Review of attachment 8521738 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Do you know why we can't just put the forwarding pointer somewhere in
> ObjectElements, since that memory is always present? I guess there may have
> been some path where we use the length before checking for forwarding, but
> it seems like it would be better fix those cases than add more complexity to
> our forwarding paths.

Ion stack values which store slots/elements pointers don't distinguish between object dense elements and other types (slots, typed array buffers).  So when we try to get the forwarding pointer for a given array of values we don't know whether there is an ObjectElements header available to access.  This could be fixed but it would add a big mess of complexity to Ion and wouldn't fix the issues which typed objects bring to the table anyways, which you can see in the full patch.

> Also, is the reason you made forwardedPointers lazy performance? None of the
> extra overhead appears to be in particularly hot paths, so I think we should
> initialize it eagerly for the simplicity wins.

This is less for performance than it is for memory and simplicity.  Why have this table taking up room at all times when it is only used during nursery GCs?  If we end up sticking a lot of entries into this table during the nursery GC and want to reclaim the memory after the nursery GC finishes, we need to use HashMap::finish since HashMap::clear doesn't shrink the table.  So I think initializing this lazily is the best thing to do here, and the difference in complexity is pretty small.

> ::: js/src/gc/Nursery.cpp
> @@ +444,3 @@
> >      MOZ_ASSERT(isInside(oldHeader));
> >      MOZ_ASSERT(!isInside(newHeader));
> > +    if (nelems - ObjectElements::VALUES_PER_HEADER < 1) {
> 
> Let's add a comment explaining why we need this test. Something like:
> // If there is not enough room to store the forwarding pointer inline, put
> it in the forwardedBuffers map.

Since this bug allows people to read or write to arbitrary addresses, I don't think extra comments are the best idea.

> ::: js/src/gc/Nursery.h
> @@ +212,5 @@
> > +     * for buffers whose length is less than pointer width, or when different
> > +     * buffers might overlap each other. For these, an entry in the following
> > +     * table is used.
> > +     */
> > +    typedef HashMap<void *, void *, PointerHasher<void *, 1>, SystemAllocPolicy> ForwardedBufferMap;
> 
> ObjectElements should have Value alignment, so use a shift of 3 for
> PointerHasher.

This is true for ObjectElements, but not for nursery element pointers from typed objects and typed arrays which point into inline typed objects (which the full patch addresses).
Attachment #8521738 - Flags: review?(terrence)
(In reply to Steve Fink [:sfink, :s:] from comment #10)
> I don't follow this. Why can't lazyArrayBuffers guarantee enough space?
> AllocKindForLazyBuffer guarantees enough space, right? Or am I getting
> things mixed up?
> 
> And how would other views overlap?

This code is for the case where we create an inline typed object in the nursery, access its buffer, then use that buffer to create a typed array or data view pointing into the inline typed object's data.  This typed array might not be long enough to store a forwarding pointer since, even if the inline typed object was originally long enough, the typed array can point to any offset in it.  More importantly though, other typed arrays or typed objects can point to other offsets in the inline typed object's data, and if we store forwarding pointers for all of these they can overlap and we'll end up reading garbage values back out.

It's possible to do a little better here, if we store forwarding pointers in the inline typed object when (a) the view is large enough to accommodate a pointer, and (b) the array is aligned at pointer width.  But the complexity involved doesn't seem worth it since this case is pretty pathological.

This is all different from a typed array created in the nursery whose buffer is lazily constructed.  In that case, only the original typed array can point to its data, so there is no overlapping views issue, and AllocKindForLazyBuffer ensures there is enough space for a forwarding pointer.
Flags: needinfo?(bhackett1024)
I don't think those flags are set correctly.  The TypedObject thing only affects nightly, but the crash in comment 7 probably affects any builds with GGC enabled.
GGC is only enabled on B2G v2.2. Feel free to set the Fx flags correctly if you want.
Comment on attachment 8521738 [details] [diff] [review]
elements patch

Review of attachment 8521738 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you for the answers! The fact that HashTable::clear doesn't shrink is a bit surprising, but does indeed make the lazy approach correct here.
Attachment #8521738 - Flags: review?(terrence) → review+
Comment on attachment 8521738 [details] [diff] [review]
elements patch

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

not easily

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

no

Which older supported branches are affected by this flaw?

builds with gGC

If not all supported branches, which bug introduced the flaw?

GGC

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

should be straightforward

How likely is this patch to cause regressions; how much testing does it need?

not likely

Approval Request Comment
[Feature/regressing bug #]: GGC
[User impact if declined]: potentially exploitable crashes
[Describe test coverage new/current, TBPL]: none
[Risks and why]: low, adds some new functionality but nothing complicated
Attachment #8521738 - Flags: sec-approval?
Attachment #8521738 - Flags: approval-mozilla-beta?
Attachment #8521738 - Flags: approval-mozilla-aurora?
Comment on attachment 8521738 [details] [diff] [review]
elements patch

sec-approval+ for trunk.
Attachment #8521738 - Flags: sec-approval? → sec-approval+
Comment on attachment 8521738 [details] [diff] [review]
elements patch

Beta+
Attachment #8521738 - Flags: approval-mozilla-beta?
Attachment #8521738 - Flags: approval-mozilla-beta+
Attachment #8521738 - Flags: approval-mozilla-aurora?
Attachment #8521738 - Flags: approval-mozilla-aurora+
Note that this bug is approved for Beta but Ryan is out. If this is to make beta10 is will need to land very soon (minutes not hours). Can you please land on mozilla-beta yourself?
Flags: needinfo?(bhackett1024)
Group: javascript-core-security
Keywords: leave-open
Brian - Can you also please land on Aurora?
Flags: needinfo?(bhackett1024)
Whiteboard: [jsbugmon:update] → [jsbugmon:update,ignore]
JSBugMon: The testcase found in this bug no longer reproduces (tried revision 64e7a6391916).
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #23)
> Brian - Can you also please land on Aurora?

I've been trying repeatedly to update my aurora tree and am getting some error I haven't seen before (some SSL thing: SSL3_GET_RECORD:decryption failed or bad record mac.)  Is there anyone else who can do this?
Thanks!
Comment on attachment 8521714 [details] [diff] [review]
patch

Review of attachment 8521714 [details] [diff] [review]:
-----------------------------------------------------------------

Oh, ok. I think what I was missing was in your first reply to terrence re: slots/elements storage.
Attachment #8521714 - Flags: review?(sphink) → review+
Why is this marked 36 affected but fixed for 34 and 35? Oversight or ?
Flags: needinfo?(choller)
Whiteboard: [jsbugmon:update,ignore] → [jsbugmon:update,ignore][adv-main34+]
Per the last paragraph in comment 5, I think that there are two issues exposed by the initial test case.  One affects beta etc., and has had a patch landed.  The other issue only affects TypedObjects, and thus only affects trunk.  The latter issue has not been fixed.  Maybe we should just close out this one and open a new one for the specific TypedObjects issue, to simplify advisory tracking, etc.  Somebody correct me if I'm wrong.
Hmm, or maybe this is being left open for tests?  bhackett, can this be marked "fixed" for 36, and then this is just being left open for tests or something?
Flags: needinfo?(bhackett1024)
Blocks: 1103273
(In reply to Andrew McCreight [:mccr8] from comment #31)
> Per the last paragraph in comment 5, I think that there are two issues
> exposed by the initial test case.  One affects beta etc., and has had a
> patch landed.  The other issue only affects TypedObjects, and thus only
> affects trunk.  The latter issue has not been fixed.  Maybe we should just
> close out this one and open a new one for the specific TypedObjects issue,
> to simplify advisory tracking, etc.  Somebody correct me if I'm wrong.

Yeah, this is right, filed bug 1103273 for the typed objects issues.
Flags: needinfo?(choller)
Flags: needinfo?(bhackett1024)
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Thanks.
Keywords: leave-open
Target Milestone: --- → mozilla36
Reproduced the original issue using changeset d380166816dd with the instructions from comment #0:

- Assertion failure: !isInside(*pSlotsElems), at /home/kjozwiak/code/mozilla-central/js/src/gc/Nursery.cpp:481

fx 36 (changeset: 74edfbf0e6a3) - PASSED

When I try to run the above poc in fx35 (changeset: 56f456359253) and fx34 (changeset: c8ff4c93ee85), I get the following error:

"poc.js:2:4 ReferenceError: TypedObject is not defined"

Christian, is there any way I can verify this on fx35 and fx34? Perhaps I'm doing something incorrectly?
Flags: needinfo?(choller)
You will have to ask the developers if they can write a test for fx35/34. The test in comment 0 won't work there since it required TypedObject which is Nightly only.
Flags: needinfo?(choller)
Brian, would you be able to add a test case that works for fx35/34 as TypedObject is only available under m-c? Would the test case that you added in comment # 7 be sufficient for verification?
Flags: needinfo?(bhackett1024)
(In reply to Kamil Jozwiak [:kjozwiak] from comment #37)
> Brian, would you be able to add a test case that works for fx35/34 as
> TypedObject is only available under m-c? Would the test case that you added
> in comment # 7 be sufficient for verification?

Yes, the comment 7 testcase is sufficient for this.
Flags: needinfo?(bhackett1024)
I ran the testcase from comment #7 using m-a 4af988d1770f but got the same results when I run the testcase against m-c d380166816dd which is the build I reproduced the original assertion with. The output appears to be the same.

The output lists several 11, NaN1, NaN0 entries.

Brian, I'm not really sure how the non-fixed output should look like as the output looks to be the same every time. (even against the changeset that crashes when using the poc from comment #0)
Flags: needinfo?(bhackett1024)
(In reply to Kamil Jozwiak [:kjozwiak] from comment #39)
> I ran the testcase from comment #7 using m-a 4af988d1770f but got the same
> results when I run the testcase against m-c d380166816dd which is the build
> I reproduced the original assertion with. The output appears to be the same.
> 
> The output lists several 11, NaN1, NaN0 entries.
> 
> Brian, I'm not really sure how the non-fixed output should look like as the
> output looks to be the same every time. (even against the changeset that
> crashes when using the poc from comment #0)

Like many JS testcases this is sensitive to the particular machine and platform on which you're testing.  Sometimes the comment 7 testcase will crash an affected build, other times one of the lines will be some random double value rather than NaN.
Flags: needinfo?(bhackett1024)
Group: javascript-core-security
Whiteboard: [jsbugmon:update,ignore][adv-main34+] → [jsbugmon:update,ignore][adv-main34+][b2g-adv-main2.2-]
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.