Closed Bug 1023158 Opened 6 years ago Closed 5 years ago

Assertion failure: ptr->isTenured(), at jit/shared/Assembler-shared.h:203 with asm.js

Categories

(Core :: JavaScript Engine: JIT, defect, critical)

x86
Linux
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla36
Tracking Status
firefox32 --- wontfix
firefox33 --- wontfix
firefox34 --- fixed
firefox35 --- fixed
firefox36 --- verified
firefox-esr31 --- unaffected
b2g-v1.4 --- unaffected
b2g-v2.0 --- unaffected
b2g-v2.0M --- unaffected
b2g-v2.1 --- unaffected
b2g-v2.2 --- fixed

People

(Reporter: decoder, Assigned: terrence)

References

(Blocks 2 open bugs)

Details

(Keywords: assertion, sec-high, testcase, Whiteboard: [jsbugmon:update,ignore][only a sec issue with GGC][adv-main34+][b2g-adv-main2.2-])

Attachments

(3 files)

The following testcase asserts on mozilla-central revision 9dc0ffca10f4 (run with --fuzzing-safe --ion-eager):


this.__defineSetter__("x", (function(stdlib) {
    "use asm";
    function f() {}
    return f;
}));
evaluate('var x = "";', { compileAndGo : true });
Whiteboard: [jsbugmon:update,bisect]
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:update]
Jan, do you know who might be able to look at this?
Flags: needinfo?(jdemooij)
Keywords: sec-high
Attached file backtrace.txt
I was able to reproduce, but only on 32-bit Linux debug (not 64) and only with --thread-count=1.  Attached is the full backtrace from gdb.  It looks like the IC code assumes that IsCacheableGetPropCallNative implies that the setter is tenured and that is being violated by an asm.js module function (which isNative() and gets cloned non-tenured by JSOP_LAMBDA).  I'm not sure if this is a generally valid assumption or not; the IC code predates GGC.  It seems like the fix would be to either update IsCacheableSetPropCallNative to additionally require shape->setterObject()->isTenured() or to change js::CloneFunctionObject to tenure cloned asm.js native functions.  Both fix the assert, but I'm not sure which is the "right" fix because I'm not familiar with our general tenuring invariants.  Should, e.g., fun->isNative() imply fun->isTenured?
Flags: needinfo?(jdemooij) → needinfo?(terrence)
(In reply to Luke Wagner [:luke] from comment #3)
> Created attachment 8441569 [details]
> backtrace.txt
> 
> I was able to reproduce, but only on 32-bit Linux debug (not 64) and only
> with --thread-count=1.  Attached is the full backtrace from gdb.  It looks
> like the IC code assumes that IsCacheableGetPropCallNative implies that the
> setter is tenured and that is being violated by an asm.js module function
> (which isNative() and gets cloned non-tenured by JSOP_LAMBDA).  I'm not sure
> if this is a generally valid assumption or not; the IC code predates GGC. 
> It seems like the fix would be to either update IsCacheableSetPropCallNative
> to additionally require shape->setterObject()->isTenured() or to change
> js::CloneFunctionObject to tenure cloned asm.js native functions.  Both fix
> the assert, but I'm not sure which is the "right" fix because I'm not
> familiar with our general tenuring invariants.  Should, e.g.,
> fun->isNative() imply fun->isTenured?

Initially the invariant was that nursery things would not flow into IC's -- this turned out to be impossible to ensure in practice, so we added ImmMaybeNurseryPtr to do the appropriate tracing. It sounds like I missed a pointer when making this change. Does making the ImmGCPtr at the top of the stack here into an ImmMaybeNurseryPtr fix the crash?

I think this would be the preferred solution over the other two: we shouldn't disable the prop cache ic just because the thing we're caching happens to initially be a nursery thing. Also, I don't see a reason why isNative() JSFunction would necessarily be long-lived, so I'm not sure it makes sense to always tenure those either.
Flags: needinfo?(terrence)
Yes, that does fix the assertion.

Looking around jit/*/MacroAssembler-*, there seem to be a bunch of places baking in pointers without any special ImmMaybeNurseryPtr.
(In reply to Luke Wagner [:luke] from comment #5)
> Yes, that does fix the assertion.
> 
> Looking around jit/*/MacroAssembler-*, there seem to be a bunch of places
> baking in pointers without any special ImmMaybeNurseryPtr.

Yes. The relevant invariant here that I should have stated explicitly above is that anything directly attached to MIR or LIR -- e.g. template objects and such -- must be tenured. Specifically, off-main-thread codegen is synchronized only against major gc, not against minor collection; if anything in the nursery wanders into MIR or LIR this is a serious bug and thus ImmGCPtr asserts that the thing is tenured.

The special case here is where we are compiling a new IC at runtime from an object that hasn't been created specifically as a template. Templates have explicit tenured heap creation, whereas IC's can sometimes be created from arbitrary objects floating around in the system.

Since having a nursery thing flow into codegen would be catastrophic, except in the special case where we are compiling in the foreground for an IC, ImmGCPtr and ImmManybeNurseryPtr are inconvertible and we have created totally separate, duplicated code paths in jit/*/MacroAssembler-* taking ImmMaybeNurseryPtr for IC usage. Fortunately this is only 2 functions at the moment.
Ah, I see, so the issue isn't the nursery pointer being baked into JIT code, we're just using ImmGCPtr as a sort of pinch-point to catch nursery pointers that passed through MIR/LIR.  Is that right?

In that case, then it seems like we shouldn't just change ImmGCPtr to ImmMaybeNurseryPtr inside pushValue, since that would add a hole in our assertion wall.  It seems like, instead, we'd have a pushValueMaybeInNursery, so that we bubble the maybe-nursery-ness up to the caller in IonCaches.cpp.  Does that make sense?
Flags: needinfo?(terrence)
(In reply to Luke Wagner [:luke] from comment #7)
> Ah, I see, so the issue isn't the nursery pointer being baked into JIT code,
> we're just using ImmGCPtr as a sort of pinch-point to catch nursery pointers
> that passed through MIR/LIR.  Is that right?

Yes, well said.

> In that case, then it seems like we shouldn't just change ImmGCPtr to
> ImmMaybeNurseryPtr inside pushValue, since that would add a hole in our
> assertion wall.  It seems like, instead, we'd have a
> pushValueMaybeInNursery, so that we bubble the maybe-nursery-ness up to the
> caller in IonCaches.cpp.  Does that make sense?

That is exactly right.
Flags: needinfo?(terrence)
See Also: → 1013001
Marty found a major hole in our assertions here: we don't have any ImmGCPtr equivalent for pointers embedded in Value. The particular case he found, Push(Value), isn't too bad because at least on x86 and arm, this ends up eventually doing push(ImmGCPtr()). On x64, however, we're doing push(ImmWord(layout.asBits)), totally skipping the relevant assertions. We should audit any Value taking entries in *Assembler* to make sure they're not baking in nursery pointers by accident.
(In reply to Terrence Cole [:terrence] from comment #9)
> The particular case he found,
> Push(Value), isn't too bad because at least on x86 and arm, this ends up
> eventually doing push(ImmGCPtr()). On x64, however, we're doing
> push(ImmWord(layout.asBits)), totally skipping the relevant assertions. We
> should audit any Value taking entries in *Assembler* to make sure they're
> not baking in nursery pointers by accident.

Good catch, but this is not just skipping an assertion, it's also a GC hazard.. push(ImmGCPtr) calls writeDataRelocation, to record the offset of the GC pointer, so that we can trace it later. If we use ImmWord instead of ImmGCPtr, we won't mark this pointer. pushValue on x64 should check val.isMarkable() and in that case use ImmGCPtr instead of ImmWord.

According to "hg annotate", this goes back to bug 713526, so it's an old bug.
Blocks: GC.stability
Group: javascript-core-security
Comment 10 suggests this is a regression from bug 713526.  Is this something you could look at Brian?  Thanks.
Flags: needinfo?(bhackett1024)
(In reply to Andrew McCreight [:mccr8] from comment #11)
> Comment 10 suggests this is a regression from bug 713526.  Is this something
> you could look at Brian?  Thanks.

I think the assertion / crash here is a GGC bug; comments 9 and 10 are sidetrack and talk about a problem with our x64 backend, whereas this report is for x86.

The method added in bug 713526 does indeed fail to trace the immediate correctly, but I don't think this is a GC hazard by itself without either a moving GC or another bug somewhere in Ion --- for Ion to execute code using a pointer to an object that has been swept, it would have to be executing the code incorrectly.  (I suppose Ion could execute code using a pointer to a swept string, which would require either it or baseline to be able to manufacture and bake in string constants for e.g. folded concat, but I don't think we do such things anywhere.)  This still needs to be fixed of course but I don't think it is a security risk, and fixing it with a call to writeDataRelocation() would still be problematic with GGC since calling writeDataRelocation() explicitly won't cause post barriers to trigger.
Attached patch fix pushValueSplinter Review
Attachment #8470215 - Flags: review?(jdemooij)
Flags: needinfo?(bhackett1024)
Comment on attachment 8470215 [details] [diff] [review]
fix pushValue

Review of attachment 8470215 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good but you're right, there's still the GGC issue..
Attachment #8470215 - Flags: review?(jdemooij) → review+
NI terrence for the ImmMaybeNurseryPtr issue discussed in comment 4 and later.
Flags: needinfo?(terrence)
Assignee: bhackett1024 → nobody
Whiteboard: [jsbugmon:update] → [jsbugmon:update,ignore]
JSBugMon: The testcase found in this bug no longer reproduces (tried revision 76b55c0850ca).
Duplicate of this bug: 1057625
Terrence, can you look at the GGC issue here mentioned in comment 15?
(In reply to Andrew McCreight [:mccr8] from comment #20)
> Terrence, can you look at the GGC issue here mentioned in comment 15?

Sorry for the delay! I thought that the problem here was purely theoretical because of the lack of a new fuzzer testcase in this bug. It turns out Christian actually found a great testcase for this bug back in May and filed it as bug 1013001. It was initially marked as a jit issue and I didn't get NI'd on it, missing it in the tide of bugmail, so was totally unaware of it until Jan linked it in related bugs.

The ball got dropped and I'm not entirely sure how to make sure this situation doesn't happen again. :-(
Flags: needinfo?(terrence)
And the additional thing discussed in comment 6+ is now fixed by bug 1013001, so closing the bug as FIXED via the original issue and testcase.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
JSBugMon: This bug has been automatically verified fixed.
Brian, wasn't this fixed in Firefox 34 per the checkin in comment 17? I'm unclear why this didn't go through sec-approval before it was checked in though.

As a sec-high bug, assuming ESR31 is affected, we need to backport this bug to ESR31 unless there is a reason not to do so.
Flags: needinfo?(bhackett1024)
Per comment 12, Brian thinks this is only a sec issue with generational GC enabled, which is maybe 32 or later, so 31 shouldn't be affected.
Whiteboard: [jsbugmon:update,ignore] → [jsbugmon:update,ignore][only a sec issue with GGC]
Brian, this is a sec-high. Can we get this nominated for 34 and 35 (Beta and Aurora) ASAP and get it checked in?
(In reply to Al Billings [:abillings] from comment #24)
> Brian, wasn't this fixed in Firefox 34 per the checkin in comment 17? I'm
> unclear why this didn't go through sec-approval before it was checked in
> though.
> 
> As a sec-high bug, assuming ESR31 is affected, we need to backport this bug
> to ESR31 unless there is a reason not to do so.

The patch in 36814bee6277 isn't fixing the crash in this bug, it is addressing a sidetrack unrelated to the original assertion/crash and per comment 12 I don't believe this can trigger any crashing / bad behavior.
Flags: needinfo?(bhackett1024)
Assignee: nobody → terrence
Depends on: 1013001
Keywords: leave-open
Target Milestone: --- → mozilla36
Group: javascript-core-security
Whiteboard: [jsbugmon:update,ignore][only a sec issue with GGC] → [jsbugmon:update,ignore][only a sec issue with GGC][adv-main34+]
Whiteboard: [jsbugmon:update,ignore][only a sec issue with GGC][adv-main34+] → [jsbugmon:update,ignore][only a sec issue with GGC][adv-main34+][b2g-adv-main2.2-]
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.