Closed Bug 1054538 Opened 10 years ago Closed 10 years ago

Crash [@ interpExitTrampoline] with js::jit::IonScript::unlinkFromRuntime and GC on the stack

Categories

(Core :: JavaScript Engine: JIT, defect)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla35
Tracking Status
firefox32 --- wontfix
firefox33 --- wontfix
firefox34 --- wontfix
firefox35 --- fixed
firefox-esr31 --- wontfix
b2g-v1.4 --- wontfix
b2g-v2.0 --- wontfix
b2g-v2.0M --- wontfix
b2g-v2.1 --- wontfix
b2g-v2.2 --- fixed

People

(Reporter: gkw, Assigned: luke)

References

Details

(Keywords: sec-high, Whiteboard: [adv-main35+][b2g-adv-main2.2+])

Crash Data

Attachments

(2 files)

Attached file stack
On m-c rev 9fdb8047f07d, there is an unreproducible crash (but with a crash dump).

My configure flags are:

AR=ar sh /home/gkwong/trees/mozilla-central/js/src/configure --disable-debug --enable-optimize --enable-gczeal --enable-debug-symbols --disable-tests --enable-more-deterministic --with-ccache --enable-threadsafe

I'll be sending the files to Terrence and :jonco to look at.
Flags: needinfo?(terrence)
Flags: needinfo?(jcoppeard)
> I'll be sending the files to Terrence and :jonco to look at.

Done.
Crash Signature: [@ interpExitTrampoline] [@ js::jit::IonScript::unlinkFromRuntime]
A core file isn't too useful unless I can generate exactly the same binary. What compiler was that using?
Flags: needinfo?(terrence)
CCing Luke as it's asm.js related.
I'm using GCC 4.8:

$ gcc --version
gcc-4.8.real (Ubuntu 4.8.2-19ubuntu1) 4.8.2
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Flags: needinfo?(terrence)
The problem here is that code_ is dead (and nonnull) when updating a dependent module's stub. We initialize code_ to nullptr in the constructor and unmap it in the destructor, so the lifetime management is simple enough. Before unmapping the code, we unlink ourself from all the modules that have dependencies targeting us, unless the target reverse-dependence has no ion script. We check this when destructing though, so maybe it lost it's IonScript without updating its dependents? Seems unlikely. I'm not sure what else to look at here.

Luke, could you take a look? Ping Gary for ssh access.
Flags: needinfo?(terrence) → needinfo?(luke)
Can you just post the files to run?
Flags: needinfo?(luke)
(In reply to Luke Wagner [:luke] from comment #6)
> Can you just post the files to run?

This is an unreproducible crash, so not sure if the testcase will be useful. I instead have the binary and the crash dump. Which of these would you like?
Flags: needinfo?(luke)
You can email me both, but I'm not sure I'll be able to do much.
Flags: needinfo?(luke)
(In reply to Luke Wagner [:luke] from comment #8)
> You can email me both, but I'm not sure I'll be able to do much.

I've emailed them, but the last option is ssh access.

If you would not like to have access and poke around in gdb (and if the emailed files don't help), then I'm really not sure how to move this forward.
Flags: needinfo?(luke)
Looking at the faulting instruction, all the bits look correct relative to each other, so I think the problem must be that code+globaldata has been unmapped.  This would happen if the AsmJSModule was destroyed and didn't notify an associated IonScript (~AsmJSModule immediately munmaps the code+globaldata).

I *think* I see a potential problem that would be really hard to manifest, but would be most likely to show up in a shutdown-GC like we have here: ~AsmJSModule (which is called during finalization) accesses the JSFunction it depends on, despite not having marked that JSFunction in trace() (b/c AsmJSModule::trace wasn't called; it's dead).  That means that the JSFunction could be garbage and thus could have been clobbered and thus could appear not to have an associated IonScript.  The fix seems simple: store the IonScript in ExitDatum and use that instead.  Both IonScript and AsmJSModule have destroy steps during which they can tell the other before anything is garbage.

This reminds me of the problems where object finalizers want to look at their shape but can't b/c they can have been finalized.  This all has to happen in a single GC finalization.  I'd like to write a test that tries to make this happen; Terrence: do you know of a way to force clobbering to happen during finalization (or, at the least, flip the bits that determine fun->hasScript).

Tentatively marking s-s, although this seems extraordinarily hard to trigger.
Group: javascript-core-security
Flags: needinfo?(luke) → needinfo?(terrence)
(In reply to Luke Wagner [:luke] from comment #10)
> I *think* I see a potential problem that would be really hard to manifest,
> but would be most likely to show up in a shutdown-GC like we have here:
> ~AsmJSModule (which is called during finalization) accesses the JSFunction
> it depends on, despite not having marked that JSFunction in trace() (b/c
> AsmJSModule::trace wasn't called; it's dead).  That means that the
> JSFunction could be garbage and thus could have been clobbered and thus
> could appear not to have an associated IonScript.  The fix seems simple:
> store the IonScript in ExitDatum and use that instead.  Both IonScript and
> AsmJSModule have destroy steps during which they can tell the other before
> anything is garbage.

Ah, I did not notice that the IonScript lookup was through the function here. Given that, I concur with your hypothesis 100%. 

> This reminds me of the problems where object finalizers want to look at
> their shape but can't b/c they can have been finalized.  This all has to
> happen in a single GC finalization.  I'd like to write a test that tries to
> make this happen; Terrence: do you know of a way to force clobbering to
> happen during finalization (or, at the least, flip the bits that determine
> fun->hasScript).

Thankfully, it looks like this can only happen during shutdown: normally JSScript is foreground finalized and JSFunction is background finalized, so we will get the correct ordering by default. However, during shutdown, we destruct both foreground and background things in one go in the order: objects, strings/symbols, scripts, jitcode, and (lastly) shapes. I guess this crash would happen during any shutdown where both the function and AsmJSModule were still live.
Flags: needinfo?(terrence)
Attached patch fix-detach-ionSplinter Review
This patch fixes the theoretical bug.
Assignee: nobody → luke
Status: NEW → ASSIGNED
Attachment #8481631 - Flags: review?(hv1989)
Flags: needinfo?(jcoppeard)
We don't need to use the more restrictive component-specific security groups unless a bug is sec-high or sec-critical
Group: javascript-core-security → core-security
(In reply to Terrence Cole [:terrence] from comment #11)
> However, during shutdown, we destruct both foreground and background things in one go

Is that true?  I thought we did this in the same order regardless.  If not, I think we should make the behaviour the same in both cases.
Comment on attachment 8481631 [details] [diff] [review]
fix-detach-ion

Review of attachment 8481631 [details] [diff] [review]:
-----------------------------------------------------------------

Good find!
Attachment #8481631 - Flags: review?(hv1989) → review+
Keywords: sec-high
https://hg.mozilla.org/mozilla-central/rev/7229fe7c6215
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Do we know how far back this goes?
Flags: needinfo?(dveditz)
Luke: security bugs need to request sec-approval? before landing, in part to answer bkerensa's question above. https://wiki.mozilla.org/Security/Bug_Approval_Process

Please retroactively request sec-approval? on the patch and fill in the template so we can figure out where we need to worry about taking this fix.
status-b2g-v1.4: --- → ?
status-b2g-v2.0: --- → ?
Flags: needinfo?(dveditz) → needinfo?(luke)
Comment on attachment 8481631 [details] [diff] [review]
fix-detach-ion

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

Very very hard.  We only have one intermittent, unreproducible failure in the shell using special shell flags.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

Nope

Which older supported branches are affected by this flaw?

It's hard to know since this has never reproduced in a stock browser or even shell, but, if it was reproducible, it might be on all branches -- this goes back to bug 860838.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

No and 'risky'.  I think this is definitely something that should ride the trains.

How likely is this patch to cause regressions; how much testing does it need?

Slightly risky.  (But, in retrospect, it seems to have stuck.)
Attachment #8481631 - Flags: sec-approval?
Flags: needinfo?(luke)
Comment on attachment 8481631 [details] [diff] [review]
fix-detach-ion

Ok. We should not take this outside of trunk given the comments about risk.
Attachment #8481631 - Flags: sec-approval? → sec-approval+
Depends on: 1094570
Whiteboard: [adv-main35+]
Whiteboard: [adv-main35+] → [adv-main35+][b2g-adv-main2.2+]
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: