Bug 677272 (JITHardening)

JIT hardening

NEW
Unassigned

Status

()

Core
JavaScript Engine
6 years ago
a month ago

People

(Reporter: cdleary, Unassigned)

Tracking

(Depends on: 5 bugs, Blocks: 1 bug, {sec-want})

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(14 attachments, 9 obsolete attachments)

7.96 KB, patch
Details | Diff | Splinter Review
464.88 KB, image/svg+xml
Details
615.15 KB, image/svg+xml
Details
7.82 KB, patch
dmandelin
: review+
Details | Diff | Splinter Review
54.56 KB, image/svg+xml
Details
64.55 KB, image/svg+xml
Details
646 bytes, text/plain
Details
59.37 KB, patch
dmandelin
: review+
Details | Diff | Splinter Review
2.73 KB, patch
dmandelin
: review+
Details | Diff | Splinter Review
1.13 KB, patch
dmandelin
: review+
Details | Diff | Splinter Review
5.78 KB, patch
dmandelin
: review+
Details | Diff | Splinter Review
2.95 KB, patch
dmandelin
: review+
Details | Diff | Splinter Review
3.14 KB, patch
dmandelin
: review+
Details | Diff | Splinter Review
21.61 KB, patch
dmandelin
: review+
Details | Diff | Splinter Review
Chris Rolph and Yan Ivnitskiy from Matasano have been discussing JIT spray mitigations with us. Quoting Blazakis, 2010: "JIT spraying is the process of coercing the JIT engine to write many executable pages with embedded shellcode."

There are a number of patches and performance results drafted up that implement the following laundry-list of mitigation techniques, as enumerated in their black hat presentation ( http://www.matasano.com/research/jit/ ):

0. W^X permissions on code pages -- this entails changing the permissions on code pages for each IC repatch.
1. Guard pages -- this prevents heap overflows from adjacent rw- pages from crossing into -wx page (writable code) space. This overlaps with W^X and is also made much less feasible by mapping randomization (to follow).
2. Mapping randomization -- VirtualAlloc, unlike mmap, is not subject to the randomization of ASLR. Because the system call permits a requested "target address", we can randomize mappings ourselves with reasonable effectiveness.
3. Constant blinding -- by selecting a magic value to XOR immediate constants with, the attacker can no longer predict what immediate values will present themselves in the instruction stream.
4. Allocation restrictions -- by assuming that web-based workloads conform to a particular allocation pattern, it's possible to cull out allocation patterns for spray-oriented payloads.
5. NOP insertion -- because JIT spray relies on the uniform and predictable nature of the JIT instruction stream, inserting various forms of NOPs at random intervals can cause the attacker to lose control.
6. Random base offset -- JITted code can have a NOP-padding prologue of unknown size to make ROP gadget discovery and use more difficult in terms of page-relative offsets. This can also randomize offset at which an instruction stream starts (i.e. it can avoid being DWORD aligned in the first instruction), but (5) has overlapping benefits there.

This bug will elaborate on each patch and the observed performance impact of the techniques.
I gave some thoughts into that yesterday:
0. We would need to change the memory permissions twice for every patching in our IC's. We maybe should investigate separating code that needs patching and those that doesn't need it.
1. This should be free on the performance side (except maybe when stuff doesn't fit into the cache because of that and otherwise would). If all our code was not writable we wouldn't need that.
2. free except we might have some problems because the code is scattered around in memory.
3. This sounds like the biggest problem to me, when we need to store constants in tight code.
4. I think we already kinda do this by throwing away dead code in the GC, maybe we should get a bit more restrictive about this beforehand.
5 & 6. Memory increase

Overall the biggest problem will be memory overhead created by this.
Forgot to note that some of these changes will apply to all platforms, but the primary hardening target is Win32/x86.
>Forgot to note that some of these changes will apply to all platforms, but the >primary hardening target is Win32/x86.
From what i gathered most of this stuff is not harder to do on Linux.
Apologies to Chris Rohlf, I misspelled his name in the original post!
Created attachment 551656 [details] [diff] [review]
0. WIP: W^X

This is the patch that I drafted to implement W^X. Obviously, you incur additional overhead from the IC patches, where you're at least forcing the OS to manage page tables. Data to follow, with pretty plots.
Created attachment 551657 [details]
SunSpider protect overhead

16k reprotects in SunSpider, at a mean of ~.8us each, running us about 13.85ms. This plot has a pretty clear single mode, which is a nice contrast to the V8 plot to come.
Created attachment 551658 [details]
V8-V6 protect overhead

V8-V6 has a bimodal look to its 33k repatches. One peak is close to the mean at ~900ns, the other is about a stddev away at ~1.3us. Would be interesting to see what the properties are for those longer repatches; for example, is it a stub vs. inline patch distinction? Or perhaps repatches that cross page boundaries?
Comment on attachment 551656 [details] [diff] [review]
0. WIP: W^X

Pulling the numbers out from my notebook for ``js -m -a`` runs in the SunSpider harness:

Machine: Lenovo T510 Win7/x86 [i7 Arrandale]
SS:    ~5ms loss  = ~2.5% (w/ ~2ms variance)
V8-V6: ~30ms loss = ~2%   (w/~10ms variance)

Machine: Dell T3500 Linux/x86 [i7 Bloomfield]
SS:    ~8ms loss  = ~3.5% (w/ very little variance)
V8-V6: ~30ms loss = ~2.5% (w/ very little variance)

Previous SVG data was collected via |clock_gettime(CLOCK_MONOTINIC, ...)| timers on the Dell.
Created attachment 551698 [details] [diff] [review]
2. Randomize mappings on Windows

Per comments, we get 13 bits of randomness in the VirtualAlloc requested address on x86, and 21 bits of randomness on x64. As one would expect, I see no observable performance impact.
Attachment #551698 - Flags: review?(dmandelin)
Created attachment 551701 [details] [diff] [review]
3a. WIP: assembler hardening scaffolding

This is the "scaffolding" I use as a basis for testing the other hardening techniques. Hardening is (aggressively) disabled in paths with assumed fixed instruction sequences (like ICs and jump tables). It's important to err on the side of caution here to avoid difficult-to-reproduce crashes as a result of self-inflicted randomness.
Created attachment 551842 [details] [diff] [review]
3b. WIP: movi32r blinding

This patch only blinds moves of imm32s into a register as a control experiment to avoid register pressure. On the Bloomfield box I see less than 0.5ms slowdown on SunSpider, but a 2-3% regression (~30ms) on V8-V6.
FWIW, I think V8 only blinds "large" constants. Noticed this a few months ago when comparing generated code and wondering why it was xor'ing this large integer in the loop test... Not saying that we have to follow them but it may help reduce the perf loss because small constants are probably much more common.
Comment on attachment 551698 [details] [diff] [review]
2. Randomize mappings on Windows

Review of attachment 551698 [details] [diff] [review]:
-----------------------------------------------------------------

Nice. r+ with the 3 requested changes.

::: js/src/assembler/jit/ExecutableAllocatorWin.cpp
@@ +41,5 @@
>  
> +void *ExecutableAllocator::computeRandomAllocationAddress()
> +{
> +    /*
> +     * Inspration is V8's OS::Allocate in platform-win32.cc.

s/Inspration/Inspiration

@@ +51,5 @@
> +     * tries to avoid system default DLL mapping space. In the end, we get 13
> +     * bits of randomness in our selection. 
> +     * x64: [2GiB, 16TiB), with 21 bits of randomness.
> +     */
> +    static const uintN pageBits = 16;

s/pageBits/chunkBits

@@ +54,5 @@
> +     */
> +    static const uintN pageBits = 16;
> +#if WTF_CPU_X86_64
> +    static const uintptr_t base = 0x0000000080000000;
> +    static const uintptr_t mask = 0x000003ffffff0000;

I think |mask| is (2 ** 42 - 1), which goes with 4TiB, so either mask or the comment should change.
Attachment #551698 - Flags: review?(dmandelin) → review+
Created attachment 552546 [details]
Win32 SunSpider protection overhead

The discreteness of the values here is somewhat odd -- maybe contributed by use of RDTSC instead of clock_gettime, but the numbers sanity check.
Created attachment 552547 [details]
Win32 V8-V6 protection overhead
Created attachment 552548 [details]
Plotting script
Created attachment 552550 [details] [diff] [review]
5. WIP: NOP insertion

First of the patches that hadn't been posted yet. "Guard pages" (1) and "alloc restriction" (4) patches are not in a workable state and so will only be posted as followup bugs, if at all.

This patch regresses V8 by less than 1%, and actually gave a small-but-consistent-despite-the-randomness speedup to sunspider on my box. Maybe because of jump target alignment? Could be noise from elsewhere.
Created attachment 552552 [details] [diff] [review]
6. WIP: Random base offset for assemblies

Randomize starting location (unknown DWORD alignment) for JIT-assembled code.
Current WIP is mixing these together into a cocktail that doesn't significantly regress performance, along with enabled NanoJIT hardening flags. Can't get the JaegerMonkey sloshed.
Created attachment 555186 [details] [diff] [review]
3a. assembler hardening scaffolding.
Attachment #551701 - Attachment is obsolete: true
Attachment #555186 - Flags: review?(dmandelin)
Created attachment 555187 [details] [diff] [review]
5. NOP insertion.
Attachment #552550 - Attachment is obsolete: true
Attachment #555187 - Flags: review?
Created attachment 555188 [details] [diff] [review]
6. Random base offset for assemblies
Attachment #552552 - Attachment is obsolete: true
Attachment #555188 - Flags: review?(dmandelin)
Created attachment 555189 [details] [diff] [review]
3b. Blinding.
Attachment #551842 - Attachment is obsolete: true
Attachment #555189 - Flags: review?(dmandelin)
Created attachment 555190 [details] [diff] [review]
Swap in real rand.
Attachment #555190 - Flags: review?(dmandelin)
Created attachment 555191 [details] [diff] [review]
NanoJIT hardening.
Attachment #555191 - Flags: review?(dmandelin)
Attachment #555187 - Flags: review? → review?(dmandelin)
Comment on attachment 555186 [details] [diff] [review]
3a. assembler hardening scaffolding.

Review of attachment 555186 [details] [diff] [review]:
-----------------------------------------------------------------

Looks OK. I do have a concern that this way of controlling when the hardening transformations are applied could be somewhat error-prone. In particular, I imagine the user has to be pretty careful about when hardening is and is not applied in a fairly "manual" way (the AutoHarden class helps here, though). (An example of this would be the register pinning in JM2: that was manual and proved hard to get right.) When we redo this for IonMonkey, think about how to make it safer: perhaps containing these things more, or maybe using assertions to check which hardening is done where. But I think it's OK as-is for JM2.
Attachment #555186 - Flags: review?(dmandelin) → review+
Comment on attachment 555187 [details] [diff] [review]
5. NOP insertion.

Review of attachment 555187 [details] [diff] [review]:
-----------------------------------------------------------------

Basically good but I have some significant nitpicky stuff.

::: js/src/assembler/assembler/X86Assembler.h
@@ +381,5 @@
>  
> +    static const uintN NOP_INST_COUNTDOWN_MIN = 128;
> +    static const uintN NOP_INST_COUNTDOWN_RAND_SPAN = 1023;
> +
> +    int fakeRand() const

A function with 'fake' in the name needs an explanatory comment.

@@ +390,5 @@
> +    void resetNopCountdown()
> +    {
> +        int randVal = fakeRand();
> +        countdown = NOP_INST_COUNTDOWN_MIN + (randVal & NOP_INST_COUNTDOWN_RAND_SPAN);
> +        nopSize = (randVal >> 16) & 0x7;

Why is nopSize generated here instead of at the point where we emit the nops?

@@ +400,5 @@
>              return;
> +
> +        countdown -= 1;
> +        if (countdown >= 0)
> +            return;

if (!hardening || --countdown)
    return

would be more like house style.

@@ +2611,5 @@
>      }
>  
>      bool hardening;
> +    int countdown;
> +    size_t nopSize;

Please add comments explaining what these values are for and what the encoding is.
Attachment #555187 - Flags: review?(dmandelin) → review-
Comment on attachment 555188 [details] [diff] [review]
6. Random base offset for assemblies

Review of attachment 555188 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with requested comment added.

::: js/src/assembler/assembler/ARMAssembler.h
@@ +311,5 @@
>              int m_offset : 31;
>              bool m_used : 1;
>          };
>  
> +        void maybeNop() {}

Please add a comment explaining why this has null implementation.
Attachment #555188 - Flags: review?(dmandelin) → review+
Comment on attachment 555189 [details] [diff] [review]
3b. Blinding.

Review of attachment 555189 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, but I want a little further discussion on the API before taking this.

::: js/src/assembler/assembler/X86Assembler.h
@@ +1529,5 @@
>          m_formatter.oneByteOp(OP_MOV_GvEv, dst, base, index, scale, offset);
>          postInst();
>      }
>  
> +    void movl_i32r(int imm, RegisterID dst, bool maybeHarden = true)

This seems like it might be a slightly confusing API, although it's clear enough with the context I have (knowing that you're hardening our JITs :-) ).

By default, I'd prefer having a separate method for the hardened case, maybe movl_i32r_blind, or something like that. But maybe this is better here--can you explain why you did it?

As is, I think it would be better to have the parameter be |bool maybeBlind|, and a comment explaining it is blinded only if hardening is on.
Attachment #555189 - Flags: review?(dmandelin)
(In reply to Jan de Mooij from comment #12)
> FWIW, I think V8 only blinds "large" constants. Noticed this a few months
> ago when comparing generated code and wondering why it was xor'ing this
> large integer in the loop test... Not saying that we have to follow them but
> it may help reduce the perf loss because small constants are probably much
> more common.

I missed this comment because I was just going down the list hitting reviews. But this sounds like a great thing to try--we should avoid regressions if we can, and the small constants be a far harder target for attack.
v8 only blinds constants in its untyped compiler, as far as I can tell
Comment on attachment 555191 [details] [diff] [review]
NanoJIT hardening.

Review of attachment 555191 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jstracer.cpp
@@ +2839,5 @@
> +            int candidate = rand();
> +            if (candidate < limit)
> +                return candidate;
> +        }
> +    }

What values is this called with? It seems to me that if limit is small, it could take a long time. Could you do a simple analysis of the EV of number of rounds and probability it will be big? If there's a min limit, this should assert harder on limit; e.g. limit == 1 is probably too small.
Attachment #555191 - Flags: review?(dmandelin)
Attachment #555190 - Flags: review?(dmandelin) → review+
dveditz brought up that it would also be useful to have a way of preffing off JIT hardening for maximizing reproducibility -- shouldn't be difficult, I'll make a patch today on top of this stack.
Created attachment 556700 [details] [diff] [review]
5. NOP insertion.

Addresses review points. Note that fakeRand is replaced later in the stack with the real rand function.
Attachment #555187 - Attachment is obsolete: true
Attachment #556700 - Flags: review?(dmandelin)
Created attachment 556702 [details] [diff] [review]
NanoJIT hardening.

Future-proofs bad usage with an assertion.
Attachment #555191 - Attachment is obsolete: true
Attachment #556702 - Flags: review?(dmandelin)
Created attachment 556707 [details] [diff] [review]
3b. Blinding.

You're right, the recursion/default-param was gross. This simply breaks it up.
Attachment #555189 - Attachment is obsolete: true
Attachment #556707 - Flags: review?(dmandelin)
Created attachment 556708 [details] [diff] [review]
3b. Blinding.

qref
Attachment #556707 - Attachment is obsolete: true
Attachment #556707 - Flags: review?(dmandelin)
Attachment #556708 - Flags: review?(dmandelin)
Created attachment 556715 [details] [diff] [review]
Add JSOPTION_SOFTEN.

Alloc behavior and RNG seed is per-allocator so that it can be set differently on independent threads.
Attachment #556715 - Flags: review?(dmandelin)
Attachment #556700 - Flags: review?(dmandelin) → review+
Attachment #556708 - Flags: review?(dmandelin) → review+
Comment on attachment 556702 [details] [diff] [review]
NanoJIT hardening.

Review of attachment 556702 [details] [diff] [review]:
-----------------------------------------------------------------

I just hope the future-proofing isn't too strict. But it will be easy enough to find that out.
Attachment #556702 - Flags: review?(dmandelin) → review+
Talked with Brian about coordinating to give TI (bug 608741) time to bake on trunk before intentionally introducing nondeterminism. I'm going to land this pref'd out once the JSOPTION_SOFTEN patch gets r+. When Brian feels comfortable he'll give me the word to flip the pref on.
Created attachment 557017 [details] [diff] [review]
Add JSOPTION_SOFTEN, disabled by default.

Disabled by default in libpref.
Attachment #556715 - Attachment is obsolete: true
Attachment #556715 - Flags: review?(dmandelin)
Comment on attachment 557017 [details] [diff] [review]
Add JSOPTION_SOFTEN, disabled by default.

Whoops, dropped the r?
Attachment #557017 - Flags: review?(dmandelin)
Attachment #557017 - Flags: review?(dmandelin) → review+
Rebasing over type inference now in order to land.
Depends on: 700822

Updated

6 years ago
Blocks: 504250
Articles like this are the price I pay for not landing:

http://arstechnica.com/business/news/2011/12/chrome-sandboxing-makes-it-the-most-secure-browser-vendor-study-claims.ars

Going to ask IT for a dedicated WinXP box on Monday to work through the WinXP-specific issues with bug 700822 in order to unblock everything else. Landing these will be my top priority after bug 686927 lands on IonMonkey (est: ~1.5 weeks).

Comment 45

6 years ago
I'd like to draw attention to this paper[0] by a group of European researchers on specialized code generation techniques designed to defeat ROP attacks by eliminating usable gadgets from compiled code. There may be some ideas here you can use for further JIT hardening.

[0]
http://iseclab.org/papers/gfree.pdf
Blocks: 729824

Comment 46

6 years ago
Hi Chris, did this land?
I've heard this landed. Does anyone know if we can close this bug?
(Re-poke?) cdleary - are we done here?
(In reply to Johnathan Nightingale [:johnath] from comment #48)
> (Re-poke?) cdleary - are we done here?

http://mxr.mozilla.org/mozilla-central/source/modules/libpref/src/init/all.js#669 would seem to suggest yes?

Comment 50

5 years ago
Hi there -- to follow up from Johnath's question -- are we done with this work?
(In reply to John Jensen from comment #50)
> Hi there -- to follow up from Johnath's question -- are we done with this
> work?

I think there are potentially more things to turn on (e.g., nop insertion), but instead of doing that, I'm inclined to wait for IonMonkey, and reformulate a JIT hardening strategy to work with IM.
(In reply to David Mandelin from comment #51)
> I think there are potentially more things to turn on (e.g., nop insertion),
> but instead of doing that, I'm inclined to wait for IonMonkey, and
> reformulate a JIT hardening strategy to work with IM.

Ballpark timeline? I get the sense from your reply that you don't think the value outweighs the cost for putting more hardening in now, but we know it's a place that security writers, if not researchers, are sniffing around so I want to make sure we don't write off easy wins now against a better version in ionmonkey that won't be available for a while.
(In reply to Johnathan Nightingale [:johnath] from comment #52)
> (In reply to David Mandelin from comment #51)
> > I think there are potentially more things to turn on (e.g., nop insertion),
> > but instead of doing that, I'm inclined to wait for IonMonkey, and
> > reformulate a JIT hardening strategy to work with IM.
> 
> Ballpark timeline? I get the sense from your reply that you don't think the
> value outweighs the cost for putting more hardening in now, but we know it's
> a place that security writers, if not researchers, are sniffing around so I
> want to make sure we don't write off easy wins now against a better version
> in ionmonkey that won't be available for a while.

Ballpark landing timeline is a couple of months.

I'm also disinclined to work too hard on this right now because it's really not obvious how valuable various hardening measures are for actual security--it seems like the marketing value is most of it at this point.

We could at least try turning on nop insertion, as long as the code is in fact all there and in shape. I think the main concern was about what that would do to topcrash debuggability, but I don't think we are actively debugging jitcode we get back anyway. So we just need to check if it works and perf is OK. I won't assign it a high importance, but I did put it on my short-term to-do list.
Makes sense, thanks.

Updated

5 years ago
Keywords: sec-want

Comment 55

5 years ago
Windows 8 apparently applies ASLR on VirtualAlloc.
Mass-reassigning cdleary's bugs to default. He won't work on any of them, anymore. I guess, at least.

@cdleary: shout if you take issue with this.
Assignee: cdleary → general
Status: ASSIGNED → NEW
David, do you know what the next steps are for this bug?
Flags: needinfo?(dvander)
This bug has bitrotted too much to be useful by now, especially with the upcoming removal of JM, it's not worth rebasing. As Dave said in comment #53, the value here is mostly marketing. No optimizing JIT that I know of takes the performance hit required to implement these hardening features.

V8 does some kind of limited constant blinding in their baseline JIT only. We could always consider these things for our baseline JIT, for marketing purposes. The only thing that will actually make the JIT secure is a content sandbox.
Flags: needinfo?(dvander)
Blocks: 862082
(Assignee)

Updated

3 years ago
Assignee: general → nobody
No longer blocks: 729824
Depends on: 729824

Updated

4 months ago
Depends on: 1361159
No longer blocks: 862082
Depends on: 1333005

Updated

2 months ago
Depends on: 1376819
Depends on: 1348341
You need to log in before you can comment on or make changes to this bug.