Closed Bug 595034 Opened 10 years ago Closed 10 years ago

nanojit: harden via random no-op instruction insertion

Categories

(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect, P1)

x86
All
defect

Tracking

(Not tracked)

RESOLVED FIXED
flash10.1.x-Salt

People

(Reporter: rreitmai, Assigned: rreitmai)

References

Details

(Whiteboard: fixed-in-salt,fixed-in-nanojit,fixed-in-tracemonkey,fixed-in-tamarin)

Attachments

(3 files, 4 obsolete files)

Attached patch patch applies to salt not redux. (obsolete) — Splinter Review
Insert no-op instructions when JIT'ing with the intention of making exploitable patterns within the generated code less predicable.
Attachment #473845 - Attachment is patch: true
Attachment #473845 - Attachment mime type: application/octet-stream → text/plain
Attachment #473845 - Flags: review?(edwsmith)
Attachment #473845 - Flags: review?(wmaddox)
Blocks: 593517
> Should we limit these random points to places adjacent to constants 
> the attacker can control?  or, is it important that they go randomly 
> between any two instructions.

The paper below does an analysis of ret-based attacks and seems to hint that useful sequences are also available outside of constants.  I have yet to confirm this as I haven't yet fully read the paper nor examined our generated code (as opposed to analyzing libc).

The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86)
http://cseweb.ucsd.edu/~hovav/papers/s07.html
Assignee: nobody → rreitmai
Status: NEW → ASSIGNED
Flags: flashplayer-qrb+
Priority: -- → P1
Target Milestone: --- → flash10.1.x-Salt
Just a note re frequency: the above patch inserts a nop every 64-128 LIR ops, not machine instructions.
Attached patch v2 - cleanup (obsolete) — Splinter Review
Halted review until dependency on random number generator is complete, will open new bug regarding this piece.
Attachment #473845 - Attachment is obsolete: true
Attachment #473845 - Flags: review?(wmaddox)
Attachment #473845 - Flags: review?(edwsmith)
Depends on: 596056
Group: tamarin-security
Inserts random nops every 128-1151 bytes that are generated by the jit.

The patch is based on Salt and so may not apply cleanly to the latest redux.
Attachment #474891 - Attachment is obsolete: true
Attachment #475360 - Flags: review?(edwsmith)
Attachment #475360 - Attachment is patch: true
Attachment #475360 - Attachment mime type: application/octet-stream → text/plain
Attachment #475360 - Flags: feedback?(nnethercote)
Adding Nick for feedback as this is not yet ready to land in nanojit central without bug 596056.
Oh should also point out that nop insertion will occur once every 128-1151 Bytes of generated code. 

The rule of thumb that I've heard tossed about is that inserting one every 64 instructions has a negligible performance impact, I've yet to confirmed.

But in any case, it would be good if anyone with experience on this topic, could provide feedback.

Will run more benchmarks and post results here using the v3 patch above.
Comment on attachment 475360 [details] [diff] [review]
v3 - relies on VMPI_random function patch in blocking bug

I don't know anything about the attack and how effective this is at countering it, but if I assume that it is valid the implementation of it seems reasonable.  So f+ from me.

>+#ifdef NJ_HARDENING_RANDOM_NOP_INSERTION
>+        VMPI_randomInit(&rgenerator);
>+
>+        int32_t nopInsertTrigger;
>+        VMPI_random(&rgenerator, &nopInsertTrigger, sizeof(nopInsertTrigger));

The size parameter seems weird.  Seems like in practice you'll not need any random number bigger than 32-bits.  If that's true, you could just return an int32_t and then mask it if you need a smaller value.


>+    // number of bytes that any instruction can take
>+    #define NJ_MAX_BYTES_OF_INSTRUCTION     10

AFAIK on x86 instructions can be at least 15 bytes.  Googling around I found several people claim that instructions up to 17 bytes are conceivable but anything more than 15 bytes cause some kind of exception.  Eg. http://coding.derkeiler.com/Archive/Assembler/alt.lang.asm/2006-12/msg00028.html and http://www.rhinocerus.net/forum/lang-asm-x86/256816-longest-instruction.html.
Attachment #475360 - Flags: feedback?(nnethercote) → feedback+
(In reply to comment #7)
> Comment on attachment 475360 [details] [diff] [review]
> v3 - relies on VMPI_random function patch in blocking bug
> 
> I don't know anything about the attack and how effective this is at countering
> it

The paper referenced above has some interesting details on this.

> The size parameter seems weird.  Seems like in practice you'll not need any
> random number bigger than 32-bits.  If that's true, you could just return an
> int32_t and then mask it if you need a smaller value.

True, Ed suggested wrapping the random number retrieval in a call.  I'll post an updated patch shortly.

> 
> 
> >+    // number of bytes that any instruction can take
> >+    #define NJ_MAX_BYTES_OF_INSTRUCTION     10
> 
> AFAIK on x86 instructions can be at least 15 bytes.  Googling around I found
> several people claim that instructions up to 17 bytes are conceivable but
> anything more than 15 bytes cause some kind of exception.  Eg.
> http://coding.derkeiler.com/Archive/Assembler/alt.lang.asm/2006-12/msg00028.html
> and
> http://www.rhinocerus.net/forum/lang-asm-x86/256816-longest-instruction.html.

Hmm, this is interesting, thanks.   The value is used as a cap for how many bytes can be removed per iteration of gen() , which now that I think about it is not quite correct.  I'll update the constant name/value.
Attached patch v4 - update (obsolete) — Splinter Review
This patch introduces asm_insert_random_nop() which is empty for all but x86-32, and updates calls to use the getNoise() functions.
Attachment #475360 - Attachment is obsolete: true
Attachment #476326 - Flags: review?
Attachment #475360 - Flags: review?(edwsmith)
Attachment #476326 - Flags: review?(nnethercote)
Attachment #476326 - Flags: review?(edwsmith)
Attachment #476326 - Flags: review?
Comment on attachment 476326 [details] [diff] [review]
v4 - update

(In reply to comment #4)
> Created attachment 475360 [details] [diff] [review]

The main assembler loop in gen() definitely *is* in the critical path for jit time.   So, the predicate that checks for noise insertion should be optimized for the common case (no noise), at least by checking the config flag first.  

Will the config flag be static const = false on non-x86-32 platforms?  if not, we need ifdefs around it.

You've got NJ_ESTIMATED_MAX_BYTES_GENERATED_PER_LIR_INSTRUCTION = 64, but elsewhere are hardcoding the values 128 and 1023.  How do these three numbers interact?  Should they all be defined constants?  128 and 1023 appear more than once in the source code, which justifies defining a constant for them.

I didn't find any bugs, but I think between making constants, streamlining the predicate in gen(), and factoring out the definition of class Noise (which is in another R+ patch already), its worth re-reviewing; so R- just for that reason.  I think its almost done.
Attachment #476326 - Flags: review?(edwsmith) → review-
Every call to hasNoise() also checks && _config.harden_nop_... would be slightly cleaner to write a predicate for each one:

  bool hardenNopInsertion() { return _noise != NULL && _config.harden_nop_insertion; }
  bool hardenSomethingElse() { return _noise != NULL && _config.harden_something_else; }

And so on.  Forget what i said in the last comment about checking the config flag first; its a wash either way, probably.
Attachment #476326 - Flags: review?(nnethercote)
Looks like this was checked into Salt in P4 CL 721542.  What work remains?
Whiteboard: fixed-in-salt
Attached patch v5Splinter Review
Addressing Edwin's concerns and requesting Nick to re-review since there are quite a few changes.

(1) hardenNopInsert() is const on all platforms but x86-32
(2) isolated noise range used for insertion; noiseForNopInsertion()
(3) sanity of 'delta' is now determined exactly by seeing if priorIns is in the prior block, via codeList->isInBlock() call.

Also this code disables the feature on all platforms for now.  

There will be separate patches for enabling it, once we can toggle it on/off on the command-line.
Attachment #476326 - Attachment is obsolete: true
Attachment #483585 - Flags: review?(nnethercote)
Attachment #483585 - Flags: review?(edwsmith)
This code disables hardening features when generating the thunks; the thinking being that thunks are highly controlled code that is not open to user manipulation.
Attachment #483588 - Flags: review?(edwsmith)
Attachment #483588 - Attachment is patch: true
Attachment #483588 - Attachment mime type: application/octet-stream → text/plain
Attachment #483588 - Flags: review?(edwsmith) → review?(wmaddox)
Attachment #483675 - Flags: review?(edwsmith)
Comment on attachment 483588 [details] [diff] [review]
Codegen nop changes

Could you set up 'cfg' just once in the CodegenLIR() constructor, rather than on each call to emitMD()?
Attachment #483588 - Flags: review?(wmaddox) → review+
Attachment #483585 - Flags: review?(nnethercote) → review+
Attachment #483585 - Flags: review?(edwsmith) → review?(wmaddox)
Comment on attachment 483675 [details] [diff] [review]
option to enable hardening

Nothing wrong, but this isn't finished -- need to update usage() too.  R+ with that fixed before landing.
Attachment #483675 - Flags: review?(edwsmith) → review+
Comment on attachment 483585 [details] [diff] [review]
v5

R+, but please consider the following issues and request re-review if you
make significant changes.

If the delta is invalid (priorIns and _nIns not in same block), you could just skip everything up to setting the new priorIns, instead of setting delta to 0.

Is VMPI_getVMPageSize() guaranteed to be fast on all platforms and for all host applications?  Perhaps we should grab the page size at initialization and hold it in the Assembler object.

On non-x86 platforms, we use a different definition of hardenNopInsertion() that just returns false instead of querying the config structure.  I presume that you did this to allow the hardenNopInsertion() and the code conditioned upon it to optmize away as dead code.  NOP insertion makes sense on other platforms as well.  Perhaps we want to leave this infrastructure more uniform on all platforms, anticipating asm_assert_random_nop() support.  Features that are intrinsically X86-specific have been conditionally compiled out entirely,
e.g., x86-only LIR opcodes.
Attachment #483585 - Flags: review?(wmaddox) → review+
(In reply to comment #18)
> Comment on attachment 483585 [details] [diff] [review]
> v5
> 
> If the delta is invalid (priorIns and _nIns not in same block), you could just
> skip everything up to setting the new priorIns, instead of setting delta to 0.
> 

done

> Is VMPI_getVMPageSize() guaranteed to be fast on all platforms and for all host
> applications?  Perhaps we should grab the page size at initialization and hold
> it in the Assembler object.
> 

Its sitting inside a debug-only assert, so I'm not too concerned.

> On non-x86 platforms, we use a different definition of hardenNopInsertion()
> that just returns false instead of querying the config structure.  I presume
> that you did this to allow the hardenNopInsertion() and the code conditioned
> upon it to optmize away as dead code. 

Exactly.

>  Perhaps we want to leave this infrastructure more uniform
> on all platforms, anticipating asm_assert_random_nop() support.  Features that
> are intrinsically X86-specific have been conditionally compiled out entirely,
> e.g., x86-only LIR opcodes.

I prefer the static approach, where it can be used, rather than ifdef'ing.  We do have a slight gain in code size; re: empty asm_insert_random_nop() methods if the linker is not able to weed them out, but I'd argue that this cost is negligible compared against increased code maintainability.
rreitmai http://hg.mozilla.org/projects/nanojit-central/rev/d3116f2abd87
Whiteboard: fixed-in-salt → fixed-in-salt,fixed-in-nanojit
(In reply to comment #19)

It looks like that last patch got a bit garbled.  It contains the following code, effectively dead:

+
+                if (codeList) {
+                    codeList = codeList;
+                }
Hmm interesting...thanks for pointing out, I'll have a careful look and see what happened.
http://hg.mozilla.org/tracemonkey/rev/3be4ae3c2b98
Whiteboard: fixed-in-salt,fixed-in-nanojit → fixed-in-salt, fixed-in-nanojit, fixed-in-tracemonkey
rreitmai http://hg.mozilla.org/tamarin-redux/rev/a29d6b305d1f
Whiteboard: fixed-in-salt, fixed-in-nanojit, fixed-in-tracemonkey → fixed-in-salt,fixed-in-nanojit,fixed-in-tracemonkey,fixed-in-tamarin
http://hg.mozilla.org/mozilla-central/rev/3be4ae3c2b98
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.