Closed
Bug 595034
Opened 14 years ago
Closed 14 years ago
nanojit: harden via random no-op instruction insertion
Categories
(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
flash10.1.x-Salt
People
(Reporter: rreitmai, Assigned: rreitmai)
References
Details
(Whiteboard: fixed-in-salt,fixed-in-nanojit,fixed-in-tracemonkey,fixed-in-tamarin)
Attachments
(3 files, 4 obsolete files)
10.71 KB,
patch
|
n.nethercote
:
review+
wmaddox
:
review+
|
Details | Diff | Splinter Review |
729 bytes,
patch
|
wmaddox
:
review+
|
Details | Diff | Splinter Review |
729 bytes,
patch
|
edwsmith
:
review+
|
Details | Diff | Splinter Review |
Insert no-op instructions when JIT'ing with the intention of making exploitable patterns within the generated code less predicable.
Assignee | ||
Updated•14 years ago
|
Attachment #473845 -
Attachment is patch: true
Attachment #473845 -
Attachment mime type: application/octet-stream → text/plain
Attachment #473845 -
Flags: review?(edwsmith)
Assignee | ||
Updated•14 years ago
|
Attachment #473845 -
Flags: review?(wmaddox)
Assignee | ||
Comment 1•14 years ago
|
||
> Should we limit these random points to places adjacent to constants > the attacker can control? or, is it important that they go randomly > between any two instructions. The paper below does an analysis of ret-based attacks and seems to hint that useful sequences are also available outside of constants. I have yet to confirm this as I haven't yet fully read the paper nor examined our generated code (as opposed to analyzing libc). The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86) http://cseweb.ucsd.edu/~hovav/papers/s07.html
Assignee: nobody → rreitmai
Status: NEW → ASSIGNED
Flags: flashplayer-qrb+
Priority: -- → P1
Target Milestone: --- → flash10.1.x-Salt
Assignee | ||
Comment 2•14 years ago
|
||
Just a note re frequency: the above patch inserts a nop every 64-128 LIR ops, not machine instructions.
Assignee | ||
Comment 3•14 years ago
|
||
Halted review until dependency on random number generator is complete, will open new bug regarding this piece.
Attachment #473845 -
Attachment is obsolete: true
Attachment #473845 -
Flags: review?(wmaddox)
Attachment #473845 -
Flags: review?(edwsmith)
Assignee | ||
Updated•14 years ago
|
Group: tamarin-security
Assignee | ||
Comment 4•14 years ago
|
||
Inserts random nops every 128-1151 bytes that are generated by the jit. The patch is based on Salt and so may not apply cleanly to the latest redux.
Attachment #474891 -
Attachment is obsolete: true
Attachment #475360 -
Flags: review?(edwsmith)
Assignee | ||
Updated•14 years ago
|
Attachment #475360 -
Attachment is patch: true
Attachment #475360 -
Attachment mime type: application/octet-stream → text/plain
Assignee | ||
Updated•14 years ago
|
Attachment #475360 -
Flags: feedback?(nnethercote)
Assignee | ||
Comment 5•14 years ago
|
||
Adding Nick for feedback as this is not yet ready to land in nanojit central without bug 596056.
Assignee | ||
Comment 6•14 years ago
|
||
Oh should also point out that nop insertion will occur once every 128-1151 Bytes of generated code. The rule of thumb that I've heard tossed about is that inserting one every 64 instructions has a negligible performance impact, I've yet to confirmed. But in any case, it would be good if anyone with experience on this topic, could provide feedback. Will run more benchmarks and post results here using the v3 patch above.
Comment 7•14 years ago
|
||
Comment on attachment 475360 [details] [diff] [review] v3 - relies on VMPI_random function patch in blocking bug I don't know anything about the attack and how effective this is at countering it, but if I assume that it is valid the implementation of it seems reasonable. So f+ from me. >+#ifdef NJ_HARDENING_RANDOM_NOP_INSERTION >+ VMPI_randomInit(&rgenerator); >+ >+ int32_t nopInsertTrigger; >+ VMPI_random(&rgenerator, &nopInsertTrigger, sizeof(nopInsertTrigger)); The size parameter seems weird. Seems like in practice you'll not need any random number bigger than 32-bits. If that's true, you could just return an int32_t and then mask it if you need a smaller value. >+ // number of bytes that any instruction can take >+ #define NJ_MAX_BYTES_OF_INSTRUCTION 10 AFAIK on x86 instructions can be at least 15 bytes. Googling around I found several people claim that instructions up to 17 bytes are conceivable but anything more than 15 bytes cause some kind of exception. Eg. http://coding.derkeiler.com/Archive/Assembler/alt.lang.asm/2006-12/msg00028.html and http://www.rhinocerus.net/forum/lang-asm-x86/256816-longest-instruction.html.
Attachment #475360 -
Flags: feedback?(nnethercote) → feedback+
Assignee | ||
Comment 8•14 years ago
|
||
(In reply to comment #7) > Comment on attachment 475360 [details] [diff] [review] > v3 - relies on VMPI_random function patch in blocking bug > > I don't know anything about the attack and how effective this is at countering > it The paper referenced above has some interesting details on this. > The size parameter seems weird. Seems like in practice you'll not need any > random number bigger than 32-bits. If that's true, you could just return an > int32_t and then mask it if you need a smaller value. True, Ed suggested wrapping the random number retrieval in a call. I'll post an updated patch shortly. > > > >+ // number of bytes that any instruction can take > >+ #define NJ_MAX_BYTES_OF_INSTRUCTION 10 > > AFAIK on x86 instructions can be at least 15 bytes. Googling around I found > several people claim that instructions up to 17 bytes are conceivable but > anything more than 15 bytes cause some kind of exception. Eg. > http://coding.derkeiler.com/Archive/Assembler/alt.lang.asm/2006-12/msg00028.html > and > http://www.rhinocerus.net/forum/lang-asm-x86/256816-longest-instruction.html. Hmm, this is interesting, thanks. The value is used as a cap for how many bytes can be removed per iteration of gen() , which now that I think about it is not quite correct. I'll update the constant name/value.
Assignee | ||
Comment 9•14 years ago
|
||
This patch introduces asm_insert_random_nop() which is empty for all but x86-32, and updates calls to use the getNoise() functions.
Attachment #475360 -
Attachment is obsolete: true
Attachment #476326 -
Flags: review?
Attachment #475360 -
Flags: review?(edwsmith)
Assignee | ||
Updated•14 years ago
|
Attachment #476326 -
Flags: review?(nnethercote)
Attachment #476326 -
Flags: review?(edwsmith)
Attachment #476326 -
Flags: review?
Comment 10•14 years ago
|
||
Comment on attachment 476326 [details] [diff] [review] v4 - update (In reply to comment #4) > Created attachment 475360 [details] [diff] [review] The main assembler loop in gen() definitely *is* in the critical path for jit time. So, the predicate that checks for noise insertion should be optimized for the common case (no noise), at least by checking the config flag first. Will the config flag be static const = false on non-x86-32 platforms? if not, we need ifdefs around it. You've got NJ_ESTIMATED_MAX_BYTES_GENERATED_PER_LIR_INSTRUCTION = 64, but elsewhere are hardcoding the values 128 and 1023. How do these three numbers interact? Should they all be defined constants? 128 and 1023 appear more than once in the source code, which justifies defining a constant for them. I didn't find any bugs, but I think between making constants, streamlining the predicate in gen(), and factoring out the definition of class Noise (which is in another R+ patch already), its worth re-reviewing; so R- just for that reason. I think its almost done.
Attachment #476326 -
Flags: review?(edwsmith) → review-
Comment 11•14 years ago
|
||
Every call to hasNoise() also checks && _config.harden_nop_... would be slightly cleaner to write a predicate for each one: bool hardenNopInsertion() { return _noise != NULL && _config.harden_nop_insertion; } bool hardenSomethingElse() { return _noise != NULL && _config.harden_something_else; } And so on. Forget what i said in the last comment about checking the config flag first; its a wash either way, probably.
Updated•14 years ago
|
Attachment #476326 -
Flags: review?(nnethercote)
Comment 12•14 years ago
|
||
Looks like this was checked into Salt in P4 CL 721542. What work remains?
Assignee | ||
Comment 13•14 years ago
|
||
Addressing Edwin's concerns and requesting Nick to re-review since there are quite a few changes. (1) hardenNopInsert() is const on all platforms but x86-32 (2) isolated noise range used for insertion; noiseForNopInsertion() (3) sanity of 'delta' is now determined exactly by seeing if priorIns is in the prior block, via codeList->isInBlock() call. Also this code disables the feature on all platforms for now. There will be separate patches for enabling it, once we can toggle it on/off on the command-line.
Attachment #476326 -
Attachment is obsolete: true
Attachment #483585 -
Flags: review?(nnethercote)
Assignee | ||
Updated•14 years ago
|
Attachment #483585 -
Flags: review?(edwsmith)
Assignee | ||
Comment 14•14 years ago
|
||
This code disables hardening features when generating the thunks; the thinking being that thunks are highly controlled code that is not open to user manipulation.
Attachment #483588 -
Flags: review?(edwsmith)
Assignee | ||
Updated•14 years ago
|
Attachment #483588 -
Attachment is patch: true
Attachment #483588 -
Attachment mime type: application/octet-stream → text/plain
Updated•14 years ago
|
Attachment #483588 -
Flags: review?(edwsmith) → review?(wmaddox)
Assignee | ||
Comment 15•14 years ago
|
||
Attachment #483675 -
Flags: review?(edwsmith)
Comment 16•14 years ago
|
||
Comment on attachment 483588 [details] [diff] [review] Codegen nop changes Could you set up 'cfg' just once in the CodegenLIR() constructor, rather than on each call to emitMD()?
Attachment #483588 -
Flags: review?(wmaddox) → review+
Updated•14 years ago
|
Attachment #483585 -
Flags: review?(nnethercote) → review+
Updated•14 years ago
|
Attachment #483585 -
Flags: review?(edwsmith) → review?(wmaddox)
Comment 17•14 years ago
|
||
Comment on attachment 483675 [details] [diff] [review] option to enable hardening Nothing wrong, but this isn't finished -- need to update usage() too. R+ with that fixed before landing.
Attachment #483675 -
Flags: review?(edwsmith) → review+
Comment 18•14 years ago
|
||
Comment on attachment 483585 [details] [diff] [review] v5 R+, but please consider the following issues and request re-review if you make significant changes. If the delta is invalid (priorIns and _nIns not in same block), you could just skip everything up to setting the new priorIns, instead of setting delta to 0. Is VMPI_getVMPageSize() guaranteed to be fast on all platforms and for all host applications? Perhaps we should grab the page size at initialization and hold it in the Assembler object. On non-x86 platforms, we use a different definition of hardenNopInsertion() that just returns false instead of querying the config structure. I presume that you did this to allow the hardenNopInsertion() and the code conditioned upon it to optmize away as dead code. NOP insertion makes sense on other platforms as well. Perhaps we want to leave this infrastructure more uniform on all platforms, anticipating asm_assert_random_nop() support. Features that are intrinsically X86-specific have been conditionally compiled out entirely, e.g., x86-only LIR opcodes.
Attachment #483585 -
Flags: review?(wmaddox) → review+
Assignee | ||
Comment 19•14 years ago
|
||
(In reply to comment #18) > Comment on attachment 483585 [details] [diff] [review] > v5 > > If the delta is invalid (priorIns and _nIns not in same block), you could just > skip everything up to setting the new priorIns, instead of setting delta to 0. > done > Is VMPI_getVMPageSize() guaranteed to be fast on all platforms and for all host > applications? Perhaps we should grab the page size at initialization and hold > it in the Assembler object. > Its sitting inside a debug-only assert, so I'm not too concerned. > On non-x86 platforms, we use a different definition of hardenNopInsertion() > that just returns false instead of querying the config structure. I presume > that you did this to allow the hardenNopInsertion() and the code conditioned > upon it to optmize away as dead code. Exactly. > Perhaps we want to leave this infrastructure more uniform > on all platforms, anticipating asm_assert_random_nop() support. Features that > are intrinsically X86-specific have been conditionally compiled out entirely, > e.g., x86-only LIR opcodes. I prefer the static approach, where it can be used, rather than ifdef'ing. We do have a slight gain in code size; re: empty asm_insert_random_nop() methods if the linker is not able to weed them out, but I'd argue that this cost is negligible compared against increased code maintainability.
Assignee | ||
Comment 20•14 years ago
|
||
rreitmai http://hg.mozilla.org/projects/nanojit-central/rev/d3116f2abd87
Whiteboard: fixed-in-salt → fixed-in-salt,fixed-in-nanojit
Comment 21•14 years ago
|
||
(In reply to comment #19) It looks like that last patch got a bit garbled. It contains the following code, effectively dead: + + if (codeList) { + codeList = codeList; + }
Assignee | ||
Comment 22•14 years ago
|
||
Hmm interesting...thanks for pointing out, I'll have a careful look and see what happened.
Comment 23•14 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/3be4ae3c2b98
Whiteboard: fixed-in-salt,fixed-in-nanojit → fixed-in-salt, fixed-in-nanojit, fixed-in-tracemonkey
Assignee | ||
Comment 24•14 years ago
|
||
rreitmai http://hg.mozilla.org/tamarin-redux/rev/a29d6b305d1f
Whiteboard: fixed-in-salt, fixed-in-nanojit, fixed-in-tracemonkey → fixed-in-salt,fixed-in-nanojit,fixed-in-tracemonkey,fixed-in-tamarin
Assignee | ||
Comment 25•14 years ago
|
||
rreitmai http://hg.mozilla.org/tamarin-redux/rev/f75c9e567e10
Comment 26•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/3be4ae3c2b98
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•