Closed Bug 860254 Opened 11 years ago Closed 10 years ago

Poison memory on free for all small allocations

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla30

People

(Reporter: benjamin, Assigned: away)

References

Details

(Keywords: helpwanted, sec-want, Whiteboard: [mentor=glandium][lang=c++][good second bug])

Attachments

(3 files, 4 obsolete files)

As a way of making crashes more predictable/stable and hopefully reducing use-after-free attack surface, I would like to poison memory from small allocations on free (anything smaller than 512 bytes, say?)

Ideally, would like the poison address to be in a non-addressable block of memory. On Windows, we would try to MEM_RESERVE the 32k page block at 0xFAFAFAFA or another poison block.

I have some implementation questions for jlebar/vladan/glandium:

* should we do this in jemalloc or is replace-malloc ready to do this on all platforms (I've been waiting on replace-malloc for bug 811483, so if it's not available soon, I want to implement that in jemalloc also)

* Will it be sufficient to test the performance of this by landing it to inbound/central and measuring perf?
* Are there existing telemetry metrics which can usefully measure the impact of this in the field?
* Should we make this pref-controllable and do we even need to consider doing a field A/B test on the perf impact?

Anyone want to take this or volunteer to mentor it? It sounds like it could be a good student project.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #0)
> As a way of making crashes more predictable/stable and hopefully reducing
> use-after-free attack surface, I would like to poison memory from small
> allocations on free (anything smaller than 512 bytes, say?)

I guess the right value would be that of the larger class we have in the tree.

> Ideally, would like the poison address to be in a non-addressable block of
> memory. On Windows, we would try to MEM_RESERVE the 32k page block at
> 0xFAFAFAFA or another poison block.

We already have a poison area in layout/base/nsPresArena.cpp. We should probably globalize it.

> I have some implementation questions for jlebar/vladan/glandium:
> 
> * should we do this in jemalloc or is replace-malloc ready to do this on all
> platforms (I've been waiting on replace-malloc for bug 811483, so if it's
> not available soon, I want to implement that in jemalloc also)

replace-malloc is not ready to be used for always-on features. That being said, both mozjemalloc and jemalloc3 have a feature to fill freed allocations with 0x5a (search for opt_junk under memory/). That could be extended to allow configurable values and maximum length.

> Anyone want to take this or volunteer to mentor it? It sounds like it could
> be a good student project.

I can certainly mentor.
> I guess the right value would be that of the larger class we have in the tree.

I'm not sure exactly what you mean, but if we pick a size-class as listed at the top of memory/mozjemalloc/jemalloc.c, I don't think we can go wrong...

> * Should we make this pref-controllable and do we even need to consider doing a field A/B test on 
> the perf impact?

I'd love to, but telemetry is in such a sorry state, this is probably a lot of effort for a tiny reward.

Maybe it's obvious, but to say it: Using an XPCOM pref to control free is a bit tricky.  Changing the poison pref midway through execution isn't a problem, but we probably call free a bunch of times before we can read prefs, and it's not clear what the default value should be.
(In reply to Justin Lebar [:jlebar] from comment #2)
> > I guess the right value would be that of the larger class we have in the tree.
> 
> I'm not sure exactly what you mean, but if we pick a size-class as listed at
> the top of memory/mozjemalloc/jemalloc.c, I don't think we can go wrong...

I was thinking C++ class size.

> Maybe it's obvious, but to say it: Using an XPCOM pref to control free is a
> bit tricky.  Changing the poison pref midway through execution isn't a
> problem, but we probably call free a bunch of times before we can read
> prefs, and it's not clear what the default value should be.

We could do A/B testing based on rand() or something similar.
Whiteboard: [mentor=glandium][lang=c++][good second bug]
Keywords: sec-want
(In reply to Mike Hommey [:glandium] from comment #1)
> > Ideally, would like the poison address to be in a non-addressable block of
> > memory. On Windows, we would try to MEM_RESERVE the 32k page block at
> > 0xFAFAFAFA or another poison block.
> 
> We already have a poison area in layout/base/nsPresArena.cpp. We should
> probably globalize it.

Bug 867530 is moving this into MFBT.
Depends on: 867530
Hi I am interested in working on this bug,but it's my first time to work on with debug,can anybody guide me on how to get started with it?Thanks a lot.
Assignee: nobody → dmajor
I've started looking into the code, and I have some questions:

Should the poisoning use the same 0x5a value from opt_junk, or should it somehow try to tie into the mozPoisonValue() that's now in MFBT?

Should this have its own "opt_" infrastructure with all the ifdefs and variables, or just be baked in directly, like the MOZ_TEMP_INVESTIGATION poisoning? (By the way, is the MOZ_TEMP_INVESTIGATION still needed?)

Perhaps the size cutoff could be determined empirically? Start out poisoning everything and dial it back until perf runs are acceptable?
Should have set needinfo for comment 6.
Flags: needinfo?(mh+mozilla)
(In reply to David Major [:dmajor] from comment #6)
> I've started looking into the code, and I have some questions:
> 
> Should the poisoning use the same 0x5a value from opt_junk, or should it
> somehow try to tie into the mozPoisonValue() that's now in MFBT?

As much as this would be nice, we should also keep jemalloc self-contained. Also, mozPoisonValueInit is *very* likely to not have been run when jemalloc initializes.

> Should this have its own "opt_" infrastructure with all the ifdefs and
> variables, or just be baked in directly, like the MOZ_TEMP_INVESTIGATION
> poisoning?

For mozjemalloc, it's reasonable to bake in directly. For jemalloc3, it would probably be better to make it have its own "opt_" infrastructure, but it would be even better to ask on the upstream mailing list. Jemalloc3 could be dealt with in a followup, though.

> (By the way, is the MOZ_TEMP_INVESTIGATION still needed?)

I guess not.
 
> Perhaps the size cutoff could be determined empirically? Start out poisoning
> everything and dial it back until perf runs are acceptable?

Depending on the size that ends up being, it could very well be too small.
Flags: needinfo?(mh+mozilla)
Attached patch First draft (obsolete) — Splinter Review
Here is a first draft. (It took a while because I've put this bug on a background thread)

I've pretty much made the code act as if opt_junk were defined and set to true for all free-like functions. The remaining instances of opt_junk are only in the malloc-like functions. 

This is a perf-worst-case starting point, with no size considerations. We can dial it back from here as necessary. 

Glandium: What do you think of this patch, and what does it take to get from here to a real code change? Do you have particular perf tests in mind?
Attachment #812891 - Flags: feedback?(mh+mozilla)
Comment on attachment 812891 [details] [diff] [review]
First draft

Review of attachment 812891 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/mozjemalloc/Makefile.in
@@ -22,5 @@
>  LOCAL_INCLUDES += -I$(topsrcdir)/memory/build
>  
>  # For non release/esr builds, enable (some) fatal jemalloc assertions.  This
> -# helps us catch memory errors.  See bug 764192 for details on what
> -# MOZ_TEMP_INVESTIGATION is for.

Can you remove that in a separate bug?

::: memory/mozjemalloc/jemalloc.c
@@ +4576,5 @@
>  arena_dalloc_large(arena_t *arena, arena_chunk_t *chunk, void *ptr)
>  {
> +	size_t pageind;
> +	size_t size;
> +  

Beware of trailing whitespaces. If you haven't done so already, try running mach mercurial-setup, which should setup things such that added trailing whitespaces are apparent in hg diff output.

@@ +4583,5 @@
>  
> +	pageind = ((uintptr_t)ptr - (uintptr_t)chunk) >> pagesize_2pow;
> +	size = chunk->map[pageind].bits & ~pagesize_mask;
> +
> +	memset(ptr, 0x5a, size);

Large allocations are > 2k and < ~4MB. There's potential for serious performance impact here. There should probably be a (configurable?) cap.

Note that, on an implementation level, you don't really need to remove opt_junk and MALLOC_FILL. Just adjust their default at the beginning of the file. If there's concern about the if (opt_junk) branches, opt_junk could be turned into a static const, at which point the compiler would optimize the branches out.

Please also discuss adding something similar to current jemalloc on the jemalloc-discuss list.
Attachment #812891 - Flags: feedback?(mh+mozilla)
Depends on: 931196
need help .how to set up this project?
Comment on attachment 812891 [details] [diff] [review]
First draft

>+	memset(ptr, 0x5a, size);

IIRC, jemalloc allocations are word aligned and at least the size of a word
so mozWritePoison() in mfbt/Poison.h seems like a better alternative.

The poison value is described in:
http://mxr.mozilla.org/mozilla-central/source/mfbt/Poison.cpp
http://mxr.mozilla.org/mozilla-central/source/mfbt/tests/TestPoisonArea.cpp
(In reply to Mats Palmgren (:mats) from comment #12)
> Comment on attachment 812891 [details] [diff] [review]
> First draft
> 
> >+	memset(ptr, 0x5a, size);
> 
> IIRC, jemalloc allocations are word aligned and at least the size of a word
> so mozWritePoison() in mfbt/Poison.h seems like a better alternative.

While this is something we could do for mozjemalloc, it's not really something I'd want to do with jemalloc3, and jemalloc3 is the future (okay, it's been the future for way too long, but still).
The junk/poison code is still disabled (in MALLOC_PRODUCTION) via a static const bool set to false. I spot-checked an opt build to verify that this doesn't change the generated code.
Attachment #812891 - Attachment is obsolete: true
Code still disabled. Spot-checked an opt build.
Comment on attachment 8360511 [details] [diff] [review]
Part 2: Split opt_junk into opt_junk (allocate) and opt_poison (deallocate)

Review of attachment 8360511 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/mozjemalloc/jemalloc.c
@@ +5460,5 @@
>  		    "\n", "");
>  		_malloc_message("Boolean MALLOC_OPTIONS: ",
>  		    opt_abort ? "A" : "a", "", "");
>  #ifdef MALLOC_FILL
> +		_malloc_message(opt_poison ? "C" : "c", "", "", "");

The letter P was taken. Think C for "clobber". Also "cheesy" and "could be replaced with something better if you have suggestions".
Still disabled.

arena_dalloc_large had some strange ifdef control flow. I *think* this version is still equivalent.
Feedback is welcome on this new round of patches, but they're not ready to go yet. Still need to do perf tests to figure out the size limit (and whether we can get away with doing this at all).
I ran some perf tests with poisoning enabled for all sizes, and the world didn't blow up. 

http://compare-talos.mattn.ca/?oldRevs=99c49a35bb4e&newRev=69b9243022b0&server=graphs.mozilla.org&submit=true

I asked mbrubeck to look at the numbers and he didn't find them alarming. The only one that really sticks out to me is dromaeo-css-yui. I was able to reproduce the loss with the try builds, but it disappeared when I ran under a profiler. I've seen that particular test swing wildly for unrelated changes before, so I think we'll need the post-checkin average-of-12 numbers to get the full picture.

If SIZE_T_MAX is too scary then we can dial it down, but otherwise I'm ready to move forward with this change.
Attachment #8360510 - Flags: review?(mh+mozilla)
Attachment #8360511 - Flags: review?(mh+mozilla)
Attachment #8360514 - Flags: review?(mh+mozilla)
Attachment #8360515 - Flags: review?(mh+mozilla)
Attachment #8360510 - Flags: review?(mh+mozilla) → review+
Attachment #8360511 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8360514 [details] [diff] [review]
Part 3: Add size condition for poisoning

Review of attachment 8360514 [details] [diff] [review]:
-----------------------------------------------------------------

This looks correct, but I don't see a point in landing this. A size_t is always going to be <= SIZE_T_MAX, so in practice, should_poison is equivalent to opt_poison. I'd suggest keeping this for a followup if we end up needing to cut the limit down at which point should_poison would have a meaning. Until then, we can keep the code as it is with part 1, 2 and 4.
Attachment #8360514 - Flags: review?(mh+mozilla) → feedback+
Attachment #8360515 - Flags: review?(mh+mozilla) → review+
This is timing out OSX jsreftests on Try, where a certain test has a 1GB memory block that takes several minutes to memset (it only takes 3 seconds on the linux machines, and on my own Mac Mini).
Depends on: 969692
So close... Is comment 22 a (temporary) show stopper? Is anyone digging into that issue?
I would expect that when we free a 1GB memory block we would immediately unmap the memory region. Can we know in the allocator whether that's the case and avoid poisoning just those blocks?

Is the 1G memory block an essential part of the test (and which test is it)? Can we just fix that test?

I support landing this with an upper bound of 10MB if that would help get this landed!
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #24)
> I would expect that when we free a 1GB memory block we would immediately
> unmap the memory region. Can we know in the allocator whether that's the
> case and avoid poisoning just those blocks?

This is coming from huge_dalloc which does immediately unmap afterwards, so yes we could just skip poisoning there. That would cover all sizes 1MB+. Maybe there's still value in poisoning though, in case someone remaps later? (Or do all OSes guarantee zeroing?)

> Is the 1G memory block an essential part of the test (and which test is it)?
> Can we just fix that test?

It is regress-617935.js which does require huge allocations.

> I support landing this with an upper bound of 10MB if that would help get
> this landed!

I can pass Try with a bound of 0x1FFFFFFF (just shy of 512MB), if that's what it comes down to...
(In reply to David Major [:dmajor] from comment #25)
> (Or do all OSes guarantee zeroing?)

Yes they do.

Although Linux has this:
       MAP_UNINITIALIZED (since Linux 2.6.33)
              Don't clear anonymous pages.  This flag is intended to improve performance on embedded devices.  This flag  is  hon‐
              ored  only  if  the  kernel was configured with the CONFIG_MMAP_ALLOW_UNINITIALIZED option.  Because of the security
              implications, that option is normally enabled only on embedded devices (i.e., devices where one has complete control
              of the contents of user memory).

But that's explicit on mmap, and requires a specific kernel configuration (not enabled by default)
Yes, there is no reason to poison before unmap. Accesses to that region should just fault cleanly while it's unmapped. It could be mapped later for something else, so theoretically we could re-poison on re-map, but let's not do that yet...
Attachment #8360514 - Attachment is obsolete: true
Attachment #8374600 - Flags: review?(mh+mozilla)
Comment on attachment 8374600 [details] [diff] [review]
Part 3v2: No need to poison immediately before unmap

Review of attachment 8374600 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/mozjemalloc/jemalloc.c
@@ -5269,5 @@
>  	malloc_mutex_unlock(&huge_mtx);
>  
>  	/* Unmap chunk. */
> -#ifdef MALLOC_FILL
> -	if (opt_poison)

You should just leave the original code, here, doing the memset on opt_junk (sure, it's useless, but there's no need to diverge here since it's not enabled).
Attachment #8374600 - Flags: review?(mh+mozilla) → review-
Removed Part 3v2. Reverted change to huge_dalloc in Part 2.
Attachment #8360511 - Attachment is obsolete: true
Attachment #8374600 - Attachment is obsolete: true
Attachment #8374853 - Flags: review?(mh+mozilla)
Attachment #8374853 - Flags: review?(mh+mozilla) → review+
Does anything prevent 0x5a5a5a5a from being mapped/allocated?

I understand the reluctance to make jemalloc call code in mfbt. Would it make sense to *move* the code for reserving the poison address into jemalloc? Or set up some kind of callback?
Depends on: 977955
Sorry to spam this bug report, but I had a problem with nouveau and plugin container going crazy. Reverting all patches from this bug fixed my bug.

Cf bug #978439
Depends on: 1045958
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: