Open Bug 1446040 Opened 7 years ago Updated 3 days ago

[meta] Add guard pages before and after arenas, chunks, and runs

Categories

(Core :: Memory Allocator, task, P3)

task

Tracking

()

Tracking Status
firefox61 --- affected

People

(Reporter: tjr, Assigned: gcp)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(4 keywords)

Attachments

(6 obsolete files)

Without guard pages, linear buffer overflows can run between different structures and corrupt other objects. Guard pages will cause an access violation during the overflow (assuming it is not a small overflow.) The Windows 10 heap will place trailing guard pages after segments (which hold allocated blocks) and large allocations (greater than 1 MB) which come directly from the backed allocator. Guard pages are only added to segments if they grow to a maximally bounded size. PartitionAlloc places a trailing guard page on all superpages (a linear sequence of pages) that service allocations and a leading guard page on all superpages (a linear sequence of pages) that service allocations.
Blocks: 1446045
Assignee: nobody → mhowell
Priority: -- → P2
Attached patch WIP Patch (broken) (obsolete) — Splinter Review
The idea with this patch is to place the guard pages at the start and end of each new chunk, without increasing the size of the region that's mmap'd when creating a chunk (so the usable space inside each chunk is reduced by two pages). For huge allocations, the guard pages are around the borders of the entire mmap'd region, which is usually larger than the client-requested size. This patch doesn't work, to say the least. Really that's an understatement, and you'll understand why I say that if you take a look at this try push [0]; believe it or not, that represents a noticeable improvement over my first few attempts over the past several weeks. I believe the failures I'm seeing now are mostly coming from null pointer dereferences deep in the JS engine. I'm having a very difficult time debugging these; it's extremely difficult to trace the data flow and find where the null pointer originally comes from, even using something like rr (and it doesn't help that the crashes tend to disappear when rr is recording). So I'm not making a ton of progress right now, and I already didn't feel great about how long it took me to get this far. The way this is going I don't know whether there's just some isolated bug I haven't tracked down yet, or if the whole approach is invalid. [0] https://treeherder.mozilla.org/#/jobs?repo=try&revision=659c38f374cc416d9196222c06b782c198f95206&selectedJob=214637377
(In reply to Matt Howell (he/him) [:mhowell] from comment #1) > I believe the failures I'm seeing now are mostly coming from null pointer > dereferences deep in the JS engine. I don't think this was right; I think these are actually assertion failures that I was mistaking for null derefs when viewing them through the lens of gdb. > (and it doesn't help that the crashes tend to disappear when rr is recording). I might have gotten around this by using chaos mode.
Comment on attachment 9028774 [details] [diff] [review] WIP Patch (broken) Review of attachment 9028774 [details] [diff] [review]: ----------------------------------------------------------------- I'm concerned about the overhead this is going to cause on systems with 64K pages. We may want to counter with an increased chunk size on those platforms... but then there are platforms where the page size can vary, depending on the kernel configuration, for the same executable... ::: memory/build/mozjemalloc.cpp @@ +437,5 @@ > #endif > > +// We configure guard pages at the beginning and end of each chunk, > +// so the available space is 2 pages less than the total chunk size. > +static const size_t kUsableChunkSize = kChunkSize - (2 * gPageSize); This should be g-prefixed instead of k-prefixed (because it's not a real constant), and should be defined with DECLARE_GLOBAL/DEFINE_GLOBAL. @@ +1323,5 @@ > // Return the chunk address for allocation address a. > static inline arena_chunk_t* > GetChunkForPtr(const void* aPtr) > { > + return !aPtr ? nullptr : (arena_chunk_t*)((uintptr_t(aPtr) & ~kChunkSizeMask) + gPageSize); Doing this without changing all the callers that do further pointer arithmetic (mostly to compute the chunk map index) instead of GetChunkOffsetForPtr is dangerous. If you do that change of callers, please do so in a separate bug. That's cleanup that's nice to have anyways, but could have some codegen impact that would be better measured independently. @@ +1330,5 @@ > // Return the chunk offset of address a. > static inline size_t > GetChunkOffsetForPtr(const void* aPtr) > { > + return !aPtr ? 0 : (size_t)(uintptr_t(aPtr) & kChunkSizeMask) - gPageSize; A pointer within the first page guard in the real chunk is going to return a very large number. @@ +2131,5 @@ > // Coalesce chunk with the following address range. This does > // not change the position within gChunksByAddress, so only > // remove/insert from/into gChunksBySize. > gChunksBySize.Remove(node); > + chunk_remove_guards(node->mAddr); I think the guards should be removed one level above, in chunk_dealloc. Or even a level further above, in DeallocChunk and huge_dalloc, separately, because huge allocations should have a different guard page strategy (further below). @@ +2544,5 @@ > > #ifdef MALLOC_DECOMMIT > // Start out decommitted, in order to force a closer correspondence > // between dirty pages and committed untouched pages. > + pages_decommit(run, gMaxLargeClass - gPageSize); This doesn't decommit the last page. (gMaxLargeClass doesn't include the guard page) @@ +3842,5 @@ > extent_node_t* node; > bool zeroed; > > // Allocate one or more contiguous chunks for this request. > + csize = CHUNK_CEILING(aSize + (2 * gPageSize)); I think huge allocations should put guard page*s* (plural) at the tail, putting the first one at the end of the (page-aligned) requested size, rather than chunk size. So that if you have a 1.5MB allocation, and try to deref at 1.7MB, you hit a guard page. @@ +4014,5 @@ > MOZ_RELEASE_ASSERT(!aArena || node->mArena == aArena); > huge.Remove(node); > > huge_allocated -= node->mSize; > + huge_mapped -= node->mSize + (2 * gPageSize); I don't think that's right. The pages are still mapped. They won't ever be committed, but it's still important to report how much address space we're using accurately. Even more when we have guard pages around "wasting" address space. BTW, we should track how much address space is used by guard pages. @@ +4019,4 @@ > } > > // Unmap chunk. > + chunk_dealloc(node->mAddr, node->mSize, HUGE_CHUNK); That is not good. That'd be leaking address space between node->mSize and CHUNK_CEILING(node->mSize). ::: memory/gtest/TestJemalloc.cpp @@ +230,5 @@ > // Chunk header. > UniquePtr<int> p = MakeUnique<int>(); > size_t chunksizeMask = stats.chunksize - 1; > + char* chunk = (char*)((uintptr_t(p.get()) & ~chunksizeMask) + stats.page_size); > + size_t chunkHeaderSize = stats.chunksize - stats.large_max - (stats.page_size * 2); You should actually do a jemalloc_ptr_info check with pointers in both page guards.
Hi, I got pinged about this bug to help with SpiderMonkey issues. I do see some GC-related assertion failures on that Try push, for instance the one in Arena::finalize. It's curious because the GC uses mmap to map chunks/arenas. I'm happy to help but it would be useful if you could first: (1) See if this repros with a standalone JS shell build and shell tests. (2) Try to minimize the patch. See if it also fails with the actual mprotect calls commented out, etc.
Flags: needinfo?(mhowell)
(In reply to Jan de Mooij [:jandem] from comment #4) > Hi, I got pinged about this bug to help with SpiderMonkey issues. I do see > some GC-related assertion failures on that Try push, for instance the one in > Arena::finalize. Thanks for taking a look! > It's curious because the GC uses mmap to map chunks/arenas. I noticed that as well, which has only multiplied my confusion. > I'm happy to help but it would be useful if you could first: > > (1) See if this repros with a standalone JS shell build and shell tests. It does reproduce in some of the JS shell tests, but not all and not consistently; I'm currently using non262/TypedArray/sort_small.js because it seems to fail more frequently than many others (enough that I was able to get an rr trace). > (2) Try to minimize the patch. See if it also fails with the actual mprotect > calls commented out, etc. I did already try commenting out the mprotect's, and it doesn't seem to have any affect on this. The rest of the patch is really just a bunch of chunk sizing/addressing changes that all work together to make the mprotect calls possible, so I'm afraid there isn't much else to minimize out. I also tried building with --enable-small-chunk-size, and that seemed to make the problem go away. Which again is confusing because that's the chunk size of the *internal* allocator. Do you know of anywhere that there might be an assumption that the chunk size the system allocator/jemalloc uses is at least equal to js::gc::ChunkSize?
Flags: needinfo?(mhowell)
Nothing comes to mind right now, I think we'll just have to dig in. It's Friday night here and I'll be traveling to Orlando tomorrow, but I'll do my best to look into it before/on Monday. Can you post an updated patch or is the one from your Try push still the most recent? Also please post here if you figure something out so we don't duplicate work :)
Flags: needinfo?(jdemooij)
Flags: needinfo?(mhowell)
(In reply to Jan de Mooij [:jandem] from comment #6) > Nothing comes to mind right now, I think we'll just have to dig in. It's > Friday night here and I'll be traveling to Orlando tomorrow, but I'll do my > best to look into it before/on Monday. > > Can you post an updated patch or is the one from your Try push still the > most recent? Also please post here if you figure something out so we don't > duplicate work :) The one in the try push and attached here is still the most recent right now; I'll definitely update if I make any progress. Thanks again!
Flags: needinfo?(mhowell)
(Jan asked me to take a look since I've studied our allocation behaviors around SpiderMonkey GC and jemalloc in the context of crash analysis). Looking at the crash dump from https://treeherder.mozilla.org/#/jobs?repo=try&author=mhowell%40mozilla.com&selectedJob=214658342 I see some very weird memory maps. There are many allocations as follows: > 00002e200000 4kB NA MAPPED > 00002e201000 1024kB RW MAPPED > 00002e301000 4kB NA MAPPED Where MAPPED is the Windows VirtualQuery MEM_MAPPED flag. I'm not 100% sure what the 1024 kB blocks are, but my best guess is they are jemalloc arena_chunk_t which require 1 MB alignment for correctness. The patch I see here (https://hg.mozilla.org/try/rev/cc75dcc7e1e37c0680606181ad71e65dc8c09322) doesn't assert alignment after calling |chunk_apply_guards|. While the SpiderMonkey GC largely maintains it's own data structures, it still uses jemalloc for some key data structures (eg. StoreBuffer) and jemalloc problems could easily manifest as GC asserts.
(In reply to Ted Campbell [:tcampbell] from comment #8) > > 00002e200000 4kB NA MAPPED > > 00002e201000 1024kB RW MAPPED > > 00002e301000 4kB NA MAPPED Also I think the second one should be 1016 KB (the usable chunk size) instead of 1024, if I'm reading the patch right?
So, with this patch, the size of a jemalloc chunk is 0xfe000 bytes, with two guard pages (those 4 KB allocations) on either side of that for a total of 1024 KB; the entire chunk including the guard pages is aligned to 1024 KB, but the non-guard usable area starts at that alignment plus one page (the end of the leading guard page). Fixing up the alignment and sizing issues that creates is the majority of the patch. This means that pointers allocated in a chunk will never have the 1 MB alignment that the chunk itself has, but that would already have been the case because of the chunk header. My reasoning for doing that that way was that, if I had kept the first usable address aligned to 1 MB, then there would be a leading guard page before it at a page-aligned address and then the trailing page would be after it at a 1 MB aligned address, and having those there would block out those entire 1 MB regions from having another chunk in them, because the entire 1 MB would be needed for that; I thought the extreme fragmentation that created would be prohibitive. It occurs to me now that I could avoid some of this trouble by allocating only a single guard page at the end of each chunk, and if there's a chunk immediately following that then the same guard page would service both chunks. That creates a problem when creating a new chunk that doesn't already have a chunk immediately before it, because we can't just not have a leading guard page, so we would need to be able to allocate a guard page before the chunk itself and keep track of the fact that that chunk is responsible for it. It may not end up being worth that. What you're seeing at 0x2e201000 is probably a jemalloc "huge" allocation, meaning that a client of the allocator requested 1024 KB and that's too large to fit into a chunk, so it got its own separate allocation outside of any chunk. I'm assuming the reason why that memory isn't committed is because whatever allocated it has since freed it and jemalloc is now keeping that address space around for potential future recycling; the allocator likes to decommit pages that are in that state on Windows. Huge allocations *would* return 1 MB aligned pointers before this patch, because they don't have any header, so it's possible that the fact that I've broken that alignment is the source of some of the problems; I'll see what I can do about that.
Note that one problem is that the patch actually breaks memalign for alignments larger than the page size. This might not be the cause of the crashes, but it certainly is a problem.
Clearing NI because there are known issues with the patch.
Flags: needinfo?(jdemooij)
(In reply to Matt Howell (he/him) [:mhowell] from comment #10) > Huge allocations *would* return 1 MB aligned pointers before this patch, > because they don't have any header, so it's possible that the fact that > I've broken that alignment is the source of some of the problems; I'll > see what I can do about that. I got this alignment issue fixed, but it doesn't seem to have helped any: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ca1d14a58c64ce0524f7a34950ddf8732f638300 Note that I deliberately disabled recycling huge allocations in that patch, because I thought it would be hard to get it to work with the rest of the change and I didn't want to go to the trouble fixing it to just test the idea, but I wouldn't try to land it that way.

I've started rewriting this patch from scratch, for two reasons: I felt like I was going to have to in order to implement comment 3's feedback, and also I don't feel optimistic about the code I had ever getting into a working state.

All I have so far with this version is guards following huge allocations (up to the next chunk-aligned address this time, rather than always a single page), and even that isn't quite finished (I intend to implement tracking of total guard page allocation amounts, for instance) but what's here does appear to be working, so I'm attaching it here because I thought it might provide a baseline to build the rest on top of.

I just started this try push, but it already looks relatively much better:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=70c00c0a4c21a72429536e365250c0d197b88535

Building on my last post, this patch adds trailing guard pages for normal arena (not huge) chunks, and then adds leading and trailing guards around base allocations. That first addition seems to have gone well, but the second might have caused some issues that I'm having trouble getting a local repro on right away.

Attachment #9040231 - Attachment is obsolete: true

This patch series removes the previous third part, which was the base allocations portion, and replaces it with a leading guard page on each huge allocations. So far I've only run this through the jemalloc gtest suite locally, but passing those is the most progress I've had in a few days, so I thought I'd go ahead and post it. Here's a try push to test all the things: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1a3c861f90a850ce2b97679417a4eec0e3a5dba0

Attachment #9041627 - Attachment is obsolete: true
Depends on: 1529922

This is based on top of (the current state of) the bug 1529922 patches, which obsolete the ones I had previously posted here. This does need a test added, but it appears to work; this try push seems to be doing well.

Attachment #9043971 - Attachment is obsolete: true

For the record, I don't currently plan on working on this any further, so I'm turning this bug over to GCP, who has already taken on bug 1529922.

Assignee: mhowell → gpascutto
Depends on: 1537781
Keywords: meta
Priority: P2 → P3
Summary: Add guard pages before and after arenas, chunks, and runs → [meta] Add guard pages before and after arenas, chunks, and runs
Depends on: 1542290
Type: defect → task
Depends on: 1545825
Blocks: 1546442
Attachment #9028774 - Attachment is obsolete: true
Attachment #9049061 - Attachment is obsolete: true
Priority: P3 → P1
Severity: normal → S4
Priority: P1 → P3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: