Closed Bug 470217 Opened 16 years ago Closed 8 years ago

pages_map() in jemalloc.c makes incorrect assumptions about mmap() leading to infinite loop

Categories

(Core :: Memory Allocator, defect)

x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 613089

People

(Reporter: DarlingStick, Assigned: anarchy)

References

Details

Attachments

(1 file, 1 obsolete file)

User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.4) Gecko/2008111701 Gentoo Firefox/3.0.4 Build Identifier: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1b2) Gecko/20081213 Gentoo Firefox/3.1b2 After upgrading to firefox 3.1_beta2 to fix a bug I'm having with 3.0.*, starting firefox results in it using 100% CPU and memory use jumping up and down by a megabyte. After trying serveral things I resorted to strace, this is a partial strace output that shows the beginning of the infinite loop: open("/proc/cpuinfo", O_RDONLY) = 3 read(3, "processor\t: 0\nvendor_id\t: Authent"..., 1024) = 577 read(3, ""..., 1024) = 0 close(3) = 0 readlink("/etc/malloc.conf", 0x7e17cebf16d0, 4096) = -1 ENOENT (No such file or directory) mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x6f2ec8f73000 mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x6f2ec7883000 munmap(0x6f2ec7883000, 2097152) = 0 mmap(0x6f2ec7a00000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x6f2ec7883000 munmap(0x6f2ec7883000, 2097152) = 0 mmap(NULL, 3145728, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x6f2ec7783000 munmap(0x6f2ec7783000, 3145728) = 0 mmap(0x6f2ec7800000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x6f2ec7883000 munmap(0x6f2ec7883000, 2097152) = 0 mmap(NULL, 3145728, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x6f2ec7783000 munmap(0x6f2ec7783000, 3145728) = 0 mmap(0x6f2ec7800000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x6f2ec7883000 munmap(0x6f2ec7883000, 2097152) = 0 This continues forever. If I'm reading it correctly, it tries to mmap a fixed address with size 2097152, the kernel instead returns a mapping at a different address (page boundary?) which it destroys, then it requests a larger mapping of 3145728 at any address, which it gets but doesn't like either, destroys it, rinse, repeat. After taking a hint from bug 467189 which I believe is a related issue (on solaris) the loop in chunk_alloc_mmap() in memory/jemalloc/jemalloc.c (at http://mxr.mozilla.org/mozilla-central/source/memory/jemalloc/jemalloc.c#2365 ) seems responsible. Specifically the fact that pages_map() in that file (at line 2186) assumes that mmap() will always return the requested address. man mmap(2) shows that it's only a hint to the kernel, this assumption is false. In fact on my system with a grsecurity pax enabled kernel, which randomizes the mmap() base, it always returns a different address, leading to the infinite loop. If I temporarily disable pax (sysctl -w kernel.pax.softmode=1) the loop succeeds and firefox starts normally (strace output): readlink("/etc/malloc.conf", 0x7fffffff8cc0, 4096) = -1 ENOENT (No such file or directory) mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2f7daa2000 mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2f7c3b2000 munmap(0x7f2f7c3b2000, 2097152) = 0 mmap(0x7f2f7c500000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2f7c3b2000 munmap(0x7f2f7c3b2000, 2097152) = 0 mmap(NULL, 3145728, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2f7c2b2000 munmap(0x7f2f7c2b2000, 3145728) = 0 mmap(0x7f2f7c300000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2f7c300000 madvise(0x7f2f7c300000, 2097152, MADV_DONTNEED) = 0 mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2f7c200000 madvise(0x7f2f7c200000, 1048576, MADV_DONTNEED) = 0 open("/usr/lib64/mozilla-firefox/application.ini", O_RDONLY) = 3 I believe mozilla is at fault here, especially since the bug also occurs on Solaris, it has also been confirmed on the gentoo forums; but if necessary I can file a bug on the gentoo bugzilla and CC the PaX team since I know they listen in on it. Reproducible: Always
I meant bug 457189 which is the Solaris report
Component: General → jemalloc
Product: Firefox → Core
QA Contact: general → jemalloc
Your linux strace looks remarkably like the truss I looked at last night on Solaris while trying to replicate bug 457189. Additionally, my Solaris test case does not involve the browser, only spidermonkey's js shell. I'm surprised jemalloc is this sensitive to the address of the mapping. Here's what mmap(3C) has to say about addr: A non-zero value of addr is taken to be a suggestion of a process address near which the mapping should be placed. When the system selects a value for pa, it will never place a mapping at address 0, nor will it replace any extant mapping, nor map into areas considered part of the potential data or stack "segments".
I took another look at this bug with a newer FF release. The bug is still present and Linux' mmap does not provide MAP_ALIGN so Wesley's fix for bug 457189 does not apply to this bug. Also, after having taken a closer look at the code, relying on unportable, flawed assumptions about mmap bevahior to make your allocator's threading model work seems like a Really Bad Idea (tm). IMHO the current retry-on-fail model is broken by design.
Firefox 3.1 up fails on any Linux which randomizes mmap addresses (which is in fact any "hardened" or security-related linux), because on all these systems, mmap *never* returns a block at the requested address, it always shifts the address by a random amount to make code-inject exploits harder and block brute-force address-guessing exploits by using a different address each time. Security mechanisms like randomized mmap are especially important for web browsers, firefox is seriously broken if it requires disabling a central security mechanism. And as it was stated in the previous comment: The assumption that mmap will return a block at the requested address is fundamentally wrong. Neither POSIX nor anything else ever required mmap to obey the given address: By definition, it is just a hint which may or may not be honored.
Attachment #394266 - Flags: review?(jasone)
Assignee: nobody → geekypenguin
Jasone can we get some feedback on this please. I would like to see this land before we get to far into the 1.9.2 release schedule.
Flags: wanted1.9.2?
this looks a lot more expensive than the previous version. what is the performance hit?
Stuart how does this patch look a lot more expensive? The patch has been explained here: http://bugs.gentoo.org/show_bug.cgi?id=278698#c26
Attachment #394266 - Attachment is obsolete: true
Attachment #394266 - Flags: review?(jasone)
Patch attached against current mozilla-central, can we please get this review complete so we can get this landed.
Attachment #423948 - Flags: review?(jasone)
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #423948 - Flags: review?(pavlov)
Jory, would you mind explaining how this patch works to insure the required alignment of the memory-mapped segment? It looks to me that the patch slightly over-allocates, then unmaps a little bit at the beginning, and a little bit at the end. What concerns me, if that is the case, is that it may yield difficulty on some conformant SUSv3 munmap implementations. The OpenGroup says, "The implementation shall require that addr be a multiple of the page size {PAGESIZE}." The Solaris man page goes on to say that mention that it unmaps by "rounding the len argument up to the next multiple of the page size as returned by sysconf(3C)." So, while this patch presumably works for Hardened Linux, I am concerned that it may cause a regression on Solaris and potentially other SUSv3 conformant platforms under certain conditions. I don't know if those conditions are satisfied by the current browser/jsapi call patterns or not. The current code, under Solaris, simply asks the kernel for a properly-aligned page, and the kernel returns it every time, in a single system call. Also, is it known if the MALLOC_PAGEFILE code is reached? I note that in the Solaris-specific code which is removed that this appears to the algorithm when MALLOC_PAGEFILE is enabled: - If we have a page file, try to get an aligned mapping from it - If that fails, get an aligned anonymous mapping (i.e. from /dev/zero) But in the patch, I believe this is what happens: - Get an anonymous mapping and align it - Drop that mapping in favour a mapping in the page file and hope it succeeds Why the extra mapping? Why the change to the fallback algorithm? Does it matter? Can the second mapping ever fail? If it does fail, does the original mapping need to be unmapped?
I will leave the explanation up to pageexec he can better explain then I can.
first of all please note that a variant of this patch has already made it into Jason Evan's jemalloc tree (see bug 527356 and the discussion there) so i expect firefox will eventually use it. as for the concerns: > It looks to me that the patch slightly over-allocates, then unmaps a little > bit at the beginning, and a little bit at the end. correct. > I don't know if those conditions are satisfied by the current browser/jsapi > call patterns or not. all 'addr' parameters the patch uses are either NULL (no possible problems) or are guaranteed to be page aligned (either because it's the return value of mmap or because 'size' and 'alignment' are both multiples of the page size, at least as far as i could determine from reading the code, if i overlooked some caller, please let me know). > Also, is it known if the MALLOC_PAGEFILE code is reached? it should be reached under the same circumstances as the original code, that is, if we have enough address space left to satisfy the mmap request, we'll do that with the file mapping in the end. > Drop that mapping in favour a mapping in the page file and hope it succeeds actually, now that i look at that code again, there's a bug in there, if the file mmap fails, we should either clean up the anon mmap and return NULL or keep using the still existing anon mmap. i don't know which way solaris et al. prefer here, can someone elaborate? the thing is, if the anon/file mmap cases must be strictly separated, there will be code duplication... > Can the second mapping ever fail? it depends on how the given kernel implements the 'mmap replaces an existing mapping' case, but it can certainly fail (in which case we're left with the anon mapping whose fate is yet to be decided in fact, see above).
Soon as that lands we can close :)
Depends on: jemalloc3
Hey, Any update on this?
It depends on 580408, which I am hoping to find time for in the medium-term (~= next 2 months).
Up to 6.0.2 there is workaround using paxctl. Since 7.0 workaround doesn't work mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3291633c000 munmap(0x3291633c000, 2097152) = 0 mmap(0x32916400000, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3291643c000 munmap(0x3291643c000, 1048576) = 0
It seems another executable (firefox) needs paxctl since 7.0.
(In reply to wbrana from comment #17) > Up to 6.0.2 there is workaround using paxctl. > Since 7.0 workaround doesn't work > > mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) > = 0x3291633c000 > munmap(0x3291633c000, 2097152) = 0 > mmap(0x32916400000, 1048576, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3291643c000 > munmap(0x3291643c000, 1048576) = 0 weird, ff7 works fine here with PaX/ASLR enabled on it.
Attachment #423948 - Flags: review?(jasone)
I think the original problem has been fixed a long time ago. However, firefox still does not work on hardened linux: It crashes regularly (get's killed by PaX), but for reasons very different from those described here.
Attachment #423948 - Flags: review?(pavlov)
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: