Closed Bug 1015957 Opened 10 years ago Closed 10 years ago
HTTP cache v2: use mmap to access files
For those who don't know, cache2 is NOT using mmap at all. There were suggestions during the design phase, but w/o a wider discussion the file back-end has been implemented not doing mmap. We should be using it on all supporting platforms unless there are any strong arguments against it.
The strong argument is that there would be no benefit when we would use mmap. What exactly do you expect from using mmap?
mmap is excellent if used properly on linux. However it has some downsides that make it not viable: * Can't control IO patterns on MacOSX, Windows(eg no madvise)...Those OSes also read in stupidly small increments(16K on Windows, see http://taras.glek.net/blog/2010/04/19/windows-sucks-at-memory-mapped-io-during-startup/) * Error handling is a PITA. You can't use mmap for the best-case 0-memcopy use case. Any errors(eg stale NFS, IO timeout, etc) will force you to handle&recover from SIGBUS on Linux(not fun), SEHs on Windows(less not fun, but sucks) everywhere the memory is used. Note this is a painful problem even if you memcopy(to localize error handling). This happens more frequently than one would think...possibly due to below. * Memory fragmentation. Having lots of mmapped areas causes virtual memory fragmentation, which causes OOMs to happen MUCH sooner(depending on shittyness of virtual memory subsystem)... This basically makes it non-viable for the lots of cache files usecase you guys have. * "convenience" of mmap api and difficulty of extracting actual IO patterns that the OS executes generally leads to naive codepaths that do very seek-heavy(worst cases of backwards IO: http://taras.glek.net/blog/2010/05/27/startup-backward-constructors/) I implemented mmap for http://mxr.mozilla.org/mozilla-central/source/modules/libjar/nsZipArchive.cpp before I knew above downsides. We get around some of these problems because firefox opens relatively few zips and preloads the important jar files with normal IO, plus jars are somewhat similar to .exe files in the memory access pattern, so having similar limitation is somewhat ok. Now that we patched various mmap gotchas(except for handling sigbus on Unix platform) engineering to go back to normal io isn't worthwhile, but I don't recommend mmap for anything outside of something like mmaping in a single large file on Linux for random IO.
Taras, thanks for this exhausting outline. Closing as WONTFIX based on that.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.