Closed Bug 965936 (defrag) Opened 11 years ago Closed 2 years ago

[meta] Virtual address space fragmentation

Categories

(Core :: General, defect)

defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: away, Unassigned)

References

(Depends on 2 open bugs)

Details

(Keywords: meta, Whiteboard: [MemShrink:meta])

Attachments

(1 file, 1 obsolete file)

No description provided.
Whiteboard: [MemShrink:meta]
Well, let's leave this untriaged so we'll see it in the next MemShrink meeting.
Whiteboard: [MemShrink:meta] → [MemShrink]
We already have bug 859955, and it's already MemShrink:P1.
Whiteboard: [MemShrink]
Depends on: 987160
Depends on: 1001760
Depends on: 1005844
Depends on: 1005849
dmajor: do we need both this bug and bug 859955?
Flags: needinfo?(dmajor)
I consider this to be the general bug. Bug 859955 is a specific instance of it (okay, two instances: it started out with a suspected file mapping issue and then morphed into a graphics investigation).
Flags: needinfo?(dmajor)
Whiteboard: [MemShrink:P1]
Depends on: 1101179
See Also: → 1266389
Component: Tracking → General
Depends on: 1123465
Attached file memlist.out (obsolete) —
I just created a simple script [1] to download Nightly crash dumps with signature "OOM | small" within a week, and updated bsmedberg's minidump-memorylist with dmajor's work in bug 1001760 to output the lines like below. I haven't analyzed those, but it seems that WriteCombine is not the only reason causing fragmentation: c0942a73-c7e7-4609-91d5-95d782160729,Free=134211699M,Tiny=157M,WriteCombine=0M,Misaligned=26M,Usable=18M,Other=134211497M ae940481-639d-4fd6-b264-875122160728,Free=786M,Tiny=786M,WriteCombine=2198M,Misaligned=0M,Usable=0M,Other=0M e2eee53c-9df1-46cb-a767-7162c2160729,Free=127M,Tiny=127M,WriteCombine=91M,Misaligned=0M,Usable=0M,Other=0M 0ddf19f2-d558-4e95-9935-9870a2160801,Free=132M,Tiny=132M,WriteCombine=0M,Misaligned=0M,Usable=0M,Other=0M [1] https://github.com/janus926/minidump-memorylist/blob/master/oom-small-memlist.sh
Attached file oom-small-memlist.out
Please ignore comment 5. I thought bug 1202523 changed mozjemalloc's chunk size to 2M, but actually the change is for jemalloc.
Attachment #8777657 - Attachment is obsolete: true
On Windows, if we call VirtualAlloc() for less than 64kb, it will create unusable memory region. For instance by allocating 4kb, we will have 60kb of unusable memory because the address has to align to 64kb. With the understanding above, when I check the memory blocks from the minidump of 9fd40f67-d6a5-4ddf-9547-286e62160809 which has: Free=113M,Tiny=113M,WriteCombine=9M,Misaligned=0M,Usable=0M,Other=0M I found most fragmentations of it come from 64kb unaligned allocation, following 5 patterns contribute ~60% of the tiny(<1M) blocks: #1 107 chunks, 9108k unusable memory base alloc_base alloc_protect size state protect type 9c30000 9c30000 4 1000 COMMIT 4 20000 9c31000 0 0 f000 FREE 1 0 #2 204 chunks, 12144k unusable memory 73d70000 73d70000 4 e7000 COMMIT 4 20000 73e57000 0 0 9000 FREE 1 0 #3 216 chunks, 12096k unusable memory ffffffffb7090000 ffffffffb7090000 4 152000 COMMIT 4 40000 ffffffffb71e2000 0 0 e000 FREE 1 0 #4 141 chunks, 8212k unusable memory ffffffffaa190000 ffffffffaa190000 4 a7000 COMMIT 4 40000 ffffffffaa237000 0 0 9000 FREE 1 0 #5 245 chunks, 27968k unusable memory 9e50000 0 0 10000 FREE 1 0 9ed0000 0 0 20000 FREE 1 0 Note #5 are still usable memory for VirtualAlloc as they are 64kb aligned, just they're not usable for mozjemalloc. My plan now is to figure out whom create those, unfortunatelly the tool VMMap crashes easily when I run it along with Firefox...
I've got MS Detours to hook NtAllocateVirtualMemory(), I'll see if I can filter out some useful data.
I'm stuck in translating the addresses to symbols, somehow it causes access violation. I am trying to debug it with windbg.
Depends on: 1299747
I ran the tool from bug 1299747 for 2 days of daily use, but didn't see significant allocations causing fragmentation other than bug 1299747 comment 12. But for some unknown reasons I got only the logs from the parent process, it could be the child process is killed but not by TerminateProcess(). I will do another round for 1M unaligned allocation.
js::jit:ExecutableAllocator::systemAlloc is the majority of 1M unaligned allocations, it allocates 64k.
(In reply to Ting-Yu Chou [:ting] from comment #11) > js::jit:ExecutableAllocator::systemAlloc is the majority of 1M unaligned > allocations, it allocates 64k. This could be the cause of #5 in comment 7, which I think making it 1M aligned would be helpful to lower the fragmentation. But I don't know the reason why 64k was used. :njn, do you think there'll be any bad impacts from making ExecutableAllocator::createPool to allocate 1M aligned pool?
Flags: needinfo?(n.nethercote)
> :njn, do you think there'll be any bad impacts from making ExecutableAllocator::createPool to allocate 1M aligned pool? I don't really know. It sounds a bit scary. I suggest trying it, taking some measurements before and after.
Flags: needinfo?(n.nethercote)
I'll try to run with 64K and 1M ExecutableAllocator::createPool, and check the memory stats from crash dump.
Please ignore comment 15. I just found the results of 64k are actually from 1M binary, I'll redo the test and update later.
The output from minidump-memorylist [1] for 64K and 1M ExecutableAllocator::createPool after 24 hours daily use on my desktop: parent-64k Free=3061M,Tiny=31M,WriteCombine=15M,Misaligned=6M,Usable=10M,Other=3013M child-64k Free=2003M,Tiny=55M,WriteCombine=44M,Misaligned=7M,Usable=65M,Other=1875M parent-1m(1) Free=2752M,Tiny=24M,WriteCombine=15M,Misaligned=4M,Usable=13M,Other=2709M child-1m(1) Free=2689M,Tiny=35M,WriteCombine=20M,Misaligned=1M,Usable=38M,Other=2615M parent-1m(2) Free=3018M,Tiny=33M,WriteCombine=15M,Misaligned=8M,Usable=7M,Other=2968M child-1m(2) Free=2099M,Tiny=70M,WriteCombine=47M,Misaligned=3M,Usable=29M,Other=1996M Still I don't see significant differences... [1] https://github.com/janus926/minidump-memorylist
Switching to meta, we can prioritize the blockers accordingly.
Whiteboard: [MemShrink:P1] → [MemShrink:meta]
Depends on: 1361354
Severity: normal → S3

A six year old meta bug is not very useful. If somebody wants to start a new investigation into fragmentation, a new bug is likely in order. Generally speaking, address space fragmentation is less of an issue now that 64-bit systems are much more common.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: