Closed
Bug 782477
Opened 12 years ago
Closed 12 years ago
Cut-the-Rope subprocess is OOM'ing
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dhylands, Unassigned)
Details
Attachments
(2 files)
If I launch Cut-the-Rope on otoro, b2g dies. Even under gdb, the process gets wiped out, so unfortunately, no backtrace. It looks like someone sent a SIGKILL which kills the process right out from under gdb.
Reporter | ||
Comment 1•12 years ago
|
||
Comment 2•12 years ago
|
||
As I know, the current version of Cut-the-Rope is designed for desktop, which hasn't been optimized for mobile on memory usage. If you check by 'procrank' in adb shell, you'll see the memory exhausted and b2g got crashed.
It's come out that the memory usage of the b2g process has regressed quite a bit since last time I measured --- it's using 40-50MB more than previously :(. We need to investigate that.
Comment 4•12 years ago
|
||
Citing cjones: > we've made a bunch of changes to free up memory for > apps lately, but I've noticed one frustrating problem. If you > - load Cut the Rope > - play the first level > > it works now. But if you try to move on to the second level, the CTR > app is lowmem killed. > > Its memory usage gets up around 90MB by that point, but we still have > plenty of RAM in pagecache, 20MB or so. > > So the project is > - why isn't linux dropping those pagecache pages instead of > lowmem-killing? is it locked from mapping or what? > - is there anything we can do about it?
Status: NEW → ASSIGNED
Updated•12 years ago
|
Assignee: nobody → tdz
Updated•12 years ago
|
OS: Linux → Android
Hardware: x86_64 → ARM
Comment 5•12 years ago
|
||
My results so far... I managed to reproduce the OOM in my otoro phone. The free memory varies while CTR runs varies from 24 MiB to 3 MiB. I see a lot of pmem usage in logcat, with frequent OOMs > D/memalloc( 2937): /dev/pmem: Mapped buffer base:0x47a00000 size:987136 offset:921600 fd:74 > D/memalloc( 2290): /dev/pmem: Allocated buffer base:0x4abb5000 size:614400 offset:4358144 > fd:168 > D/memalloc( 2937): /dev/pmem: Mapped buffer base:0x494ee000 size:4972544 offset:4358144 fd:77 > D/memalloc( 2937): /dev/pmem: Unmapping buffer base:0x4e8e2000 size:11075584 offset:11010048 > D/memalloc( 2290): Out of PMEM. Dumping PMEM stats for debugging > D/memalloc( 2290): ------------- PRINT PMEM STATS -------------- > D/memalloc( 2290): Node 0 -> Start Address : 0 Size 9600 Free info 0 > D/memalloc( 2290): Node 1 -> Start Address : 9600 Size 9600 Free info 0 > D/memalloc( 2290): Node 2 -> Start Address : 19200 Size 9600 Free info 0 > D/memalloc( 2290): Node 3 -> Start Address : 28800 Size 2048 Free info 0 > D/memalloc( 2290): Node 4 -> Start Address : 30848 Size 2048 Free info 0 > D/memalloc( 2290): Node 5 -> Start Address : 32896 Size 5504 Free info 1 > D/memalloc( 2290): Node 6 -> Start Address : 38400 Size 43008 Free info 0 > D/memalloc( 2290): Node 7 -> Start Address : 81408 Size 18560 Free info 1 > D/memalloc( 2290): Node 8 -> Start Address : 99968 Size 9600 Free info 0 > D/memalloc( 2290): Node 9 -> Start Address : 109568 Size 13312 Free info 0 > D/memalloc( 2290): Node 10 -> Start Address : 122880 Size 13312 Free info 0 > D/memalloc( 2290): Node 11 -> Start Address : 136192 Size 19200 Free info 0 > D/memalloc( 2290): Node 12 -> Start Address : 155392 Size 14976 Free info 1 > D/memalloc( 2290): Node 13 -> Start Address : 170368 Size 60800 Free info 0 > D/memalloc( 2290): Node 14 -> Start Address : 231168 Size 73728 Free info 0 > D/memalloc( 2290): Node 15 -> Start Address : 304896 Size 19200 Free info 0 > D/memalloc( 2290): Node 16 -> Start Address : 324096 Size 19200 Free info 0 > D/memalloc( 2290): Node 17 -> Start Address : 343296 Size 3072 Free info 1 > D/memalloc( 2290): Node 18 -> Start Address : 346368 Size 43008 Free info 0 > D/memalloc( 2290): Node 19 -> Start Address : 389376 Size 73728 Free info 0 > D/memalloc( 2290): Node 20 -> Start Address : 463104 Size 43008 Free info 0 > D/memalloc( 2290): Node 21 -> Start Address : 506112 Size 18176 Free info 1 > D/memalloc( 2290): Total Allocated: Total Free: > D/memalloc( 2290): ---------------------------------------------- > E/memalloc( 2290): /dev/pmem: No more pmem available > W/memalloc( 2290): Falling back to ashmem > D/memalloc( 2290): ashmem: Allocated buffer base:0x44f10000 size:614400 fd:171 > D/memalloc( 2937): ashmem: Mapped buffer base:0x45668000 size:614400 fd:62 > D/memalloc( 2290): /dev/pmem: Freeing buffer base:0x4af75000 size:425984 offset:3932160 fd:162 > D/memalloc( 2937): /dev/pmem: Unmapping buffer base:0x48dc0000 size:4358144 offset:3932160 > D/memalloc( 2290): /dev/pmem: Freeing buffer base:0x4af0d000 size:425984 offset:3506176 fd:56 > D/memalloc( 2937): /dev/pmem: Unmapping buffer base:0x48a00000 size:3932160 offset:3506176 Pmem hands out physically contiguous memory it receives from kmalloc, which could explain why the available pages are remain unused. IIRC kmalloc is implemented directly on top of the SLAB allocator and only uses the configured pools. The output of /proc/slabinfo before the crash is shown below. You can see that the allocator is short in memory. > root@android:/ # cat /proc/slabinfo > slabinfo - version: 2.1 > # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> > ... > size-4194304 0 0 4194304 1 1024 : tunables 1 1 0 : slabdata 0 0 0 > size-2097152 0 0 2097152 1 512 : tunables 1 1 0 : slabdata 0 0 0 > size-1048576 1 1 1048576 1 256 : tunables 1 1 0 : slabdata 1 1 0 > size-524288 1 1 524288 1 128 : tunables 1 1 0 : slabdata 1 1 0 > size-262144 0 0 262144 1 64 : tunables 1 1 0 : slabdata 0 0 0 > size-131072 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0 > size-65536 4 4 65536 1 16 : tunables 8 4 0 : slabdata 4 4 0 > size-32768 3 3 32768 1 8 : tunables 8 4 0 : slabdata 3 3 0 > size-16384 36 36 16384 1 4 : tunables 8 4 0 : slabdata 36 36 0 > size-8192 41 41 8192 1 2 : tunables 8 4 0 : slabdata 41 41 0 > size-4096 136 141 4096 1 1 : tunables 24 12 0 : slabdata 136 141 0 > size-2048 172 172 2048 2 1 : tunables 24 12 0 : slabdata 86 86 0 > size-1024 224 224 1024 4 1 : tunables 54 27 0 : slabdata 56 56 0 > size-512 861 864 512 8 1 : tunables 54 27 0 : slabdata 108 108 0 > size-256 165 165 256 15 1 : tunables 120 60 0 : slabdata 11 11 0 > size-192 148 160 192 20 1 : tunables 120 60 0 : slabdata 8 8 0 > size-128 328 330 128 30 1 : tunables 120 60 0 : slabdata 11 11 0 > size-96 1199 1240 96 40 1 : tunables 120 60 0 : slabdata 31 31 0 > size-64 4003 4012 64 59 1 : tunables 120 60 0 : slabdata 68 68 0 > size-32 6670 6780 32 113 1 : tunables 120 60 0 : slabdata 60 60 0 > kmem_cache 144 160 96 40 1 : tunables 120 60 0 : slabdata 4 4 0
Reporter | ||
Comment 6•12 years ago
|
||
There are actually 3 different kmalloc implementations, SLAB, SLOB, and SLUB. You may want to compare them to see what impact they have on performance and/or memory usage. The otoro kernel is currently configured to use SLAB. IIRC SLAB is also a power of 2 allocator, so when that pmem request that allocates 11075584 would actually be consuming 2Mb. Changing the SLAB/SLUB/SLOB setting will require all of the modules to be rebuilt as well (or at least it has in the past). http://census-labs.com/news/2012/01/03/linux-kernel-heap-exploitation/ http://lwn.net/Articles/229984/ http://www.socallinuxexpo.org/scale9x/presentations/status-linux-slab-allocators.html
Running out of pmem is perfectly fine. We just fall back on non-physically-contiguous shared memory (which forces our screen compositor code out of its fast path). I'd like to find out why the kernel can't satisfy the allocation requests when there's apparently plenty of pagecache memory left. Is it pinned somehow? Does linux not allow it to fall below a certain threshold? Etc.
Comment 8•12 years ago
|
||
Chris, are you sure you have 20 MiBs of pagecache left? I managed to get the output of /proc/meminfo shortly before CTR gets killed, which is while the first level is being loaded. But from the numbers, it doesn't seem like there is enough memory to run CTR. > root@android:/ # cat /proc/meminfo > MemTotal: 172880 kB > MemFree: 3252 kB > Buffers: 4 kB > Cached: 25516 kB > SwapCached: 0 kB > Active: 139260 kB > Inactive: 6040 kB > Active(anon): 129020 kB > Inactive(anon): 80 kB > Active(file): 10240 kB 10 MiB of the pagecache are actively in use and cannot be reclaimed. > Inactive(file): 5960 kB These 6 MiB could be reclaimed. > Unevictable: 9204 kB It's probably worth investigating why there are 9 MiBs of unevictable memory. This means that the memory pages are mlock'd, or so. > Mlocked: 8640 kB See above. > HighTotal: 0 kB > HighFree: 0 kB > LowTotal: 172880 kB > LowFree: 3252 kB > SwapTotal: 0 kB > SwapFree: 0 kB > Dirty: 0 kB > Writeback: 0 kB > AnonPages: 129008 kB > Mapped: 20744 kB > Shmem: 8756 kB > Slab: 7596 kB > SReclaimable: 1312 kB > SUnreclaim: 6284 kB > KernelStack: 1672 kB > PageTables: 1648 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > WritebackTmp: 0 kB > CommitLimit: 86440 kB > Committed_AS: 289468 kB > VmallocTotal: 720896 kB > VmallocUsed: 34444 kB > VmallocChunk: 661052 kB To release pages ASAP, I reduced the time for Linux to move pages from active to inactive state (/proc/sys/vm/dirty_expire_centisecs) to 1 cs, and increase the scheduling interval of the flusher thread to 10 cs (/proc/sys/vm/dirty_writeback_centisecs). This free'd up some memory earlier, but didn't improve the overall results. From these results, I'd say this is not a problem within Linux, but CTR using too much memory.
Thanks for the analysis, Thomas! Here's the result I get from just before OOM death. I'm actually not able to get through the first level anymore. > MemTotal: 188248 kB Thomas: make sure you have the kernel update 8. It frees up more memory. > MemFree: 2700 kB > Buffers: 24 kB > Cached: 17456 kB > SwapCached: 0 kB > Active: 158516 kB > Inactive: 3240 kB > Active(anon): 151804 kB > Inactive(anon): 80 kB > Active(file): 6712 kB > Inactive(file): 3160 kB > Unevictable: 7480 kB > Mlocked: 6936 kB > HighTotal: 0 kB > HighFree: 0 kB > LowTotal: 188248 kB > LowFree: 2700 kB > SwapTotal: 0 kB > SwapFree: 0 kB > Dirty: 0 kB > Writeback: 0 kB > AnonPages: 151792 kB > Mapped: 16968 kB > Shmem: 7052 kB > Slab: 8672 kB > SReclaimable: 1480 kB > SUnreclaim: 7192 kB > KernelStack: 1672 kB > PageTables: 1580 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > WritebackTmp: 0 kB > CommitLimit: 94124 kB > Committed_AS: 316376 kB > VmallocTotal: 720896 kB > VmallocUsed: 26444 kB > VmallocChunk: 666460 kB So my results look similar to yours. I'm starting with more free memory, so it's odd that you're getting further along. My kernel is also not able to reclaim about 3MB. I'll try setting the parameters you suggest and see if I get any further.
(In reply to Thomas Zimmermann from comment #8) > From these results, I'd say this is not a problem within Linux, but CTR > using too much memory. I should have added here, this is indeed the case; the CTR app is egregious. We're running code that was designed to run in IE on "desktop". But it's a good test case for our low-memory handling. We don't control the code of third-party apps like this, we'll need to deal with memory hogs.
With the kernel params tweaked, I also see
> Inactive(file): 6620 kB
but the results are the same. My logging isn't very precise, so maybe this memory is actually being reclaimed and the timing is just bad.
Comment 12•12 years ago
|
||
The JS engine might mlock stacks. Grep the source for mlock.
Updated•12 years ago
|
Summary: Cut-the-Rope kills b2g on otoro (with SIGKILL) → Cut-the-Rope subprocess is OOM'ing
The PSS for the CTR process gets up to 90MB+ before it dies its OOM death. This is in the neighborhood of the available memory we were targeting for app processes for v1, so this isn't necessarily a bug. However, what's somewhat disturbing is that it appears the memory usage of the b2g process has gone up significantly since the last time I measured, possibly up to 8MB USS. That probably explains why I can't play the first level anymore. My kingdom for about:memory --> logcat! :)
Comment 14•12 years ago
|
||
> My kingdom for about:memory --> logcat! :) Actively working on it, bug 788021.
Hum, after a reboot, the b2g process was using *10 MB* less USS, and I got past the first level on CTR. Trying to load the second level, its PSS usage goes past 95MB and then it's OOM killed.
Comment 16•12 years ago
|
||
Thanks everyone. I'm currently investigating these unevictable pages. Searching through the sources hasn't been successful, so I'm working on a kernel module to output the relevant bits of the kernel's VMAs.
Comment 17•12 years ago
|
||
> Thanks everyone. I'm currently investigating these unevictable pages.
Sorry...what exactly are you doing?
Comment 18•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #17) > > Thanks everyone. I'm currently investigating these unevictable pages. > > Sorry...what exactly are you doing? Typically, almost all cached memory pages can be evicted from the pagecache. The output of /proc/meminfo in comments 8 and 9 shows 9 MiBs of memory that can not be evicted by Linux' flusher threads. I try to find the owner of this memory and afterwards I'll try to make the respective memory pages evictable.
Comment 19•12 years ago
|
||
I found that the unevictable pages are graphics buffers within '/system/b2g/b2g' (pid 105), which get allocated when CTR uses canvas elements. Shown below is the output of the kernel module is used for inspecting the kernel's VMAs. The format roughly resembles the content of /proc/<pid>/maps.
> <6>[09-11 22:21:57.067] [879: insmod]pid nr is 105
> <6>[09-11 22:21:57.067] [879: insmod]47900000-47a50000 rw-sl 00000000 00:04 3358
> <6>[09-11 22:21:57.067] [879: insmod] /dev/ashmem/gralloc-buffer-4a89dd40 (deleted)
> <6>[09-11 22:21:57.067] [879: insmod]
> <6>[09-11 22:21:57.067] [879: insmod]4d8fb000-4db3b000 rw-sl 00000000 00:04 3743
> <6>[09-11 22:21:57.067] [879: insmod] /dev/ashmem/gralloc-buffer-4a93d640 (deleted)
> <6>[09-11 22:21:57.067] [879: insmod]
> <6>[09-11 22:21:57.067] [879: insmod]4db3b000-4dd7b000 rw-sl 00000000 00:04 3781
> <6>[09-11 22:21:57.067] [879: insmod] /dev/ashmem/gralloc-buffer-4aa368c0 (deleted)
> <6>[09-11 22:21:57.067] [879: insmod]
We obviously cannot throw away the graphics buffers, but TowerJelly does not have unevictable pages.
- Does this only depend on the application, or is there anything to optimize at the lower levels?
- Or does this depend on the use of pmem vs ashmem?
It would be helpful to me if someone with knowledge in the graphics system could comment on these questions. Thanks!
Ah! Nice find. In the course of normal operation, we use pmem-backed "gralloc" memory for gfx. But it's only allocated from a fixed-size pool, so it may run even when there's RAM still available. This game is running at something like 16x the screen resolution of the otoro, and lord knows what crazy things it might be doing. I always see it exhaust gralloc memory when it runs. Usually exhausting gralloc just results in allocation failures, but the vendor allocator library here will fall back on ashmem when pmem is gone, transparent to the allocator. The semantics of ashmem is a little odd --- as I recall it can be in one of two states, pinned or not-pinned. When it's not pinned, the kernel is free to throw away the backing memory whenever it wants. When it's pinned, the kernel can't do that. IIRC ashmem segments start out pinned. So I would bet that ashmem pinning is implemented by the same mechanism as mlock(), and the mlock()d memory here is for gfx buffers. The problem is, we can't unpin this memory without causing crashes or graphical corruption. So I guess we're doing about as well as we can do here :/. Need to look for wins elsewhere. (I have an idea that might let us not run out of pmem so soon.)
Comment 21•12 years ago
|
||
Ah, ok. Thanks for the info.
Comment 22•12 years ago
|
||
We could experiment with linking /dev/pmem and /dev/pmem_adsp together but then we need a in-gecko mechanism to evict gralloc buffers from pmem when it comes time to turn on the camera or do OMX video decoding.
Good to know, thanks. We have another 5MB win in hand, and I'm starting to think we may be able to get by with 8MB of gralloc, which would get us another 5MB. We're already at the initial target for available app memory, so at this point it's probably not worth the risk of trying that.
Updated•12 years ago
|
Assignee: tzimmermann → nobody
Status: ASSIGNED → NEW
Updated•12 years ago
|
Whiteboard: [3rd-party-preloaded-apps]
Memory usage has improved enough to run the unmodified game.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Whiteboard: [3rd-party-preloaded-apps]
You need to log in
before you can comment on or make changes to this bug.
Description
•