Closed Bug 959966 Opened 10 years ago Closed 10 years ago

[tarako] Avoid preallocated process from being killed for low memory devices.

Categories

(Core :: IPC, defect)

All
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla31
blocking-b2g 1.3T+
Tracking Status
b2g-v1.3T --- fixed

People

(Reporter: sinker, Assigned: cyu)

References

Details

(Whiteboard: [demo])

Attachments

(2 files, 4 obsolete files)

For tarako, it was struggled for short of memory, but it is no more now.  (see bug 945174) Launching time of apps is a prominent problem of tarako now.  The preallocated process is usually killed on tarako.  By a rough studying, without preallocated process could cost ~1s of launching time.

This bug is conflicted with bug 908995.  We may close bug 908995 once this bug is proved to improve the launching time.
typo s/bug 908995/bug 947571/g
Ultimately won't it be possible to avoid the need for a pre-allocated process with NUWA enabled?  I thought that was our desired end goal.
It could be once we resolve all IPC issues of preload slow things.  For now, a preallocated process cost only 4xxk bytes of USS.  It gains little without preallocated process.  So, I don't plan to resolve preload slow things issues for now.  If someone interest to do it, just go ahead.  If no one ever start it, I would find someone to do that once we are free from current Tarako project.
WIP: This raises the priority of the Nuwa process and the preallocated process (as high as the b2g process) so it won't be killed by the lowmem killer.
Depends on: 957509
Make this bug depend on bug 957509. We need to reduce the USS of the preallocated process so that even we let it have a high priority it still doesn't require much memory.
See Also: → 947571
Summary: [taroko] Avoid preallocated process from being killed for low memory devices. → [tarako] Avoid preallocated process from being killed for low memory devices.
Whiteboard: [tarako]
use 1.3T?, remove [tarako] whiteboard
blocking-b2g: --- → 1.3T?
Whiteboard: [tarako]
triage: 1.3T+ for tarako
Assignee: nobody → cyu
blocking-b2g: 1.3T? → 1.3T+
Whiteboard: [demo]
The attachment 8360989 [details] [diff] [review] makes foreground application have nice 18, not 1.

The reason is it sets preallocated process priority to PROCESS_PRIORITY_MASTER initially, which makes ComputeCPUPriority() to return PROCESS_CPU_PRIORITY_LOW when ParticularProcessPriorityManager::SetPriorityNow() is called for the other processes e.g., the foreground one. Be noted PROCESS_CPU_PRIORITY_LOW maps to "hal.processPriorityManager.gonk.LowCPUNice" which is 18.

Should we set it PROCESS_PRIORITY_FOREGROUND instead?
(In reply to Ting-Yu Chou [:ting] from comment #8)
> The reason is it sets preallocated process priority to
> PROCESS_PRIORITY_MASTER initially, which makes ComputeCPUPriority() to
> return PROCESS_CPU_PRIORITY_LOW when
> ParticularProcessPriorityManager::SetPriorityNow() is called for the other
> processes e.g., the foreground one. Be noted PROCESS_CPU_PRIORITY_LOW maps
> to "hal.processPriorityManager.gonk.LowCPUNice" which is 18.
> 
> Should we set it PROCESS_PRIORITY_FOREGROUND instead?

Careful with PROCESS_PRIORITY_MASTER, that's reserved for the main b2g process and should not be used. I suggest using PROCESS_BACKGROUND_PERCEIVABLE instead, here's why:

- If you use PROCESS_PRIORITY_MASTER or PROCESS_PRIORITY_FOREGROUND_HIGH in a low memory condition the LMK will try to kill the largest application at that level which means that it could kill the master process or a process doing something very important (e.g. receiving a call). We obviously don't want that.

- If you use PROCESS_PRIORITY_FOREGROUND the same thing will happen with the foreground app: it will always be chosen by the LMK over the preallocated process. Are we sure we want that?

If we use PROCESS_PRIORITY_BACKGROUND_PERCEIVABLE then the killing will go like this:

- First the PRIORITY_BACKGROUND apps
- Then the homescreen (BACKGROUND_HOMESCREEN)
- Then the largest of the background perceivable apps. If we have something like the music player in the background that's gonna be it, otherwise it will be the preallocated process if we set it at that level
- Then the foreground apps and so on
blocking-b2g: 1.3T+ → 1.3T?
Sorry for incorrectly flagged 1.3T?. Please change back to 1.3T
Comment on attachment 8360989 [details] [diff] [review]
WIP: Adjust  the priority of the Nuwa and preallocated process

We need this change to make the preallocated process not killed.
Attachment #8360989 - Flags: review?(khuey)
(In reply to Cervantes Yu from comment #11)
> We need this change to make the preallocated process not killed.

Please don't use PRIORITY_MASTER for the preallocated process. That level is meant for the b2g main process only, use PRIORITY_FOREGROUND_HIGH if you really want to keep the preallocated process alive but not PRIORITY_MASTER.
blocking-b2g: 1.3T? → 1.3T+
Letting the preallocated process to have a high priority seems to be a double-edged blade. It keeps from the process alive, but it can also compete with the foreground app for CPU time (like GC due to memory pressure). Maybe we need to create another priority class that is unlikely to be killed but is nicer to other processes.
This patch adds a new priority class for the preallocated process so it doesn't get killed easily and doesn't compete for CPU time with other processes.
Attachment #8360989 - Attachment is obsolete: true
Attachment #8360989 - Flags: review?(khuey)
Attachment #8378201 - Flags: review?(khuey)
Attachment #8378201 - Flags: feedback?(gsvelto)
> This patch adds a new priority class for the preallocated process so it
> doesn't get killed easily and doesn't compete for CPU time with other
> processes.

Tested with this patch for ringtone slow issue, this version keeps preallocated process alive and doesn't compete CPU with newly-created app.
Comment on attachment 8378201 [details] [diff] [review]
Avoid the background process from being killed

Unfortunately this won't work as you're introducing a 7th KillUnderMB entry and the kernel can only accept a maximum of 6 (see the comment just above the first entries). What you can do however is to just set an OomScoreAdjust parameter and nice value such as we do for the FOREGROUND_KEYBOARD level. With the parameters you set this should be enough to keep it alive above the 5MB free memory mark and my feedback would become a + :)
Attachment #8378201 - Flags: feedback?(gsvelto) → feedback-
I tested ringtone with preallocated process that has FOREGROUND_KEYBOARD priority level. With the original patch, ringtone starts faster than with FOREGROUND_KEYBOARD level.
This looses the constraint that we can only have 6 ProcessPriorities. Multiple ProcessPriorities can share a LMK parameter. This allows for PROCESS_PRIORITY_PREALLOC.
Attachment #8378904 - Flags: feedback?(gsvelto)
Comment on attachment 8378904 [details] [diff] [review]
Part 2: Allow more than 6 process priorities

Review of attachment 8378904 [details] [diff] [review]:
-----------------------------------------------------------------

This change is not really needed: the original code sets the LMK parameters only if a certain priority level has both the killUnderMB and oomScoreAdj entries. This is a bit obscure in the original code, the key to this mechanism is the |continue| statement here that skips a level if it can't find its killUnderMB value:

http://hg.mozilla.org/mozilla-central/file/660b62608951/hal/gonk/GonkHal.cpp#l1252

I'd be glad if instead of adding these changes you'd add a comment clarifying the existing behavior.

As I've said in my previous feedback your first patch (attachment 8378201 [details] [diff] [review]) is fine, the only change you must do is to remove the killUnderMB parameter for the new PREALLOC priority and the rest should be working without needs for further modifications. So this:

pref("hal.processPriorityManager.gonk.PREALLOC.OomScoreAdjust", 67);
pref("hal.processPriorityManager.gonk.PREALLOC.Nice", 19);

Instead of:

pref("hal.processPriorityManager.gonk.PREALLOC.OomScoreAdjust", 67);
pref("hal.processPriorityManager.gonk.PREALLOC.KillUnderMB", 5);
pref("hal.processPriorityManager.gonk.PREALLOC.Nice", 19);
Attachment #8378904 - Flags: feedback?(gsvelto) → feedback-
Attachment #8378201 - Attachment is obsolete: true
Attachment #8378904 - Attachment is obsolete: true
Attachment #8378201 - Flags: review?(khuey)
Attachment #8378937 - Flags: review?(khuey)
Attachment #8378937 - Flags: review?(gsvelto)
Instead of nice value 19, I used 18 since the comment says NSPR add 1 to it for lower priority threads.
Comment on attachment 8378937 [details] [diff] [review]
Avoid the preallocated process from being killed.

Review of attachment 8378937 [details] [diff] [review]:
-----------------------------------------------------------------

> Instead of nice value 19, I used 18 since the comment says NSPR add 1 to it
> for lower priority threads.

Excellent, sorry I didn't mention it (I wrote that NSPR code but forgot about it :-p).
Attachment #8378937 - Flags: review?(gsvelto) → review+
Comment on attachment 8378937 [details] [diff] [review]
Avoid the preallocated process from being killed.

Review of attachment 8378937 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, lost track of this one. r=me
Attachment #8378937 - Flags: review?(khuey) → review+
Component: General → IPC
Product: Firefox OS → Core
Cervantes, is this ready for landing? Thanks
Flags: needinfo?(cyu)
I saw this patch fail on local mochitest on linux build. I am fixing the bug.
Flags: needinfo?(cyu)
The updated patch that fixes mochitest-2 crash.

Try submission: https://tbpl.mozilla.org/?tree=Try&rev=eccd97b7ae12
Attachment #8378937 - Attachment is obsolete: true
Attachment #8392800 - Flags: review+
For the record, the mochitest-2 crash results from assertion failure of ParticularProcessPriorityMagager::SetPriorityNow() with PROCESS_PRIORITY_UNKNOWN. PROCESS_PRIORITY_PREALLOC shouldn't be placed under PROCESS_PRIORITY_FOREGROUND_HIGH for it will be taken as high-priority process and incorrectly tracked in mHighPriorityChildIDs. The updated patch moves the order so the prealloc process isn't taken as a high-priority process like foreground or foreground-high ones.
Keywords: checkin-needed
 https://hg.mozilla.org/integration/b2g-inbound/rev/3efee7a512e1

Cervantes, please apply for L3 access. I'll vouch.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/3efee7a512e1
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
We want to change the "hal.processPriorityManager.gonk.PREALLOC.OomScoreAdjust" value higher than "hal.processPriorityManager.gonk.FOREGROUND_HIGH.OomScoreAdjust".

Two reason.
1. Fabrice has grouped home screen to system app, b2g is bigger now.
2. We need pass top 3rd-party test case, we should keep foreground app and kill preallocated process when LMK/OOM.
Flags: needinfo?(ttsai)
Flags: needinfo?(tlee)
Flags: needinfo?(styang)
Flags: needinfo?(kkuo)
Flags: needinfo?(gal)
Flags: needinfo?(fabrice)
Have you done a full test pass on 3rd party apps with the in-process homescreen? 
The preallocated process ends up using very little memory under memory pressure:

fabrice@fabrice-x240:~/dev/birch$ b2g-info 
                          |     megabytes     |
           NAME  PID PPID CPU(s) NICE  USS  PSS  RSS VSIZE OOM_ADJ USER    
            b2g 1268    1  138.9    0 19.6 23.0 27.7 132.8       0 root    
         (Nuwa) 1302 1268    1.2    0  0.0  0.2  1.0  45.2       0 root    
        Twitter 3898 1302   42.4    1 23.0 26.8 32.0  82.9       2 app_3898
(Preallocated a 4004 1302    2.3   18  0.3  1.8  4.6  52.2       1 root
Flags: needinfo?(fabrice)
below is some log ,from danny.liang

we think it was a memory fragmentation problem with high possibility.

free memory was plenty but free swap was low.

So we prefer to kill the Preallocated process to keep the foreground app .

05-23 04:27:36.243 <4>0[ 4074.723214] Normal: 4885*4kB 1192*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 29076kB

05-23 04:27:36.243 <4>0[ 4074.723301] Free swap  = 2036kB
05-23 04:27:36.243 <4>0[ 4074.723310] Total swap = 65532kB


log start ---------
05-23 04:27:36.243 <4>0[ 4074.722641] lowmem_shrink select 29490 (YouTube), adj 2, size 8049, to kill
05-23 04:27:36.243 <4>0[ 4074.722669] lowmem_shrink send sigkill to 29490 (YouTube), adj 2, size 8049
05-23 04:27:36.243 <4>0[ 4074.722685] kswapd0 invoked lowmemorykiller: gfp_mask=0xd0, oom_adj=0, oom_score_adj=0
05-23 04:27:36.243 <4>0[ 4074.722698] Backtrace: 
05-23 04:27:36.243 <4>0[ 4074.722750] [<c4537ad8>] (dump_backtrace+0x0/0x110) from [<c488a278>] (dump_stack+0x18/0x1c)
05-23 04:27:36.243 <4>0[ 4074.722861]  r7:c70d5f78 r6:000000d0 r5:c70d4000 r4:c64357a0
05-23 04:27:36.243 <4>0[ 4074.722905] [<c488a260>] (dump_stack+0x0/0x1c) from [<c4779968>] (lowmem_shrink+0x480/0x57c)
05-23 04:27:36.243 <4>0[ 4074.722931] [<c47794e8>] (lowmem_shrink+0x0/0x57c) from [<c45a4960>] (shrink_slab+0x114/0x1c0)
05-23 04:27:36.243 <4>0[ 4074.722954] [<c45a484c>] (shrink_slab+0x0/0x1c0) from [<c45a4f94>] (kswapd+0x588/0x924)
05-23 04:27:36.243 <4>0[ 4074.722979] [<c45a4a0c>] (kswapd+0x0/0x924) from [<c456b7d0>] (kthread+0x8c/0x94)
05-23 04:27:36.243 <4>0[ 4074.723003] [<c456b744>] (kthread+0x0/0x94) from [<c45547a4>] (do_exit+0x0/0x5fc)
05-23 04:27:36.243 <4>0[ 4074.723017]  r7:00800013 r6:c45547a4 r5:c456b744 r4:c7029f20
05-23 04:27:36.243 <4>0[ 4074.723041] Mem-info:
05-23 04:27:36.243 <4>0[ 4074.723050] Normal per-cpu:
05-23 04:27:36.243 <4>0[ 4074.723061] CPU    0: hi:   42, btch:   7 usd:  41
05-23 04:27:36.243 <4>0[ 4074.723085] active_anon:2280 inactive_anon:2315 isolated_anon:14
05-23 04:27:36.243 <4>0[ 4074.723093]  active_file:656 inactive_file:651 isolated_file:0
05-23 04:27:36.243 <4>0[ 4074.723101]  unevictable:83 dirty:1 writeback:2 unstable:0
05-23 04:27:36.243 <4>0[ 4074.723108]  free:7269 slab_reclaimable:249 slab_unreclaimable:1176
05-23 04:27:36.243 <4>0[ 4074.723116]  mapped:2471 shmem:528 pagetables:400 bounce:0
05-23 04:27:36.243 <4>0[ 4074.723151] Normal free:29076kB min:1352kB low:2200kB high:2540kB active_anon:9120kB inactive_anon:9260kB active_file:2624kB inactive_file:2604kB unevictable:332kB isolated(anon):56kB isolated(file):0kB present:114408kB mlocked:0kB dirty:4kB writeback:8kB mapped:9884kB shmem:2112kB slab_reclaimable:996kB slab_unreclaimable:4704kB kernel_stack:2104kB pagetables:1600kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:12 all_unreclaimable? no
05-23 04:27:36.243 <4>0[ 4074.723196] lowmem_reserve[]: 0 0 0
05-23 04:27:36.243 <4>0[ 4074.723214] Normal: 4885*4kB 1192*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 29076kB
05-23 04:27:36.243 <4>0[ 4074.723267] 4165 total pagecache pages
05-23 04:27:36.243 <4>0[ 4074.723277] 199 pages in swap cache
05-23 04:27:36.243 <4>0[ 4074.723289] Swap cache stats: add 425325, delete 425126, find 18961/212896
05-23 04:27:36.243 <4>0[ 4074.723301] Free swap  = 2036kB
05-23 04:27:36.243 <4>0[ 4074.723310] Total swap = 65532kB
05-23 04:27:36.243 <4>0[ 4074.724865] 28854 pages of RAM
05-23 04:27:36.243 <4>0[ 4074.724875] 7944 free pages
05-23 04:27:36.243 <4>0[ 4074.724883] 1709 reserved pages
05-23 04:27:36.243 <4>0[ 4074.724892] 1021 slab pages
05-23 04:27:36.243 <4>0[ 4074.724900] 6384 pages shared
05-23 04:27:36.243 <4>0[ 4074.724908] 199 pages swap cached
05-23 04:27:36.243 <6>0[ 4074.724918] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
05-23 04:27:36.243 <6>0[ 4074.724949] [   65]     0    65       78       25   0     -16          -941 ueventd
05-23 04:27:36.243 <6>0[ 4074.724972] [   80]  1000    80      207       15   0     -16          -941 servicemanager
05-23 04:27:36.243 <6>0[ 4074.724992] [   81]     0    81     1005       29   0     -16          -941 vold
05-23 04:27:36.243 <6>0[ 4074.725012] [   82]  1000    82       62        3   0     -16          -941 vcharged
05-23 04:27:36.243 <6>0[ 4074.725032] [   83]  1000    83      170        5   0     -16          -941 sprd_monitor
05-23 04:27:36.243 <6>0[ 4074.725052] [   87]  1001    87      205       10   0     -16          -941 rilproxy
05-23 04:27:36.243 <6>0[ 4074.725072] [   88]     0    88     1855       36   0     -16          -941 netd
05-23 04:27:36.243 <6>0[ 4074.725092] [   89]     0    89      627       26   0     -16          -941 debuggerd
05-23 04:27:36.243 <6>0[ 4074.725112] [   91]  1002    91      333        7   0     -16          -941 dbus-daemon
05-23 04:27:36.243 <6>0[ 4074.725132] [   92]  1017    92      433        5   0     -16          -941 keystore
05-23 04:27:36.243 <6>0[ 4074.725153] [   93]     0    93      462        7   0     -16          -941 mfserial
05-23 04:27:36.243 <6>0[ 4074.725173] [   94]     0    94      319       10   0     -16          -941 nvm_daemon
05-23 04:27:36.243 <6>0[ 4074.725193] [   95]  1000    95      461       10   0     -16          -941 modemd
05-23 04:27:36.243 <6>0[ 4074.725213] [   99]  1001    99      205       10   0     -16          -941 rilproxy
05-23 04:27:36.243 <6>0[ 4074.725233] [  101]     0   101      278        7   0     -16          -941 engmoded
05-23 04:27:36.244        18   0     -16          -941 phoneserver_2si
05-23 04:27:36.244 <6>0[ 4074.725274] [  171]  1000   171     2103       18   0     -16          -941 engpcclient
05-23 04:27:36.244 <6>0[ 4074.725294] [  172]  1000   172      468        6   0     -16          -941 engmodemclient
05-23 04:27:36.244 <6>0[ 4074.725315] [  173]  1000   173      464        4   0     -16          -941 engservice
05-23 04:27:36.244 <6>0[ 4074.725335] [  346]     0   346      462        4   0     -16          -941 mfserial
05-23 04:27:36.244 <6>0[ 4074.725355] [  364]     0   364     1327       61   0     -16          -941 slog
05-23 04:27:36.244 <6>0[ 4074.725375] [ 3146]     0  3146     1633       55   0     -16          -941 adbd
05-23 04:27:36.244 <6>0[ 4074.725395] [13638]  1001 13638     2419       28   0     -16          -941 rild_sp
05-23 04:27:36.244 <6>0[ 4074.725415] [13639]  1001 13639     1907       24   0     -16          -941 rild_sp
05-23 04:27:36.244 <6>0[ 4074.725434] [13657]     0 13657    32964     3698   0       0             0 b2g
05-23 04:27:36.244 <6>0[ 4074.725454] [13811]     0 13811    11715      632   0       0             0 (Nuwa)
05-23 04:27:36.244 <6>0[ 4074.725476] [25747]  1010 25747      628       36   0     -16          -941 wpa_supplicant
05-23 04:27:36.244 <6>0[ 4074.725497] [26511]  1014 26511      229       22   0     -16          -941 dhcpcd
05-23 04:27:36.244 <6>0[ 4074.725517] [28052]  1013 28052     7581      415   0     -16          -941 mediaserver
05-23 04:27:36.244 <6>0[ 4074.725537] [29490] 39490 29490    22601     3930   0       2           134 YouTube
05-23 04:27:36.244 <6>0[ 4074.725557] [31445]     0 31445    13761     1113   0       1            67 (Preallocated a
05-23 04:27:36.244 <6>0[ 4074.725578] [32348]     0 32348      189       92   0       0             0 sh
05-23 04:27:36.244 <6>0[ 4074.725597] [32354]     0 32354      180       60   0       0             0 cat
05-23 04:27:36.244 <6>0[ 4074.725616] [32356]     0 32356        2        1   0       0             0 sh
05-23 04:27:36.244 <6>0[ 4074.725634] [32358]     0 32358      176       15   0       0             0 sh
05-23 04:27:36.244 <4>0[ 4074.725655] zram0 status unit(page):
05-23 04:27:36.244 <4>0[ 4074.725660]      mem_used_total:  6611 
05-23 04:27:36.244 <4>0[ 4074.725666]      compr_data_size: 6415 
05-23 04:27:36.244 <4>0[ 4074.725672]      orig_data_size:  15520 
05-23 04:27:36.244 <4>0[ 4074.725678]      num_reads:       193935 
05-23 04:27:36.244 <4>0[ 4074.725683]      num_writes:      231212
(In reply to ying.xu from comment #33)
> below is some log ,from danny.liang
> 
> we think it was a memory fragmentation problem with high possibility.
> 
> free memory was plenty but free swap was low.
> 
> So we prefer to kill the Preallocated process to keep the foreground app .

I don't get how you get the conclusion from your reasons.  Could you explain it more?
Flags: needinfo?(tlee)
(In reply to James Zhang from comment #31)
> We want to change the
> "hal.processPriorityManager.gonk.PREALLOC.OomScoreAdjust" value higher than
> "hal.processPriorityManager.gonk.FOREGROUND_HIGH.OomScoreAdjust".
> 
> Two reason.
> 1. Fabrice has grouped home screen to system app, b2g is bigger now.
> 2. We need pass top 3rd-party test case, we should keep foreground app and
> kill preallocated process when LMK/OOM.

How much USS did you see the preallocated process consume? Remember that we made the adjustment of in this bug for the problem that the preallocated process was repeatedly forked and killed by the lomem killer because its priority is too low. If we lower the priority we could bring the problem back and cause regressions. Even making it lower than the foreground high priority might cause the regression when there is a foreground high app.

After launch you may see the preallocated process to have about 4MB of USS. We can reduce this number to ~0.4 MB by not calling PreloadSlowThings() in the preallocated process. The cost of it is that app launch time will increase ~500 ms. If this is acceptable we can make this change.
(In reply to Thinker Li [:sinker] from comment #34)
> (In reply to ying.xu from comment #33)
> > below is some log ,from danny.liang
> > 
> > we think it was a memory fragmentation problem with high possibility.
> > 
> > free memory was plenty but free swap was low.
> > 
> > So we prefer to kill the Preallocated process to keep the foreground app .
> 
> I don't get how you get the conclusion from your reasons.  Could you explain
> it more?

Normally, if there was enough memory , higher than the watermark in kernel, kswapd won't work at all.
But in this case, there was much free memory, kswapd still involved LMK, 
which mean kswap was working, swapping physical memory to swap disk.

The only possibility of this situation was
one thread was malloc-ing a large block of memory(at least larger than 8K), but there was no memory to match this malloc action.And the merging action of memory pool was not succeed.
So the kswapd continued to work, try to meet the malloc of the large block memory.
which caused free swap being lower and lower and LMK killing some process.

05-23 04:27:36.243 <4>0[ 4074.722685] kswapd0 invoked lowmemorykiller: gfp_mask=0xd0, oom_adj=0, oom_score_adj=0
05-23 04:27:36.243 <4>0[ 4074.723214] Normal: 4885*4kB 1192*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 29076kB
(In reply to Cervantes Yu from comment #35)
> How much USS did you see the preallocated process consume? Remember that we
> made the adjustment of in this bug for the problem that the preallocated
> process was repeatedly forked and killed by the lomem killer because its
> priority is too low. If we lower the priority we could bring the problem
> back and cause regressions. Even making it lower than the foreground high
> priority might cause the regression when there is a foreground high app.

I fully second Cervantes analysis here. The reason we did this was both to improve startup time and to prevent the preallocated process from entering kill/relaunch cycles that would cause significant CPU and memory churn. Disabling this change might introduce a lot of regressions and we don't even have a way to spot them during testing since they would be dependent on the usage pattern.
We run monkey test to verify.
Please see this case, Preallocated is killed and foreground add Camera isn't killed, we can protect foreground app now.

05-23 22:42:08.144 <4>0[42984.376982] lowmem_shrink select 13168 (Camera), adj 2, size 12490, to kill
05-23 22:42:08.144 <4>0[42984.377003] lowmem_shrink select 13288 ((Preallocated a), adj 4, size 3233, to kill
05-23 22:42:08.144 <4>0[42984.377018] lowmem_shrink send sigkill to 13288 ((Preallocated a), adj 4, size 3233
05-23 22:42:08.144 <4>0[42984.377034] b2g invoked lowmemorykiller: gfp_mask=0x200da, oom_adj=0, oom_score_adj=0
05-23 22:42:08.144 <4>0[42984.377046] Backtrace: 
05-23 22:42:08.144 <4>0[42984.377083] [<c4537ad8>] (dump_backtrace+0x0/0x110) from [<c488a278>] (dump_stack+0x18/0x1c)
05-23 22:42:08.144 <4>0[42984.377099]  r7:c6fadcd0 r6:000200da r5:c6fac000 r4:c708a360
05-23 22:42:08.144 <4>0[42984.377135] [<c488a260>] (dump_stack+0x0/0x1c) from [<c4779968>] (lowmem_shrink+0x480/0x57c)
05-23 22:42:08.144 <4>0[42984.377160] [<c47794e8>] (lowmem_shrink+0x0/0x57c) from [<c45a4960>] (shrink_slab+0x114/0x1c0)
05-23 22:42:08.144 <4>0[42984.377181] [<c45a484c>] (shrink_slab+0x0/0x1c0) from [<c45a5518>] (try_to_free_pages+0x1e8/0x368)
05-23 22:42:08.144 <4>0[42984.377209] [<c45a5330>] (try_to_free_pages+0x0/0x368) from [<c459d230>] (__alloc_pages_nodemask+0x368/0x598)
05-23 22:42:08.144 <4>0[42984.377236] [<c459cec8>] (__alloc_pages_nodemask+0x0/0x598) from [<c45bc5a0>] (read_swap_cache_async+0x58/0x1b0)
05-23 22:42:08.144 <4>0[42984.377261] [<c45bc548>] (read_swap_cache_async+0x0/0x1b0) from [<c45bc784>] (swapin_readahead+0x8c/0x94)
05-23 22:42:08.144 <4>0[42984.377288] [<c45bc6f8>] (swapin_readahead+0x0/0x94) from [<c45aeb58>] (handle_pte_fault+0x2bc/0x6cc)
05-23 22:42:08.144 <4>0[42984.377312] [<c45ae89c>] (handle_pte_fault+0x0/0x6cc) from [<c45af4dc>] (handle_mm_fault+0xd0/0xe4)
05-23 22:42:08.144 <4>0[42984.377338] [<c45af40c>] (handle_mm_fault+0x0/0xe4) from [<c453b808>] (do_page_fault+0xe8/0x28c)
05-23 22:42:08.144 <4>0[42984.377369] [<c453b720>] (do_page_fault+0x0/0x28c) from [<c4528288>] (do_DataAbort+0x3c/0xa0)
05-23 22:42:08.144 <4>0[42984.377399] [<c452824c>] (do_DataAbort+0x0/0xa0) from [<c4533dcc>] (ret_from_exception+0x0/0x10)
05-23 22:42:08.144 <4>0[42984.377415] Exception stack(0xc6fadfb0 to 0xc6fadff8)
05-23 22:42:08.144 <4>0[42984.377430] dfa0:                                     431e7820 48d8f004 00000070 48d8f000
05-23 22:42:08.144 <4>0[42984.377451] dfc0: 431e7820 0000000e 00000000 431e7800 00000062 fffffae4 403fe270 403fe270
05-23 22:42:08.144 <4>0[42984.377469] dfe0: 485e0ca0 bed2baf8 41753fb9 41750f28 00800030 ffffffff
05-23 22:42:08.144 <4>0[42984.377481]  r7:431e7800 r6:00000000 r5:00000007 r4:0000040f
05-23 22:42:08.144 <4>0[42984.377498] Mem-info:
05-23 22:42:08.144 <4>0[42984.377506] Normal per-cpu:
05-23 22:42:08.144 <4>0[42984.377515] CPU    0: hi:   42, btch:   7 usd:  40
05-23 22:42:08.144 <4>0[42984.377536] active_anon:5752 inactive_anon:5766 isolated_anon:64
05-23 22:42:08.144 <4>0[42984.377544]  active_file:1849 inactive_file:318 isolated_file:0
05-23 22:42:08.144 <4>0[42984.377552]  unevictable:83 dirty:0 writeback:0 unstable:0
05-23 22:42:08.144 <4>0[42984.377559]  free:394 slab_reclaimable:248 slab_unreclaimable:1112
05-23 22:42:08.144 <4>0[42984.377567]  mapped:3086 shmem:1125 pagetables:405 bounce:0
05-23 22:42:08.144 <4>0[42984.377597] Normal free:1576kB min:1352kB low:2200kB high:2540kB active_anon:23008kB inactive_anon:23064kB active_file:7396kB inactive_file:1272kB unevictable:332kB isolated(anon):256kB isolated(file):0kB present:114408kB mlocked:0kB dirty:0kB writeback:0kB mapped:12344kB shmem:4500kB slab_reclaimable:992kB slab_unreclaimable:4448kB kernel_stack:2016kB pagetables:1620kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
05-23 22:42:08.144 <4>0[42984.377637] lowmem_reserve[]: 0 0 0
05-23 22:42:08.144 <4>0[42984.377650] Normal: 388*4kB 3*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1576kB
05-23 22:42:08.144 <4>0[42984.377687] 5758 total pagecache pages
05-23 22:42:08.144 <4>0[42984.377695] 335 pages in swap cache
05-23 22:42:08.144 <4>0[42984.377705] Swap cache stats: add 16881361, delete 16881026, find 544631/9116811
05-23 22:42:08.144 <4>0[42984.377716] Free swap  = 2004kB
05-23 22:42:08.144 <4>0[42984.377723] Total swap = 65532kB
05-23 22:42:08.144 <4>0[42984.379258] 28854 pages of RAM
05-23 22:42:08.144 <4>0[42984.379268] 1040 free pages
05-23 22:42:08.144 <4>0[42984.379275] 1707 reserved pages
05-23 22:42:08.144 <4>0[42984.379282] 995 slab pages
05-23 22:42:08.144 <4>0[42984.379289] 6316 pages shared
05-23 22:42:08.144 <4>0[42984.379296] 335 pages swap cached
05-23 22:42:08.144 <6>0[42984.379304] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
05-23 22:42:08.144 <6>0[42984.379335] [   65]     0    65       78       22   0     -16          -941 ueventd
05-23 22:42:08.144 <6>0[42984.379356] [   80]  1000    80      207        6   0     -16          -941 servicemanager
05-23 22:42:08.144 <6>0[42984.379375] [   81]     0    81     1005       11   0     -16          -941 vold
05-23 22:42:08.175 79392] [   82]  1000    82       62        0   0     -16          -941 vcharged
05-23 22:42:08.175 <6>0[42984.379411] [   83]  1000    83      170        0   0     -16          -941 sprd_monitor
05-23 22:42:08.175 <6>0[42984.379429] [   87]  1001    87      205        1   0     -16          -941 rilproxy
05-23 22:42:08.175 <6>0[42984.379447] [   88]     0    88     1855       20   0     -16          -941 netd
05-23 22:42:08.175 <6>0[42984.379465] [   89]     0    89      173        0   0     -16          -941 debuggerd
05-23 22:42:08.175 <6>0[42984.379483] [   91]  1002    91      342       13   0     -16          -941 dbus-daemon
05-23 22:42:08.175 <6>0[42984.379501] [   92]  1017    92      433        0   0     -16          -941 keystore
05-23 22:42:08.175 <6>0[42984.379519] [   93]     0    93      462        0   0     -16          -941 mfserial
05-23 22:42:08.175 <6>0[42984.379537] [   94]     0    94      319        3   0     -16          -941 nvm_daemon
05-23 22:42:08.175 <6>0[42984.379555] [   95]  1000    95      461        0   0     -16          -941 modemd
05-23 22:42:08.175 <6>0[42984.379573] [   99]  1001    99      205        1   0     -16          -941 rilproxy
05-23 22:42:08.175 <6>0[42984.379591] [  101]     0   101      278        0   0     -16          -941 engmoded
05-23 22:42:08.175 <6>0[42984.379609] [  166]  1000   166     7179       10   0     -16          -941 phoneserver_2si
05-23 22:42:08.175 <6>0[42984.379628] [  169]  1000   169     2103        4   0     -16          -941 engpcclient
05-23 22:42:08.175 <6>0[42984.379646] [  170]  1000   170      468        0   0     -16          -941 engmodemclient
05-23 22:42:08.175 <6>0[42984.379665] [  172]  1000   172      464        0   0     -16          -941 engservice
05-23 22:42:08.175 <6>0[42984.379683] [  311]     0   311      462        0   0     -16          -941 mfserial
05-23 22:42:08.175 <6>0[42984.379701] [  319]     0   319     1320       57   0     -16          -941 slog
05-23 22:42:08.175 <6>0[42984.379719] [  337]     0   337     1116       32   0     -16          -941 adbd
05-23 22:42:08.175 <6>0[42984.379738] [  502]  1001   502     2420       20   0     -16          -941 rild_sp
05-23 22:42:08.175 <6>0[42984.379756] [  503]  1001   503     1651       18   0     -16          -941 rild_sp
05-23 22:42:08.175 <6>0[42984.379774] [  611]     0   611    41408     5562   0       0             0 b2g
05-23 22:42:08.175 <6>0[42984.379791] [  624]     0   624    11739      112   0     -16          -941 (Nuwa)
05-23 22:42:08.175 <6>0[42984.379809] [ 8264]  1013  8264     5456       68   0     -16          -941 mediaserver
05-23 22:42:08.175 <6>0[42984.379830] [13168] 23168 13168    29727     9938   0       2           134 Camera
05-23 22:42:08.175 <6>0[42984.379847] [13267]     0 13267      189       24   0       0             0 sh
05-23 22:42:08.175 <6>0[42984.379864] [13269]     0 13269      172       34   0       0             0 orng
05-23 22:42:08.175 <6>0[42984.379882] [13288]     0 13288    13529     1011   0       4           267 (Preallocated a
(In reply to Gabriele Svelto [:gsvelto] from comment #37)
> I fully second Cervantes analysis here. The reason we did this was both to
> improve startup time and to prevent the preallocated process from entering
> kill/relaunch cycles that would cause significant CPU and memory churn.

From what I saw and test, the cycles doesn't exist.

There are two points where the preallocated process would be initialized .
ContentParent::RecvFirstIdle()
ContentParent::StartUp()

and if I killed the preallocated process manually, it won't be recreated again.
Unless a new app process was forked and ContentParent::RecvFirstIdl was called.
(In reply to ying.xu from comment #40)
> (In reply to Gabriele Svelto [:gsvelto] from comment #37)
> > I fully second Cervantes analysis here. The reason we did this was both to
> > improve startup time and to prevent the preallocated process from entering
> > kill/relaunch cycles that would cause significant CPU and memory churn.
> 
> From what I saw and test, the cycles doesn't exist.
> 
> There are two points where the preallocated process would be initialized .
> ContentParent::RecvFirstIdle()
> ContentParent::StartUp()
> 
> and if I killed the preallocated process manually, it won't be recreated
> again.
> Unless a new app process was forked and ContentParent::RecvFirstIdl was
> called.

Just killing the preallocated process doesn't make it enter the restart cycle. This isn't the problem we solve in this bug. The problem is that it take a long time to launch an app if the preallocated process has a low priority:

1. The user launches an app.
2. A new process is forked from Nuwa, with low priority.
3. The new process is killed because it has a low priority.
4. We observed that the new process is killed. Because we still have an app to launch. We need to fork a process.
5. Go to step 2.

This launch/kill cycle can repeat for a long time if the preallocated process has a low priority.
(In reply to Cervantes Yu from comment #41)
> Just killing the preallocated process doesn't make it enter the restart
> cycle. This isn't the problem we solve in this bug. The problem is that it
> take a long time to launch an app if the preallocated process has a low
> priority:
> 
> 1. The user launches an app.
> 2. A new process is forked from Nuwa, with low priority.
> 3. The new process is killed because it has a low priority.
> 4. We observed that the new process is killed. Because we still have an app
> to launch. We need to fork a process.
> 5. Go to step 2.
> 
> This launch/kill cycle can repeat for a long time if the preallocated
> process has a low priority.

OK, this's a problem.
Can we passing a parameter to ContentParent::PreallocateAppProcess?
which indicated a real app was needed or a PreallocateAppProcess
(In reply to ying.xu from comment #42)
> (In reply to Cervantes Yu from comment #41)
> > Just killing the preallocated process doesn't make it enter the restart
> > cycle. This isn't the problem we solve in this bug. The problem is that it
> > take a long time to launch an app if the preallocated process has a low
> > priority:
> > 
> > 1. The user launches an app.
> > 2. A new process is forked from Nuwa, with low priority.
> > 3. The new process is killed because it has a low priority.
> > 4. We observed that the new process is killed. Because we still have an app
> > to launch. We need to fork a process.
> > 5. Go to step 2.
> > 
> > This launch/kill cycle can repeat for a long time if the preallocated
> > process has a low priority.
> 
> OK, this's a problem.
> Can we passing a parameter to ContentParent::PreallocateAppProcess?
> which indicated a real app was needed or a PreallocateAppProcess

I don't think it a good idea for rushing changes to twiddle with priority of the preallocated process. The problem is that the preallocated process is fat, right? We can make it thinner by not running PreloadSlowThings() in ContentChild.cpp. It will reduce the USS to ~0.4 MB. The cost is ~500 ms of app launch time. I think this to be a safer solution.
(In reply to Cervantes Yu from comment #43)
> I don't think it a good idea for rushing changes to twiddle with priority of
> the preallocated process. The problem is that the preallocated process is
> fat, right? We can make it thinner by not running PreloadSlowThings() in
> ContentChild.cpp. It will reduce the USS to ~0.4 MB. The cost is ~500 ms of
> app launch time. I think this to be a safer solution.

We also want to avoid the memory fragment we have meet. 

when setting was at foreground, we slided the setting menu, the buddy info was as follows:
there was no 16k memory-block in memory pool.

root@yingxuubt:~/b2gsource/ffos/gecko/dom/ipc#  adb shell cat /proc/buddyinfo 
Node 0, zone   Normal    729    214      0      0      0      0      0      0      0      0      0 

but if the preallocated process was killed, we can found the increasement of 16k memory-block , greater than 100
(In reply to ying.xu from comment #44)
> (In reply to Cervantes Yu from comment #43)
> > I don't think it a good idea for rushing changes to twiddle with priority of
> > the preallocated process. The problem is that the preallocated process is
> > fat, right? We can make it thinner by not running PreloadSlowThings() in
> > ContentChild.cpp. It will reduce the USS to ~0.4 MB. The cost is ~500 ms of
> > app launch time. I think this to be a safer solution.
> 
> We also want to avoid the memory fragment we have meet. 
> 
> when setting was at foreground, we slided the setting menu, the buddy info
> was as follows:
> there was no 16k memory-block in memory pool.
> 
> root@yingxuubt:~/b2gsource/ffos/gecko/dom/ipc#  adb shell cat
> /proc/buddyinfo 
> Node 0, zone   Normal    729    214      0      0      0      0      0     
> 0      0      0      0 
> 
> but if the preallocated process was killed, we can found the increasement of
> 16k memory-block , greater than 100

If I read it correctly, are you saying that you want to kill a user space program because of kernel memory fragmentation?
(In reply to Cervantes Yu from comment #45)
> If I read it correctly, are you saying that you want to kill a user space
> program because of kernel memory fragmentation?

yes.
the memory in buddy-list was used for the whole system, not only for the kernel
(In reply to ying.xu from comment #46)
> (In reply to Cervantes Yu from comment #45)
> > If I read it correctly, are you saying that you want to kill a user space
> > program because of kernel memory fragmentation?
> 
> yes.
> the memory in buddy-list was used for the whole system, not only for the
> kernel

The user space program can only affect how much free page in the kernel space. It *can't* directly affect memory fragmentation. We can only kill the preallocated process to free pages back to the kernel, not to reduce fragmentation.

We can reduce the USS of the preallocated process to 0.4 MB. We don't need to kill it. Killing a user space program just because of kernel memory fragmentation doesn't make sense.
Flags: needinfo?(ttsai)
(In reply to Cervantes Yu from comment #47)
> The user space program can only affect how much free page in the kernel
> space. It *can't* directly affect memory fragmentation. We can only kill the
> preallocated process to free pages back to the kernel, not to reduce
> fragmentation.

I don't think so.

If the preallocated process was killed, the memory of the process was freed to buddy system.
This will lead to a combining action of the buddy system.

If the memory allocations from preallocated process were happened in high order memory pool(16K, or bigger),
We could retrieve the high-order memory by the combining action of buddy system.
(In reply to ying.xu from comment #48)
> (In reply to Cervantes Yu from comment #47)
> > The user space program can only affect how much free page in the kernel
> > space. It *can't* directly affect memory fragmentation. We can only kill the
> > preallocated process to free pages back to the kernel, not to reduce
> > fragmentation.
> 
> I don't think so.
> 
> If the preallocated process was killed, the memory of the process was freed
> to buddy system.
> This will lead to a combining action of the buddy system.
> 
> If the memory allocations from preallocated process were happened in high
> order memory pool(16K, or bigger),
> We could retrieve the high-order memory by the combining action of buddy
> system.

Yes, fragmentation is reduced because memory is returned, and some memory regions can be combined. But this is totally kernel implementation detail and none of user space program's concern. You can only ask user space programs to reduce memory usage, not to ask them not to reduce fragmentation. That's why I said user space programs *can't directly* affect fragmentation. Killing a process to reduce fragmentation is just like deleting a user's file on the disk because the file is causing disk fragmentation.

Back to the root cause, I think the problem is why we need to allocate contiguous memory in the kernel. Is it that scrolling in the settings app results in too much graphic buffer being allocated? If this is the case, we need to figure out a way to reduce it.

Also one possible cause of memory fragmentation is that the system usually runs short of memory. Reducing user space programs' memory usage will alleviate the problem. Killing the process is one way to reduce memory. We can safely reduce the preallocated process's USS to ~0.4MB, and I think this to be a safer and cleaner way to do
OK.

The problem is we have got some bugs about app launching time.
We really don't want any more.

(they are described with chinese, sorry for this)
http://bugzilla.spreadtrum.com/bugzilla/show_bug.cgi?id=315517    music
http://bugzilla.spreadtrum.com/bugzilla/show_bug.cgi?id=314497    camera

Are there other ways to fix this ?

give foreground app more memory
no memory fragment problem
no harm app launching time
no cycle of killing/launching preallocated process
(In reply to ying.xu from comment #50)
> (they are described with chinese, sorry for this)
> http://bugzilla.spreadtrum.com/bugzilla/show_bug.cgi?id=315517    music
> http://bugzilla.spreadtrum.com/bugzilla/show_bug.cgi?id=314497    camera
> 

I took a look at these 2 bugs. They are about time to launch the apps until ready.
The 1st is about launching the music app and loading the songs. We took 4.x sec. The 2nd is about launching the camera and it also took 4.x sec.

There are 2 problems here: 1. we spend a longer time compared to the competition because we load cover arts. A good comparison should be conducted by not loading cover arts in the music app. This has to be done in the application level.

2. The camera took 4.x seconds to run. I didn't see the camera app taking so long to launch on my engineering build. It took 2.x seconds to launch the camera (from clicking the icon to camera preview being shown), estimated using my watch. Maybe there is regression in camera launch time.
(In reply to Cervantes Yu from comment #51)
 
> 2. The camera took 4.x seconds to run. I didn't see the camera app taking so
> long to launch on my engineering build. It took 2.x seconds to launch the
> camera (from clicking the icon to camera preview being shown), estimated
> using my watch. Maybe there is regression in camera launch time.

I also get ~ 2s on a user build here.
(In reply to ying.xu from comment #50)

My intention is 
we should sort these actions with priorities. Then we can follow the ordered list.

> give foreground app more memory
> no harm app launching time
> no memory fragment problem
> no cycle of killing/launching preallocated process
Flags: needinfo?(kkuo)
Flags: needinfo?(styang)
Flags: needinfo?(gal)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: