Closed Bug 964599 Opened 12 years ago Closed 11 years ago

[Tarako] Fine tune the kernel parameters to get better launch time

Categories

(Firefox OS Graveyard :: Performance, defect, P3)

Other
Gonk (Firefox OS)
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: sinker, Assigned: ting)

Details

(Keywords: perf, Whiteboard: [c=progress p= s=2014.06.06.t u=])

For Tarako, zRAM takes about 300ms during launching an app. The pages being written should be dirty pages. So, in theory, reduce dirty pages reducing the number of page-outs; by flushing out more aggressive. For now, for Tarako, it uses the same configuration as the Linux desktop. If with proper setting, it would reduce the launch time of apps.
* initial settings dirty_background_bytes = 0 dirty_background_ratio = 5 dirty_bytes = 0 dirty_ratio = 20 dirty_expire_centisecs = 200 dirty_writeback_centisecs = 500 * time to load / zram num_reads & num_writes (Clock) 1188 3r4w 1088 0r0w 1227 0r0w 1270 14r4w 1284 158r1w 1225 16r1w 1099 0r6w 1039 5r0w 1251 6r9w 1118 69r0w
* the other settings swappiness = 100 low-memory killer parameters notify_trigger = 10240 KB oom_adj min_free 0 1024 KB 1 2048 KB 2 4096 KB 6 6144 KB 8 8192 KB 10 15360 KB
Thinker told me it'd be easier for reading if the information can be put together within 1 comment, so I arrange it a bit, please let me know if I can make it better. Test environment: - screenshot is disabled at windowClosed() - low-memory killer notify_trigger = 10240 KB oom_adj min_free 0 1024 KB 1 2048 KB 2 4096 KB 6 6144 KB 8 8192 KB 10 15360 KB - /proc/sys/vm/swappiness = 100 - /proc/sys/vm/laptop_mode = 0 Test steps: 1. tap on Clock 2. check the number differences of /sys/block/zram0 after Clock is shown 3. back to homescreen 4. lunch Settings and back to homescreen 5. goto step 1 and repeat | settings | time to load | zram num_reads & num_writes | +-------------------------------- +--------------+-----------------------------+ | <initial> | 1188 | 3r4w | | dirty_background_bytes = 0 | 1088 | 0r0w | | dirty_background_ratio = 5 | 1227 | 0r0w | | dirty_bytes = 0 | 1270 | 14r4w | | dirty_ratio = 20 | 1284 | 158r1w | | dirty_expire_centisecs = 200 | 1225 | 16r1w | | dirty_writeback_centisecs = 500 | 1099 | 0r6w | | | 1039 | 5r0w | | | 1251 | 6r9w | | | 1118 | 69r0w | +-------------------------------- +--------------+-----------------------------+ | dirty_ratio = 10 | 1117 | 16r383w | | | 1416 | 0r90w | | | 1490 | 16r193w | | | 1207 | 8r266w | | | 1330 | 208r299w | | | 1162 | 0r53w | | | 1312 | r102w | | | 1073 | 16r13w | | | 1075 | 0r73w | | | 1207 | 2r40w | +---------------------------------+--------------+-----------------------------+ | dirty_ratio = 40 | 1133 | 123r0w | | | 1223 | 108r1w | | | 1175 | 35r0w | | | 1118 | 19r6w | | | 1116 | 135r35w | | | 1255 | 2r1w | | | 1245 | 36r4w | | | 1201 | 17r0w | | | 1140 | 21r88w | | | 1342 | 19r99w | +---------------------------------+--------------+-----------------------------+ | dirty_writeback_centisecs = 250 | 1091 | 175r3w | | | 1166 | 43r0w | | | 1194 | 33r5w | | | 1274 | 52r45w | | | 1216 | 21r14w | | | 1057 | 13r7w | | | 1194 | 0r1w | | | 1124 | 10r6w | | | 1212 | 290r0w | | | 1342 | 172r2w | +---------------------------------+--------------+-----------------------------+ The numbers above shows zram read/write and loading time seems are nonlinearly related.
Additional information for comment 3: kernel = gecko = gaia = nuwa = enabled and there's no background app in most cases, which is killed by LMK.
Pressed enter accidentally, complete the revisions. (In reply to Ting-Yu Chou [:ting] from comment #4) > Additional information for comment 3: > > kernel = 44e3a037 > gecko = cd8bc54e > gaia = 22bc6be5 > nuwa = enabled > > and there's no background app in most cases, which is killed by LMK.
Retest with the latest code (nuwa enabled): kernel = 02167f67 gecko = 86a280a3 gaia = 7c686491 which kernel has patched nand flash write from polling to interrupt, and updated victim selection logic of low memory killer. 0 background app ================ Steps: 1. make sure there's no background app, swipe home screen left/right 2. tap on Clock when zram r/w activity is low 3. check the number differences of /sys/block/zram0 after Clock and time to load is shown 4. back to home screen 5. kill Clock manually 6. goto step 1 and repeat | settings | time to load | zram num_reads & num_writes | +-------------------------------- +--------------+-----------------------------+ | <initial> | 1472 | 20r1222w | | dirty_background_bytes = 0 | 1464 | 66r0w | | dirty_background_ratio = 5 | 1244 | 229r0w | | dirty_bytes = 0 | 1587 | 635r186w | | dirty_ratio = 20 | 1581 | 268r292w | | dirty_expire_centisecs = 200 | 2514 | 487r128w | | dirty_writeback_centisecs = 500 | 2041 | 522r26w | | | 2219 | 1219r150w | | | 2706 | 778r80w | | | 2426 | 1266r195w | | +--------------+-----------------------------+ | | 1925 (avg.) | 549r228w (avg.) | +-------------------------------- +--------------+-----------------------------+ | dirty_ratio = 10 | 2333 | 346r90w | | | 2457 | 1501r144w | | | 2003 | 175r111w | | | 2973 | 1498r307w | | | 3048 | 2491r206w | | | 2417 | 573r34w | | | 2295 | 512r154w | | | 2565 | 414r205w | | | 1930 | 238r91w | | | 2354 | 787r189w | | +--------------+-----------------------------+ | | 2438 (avg.) | 854r153w (avg.) | +---------------------------------+--------------+-----------------------------+ | dirty_ratio = 40 | 2317 | 1096r30w | | | 2223 | 593r81w | | | 1633 | 287r0w | | | 1354 | 153r97w | | | 2435 | 1011r115w | | | 1976 | 173r98w | | | 3120 | 557r155w | | | 3257 | 2234r254w | | | 2377 | 1037r18w | | | 1656 | 346r64w | | +--------------+-----------------------------+ | | 2235 (avg.) | 749r91w (avg.) | +---------------------------------+--------------+-----------------------------+ | dirty_writeback_centisecs = 250 | 2953 | 2579r115w | | | 1781 | 334r76w | | | 2537 | 862r114w | | | 1349 | 94r0w | | | 1603 | 212r58w | | | 1694 | 289r119w | | | 1981 | 519r6w | | | 1660 | 667r56w | | | 2140 | 1085r11w | | | 2446 | 1117r22w | | +--------------+-----------------------------+ | | 2014 (avg.) | 776r58w (avg.) | +---------------------------------+--------------+-----------------------------+ 1 background app ================ Steps: 1. launch Settings and back to home screen 2. tap on Clock after zram r/w activity is low 3. check the number differences of /sys/block/zram0 after Clock and time to load is shown 4. back to home screen 5. kill Clock manually 6. goto step 1 and repeat | settings | time to load | zram num_reads & num_writes | +-------------------------------- +--------------+-----------------------------+ | <initial> | 2054 | 334r212w | | dirty_background_bytes = 0 | 3858 | 1684r285w | | dirty_background_ratio = 5 | 1782 | 322r300w | | dirty_bytes = 0 | 2093 | 813r551w | | dirty_ratio = 20 | 1595 | 966r379w | | dirty_expire_centisecs = 200 | 1739 | 590r141w | | dirty_writeback_centisecs = 500 | 1862 | 1013r151w | | | 1531 | 595r273w | | | 1491 | 240r278w | | | 1798 | 263r394w | | +--------------+-----------------------------+ | | 1980 (avg.) | 682r296w (avg.) | +-------------------------------- +--------------+-----------------------------+ | dirty_ratio = 10 | 1966 | 420r483w | | | 1555 | 784r295w | | | 1680 | 796r228w | | | 2403 | 665r24w | | | 2332 | 345r81w | | | 1902 | 531r149w | | | 1826 | 789r292w | | | 2373 | 355r100w | | | 2090 | 767r102w | | | 1812 | 521r115w | | +--------------+-----------------------------+ | | 1994 (avg.) | 597r187w (avg.) | +---------------------------------+--------------+-----------------------------+ | dirty_ratio = 40 | 1698 | 131r144w | | | 1688 | 766r389w | | | 1883 | 1094r525w | | | 1497 | 206r118w | | | 2583 | 800r228w | | | 2345 | 342r263w | | | 1667 | 741r169w | | | 1816 | 369r89w | | | 1622 | 362r124w | | | 2017 | 845r452w | | +--------------+-----------------------------+ | | 1882 (avg.) | 566r250w (avg.) | +---------------------------------+--------------+-----------------------------+ | dirty_writeback_centisecs = 250 | 2208 | 822r340w | | | 2081 | 912r338w | | | 2405 | 1344r168w | | | 1999 | 821r85w | | | 2065 | 552r808w | | | 2339 | 1214r159w | | | 1984 | 1791r495w | | | 1575 | 202r278w | | | 1556 | 160r359w | | | 2009 | 261r351w | | +--------------+-----------------------------+ | | 2022 (avg.) | 808r338w (avg.) | +---------------------------------+--------------+-----------------------------+
Retested with kernel d6992f28, and created a spreadsheet for easier reading: http://goo.gl/4RQPzh. Will add weight numbers later after Ting-Yuan got zram IO numbers from Tarako (like 945174, comment 21).
Ting-Yuan found there's no noticeable differenes of zram IO performance between Buri and Tarako. So I use the results of bug 945174, comment 21: zRam page-in: 43.29us zRam page-out: 68.08us as weight: avg. zram read * 0.39 avg. zram write * 0.61 and updated to the spreadsheet.
blocking-b2g: --- → 1.3T?
triage: 1.3T+, we want this to be finetuned for the tarako release
blocking-b2g: 1.3T? → 1.3T+
Assignee: nobody → tchou
kernel = d6992f28 gecko = 19aabbc5 The default settings under /proc/sys/vm: block_dump = 0 dirty_background_bytes = 0 dirty_background_ratio = 5 dirty_bytes = 0 dirty_expire_centisecs = 200 dirty_ratio = 20 dirty_writeback_centisecs = 500 drop_caches = 0 highmem_is_dirtyable = 0 laptop_mode = 0 lowmem_reserve_ratio = 32 32 max_map_count = 65530 min_free_kbytes = 1350 min_free_order_shift = 4 mmap_min_addr = 32768 nr_pdflush_threads = 0 oom_dump_tasks = 1 oom_kill_allocating_task = 0 overcommit_memory = 1 overcommit_ratio = 50 page-cluster = 0 panic_on_oom = 0 percpu_pagelist_fraction = 0 scan_unevictable_pages = 0 swappiness = 100 vfs_cache_pressure = 100
I have tried: a) swap out to zram aggressively to make more free pages: dirty_ratio, dirty_background_ratio, dirty_writeback_centisecs b) swap out to zram lazily to keep pages stay in memory longer c) read 4 pages in a single swap in from zram: page-cluster but I can't see any differences from general usage (playing music in background, launch different application, browser, switch to homescreen, etc.).
(In reply to Ting-Yu Chou [:ting] from comment #11) > but I can't see any differences from general usage (playing music in background, > launch different application, browser, switch to homescreen, etc.). I didn't write it clear. I mean I couldn't see any "improvements", besides (b) makes the situation much worse.
The dirty field of /proc/meminfo stays below 100 kB, and usually just 2x~4x kB even when music is played in background. I guess that's why swap out more aggressively doesn't make any improvements.
Tried to make zram disksize smaller: 32, 38, 44, 50 MB, which Thinker would like to know does this force OOM killer to be triggered and kill Homescreen easily. I installed reference-workload-light, and tested with music playing in background: - Homescreen does not get killed easier with disksize changes. - The device gets high system loading (8x~9x) easier when disksize is smaller. - With 50 MB, can still see 0 kB SwapFree from /proc/meminfo sometimes. - Scrolling in a long list is not smooth when disksize is smaller. # /sys/blick/zram0/disksize represents the limit on the uncompressed worth of data that can be stored in this disk.
(In reply to Ting-Yu Chou [:ting] from comment #14) > Tried to make zram disksize smaller: 32, 38, 44, 50 MB Double checked 32 MB, tried 26 and 20 MB as well, Thinker expects Homescreen to be killed with smaller size: 32 MB: Homescreen was killed once in 10 mins playing 26 MB: 3 times/10mins 20 MB: 3 times/10mins The killing happened after long period (> 30 secs) high system loading, there was one time even fg application got killed (20 MB). # Tested by playing music in background, launch and scroll gallery, camera, call log, contacts in random order.
Ting-You, do we have anything actionable from this bug?
Flags: needinfo?(tchou)
(In reply to Fabrice Desré [:fabrice] from comment #16) > Ting-You, do we have anything actionable from this bug? Currently, no.
Flags: needinfo?(tchou)
(In reply to Ting-Yu Chou [:ting] from comment #17) > (In reply to Fabrice Desré [:fabrice] from comment #16) > > Ting-You, do we have anything actionable from this bug? > > Currently, no. Ok, so I think we should not block on this bug.
blocking-b2g: 1.3T+ → ---
Keywords: perf
Whiteboard: [Memshrink]
Status: NEW → ASSIGNED
Priority: -- → P3
Hardware: x86_64 → Other
Whiteboard: [Memshrink] → [c=progress p= s= u=] [Memshrink]
Whiteboard: [c=progress p= s= u=] [Memshrink] → [c=progress p= s= u=]
Re comment 17, since there are no action items, closing won't fix.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
Component: General → Performance
Whiteboard: [c=progress p= s= u=] → [c=progress p= s=2014.06.06.t u=]
You need to log in before you can comment on or make changes to this bug.