Closed
Bug 1082290
Opened 10 years ago
Closed 10 years ago
Implement cgroup swappiness feature for low memory target (256 MB)
Categories
(Firefox OS Graveyard :: Performance, defect, P1)
Tracking
(feature-b2g:2.2+, firefox36 wontfix, firefox37 wontfix, firefox38 fixed, b2g-v2.1 affected, b2g-v2.2 fixed, b2g-master fixed)
People
(Reporter: tkundu, Assigned: dhylands)
References
Details
(Whiteboard: [caf priority: p2][CR 787261][ft:media] [2.2 target])
Attachments
(4 files)
60 bytes,
text/x-github-pull-request
|
mwu
:
review+
|
Details | Review |
11.16 KB,
patch
|
gsvelto
:
review+
bajaj
:
approval-mozilla-b2g37+
|
Details | Diff | Splinter Review |
526.78 KB,
application/x-gzip
|
Details | |
227.60 KB,
text/plain
|
Details |
Hi,
we saw that Android kitkat has a nice feature as below :
If memory cgroups are available, the ActivityManager will mark lower priority threads as being more swappable than other threads. If memory is needed, the Android kernel will start migrating memory pages to zRAM swap, giving a higher priority to those memory pages that have been marked by ActivityManager.
ref: https://source.android.com/devices/low-ram.html#zram
This will help not to swap high priority process from memory and it will increase performance over all on zram enabled device.
Reporter | ||
Comment 1•10 years ago
|
||
[Blocking Requested - why for this release]:
No longer blocks: CAF-v2.1-FC-metabug
blocking-b2g: --- → 2.1?
Reporter | ||
Updated•10 years ago
|
blocking-b2g: 2.1? → 2.2?
Comment 2•10 years ago
|
||
This looks like a useful feature. I've filed bug 1081871 just a few days ago to resurrect the experimental cgroup support we had and see if it works well on KK devices.
See Also: → 1081871
Reporter | ||
Updated•10 years ago
|
Blocks: CAF-v2.2-metabug
Updated•10 years ago
|
blocking-b2g: 2.2? → ---
feature-b2g: --- → 2.2?
Comment 3•10 years ago
|
||
It's supported after Linux Kernel 3.10. Since it depends on kernel configure and kernel version. It seems to be better to enable this feature by partner in current stage. For long term, we may need to create a certified API for this kind of use for system app. Then system app can have more flexible control for other apps. But we need to have more discussion with it.
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → dhylands
Comment 4•10 years ago
|
||
(In reply to Ken Chang[:ken] from comment #3)
> It's supported after Linux Kernel 3.10.
We can assume the 3.10 kernel for the purposes of implementing this bug in the v2.2 timeframe.
Comment 5•10 years ago
|
||
This could be implemented as part of the HAL's SetProcessPriority() function and be transparent to the rest of the codebase:
http://hg.mozilla.org/mozilla-central/file/d1adecc7adad/hal/gonk/GonkHal.cpp#l1536
We should probably create and populate the /mnt/gentoo/sys/fs/cgroup/memory group ourselves and verify it's working by looking for cgroup.procs file.
(In reply to Michael Vines [:m1] [:evilmachines] from comment #4)
> We can assume the 3.10 kernel for the purposes of implementing this bug in
> the v2.2 timeframe.
It would be worth tracking bug 1094121 and its dependencies to figure out when we have something usable for testing.
See Also: → gonk-L
Updated•10 years ago
|
Whiteboard: [ft:media] [2.2 target]
Target Milestone: --- → 2.2 S4 (23jan)
Comment 6•10 years ago
|
||
Quick note, once this lands we should remember to update the documentation here:
https://developer.mozilla.org/en-US/Firefox_OS/Platform/Architecture#Processes_and_threads
Updated•10 years ago
|
Blocks: FxOS-v2.2-features
Comment 8•10 years ago
|
||
(In reply to Kevin Hu [:khu] from comment #7)
> Hema, is it something related to media?
No, this touches all applications, it's not media-specific.
Flags: needinfo?(hkoka)
Comment 9•10 years ago
|
||
That's why I asked. I don't think it's a media issue. But, ft:media was tagged in the whiteboard.
Assignee | ||
Comment 10•10 years ago
|
||
It's probably ft:media because I'm on the media team, and working on it.
Updated•10 years ago
|
feature-b2g: 2.2? → 2.2+
Updated•10 years ago
|
QA Whiteboard: [2.2-feature-qa+]
Updated•10 years ago
|
Flags: in-moztrap?(twen)
Updated•10 years ago
|
Status: NEW → ASSIGNED
Comment 11•10 years ago
|
||
Do we have any update for this bug? Do we plan to finish this bug by 1/12, the branch date?
Thank you.
Flags: needinfo?(dhylands)
Comment 12•10 years ago
|
||
(In reply to Kevin Hu [:khu] from comment #11)
> Do we have any update for this bug? Do we plan to finish this bug by 1/12,
> the branch date?
This kind-of depends on bug 1081871 which is close to being finished and which we also want to land before the 1/12 date. If that bug lands then implementing this on top of it should be very easy.
Assignee | ||
Comment 13•10 years ago
|
||
(In reply to Kevin Hu [:khu] from comment #11)
> Do we have any update for this bug? Do we plan to finish this bug by 1/12,
> the branch date?
> Thank you.
I've been reading up on cgroups and just reviewing 1081871 (will be r+'ing today).
My reading of this now, is that once bug 1081871 lands, all that's required are some tweaks in files which are under vendor control (as discussed in https://source.android.com/devices/tech/low-ram.html#zram - there are some kernel configs to enable, and a line that needs to be added to fstab).
khu - the changes needed (kernel config and fstab change) will be phone specific. Is there a particular phone that this is being targeted for?
I can test on my flame, but I don't have any 256 Mb phone.
Flags: needinfo?(dhylands)
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(khu)
Comment 14•10 years ago
|
||
(In reply to Dave Hylands [:dhylands]{PTO until Jan 5) from comment #13)
> My reading of this now, is that once bug 1081871 lands, all that's required
> are some tweaks in files which are under vendor control (as discussed in
> https://source.android.com/devices/tech/low-ram.html#zram - there are some
> kernel configs to enable, and a line that needs to be added to fstab).
We should probably add a little bit of logic on top of bug 1081871 too. The cgroup hierarchy which I've implemented in bug 1081871 only applies to the cpu cgroup. The memory cgroup will have a separate hierarchy living under another path (/sys/fs/cgroup/memory apparently). We can re-use the same layout we have for both hierarchies but we'll need to treat them separately: we'll have two cgroup.procs file per priority level, one under /dev/cpuctl/<path>/cgroup.procs and another under /sys/fs/cgroup/memory/<path>/cgroup.procs and the configuration files will also need to be adjusted separately (/dev/cpuctl/<path>/cpu.shares vs /sys/fs/cgroup/memory/<path>/memory.swappiness).
Assignee | ||
Comment 15•10 years ago
|
||
The flame already has zram0 swap enabled, so once bug 1081871 lands, then this should be implemented.
Attachment #8546280 -
Flags: review?(mwu)
Updated•10 years ago
|
Attachment #8546280 -
Flags: review?(mwu) → review+
Comment 16•10 years ago
|
||
(In reply to Dave Hylands [:dhylands] from comment #13)
> (In reply to Kevin Hu [:khu] from comment #11)
> > Do we have any update for this bug? Do we plan to finish this bug by 1/12,
> > the branch date?
> > Thank you.
>
> I've been reading up on cgroups and just reviewing 1081871 (will be r+'ing
> today).
>
> My reading of this now, is that once bug 1081871 lands, all that's required
> are some tweaks in files which are under vendor control (as discussed in
> https://source.android.com/devices/tech/low-ram.html#zram - there are some
> kernel configs to enable, and a line that needs to be added to fstab).
>
> khu - the changes needed (kernel config and fstab change) will be phone
> specific. Is there a particular phone that this is being targeted for?
>
> I can test on my flame, but I don't have any 256 Mb phone.
Hi, Dave, I talked with QA. They are also using Flame with 319MB configuration now. We need a better solution for 256MB memory usage testing. I will raise the discussion and figure it out. Sorry for the inconvenience.
Keep the ni to me for tracking.
Assignee | ||
Comment 17•10 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 18•10 years ago
|
||
Reopening, since gsvelto pointed out that I'm missing some stuff.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 19•10 years ago
|
||
Android seems to keep the memory cgroups in /sys/fs/cgroup/memory
Sigh - I don't know why I didn't appreciate that this patch would be required. But here it is.
I verified that the groups get created and that the process moves around as it changes priority (from fg to bg).
Attachment #8547794 -
Flags: review?(gsvelto)
Comment 20•10 years ago
|
||
Comment on attachment 8547794 [details] [diff] [review]
Added memory cgroup to FxOS process control
Review of attachment 8547794 [details] [diff] [review]:
-----------------------------------------------------------------
LGTM save for a few code formatting nits.
::: hal/gonk/GonkHal.cpp
@@ +1400,5 @@
> NS_NAMED_LITERAL_CSTRING(kDevCpuCtl, "/dev/cpuctl/");
> NS_NAMED_LITERAL_CSTRING(kSlash, "/");
>
> + nsAutoCString groupName(aGroup);
> + HAL_LOG("EnsureCpuCGroupExists for group '%s'", groupName.get());
nit: Do you think this is really needed? I'd prefer to have log messages only when something unexpected happens so that we aren't too chatty when everything works.
@@ +1432,5 @@
> }
>
> offset++;
> }
> + HAL_LOG("EnsureCpuCGroupExists created group '%s'", groupName.get());
nit: Likewise
@@ +1461,5 @@
> + NS_NAMED_LITERAL_CSTRING(kMemCtl, "/sys/fs/cgroup/memory/");
> + NS_NAMED_LITERAL_CSTRING(kSlash, "/");
> +
> + nsAutoCString groupName(aGroup);
> + HAL_LOG("EnsureMemCGroupExists for group '%s'", groupName.get());
nit: Likewise
@@ +1471,5 @@
> + if (!aGroup.IsEmpty()) {
> + prefPrefix += aGroup + NS_LITERAL_CSTRING(".");
> + }
> +
> + nsAutoCString memSwappinessPref(prefPrefix + NS_LITERAL_CSTRING("memory_swappiness"));
nit: This and other lines go beyond the 80-characters-per-line limit we have
https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style#Line_Length
I'm not to fond of it, especially for C++ code that tends to be quite verbose, but consistency is also important.
@@ +1488,5 @@
> + return false;
> + }
> +
> + offset++;
> + }
nit: Since we use it in two places it might be worth extracting this code into its own function, I'd call it MkdirDashP() :-P
@@ +1489,5 @@
> + }
> +
> + offset++;
> + }
> + HAL_LOG("EnsureMemCGroupExists created group '%s'", groupName.get());
nit: Same as for the previous log messages
@@ +1498,5 @@
> + nsPrintfCString("%d", memSwappiness).get())) {
> + HAL_LOG("Could not set the memory.swappiness for group %s", memSwappinessPath.get());
> + return false;
> + }
> + HAL_LOG("Set memory.swappiness for group %s to %d", memSwappinessPath.get(), memSwappiness);
nit: Same as for the previous log messages
@@ +1569,1 @@
> }
nit: I'd add an empty line between the closing brace and the following if() for extra clarity
Attachment #8547794 -
Flags: review?(gsvelto) → review+
Assignee | ||
Comment 21•10 years ago
|
||
Yeah - the HAL_LOG's were just for debugging and I forgot to remove them.
I factored out the directory creation and created:
static bool MakeCGroupDir(const nsACstring& aRootDir,
const nsACString& aGroupName)
rather than MakeDirDashP since it isn't really mkdir -p (i.e. it only creates the directories within the group name.
Assignee | ||
Comment 22•10 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Comment 24•10 years ago
|
||
(In reply to Dave Hylands [:dhylands] from comment #21)
> Yeah - the HAL_LOG's were just for debugging and I forgot to remove them.
>
> I factored out the directory creation and created:
>
> static bool MakeCGroupDir(const nsACstring& aRootDir,
> const nsACString& aGroupName)
>
> rather than MakeDirDashP since it isn't really mkdir -p (i.e. it only
> creates the directories within the group name.
My suggestion was meant only as a joke, this makes the intent clearer too, thanks!
Comment 25•10 years ago
|
||
About 256MB testing devices, we're targeting Dolphin 2.1 and Woodduck now, and around the end of January, we will change another one. Please let me know if you want to know more in details, and/or anything I can help here. Thanks.
Flags: needinfo?(khu)
Comment 26•10 years ago
|
||
Talked with Teri. Now, the first testing we would like to perform is to see if we have improvement with this bug fix. In this case, before we have better 256MB device to test, we can try to verify it on Flame 319MB.
Comment 27•10 years ago
|
||
(In reply to Kevin Hu [:khu] from comment #25)
> About 256MB testing devices, we're targeting Dolphin 2.1 and Woodduck now,
> and around the end of January, we will change another one.
Note that his is not scheduled for uplift to the 2.1 branch. I'm setting the appropriate flags to have it uplifted to 2.2 because it was scheduled for it but for 2.1 we'd need approval for both this and its dependency, bug 1081871.
Comment 28•10 years ago
|
||
Backed out per bug 1081871 comment 31.
https://hg.mozilla.org/integration/b2g-inbound/rev/869662d3bd76
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 2.2 S4 (23jan) → ---
Updated•10 years ago
|
Whiteboard: [ft:media] [2.2 target] → [CR 787261][ft:media] [2.2 target]
Updated•10 years ago
|
Whiteboard: [CR 787261][ft:media] [2.2 target] → [caf priority: p2][CR 787261][ft:media] [2.2 target]
Comment 29•10 years ago
|
||
Now that bug 1081871 has re-landed we can re-land this too.
Keywords: checkin-needed
Comment 30•10 years ago
|
||
Keywords: checkin-needed
Comment 31•10 years ago
|
||
This was merged a few days ago and got missed.
https://hg.mozilla.org/mozilla-central/rev/1affa6bf62b8
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Comment 32•10 years ago
|
||
Comment on attachment 8547794 [details] [diff] [review]
Added memory cgroup to FxOS process control
[Approval Request Comment]
Same as bug 1081871 comment 48
Attachment #8547794 -
Flags: approval-mozilla-b2g37?
Updated•10 years ago
|
Attachment #8547794 -
Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Comment 33•10 years ago
|
||
status-firefox36:
--- → wontfix
status-firefox37:
--- → wontfix
status-firefox38:
--- → fixed
Target Milestone: --- → 2.2 S5 (6feb)
Comment 34•10 years ago
|
||
Flags: in-moztrap?(twen) → in-moztrap+
Comment 36•10 years ago
|
||
I happened to finish a run of yesterday's build. I think there is no white screen / hung issue. However, the MTBF time is too short. There must be some memory issues or other issues. I will try to look into it tomorrow.
Blocks: MTBF-2015Q1
Comment 37•10 years ago
|
||
The MTBF results is worse comparing to pre-FL. It used to be more than 10 hours. It's now only around 9 hours or less. We see lots of strange stuffs. For example, sudden shutdown of device and high memory usage.
I have attached some of our logs. This is for one devices that shutdown automatically. We suspect there are some issues regarding the NICE value or power. Please let us know what we can do to see further into this issues.
We have our memory bug filed in another bug.
Flags: needinfo?(wachen)
Comment 38•10 years ago
|
||
Comment 39•10 years ago
|
||
I saw Nexus 5 running in MTBF shutdown in the same way -> device shutdown when we are running test. I think we might need further help into this issue.
Flags: needinfo?(twen)
Comment 40•10 years ago
|
||
If your MTBF setup is simple to get going, we can try to run it over here as well. If the issues are reproducible on our hardware then we're better equipped to debug sudden device shutdowns (ie, kernel or other non-userspace crashes).
Blocks: CAF-v3.0-FL-metabug
No longer blocks: CAF-v3.0-FL-metabug
Comment 42•10 years ago
|
||
Hi, Michael, I think the setup should be as easy as gaiatest (or even easier).
I believe we met another issue might relate with this bug again.
Let's Alison do the clarification :P
Blocks: MTBF-B2G
Flags: needinfo?(wachen) → needinfo?(ashiue)
Comment 43•10 years ago
|
||
I encountered Nexus 5 shut down issue twice when running MTBF testing (about 12 hours).
Although the device looks like shut down, I still could use adb shell but have no logcat and no data from b2g-ps.
Besides, the serialNumber has been changed to 0123456789ABCDEF.
Any idea for this symptom?
Flags: needinfo?(ashiue)
Comment 44•10 years ago
|
||
I hit this issue again yesterday (run MTBF about 8 hours), attached last_kmsg for reference.
Comment 45•10 years ago
|
||
(In reply to Alison Shiue from comment #44)
> I hit this issue again yesterday (run MTBF about 8 hours), attached
> last_kmsg for reference.
The warning messages seems to come from one of the kernel drivers; are you sure that the issue is related to the patch here? Otherwise I'd move testing this issue to another bug so that we better track it.
Comment 46•10 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #45)
> (In reply to Alison Shiue from comment #44)
> > I hit this issue again yesterday (run MTBF about 8 hours), attached
> > last_kmsg for reference.
>
> The warning messages seems to come from one of the kernel drivers; are you
> sure that the issue is related to the patch here? Otherwise I'd move testing
> this issue to another bug so that we better track it.
No, I'm not sure if this issue is related to this patch. This warning message is just for reference if it can provide more idea about this problem since I cannot get much information from the test device under the strange status.
And I think track this issue to another bug is a good idea.
You need to log in
before you can comment on or make changes to this bug.
Description
•