Closed Bug 1021400 Opened 10 years ago Closed 9 years ago

Flame: disable ksmd because it shortens battery life drastically

Categories

(Firefox OS Graveyard :: GonkIntegration, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: n.nethercote, Assigned: gerard-majax)

References

Details

(Keywords: dogfood, perf, power, Whiteboard: [c=power p= s= u=flame])

User Story

+++ This bug was initially created as a clone of Bug #1020000 +++

My flame running 1.3 or master doesn't survive the night without having it plugged in.
We need to fix this or nobody will use it as dogfood phone.

Attachments

(5 files)

Bug 1020000 has the full context, but the short version is this: my Flame's battery life was terrible. Even with very little use, and with wireless and bluetooth disabled, it would drain the battery in about 24 hours.

I worked out that ksmd (kernel samepage merging daemon) was running constantly, and using about 5% of CPU. Alexandre Lissy provided me with a ksmd-less kernel image which I flashed, and the battery life since then has improved by about 3x. (Which still seems sub-optimal, but let's fix one problem at a time.)

This bug is about doing any necessary follow-up. Do all flames have ksmd running? Do other users see the same CPU usage and battery drain that I did? Can we do an automatic update of the kernel?

It would be great if any Flame users could report on whether ksmd is running on their phone, and if so, how much CPU it's using. I used |adb shell top -m 5| to do that.
Keywords: qawanted
Component: Performance → GonkIntegration
ksmd *is* running on my phone, and top says (one output only):

[froydnj@cerebro ~]$ adb shell top -m 5



User 2%, System 13%, IOW 0%, IRQ 0%
User 5 + Nice 2 + Sys 43 + Idle 274 + IOW 0 + IRQ 0 + SIRQ 0 = 324

  PID PR CPU% S  #THR     VSS     RSS PCY UID      Name
 8693  0   8% R     1   1236K    548K     root     top
   96  0   4% S     1      0K      0K     root     ksmd
13607  0   0% S    13  80420K  27608K     u0_a1360 /system/b2g/plugin-container
  166  0   0% S     1      0K      0K     root     jbd2/mmcblk0p29
 2936  0   0% S    21  90204K  29404K     u0_a2936 /system/b2g/plugin-container

But I've had excellent battery life thus far from the phone; it lasts for ~4 days between charges, with light apps/web/phone/text usage.
KSM support could be enabled in the kernel but actual page scanning could be turned off. Check the contents of /sys/kernel/mm/ksm/run, if it's set to 0 then the daemon will never attempt a page scan.
(In reply to Gabriele Svelto [:gsvelto] from comment #2)
> KSM support could be enabled in the kernel but actual page scanning could be
> turned off. Check the contents of /sys/kernel/mm/ksm/run, if it's set to 0
> then the daemon will never attempt a page scan.

/sys/kernel/mm/ksm/run is set to 1 here.
(In reply to Michael Wu [:mwu] from comment #3)
> /sys/kernel/mm/ksm/run is set to 1 here.

That means that ksmd will scan all memory at fixed intervals (1 minute by default) and try to merge identical pages from different processes. I wonder if /sys/kernel/mm/ksm/run is being set to 1 by an init script or if it's the default value. Anyway turning it to 0 will fix the issue even though the ksmd daemon will be sticking around.
It's being configured by init.target.rc - https://www.codeaurora.org/cgit/quic/la/platform/vendor/qcom/msm8610/tree/init.target.rc?h=b2g_jb_3.2#n30

Don't know if those values are reasonable.
Those values imply that the daemon will scan a 100 pages before going to sleep and will sleep for 500ms before starting again. On a phone something that starts up every 500ms - even if it's not doing much - sounds like a bad idea.

KSMD's documentation provides some more information that might be useful:

https://www.kernel.org/doc/Documentation/vm/ksm.txt

First of all contrary to what I recalled it won't try to scan all memory but only pages marked with madvise(addr, length, MADV_MERGEABLE). Jemalloc doesn't do that by default so the memory consumed by our content processes and the main process shouldn't be touched. Apparently there's some counters to ascertain the effectiveness of KSM, see the last paragraph:

"The effectiveness of KSM and MADV_MERGEABLE is shown in /sys/kernel/mm/ksm/:

pages_shared     - how many shared pages are being used
pages_sharing    - how many more sites are sharing them i.e. how much saved
pages_unshared   - how many pages unique but repeatedly checked for merging
pages_volatile   - how many pages changing too fast to be placed in a tree
full_scans       - how many times all mergeable areas have been scanned


A high ratio of pages_sharing to pages_shared indicates good sharing, but
a high ratio of pages_unshared to pages_sharing indicates wasted effort.
pages_volatile embraces several different kinds of activity, but a high
proportion there would also indicate poor use of madvise MADV_MERGEABLE."

Unless it's showing some spectacular memory gains my opinion would be to turn it off. It seems to me that it could increase standby power usage for very little benefit.
m1 - based on comment 6, it sounds like we should turn off KSM to improve power usage. What do you think?
Flags: needinfo?(mvines)
(In reply to Michael Wu [:mwu] from comment #7)
> m1 - based on comment 6, it sounds like we should turn off KSM to improve
> power usage. What do you think?

On the Flame with 1GB of RAM I agree that KSM could be turned off.
Flags: needinfo?(mvines)
Whiteboard: [c=power p= s= u=] → [c=power p= s= u=flame]
(In reply to Michael Vines [:m1] [:evilmachines] from comment #8)
> (In reply to Michael Wu [:mwu] from comment #7)
> > m1 - based on comment 6, it sounds like we should turn off KSM to improve
> > power usage. What do you think?
> 
> On the Flame with 1GB of RAM I agree that KSM could be turned off.

According to comment 6, it doesn't matter how much ram the device has because our allocator doesn't mark any pages as mergeable.
In bug 1020000 comment 16, Gabriele asked:

> Where can I find the Alexander's kernel ? I can see that ksmd running on mine

I've put it at http://njn.valgrind.org/boot-noksm.img. Checksums for the file:

- md5:    64cc9778460291b5f071064bcd726bd7
- sha1:   939b90e9de04fdcb9431c2c025156fa329a9b3d6
- sha256: e5923347f1311bb80d7cf995741baddc9d549a5fb0fb6a79ac6803ba98461fe0

Here are the instructions I used to flash it:

> Plug in the phone and then do:
> $ adb reboot bootloader
>
> Then, once the device is visible in the output of |sudo fastboot devices|:
> $ sudo fastboot flash boot boot-noksm.img
>
> Then, to reboot normally:
> $ sudo fastboot reboot

WARNING: these steps worked for me, and I don't see why they wouldn't work for you, but I cannot guarantee it won't brick your phone or something else like that.
I confirm it works but unfortunately I can't access the internal multimedia memory nor the sd card anymore
And I dont' have the original boot.img
I can confirm both. no ksmd, but without access to memory card.
(In reply to ralf tauscher from comment #12)
> I can confirm both. no ksmd, but without access to memory card.

Do you have a backup to share?
sure. I sent you an email with a link.
(In reply to ralf tauscher from comment #14)
> sure. I sent you an email with a link.

Thanks for the boot image, however:
With the modified kernel I also noticed a strange screen flickering.
I reflashed the original boot image you sent me and now the screen still flickers when on auto brightness.
And a new problem arose: now Wi-Fi is not working anymore !!!
Hurrà!  Catching old enemies?
One more thing: while browsing the dev options the phone activated the "voice over tapping" flag, so now I'm stuck at the first homepage. I can't go to the second homepage to access settings, enable debug and use adb to get control over the phone.
I can only fastboot it :D
How do I reset it via fastboot now ???
I run these:

fastboot devices
fastboot erase userdata
fastboot erase cache

And now gaia does not start anymore, I can see the logo running forever
At least I can adb it now
Any hints to have it back working ?
(In reply to Gabriele Svelto [:gsvelto] from comment #2)

> turned off. Check the contents of /sys/kernel/mm/ksm/run, if it's set to 0
> then the daemon will never attempt a page scan.

Actually it will still trigger 33 wakeups/sec by default. The sleep interval also needs to be adjusted
(In reply to gabriele.vidali from comment #11)
> I confirm it works but unfortunately I can't access the internal multimedia
> memory nor the sd card anymore
> And I dont' have the original boot.img

That's not a surprise, I produced this boot.img just for the sake of checking the impact. Those bugs were expected and actually this image should not have been distributed to people since it's old WIP code, with bugs. T2M base image and this boot.img have diverged a bit. Your best chance would be to build everything yourself and making sure you disable KSMD when building the kernel.
I would like to ask for an orginal boot.img so I can use sd cards again.
Thanks.
(In reply to Krzysztof Sobiecki from comment #20)
> I would like to ask for an orginal boot.img so I can use sd cards again.
> Thanks.

You can find the original build here  http://pan.baidu.com/s/1eQh86GU
Thanks for your help.
For now I have changed init.target.rc so /sys/kernel/mm/ksm/run is set to 0.
No ksmd running.
I'm also suffering from this problem and the battery is dead a bit longer than 24h. The last couple of days I always got complains why I'm not reachable, until I figured out that the battery is empty again.

ksmd is running on my devices and consumes between 3 and 4% all the time.
m1: this is a clear problem with a (conceptually) trivial fix. How do we move forward now?
Flags: needinfo?(mvines)
For Flame, you don't need my help to disable ksmd.  For our 256MB config I don't see a reason to disable it at this point though.
Flags: needinfo?(mvines)
Comment 9 explained that the RAM size doesn't matter; i.e. ksmd is useless on any of our configurations.

So we're shipping a useless, power-hungry feature by default on our main developer phone. This is causing poor battery life for lots of people. I don't want each of these people to have to disable it themselves one by one. I want the phone to not ship with this feature, and for the already-shipped phones to get an appropriate update. What are the steps required to make this happen?
Flags: needinfo?(mvines)
Comment 9 assumes that Gecko is the only code running in the system, which is not accurate.
Flags: needinfo?(mvines)
We know that ksmd is having bad effects for at least some people. (Turning it off improved my Flame's battery life by 3x.)

Do we have any evidence that ksmd is helping anybody?
I think it would be very interesting to get to the bottom of why ksmd is seen to be running so much on some Flame devices first.  I doubt that ksmd will prevent the kernel from suspending regardless of how much it's running, so it sounds like these devices are not entering suspend for other reasons and the high ksmd usage is just a side-effect.
How do you determine if the kernel is not suspending?
And even if these phones aren't suspending, ksmd still seems bad -- tripling power consumption when idle but not suspended is not good.
(In reply to Nicholas Nethercote [:njn] from comment #28)
> Do we have any evidence that ksmd is helping anybody?

After running for a couple of days it seems to be saving ~6MiB worth of memory on my Flame by sharing a limited number of pages. Considering the amount of battery life it seems to drain and that we're talking about a 1GiB device I think it's safe to say that the memory reduction is not worth the battery cost.
(In reply to Nicholas Nethercote [:njn] from comment #26)
> What are the steps required to make this happen?

It should be just a matter of disabling the CONFIG_KSM=y switch in the t2m-flame-3.4-jb branch of the kernel repository:

https://github.com/mozilla-b2g/codeaurora_kernel_msm/blob/t2m-flame-3.4-jb/arch/arm/configs/msm8610_defconfig#L84

I'm not sure what policy we have for doing reviews and landing patches there though.

BTW the sister bug of this issue - bug 1022486 - has just been fixed by doing precisely what I described above.
So how could I turn off ksmd my own on my flame?
(In reply to aleadom from comment #34)
> So how could I turn off ksmd my own on my flame?

Run echo "0" > /sys/kernel/mm/ksm/run
Can we address the symptom immediately since a workaround is known? It's painful dogfooding these devices when they're constantly out of battery.

These devices are not consumer-focused. If they were, we could block a release on finding the root cause of this issue. There's no release to block, it's totally fine to wallpaper.
Hm. I fear if we do that we won't even try to understand why it happens on the Flame but not on QRD devices. I'm curious to see what happens on a KK based Flame for instance.
I completely understand that concern.

If someone commits to dropping everything and looking into the root cause immediately, that's fantastic.

Otherwise, history has shown that developer-only issues like this will languish because the priority is lower than issues for commercializable releases (for good reason), causing all the testers and developers to live with the situation for long periods of time, slowing everyone down with yet more environmental papercuts.
I'm with Dietrich on this.
(In reply to Fabrice Desré [:fabrice] from comment #37)
> Hm. I fear if we do that we won't even try to understand why it happens on
> the Flame but not on QRD devices. I'm curious to see what happens on a KK
> based Flame for instance.

KSM can be configured to run more or less frequently via the parameters under /sys. It might be possible that QRD devices have less aggressive settings. That being said we've seen this on other phones too as I've pointed out above, bug 1022486.
(In reply to Fabrice Desré [:fabrice] from comment #35)
> (In reply to aleadom from comment #34)
> > So how could I turn off ksmd my own on my flame?
> 
> Run echo "0" > /sys/kernel/mm/ksm/run

Thanks, adb shell "echo "0" > /sys/kernel/mm/ksm/run" works, no more ksmd. Thanks.
(In reply to Gabriele Svelto [:gsvelto] from comment #40)
> (In reply to Fabrice Desré [:fabrice] from comment #37)
> > Hm. I fear if we do that we won't even try to understand why it happens on
> > the Flame but not on QRD devices. I'm curious to see what happens on a KK
> > based Flame for instance.
> 
> KSM can be configured to run more or less frequently via the parameters
> under /sys. It might be possible that QRD devices have less aggressive
> settings. That being said we've seen this on other phones too as I've
> pointed out above, bug 1022486.

That's interesting. Can you check with parameters we use on the Flame, and then m1 could tell us if that matches what they have on QRD?
The config looked the same the last time I checked.  I'm pretty interested in some better STR around this.  Just because we haven't observed this here yet certainly doesn't mean it won't reproduce.
Here's some data from my Flame, these are the values of the files under /sys/kernel/mm/ksm:

full_scans = 274
pages_shared = 303
pages_sharing = 1770
pages_to_scan = 100
pages_unshared = 11464
pages_volatile = 7855
run = 1
sleep_millisecs = 500

This is with an uptime of ~22 hours. That's more than a scan every 5 minutes which sounds a tad too much. That being said while investigating another bug with :gerard-majax we found that the update service is running way too often, see bug 1020000 comment 7. That's definitely making things worse because we see one check every 2 minutes.
Same with 2.0 nightly:
> root@flame:/sys/kernel/mm/ksm # for i in *; do
> > echo $i:`cat $i`
> > done
> full_scans:505
> pages_shared:320
> pages_sharing:1531
> pages_to_scan:100
> pages_unshared:12969
> pages_volatile:5812
> run:0
> sleep_millisecs:500
Excuse me, I changed run to `0` manually. It is `1` by default.
We should see if an android build (be it aosp, miui or whatever) would last more than half a day
I suspect battery drain is due to the 'old' CPU used by the flame.
Another way to prove this would be to compare battery drain on a nexus with both android and fxos
Blocks: 1061012
In the meantime I searched for an android with similar specs and found out this review: http://www.trustedreviews.com/motorola-moto-e_mobile-phones_review_speaker-battery-life-and-verdict_Page-4
Moto e has got a slightly smaller screen (4.3"),the same CPU and a slightly bigger battery (1980mah) but it lasts almost two days.
Actually after my last comment I did an update or two. Without running any fancy apps battery lasts for 3 days at least. Might be some app that's draining battery like gauth. Whoever reports shrt battery life, try closing all apps when you don't use them.
(In reply to Alexander from comment #50)
> Actually after my last comment I did an update or two. Without running any
> fancy apps battery lasts for 3 days at least. Might be some app that's
> draining battery like gauth. Whoever reports shrt battery life, try closing
> all apps when you don't use them.

Note that we had three other battery-draining bugs so check that they've all been fixed on your device before testing again: bug 1062119, bug 1056034 and bug 1061012.
I'm now back to 1.3 and must say that at least it can do one day with average use.
2.2 will kill your battery after 4 hours
2.0 is not bad but thirstier than 1.3
Is anybody testing 1.4? Since there's almost no improvements compared to 1.3 (same ui, no webgl nor webrtc), the battery could last more with it.
To make the fix `adb shell "echo "0" > /sys/kernel/mm/ksm/run"` permanent I added 

echo "0" > /sys/kernel/mm/ksm/run

to the b2g.sh script as the second last line, following https://developer.mozilla.org/en-US/Firefox_OS/Developing_Firefox_OS/Customizing_the_b2g.sh_script

Seems to work fine but maybe someone more knowledgeable can comment on the merits/pitfalls of this approach.
I do have a power harness for the Flame, now, thanks to :jhylands! So I'll check and assert whether ksmd really have a role here, and how much. Anyone that can summup scenarios and usecase that we should measure power, please do so.
Assignee: nobody → lissyx+mozillians
blocking-b2g: --- → 2.2?
Values on my system:
> $ for f in /sys/kernel/mm/ksm/*; do echo -n "$f: "; cat $f; done>
> /sys/kernel/mm/ksm/deferred_timer: 1
> /sys/kernel/mm/ksm/full_scans: 5
> /sys/kernel/mm/ksm/pages_shared: 53
> /sys/kernel/mm/ksm/pages_sharing: 475
> /sys/kernel/mm/ksm/pages_to_scan: 100
> /sys/kernel/mm/ksm/pages_unshared: 3114
> /sys/kernel/mm/ksm/pages_volatile: 7046
> /sys/kernel/mm/ksm/run: 1
> /sys/kernel/mm/ksm/sleep_millisecs: 500
Attached file R script
Attached file ksm_n_ts.pdf
Attached file ksm_y_ts.pdf
This has been tested by comparing power consumption on my Flame:
 - unplugged from USB
 - battery almost full
 - power harness plugged in

My device is with 1GB and 2 CPUs.

KSM enabled means:
 - CONFIG_KSM=y
 - 1 in /sys/kernel/mm/ksm/run

KSM disabled means:
 - no CONFIG_KSM set

Raw collected data are provided as of attachment 8527748 [details] and attachment 8527749 [details]. The R script used to produce plots exposed in attachment 8527750 [details] and attachment 8527751 [details] is in attachment 8527747 [details].

What I can conclude from those material is that:
 - the frequency of power consumption spikes is very close
 - the heigh of those spikes is very similar

In light of this, I cannot blame KSM for anything.

Nicholas, since you reported this, can you document:
 - do you still hit this issue ?
 - if so, can you get us the status of ksm as in comment 55 ?
> Nicholas, since you reported this, can you document:
>  - do you still hit this issue ?
>  - if so, can you get us the status of ksm as in comment 55 ?

I flashed that ksm-less kernel that you specially built for me, so ksmd is no longer running on my Flame. If you tell me how to re-flash a vanilla kernel I could check it then.
If you have not, upgrade your device to v188 base image, this will be enough I think. Worst, take boot.img from flame's zip on pvtbuilds.
Flags: needinfo?(n.nethercote)
(In reply to Alexandre LISSY :gerard-majax from comment #61)
> What I can conclude from those material is that:
>  - the frequency of power consumption spikes is very close
>  - the heigh of those spikes is very similar
> 
> In light of this, I cannot blame KSM for anything.

Why don't we try to use ftrace to figure out what's going on? AFAIK we can gather scheduler wakeup events to see which process is causing those spikes so that we've got a starting point. ftrace should show us everything including kernel threads which makes it suitable for this task.

In a nutshell we could do (off the top of my head):

- Enable monitoring of task runtime events

echo 1 >/sys/kernel/debug/tracing/events/sched/sched_stat_runtime/enable

- Start tracing

echo 1 >/sys/kernel/debug/tracing/tracing_enabled

- Unplug the phone, wait as much as needed

- Plug the phone again, turn off tracing

echo 0 >/sys/kernel/debug/tracing/tracing_enabled
echo 0 >/sys/kernel/debug/tracing/events/sched/sched_switch/enable

- Dump the trace

cat /sys/kernel/debug/tracing/trace

I don't remember all the parameters to control the trace size and stuff right now but we could start experimenting with it first and see if we're able to gather some relevant info.

The samples in the trace should look like this:

b2g-205   [001] d.h2 18468.120471: sched_stat_runtime: comm=b2g pid=205 runtime=9960885 [ns] vruntime=162681673790 [ns]

The first field is the process name + PID (or TID since this works per thread), then you get the CPU on which the task run, the some stuff I don't remember what it's for, then a timestamp and finally the actual stats (where runtime is the relevant parameter, it's the timeslice in ns the task run for).
Sure, we can do this, Gabriele, but my goal was to assert the impact of CONFIG_KSM=y in term of power consumption.
(In reply to Alexandre LISSY :gerard-majax from comment #63)
> If you have not, upgrade your device to v188 base image, this will be enough
> I think. Worst, take boot.img from flame's zip on pvtbuilds.

Can you give me more detailed steps? This isn't something I do regularly. Thank you.
Flags: needinfo?(n.nethercote)
I probably won't get to this soon. Sorry.
Flags: needinfo?(n.nethercote)
Giving this a 2.2+ in B2G triage because it relates to battery life, but given the dates on most of these comments we want to ensure it still reproduces on new CPUs and build images.

Setting ni? to Lissy to see if he can repro this and update the bug. Please reassign as appropriate.
blocking-b2g: 2.2? → 2.2+
Flags: needinfo?(lissyx+mozillians)
This is a device configuration issue, so unless we're doing a commercial launch of the flame it doesn't really make sense for this to block a release.
Stephany, this bug is:
 - independant of shipped devices
 - filed against a very old version of the Flame kernel
 - filed against a now unsupported base system of the Flame (JB)

Also, I have never been able to reproduce this back in this time. Only some people were able. My Current investigation with the power harness showed absolutely no impact. We have much more grave things to block than this bug.
Flags: needinfo?(lissyx+mozillians) → needinfo?(swilkes)
Very helpful. Thanks, Lissy. Those were exactly the things that were not clear to the triage team yesterday, from this bug.
blocking-b2g: 2.2+ → ---
Flags: needinfo?(swilkes)
I still have this same issue, as well.
(In reply to Alexandre LISSY :gerard-majax from comment #71)
> Stephany, this bug is:
>  - independant of shipped devices
>  - filed against a very old version of the Flame kernel
>  - filed against a now unsupported base system of the Flame (JB)
> 
> Also, I have never been able to reproduce this back in this time. Only some
> people were able. My Current investigation with the power harness showed
> absolutely no impact. We have much more grave things to block than this bug.

I have a shipped device to me of Flame that I got May 19th of this year.  I still have battery life issue.
My Flame phone's battery runs out in two hours of usage or less. It runs out in less than 8 hours if unused but just on.
(In reply to Stephany Wilkes from comment #72)
> Very helpful. Thanks, Lissy. Those were exactly the things that were not
> clear to the triage team yesterday, from this bug.

I have a shipped device to me of Flame that I got May 19th of this year.  I still have battery life issue. My Flame phone's battery runs out in two hours of usage or less. It runs out in less than 8 hours if unused but just on.
(In reply to Stephany Wilkes from comment #72)
> Very helpful. Thanks, Lissy. Those were exactly the things that were not
> clear to the triage team yesterday, from this bug.

By the way, I charged my phone, yesterday, to 100%, then turned it off and kept it off for 24 hours.  Turned it on just now, to check battery life, and it says the battery life is 84%. Not sure why this is happening.  This isn't the first time this has done this, though, neither, where even at rest state, still the battery life once turned on shows not even at 90% power.  Can changing settings better the battery power life in this case?
https://bugzilla.mozilla.org/show_bug.cgi?id=1178869 I filed this bug report just now, in case my issues are slightly different than this issue.  If it is a duplicate of this bug, please state so in my bug report I filed.
Thabks for the spam. By the way without proper documentation of the system on your device we cannot diagnose anything. If you see constant network activity it might also be the push api bug.
(In reply to Alexandre LISSY :gerard-majax from comment #79)
> Thabks for the spam. 

Comments that are "spam" == unsolicited and zero value. 
I think this person is making a genuine effort here to help.
(In reply to Alexandre LISSY :gerard-majax from comment #79)
> Thabks for the spam. By the way without proper documentation of the system
> on your device we cannot diagnose anything. If you see constant network
> activity it might also be the push api bug.

My system: Flame v188 and v18D Base Image Build ID 20150527004816 Platform version 41.0a1 OS version 3.0.0.0-prerelease Build nmbr eng.cltbld.20150212.043653

Is that enough info, or do you still think it is spam?  Confused why it might be spam that you think I am doing.
I'd suggest closing this bug as INVALID and moving on to bug 1130725 or other battery life related bugs. The reason is that in this bug we specifically targeted ksmd and found out that it wasn't the culprit of the battery life issue hence we should move the discussion on what might be causing this problem elsewhere.

That being said one way to check what my be causing the battery drain is to use the tracing script I've recently landed on B2G. You don't need a B2G build or checkout to use it, just download the script from here:

https://github.com/mozilla-b2g/B2G/blob/master/scripts/trace.sh

Now, when you notice that the phone is suffering from an unexpected battery drain do the following:

1. Enable ADB debugging in the settings' developer menu
2. Plug the phone in your PC
3. Put the phone in standby and wait a minute or two for it to settle down completely
4. Start tracing by executing my script: trace.sh --start
5. Wait 10-15 minutes
6. Stop tracing: trace.sh --stop
7. Wait patiently for the output, it might take a while

Once this is done please post the output here, you'll see something like this:

            NAME   PID WAKEUPS      RUNTIME NEW
             b2g   210     171       216 ms  no
          (Nuwa)   332       0         0 ms  no
      Homescreen   945       1         0 ms  no
 Built-in Keyboa  1032       1         0 ms  no
           Usage  1503       1         0 ms  no
        Settings  1517       9        28 ms  no
  Find My Device  1558       1         0 ms  no
 Smart Collectio  1663       1         0 ms  no
 (Preallocated a  1805       1         0 ms  no

The NAME and PID columns will show all running applications and their PIDs, WAKEUPS counts the number of times each application has woken up, RUNTIME is the total CPU time consumed during those wakeups and NEW tells you if the application was already there when tracing started (no) or was launched while the trace was being gathered (yes).

April, if you can reproduce the issue and attach the output of my script this would be very helpful. However please do so in bug 1130725 so that we can close this one. I'll CC everybody else on that bug when we move the discussion there.
(In reply to Garvan Keeley [:garvank] from comment #80)
> (In reply to Alexandre LISSY :gerard-majax from comment #79)
> > Thabks for the spam. 
> 
> Comments that are "spam" == unsolicited and zero value. 
> I think this person is making a genuine effort here to help.

Six different comments to say "mee too". We're close to this definition, sorry. Specifically given the latest comments on the bug where I have spent hours checking the behavior.

(In reply to April Morone from comment #81)
> (In reply to Alexandre LISSY :gerard-majax from comment #79)
> > Thabks for the spam. By the way without proper documentation of the system
> > on your device we cannot diagnose anything. If you see constant network
> > activity it might also be the push api bug.
> 
> My system: Flame v188 and v18D Base Image Build ID 20150527004816 Platform
> version 41.0a1 OS version 3.0.0.0-prerelease Build nmbr
> eng.cltbld.20150212.043653
> 
> Is that enough info, or do you still think it is spam?  Confused why it
> might be spam that you think I am doing.

Then as stated in this bug, and without any proof ksmd process is related to what you see, then it's totally unrelated.
(In reply to April Morone from comment #81)

[...]
 
> Is that enough info, or do you still think it is spam?  Confused why it
> might be spam that you think I am doing.

My apologies if you received my first reply as rude, that was not the case at all.
Closing this one as INVALID since it seems to be falsely leading people to think this issue still make sense. Summup: this was back in Jelly Bean builds time, early releases of first Flame images. Since we moved to KK and kernel is not the same. Further investigation shows we cannot reproduce the behavior described and measures showed no impact at all.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INVALID
(In reply to Alexandre LISSY :gerard-majax from comment #84)
> (In reply to April Morone from comment #81)
> 
> [...]
>  
> > Is that enough info, or do you still think it is spam?  Confused why it
> > might be spam that you think I am doing.
> 
> My apologies if you received my first reply as rude, that was not the case
> at all.

K.  Awesome.  :)  Ty for letting me know that it was not meant in a bad way.  :)
(In reply to Alexandre LISSY :gerard-majax from comment #83)
> (In reply to Garvan Keeley [:garvank] from comment #80)
> > (In reply to Alexandre LISSY :gerard-majax from comment #79)
> > > Thabks for the spam. 
> > 
> > Comments that are "spam" == unsolicited and zero value. 
> > I think this person is making a genuine effort here to help.
> 
> Six different comments to say "mee too". We're close to this definition,
> sorry. Specifically given the latest comments on the bug where I have spent
> hours checking the behavior.
> 
> (In reply to April Morone from comment #81)
> > (In reply to Alexandre LISSY :gerard-majax from comment #79)
> > > Thabks for the spam. By the way without proper documentation of the system
> > > on your device we cannot diagnose anything. If you see constant network
> > > activity it might also be the push api bug.
> > 
> > My system: Flame v188 and v18D Base Image Build ID 20150527004816 Platform
> > version 41.0a1 OS version 3.0.0.0-prerelease Build nmbr
> > eng.cltbld.20150212.043653
> > 
> > Is that enough info, or do you still think it is spam?  Confused why it
> > might be spam that you think I am doing.
> 
> Then as stated in this bug, and without any proof ksmd process is related to
> what you see, then it's totally unrelated.

K.  I understand and I agree with your decision about closing this particular bug report.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: