Closed Bug 1105639 Opened 10 years ago Closed 10 years ago

Settings observer leaking

Categories

(Firefox OS Graveyard :: Gaia::System, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

(blocking-b2g:2.1+, b2g-v1.4 ?, b2g-v2.0 wontfix, b2g-v2.1 wontfix, b2g-v2.2 fixed)

RESOLVED FIXED
2.2 S2 (19dec)
blocking-b2g 2.1+
Tracking Status
b2g-v1.4 --- ?
b2g-v2.0 --- wontfix
b2g-v2.1 --- wontfix
b2g-v2.2 --- fixed

People

(Reporter: gerard-majax, Assigned: gerard-majax)

References

Details

Attachments

(1 file)

46 bytes, text/x-github-pull-request
kgrandon
: review+
arthurcc
: review+
Details | Review
As reported in bug 1105511 comment 15 and bug 1105511 comment 16: we are leaking some observers.

STR:
 0. Get about:memory report
 1. Lock/Unlock the device a couple of times. You may need to do it more than 20 times.
 2. Get about:memory report

Expected:
 We should have no leak reported in the "settings-observers-suspect" section

Actual:
 We have leaks reported for at least:
  - powersave.threshold
  - lockscreen.lock-message

[Blocking Requested - why for this release]: Chances are high we are hitting this on v2.1 too, given nearly all the settings changes were uplifted.
This may be because of Gaia's SettingsListener: powersave.threshold access is being used solely via this, and is triggered on each battery level change.
I reproduce the problem on 2.1[1] and 2.0[2]: at boot time, I don't have the section called "settings-observers-suspect" in the memory report. After 20 lock/unlocks, I got:
> 23 (100.0%) -- settings-observers-suspect
> └──23 (100.0%) ── referent(topic=lockscreen.lock-message)

It's not possible to get a report for 1.4 or earlier because this report appeared in gecko 31 (see bug 993203). Clearing the qa*, while we find another way to measure the number of observers.


[1] Build ID               20141126161202
Gaia Revision          5372b675e018b6aac97d95ff5db8d4bd16addb9b
Gaia Date              2014-11-26 18:54:49
Gecko Revision         https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/f34377ae402b
Gecko Version          34.0
Device Name            flame
Firmware(Release)      4.4.2
Firmware(Incremental)  eng.cltbld.20141126.194010
Firmware Date          Wed Nov 26 19:40:22 EST 2014
Bootloader             L1TC00011880

[2] Build ID               20141126160203
Gaia Revision          8d1e868864c8a8f1e037685f0656d1da70d08c06
Gaia Date              2014-11-26 13:51:52
Gecko Revision         https://hg.mozilla.org/releases/mozilla-b2g32_v2_0/rev/c756bd8bf3c3
Gecko Version          32.0
Device Name            flame
Firmware(Release)      4.4.2
Firmware(Incremental)  eng.cltbld.20141126.192736
Firmware Date          Wed Nov 26 19:27:46 EST 2014
Bootloader             L1TC00011880
The lockscreen.lock-message leak seems to come from find my device via SettingsHelper, indeed:
> 11-27 15:14:10.656 14787 14787 I Gecko   : -*- SettingsManager: addObserver lockscreen.lock-message
> 11-27 15:14:10.656 14787 14787 I Gecko   : -*- SettingsManager: addObserver lockscreen.lock-message from: addObserver@jar:file:///system/b2g/omni.ja!/components/SettingsManager.js:361:60
> 11-27 15:14:10.656 14787 14787 I Gecko   : sh_init@app://system.gaiamobile.org/shared/js/settings_helper.js:78:7
> 11-27 15:14:10.656 14787 14787 I Gecko   : SettingsHelper@app://system.gaiamobile.org/shared/js/settings_helper.js:83:5
> 11-27 15:14:10.656 14787 14787 I Gecko   : @app://system.gaiamobile.org/js/findmydevice_launcher.js:78:16
> 11-27 15:14:10.656 14787 14787 I Gecko   : ls_dispatchEvent@app://system.gaiamobile.org/lockscreen/js/lockscreen.js:1017:5
> 11-27 15:14:10.656 14787 14787 I Gecko   : ls_unlock@app://system.gaiamobile.org/lockscreen/js/lockscreen.js:685:5
> 11-27 15:14:10.656 14787 14787 I Gecko   : ls_activateUnlock@app://system.gaiamobile.org/lockscreen/js/lockscreen.js:615:7
> 11-27 15:14:10.656 14787 14787 I Gecko   : ls_handleEvent@app://system.gaiamobile.org/lockscreen/js/lockscreen.js:277:9
> 11-27 15:14:10.656 14787 14787 I Gecko   : lss_publish@app://system.gaiamobile.org/shared/js/lockscreen_slide.js:1192:7
> 11-27 15:14:10.656 14787 14787 I Gecko   : lss_onSlideEnd@app://system.gaiamobile.org/shared/js/lockscreen_slide.js:593:9
> 11-27 15:14:10.656 14787 14787 I Gecko   : lss_endGesture@app://system.gaiamobile.org/shared/js/lockscreen_slide.js:573:7
> 11-27 15:14:10.656 14787 14787 I Gecko   : LockScreenSlidePrototype.handleEvent@app://system.gaiamobile.org/shared/js/lockscreen_slide.js:321:11
And powersave.threshold comes from battery level changes via SettingsListener:
> 11-27 16:17:17.791 19198 19198 I Gecko   : -*- SettingsManager: addObserver powersave.threshold
> 11-27 16:17:17.791 19198 19198 I Gecko   : -*- SettingsManager: addObserver powersave.threshold from: addObserver@jar:file:///system/b2g/omni.ja!/components/SettingsManager.js:361:60
> 11-27 16:17:17.791 19198 19198 I Gecko   : sl_observe@app://system.gaiamobile.org/shared/js/settings_listener.js:77:5
> 11-27 16:17:17.791 19198 19198 I Gecko   : PowerSave.prototype.onBatteryChange@app://system.gaiamobile.org/js/power_save.js:127:7
> 11-27 16:17:17.791 19198 19198 I Gecko   : bm_handleEvent@app://system.gaiamobile.org/js/battery_overlay.js:77:11
> 11-27 16:17:17.791 19198 19198 I Gecko   :
Attached file Gaia PR
A first fix that looks to be doing a good job, for SettingsHelper code.
Blocks: 1105511
Hm, that may break usages where we chain get/set ...
As far as I can say, the leak will only concern code that:
 - does addObserver() without the matching removeObserver()
 - does addObserver() several times
 - does not get cleaned up (window killed/closed)

So that's probably mostly system app code, that reacts to external things. For example, addObserver() done:
 - on onbatterylevelchange in power_save.js
 - on lockscreen-request-unlock event in findmydevice_launcher.js

All other cases that I could see were properly handled. For instance, Callscreen (which lives in system app) in its init() adds some observers. Those are not leaking because we are properly releasing ressources of the callscreen window when it's closing.
Arthur, can you check this ? I don't think my PR will be valid for all usecases of SettingsHelper. I don't see a proper moment with the current code to do a removeObserver automatically.

I suspect you added the observer to mirror any setting change and avoid races.
Flags: needinfo?(arthur.chen)
SettingsHelper could be considered as a cache of a settings field. It reduces the time required for retrieving the value of a field. The observer cannot be removed because we need it for reflecting the change done by outside the object.

We may have to add an `uninit` function in which we remove the observer. And then we call to `uninit` in a proper timing.
Flags: needinfo?(arthur.chen)
(In reply to Arthur Chen [:arthurcc] from comment #9)
> SettingsHelper could be considered as a cache of a settings field. It
> reduces the time required for retrieving the value of a field. The observer
> cannot be removed because we need it for reflecting the change done by
> outside the object.
> 
> We may have to add an `uninit` function in which we remove the observer. And
> then we call to `uninit` in a proper timing.

Thanks for confirming. I've updated my PR, with an uninit method :)
Adding an observer in SettingsListener.observe() goes back to https://github.com/mozilla-b2g/gaia/commit/f847bbe2e3632f846c51031aaa462c7b1cdd073a. First use for powersave.threshold is coming from https://github.com/mozilla-b2g/gaia/commit/23f4273fb8e60659f98dee5f5f069eafb7ef65a9

So any build since before 1.0.0 is hit by the leaking.
Rudy, in your change https://github.com/mozilla-b2g/gaia/commit/23f4273fb8e60659f98dee5f5f069eafb7ef65a9#diff-97ec30d3160e3afe16d9c3e028e44a48R200 can you explain me why you need to SettingsListener.observe() ?

Because this will add an observer on the settings. But, since you are calling this in the onbatterychange() method, we will add an observer each time this is called. Which makes us leaking.

It seems to me that you used SettingsListener.observe() to make your life easier regarding handling this setting value. Thus, what we should use is rather:
> SettingsHelper.observe('powersave.threshold', -1).get(... your callback ...).uninit();

With uninit() being the new method suggested earlier by Arhtur to make sure we are properly releasing ressources.
Flags: needinfo?(rlu)
Comment on attachment 8529786 [details] [review]
Gaia PR

Arthur, I've updated the PR, added an uninit() method, and moved the lockscreen.lock-message usage to SettingsHelper(). This way, after measuring on my Flame, I don't see any leak reported anymore.
Attachment #8529786 - Flags: feedback?(arthur.chen)
Comment on attachment 8529786 [details] [review]
Gaia PR

Arthurn, would you mind reviewing this ?
Attachment #8529786 - Flags: feedback?(arthur.chen) → review?(arthur.chen)
Comment on attachment 8529786 [details] [review]
Gaia PR

Please check my comments in github, thanks. And for both cases, maybe we can simply use the original settings API as what the code does is really simple.
Attachment #8529786 - Flags: review?(arthur.chen)
(In reply to Alexandre LISSY :gerard-majax from comment #12)
> Rudy, in your change
> https://github.com/mozilla-b2g/gaia/commit/
> 23f4273fb8e60659f98dee5f5f069eafb7ef65a9#diff-
> 97ec30d3160e3afe16d9c3e028e44a48R200 can you explain me why you need to
> SettingsListener.observe() ?

I think I did this because I want the system to enter power save mode immediately when the user changes the threshold in settings and the current battery level is under that threshold.
If we get the threshold each time when the battery level is changed, and then switch to power save mode, it would have some delay before the mode switching, i.e. depends on how the battery level changes. 

> 
> Because this will add an observer on the settings. But, since you are
> calling this in the onbatterychange() method, we will add an observer each
> time this is called. Which makes us leaking.
> 
> It seems to me that you used SettingsListener.observe() to make your life
> easier regarding handling this setting value. Thus, what we should use is
> rather:
> > SettingsHelper.observe('powersave.threshold', -1).get(... your callback ...).uninit();
> 
> With uninit() being the new method suggested earlier by Arhtur to make sure
> we are properly releasing ressources.

I think maybe we should move that observer out of batterylevelchange handler, so it won't cause leaking.
That is long time ago when I wrote that patch, so bear with me if I missed any context.
Flags: needinfo?(rlu)
Pushed new PR that should address all comments.
Updated PR with unit tests on PowerSave changes, and fixed a scoping issue for the settings request onsuccess.
Attachment #8529786 - Flags: review?(ggoncalves)
Attachment #8529786 - Flags: review?(arthur.chen)
Comment on attachment 8529786 [details] [review]
Gaia PR

Looks good to me, aside from the nit I pointed out on GitHub. However, since this is actually a change in the System app, let's get a peer to give the final r+. :kgrandon, can you please have a look at this?
Attachment #8529786 - Flags: review?(kgrandon)
Attachment #8529786 - Flags: review?(ggoncalves)
Attachment #8529786 - Flags: feedback+
Alex - can you point me to where in the patch we call uninit()? I can't seem to find it. Code generally looks good though. Thanks!
Flags: needinfo?(lissyx+mozillians)
(In reply to Kevin Grandon :kgrandon from comment #20)
> Alex - can you point me to where in the patch we call uninit()? I can't seem
> to find it. Code generally looks good though. Thanks!

Well we don't use it for now, I've followed Arthur advice in comment 15 :)
Flags: needinfo?(lissyx+mozillians)
Flags: needinfo?(kgrandon)
(In reply to Guilherme Gonçalves [:ggp] from comment #19)
> Comment on attachment 8529786 [details] [review]
> Gaia PR
> 
> Looks good to me, aside from the nit I pointed out on GitHub. However, since
> this is actually a change in the System app, let's get a peer to give the
> final r+. :kgrandon, can you please have a look at this?

Thanks. We are still doing changes to find my device code, so it's worth that you give a r+ and/or a f+ at least, I think. You're comment makes sense, we discussed about it during the work week. I'll trigger a set of try with the async removed :)
Addressed in the latest push. Try is green for a couple of retriggers already, at https://tbpl.mozilla.org/?tree=Gaia-Try&rev=39bc73a45130
Flags: needinfo?(ggoncalves)
Thanks, looks good!
Flags: needinfo?(ggoncalves)
Comment on attachment 8529786 [details] [review]
Gaia PR

Sure, I think it seems fine to me then, thanks.
Flags: needinfo?(kgrandon)
Attachment #8529786 - Flags: review?(kgrandon) → review+
triage: blocking 2.1+.   blocks bug 1105511, already a blocker.
blocking-b2g: 2.1? → 2.1+
Comment on attachment 8529786 [details] [review]
Gaia PR

Rudy, can you make sure I have not broken your logic ? Thanks!
Attachment #8529786 - Flags: feedback?(rlu)
Comment on attachment 8529786 [details] [review]
Gaia PR

In this patch there are still chances of creating multiple instances of SettingsHelper. And it also breaks the check of power threshold changes. Please check my comments in github for details, thanks!
Attachment #8529786 - Flags: review?(arthur.chen)
(In reply to Arthur Chen [:arthurcc] from comment #28)
> Comment on attachment 8529786 [details] [review]
> Gaia PR
> 
> In this patch there are still chances of creating multiple instances of
> SettingsHelper. And it also breaks the check of power threshold changes.
> Please check my comments in github for details, thanks!

Right, I think your comment about powersave.threshold just answers my question towards Rudy. Meanwhile, you're right there may still be some leaking, but those should have been much much less important. Anyway, new PR should address all of those.
Attachment #8529786 - Flags: feedback?(rlu) → review?(arthur.chen)
Comment on attachment 8529786 [details] [review]
Gaia PR

r=me, thank you!
Attachment #8529786 - Flags: review?(arthur.chen) → review+
https://github.com/mozilla-b2g/gaia/commit/086a26be553638cfd36e7d5c3ea007b5237a9562
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Blocks: 1110872
Sounds like this might need a blocking 2.0 request? At a minimum, please request Gaia v2.1 approval on this when you get a chance :)
Flags: needinfo?(lissyx+mozillians)
Target Milestone: --- → 2.2 S2 (19dec)
FYI, on a Flame 2.0 with 1d 10h of uptime:
> 299 (100.0%) -- settings-observers-suspect
> ├──176 (58.86%) ── referent(topic=lockscreen.lock-message)
> └──123 (41.14%) ── referent(topic=powersave.threshold)

Ryan, I fear the patch for 2.1 and even 2.0 may need to be slightly different. And tricky.
That may be a dumb question to ask: why could a gaia 3rd-party privileged application break the phone so bad? Why can merely adding an observer break the phone?

Also, you have a "uninit" method but it seems you never call it?
(In reply to Julien Wajsberg [:julienw] from comment #34)
> That may be a dumb question to ask: why could a gaia 3rd-party privileged
> application break the phone so bad? Why can merely adding an observer break
> the phone?

It's not just "a gaia 3rd party privileged app", it's system app abusing ressources. Other applications will have their observers cleaned up when being killed/closed. But System is a never ending story ... Maybe we can be more clever on the API side, but:
 - Settings is used everywhere and changes can expose misassumptions/bugs
 - I still have no confirmation that this leaking is related to the API getting broken ; it's just that I noticed it now, but as documented in comment 33 and earlier, this is something that has been here for a long time. Previously, only battery change would trigger a leak, but starting with 2.0 we have also the lockscreen.lock-message used by find my device.
 - We still may have other things breaking the API that needs urgent care

> 
> Also, you have a "uninit" method but it seems you never call it?

Yes, I added it because otherwise people may miss it, but for the current fix we can live without using it.
Flags: needinfo?(lissyx+mozillians)
Don't lose your NI for the 2.1 patch ;)
Flags: needinfo?(lissyx+mozillians)
(In reply to Julien Wajsberg [:julienw] from comment #36)
> Don't lose your NI for the 2.1 patch ;)

Thanks :)

I have been dogfooding this since it landed, and while I may be non objective, I have my Nexus S up for a couple of days now:

> $ adb shell uptime
> up time: 4 days, 18:55:10, idle time: 12:01:52, sleep time: 4 days, 02:26:55

And I would say that I feel my device being more responsive after that much time than before.
I'd like some help of QA to assert the benefits here: reverting this change and getting some feedback on the performances after a couple of hours/days of use.
Keywords: qaurgent, qawanted
We do not have the ability to revert the change as requested.  Please provide us with a compiled flame build that we can flash to the build if you would like this test performed as well as some indication of where we might actually see performance flag with the naked eye and then retag this issue for testing.
Keywords: qaurgent, qawanted
(In reply to Jayme Mercado [:JMercado] from comment #39)
> We do not have the ability to revert the change as requested.  Please
> provide us with a compiled flame build that we can flash to the build if you
> would like this test performed as well as some indication of where we might
> actually see performance flag with the naked eye and then retag this issue
> for testing.

What stops you from doing the revert?

For the question regarding performances, I don't have a better answer than your feeling of the device :(
Flags: needinfo?(lissyx+mozillians) → needinfo?(jmercado)
It isn't something we are set up to do.  We could attempt to perform the test with a build, but the revert is not something we can do.
Flags: needinfo?(jmercado)
Alexandre, you can maybe point them to an existing build from before the change?

Also, don't forget the v2.1 change :)
Flags: needinfo?(lissyx+mozillians)
(In reply to Alexandre LISSY :gerard-majax from comment #38)
I have my dogfooding phone at [1] with an uptime of:
> up time: 6 days, 20:21:11, idle time: 07:52:14, sleep time: 6 days, 15:32:28

I flashed another device at the same build + your patch (I backported the changes in power_save.js to battery_manager.js). I am able to see this potential improvements (not sure if related):
 * The camera apps take about a second less to load if the device was locked before. It takes less time to get feedback with a 319MB running for a couple of minutes than on a 512MB running for 6+ days.

Apart from that, I don't see any other performance improvements to the naked eye:
 * It takes the same time to have access to all options in the settings app.
 * Same thing when you place a call to the voicemail. 
 * Activating the airplane mode doesn't take less time.
 * Opening a new SMS takes about the same time.

[1] Gaia-Rev        17c7ad2e4919a994f0844239b483116090412dee
Gecko-Rev       https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/39dfb662c82a
Build-ID        20141225001203
Version         34.0
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  40
FW-Date         Tue Oct 21 15:59:42 CST 2014
Bootloader      L1TC10011880
QA Contact: jlorenzo
Alex, can you request uplift approval?
Ok, given the non perceivable performance impact, and given the changes involved for uplifting, I'm not sure it's worth spending time for this.
Flags: needinfo?(lissyx+mozillians)
Hey, I just want to understand: can't we make the phone break without this patch?
Flags: needinfo?(lissyx+mozillians)
Nobody has never proved this: this landed on master on december 11th, people still had the phone breaking. This may improve responsiveness, but so far, the documented figures on this do not motivate me to uplift this, given the risks.
Flags: needinfo?(lissyx+mozillians)
See Also: → 1189004
Blocks: 1193469
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: