Closed Bug 1046828 Opened 6 years ago Closed 6 years ago

[Notifications API] We should never send the "show" event using system messages at a reboot

Categories

(Core :: DOM: Core & HTML, defect, P2)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED FIXED
mozilla34
blocking-b2g -
Tracking Status
firefox34 --- fixed
b2g-v2.1 --- verified

People

(Reporter: julienw, Assigned: julienw)

References

Details

(Keywords: feature, perf, Whiteboard: [c=boot p= s= u=])

Attachments

(5 files, 2 obsolete files)

Now that we show the remaining notifications after a reboot, we also send a "show" event when they're displayed. When we reboot, the apps are not launched, and as a result, we send a system message for this "show" event, and the app is launched.

The main consequence is that the homescreen takes longer to launch in that case.

So here is a STR:
1. receive a missed call
2. receive a SMS
3. reboot
4. unlock the lockscreen as soon as it appears

=> the homescreen takes a long time to be displayed

5. longpress home

=> we see that both dialer and messages are launched.

Asking for blocking 2.0 because it's a 2.0 feature and has performance implications.


Discussing with Alexandre, we think there are 3 ways of fixing this:

1. in Gecko: never send a system message for the event "show"
2. in Gaia: don't send the "desktop-notification-show" in NotificationScreen.addNotification if this.isResending === true
3. In Gaia, add a "isResending" property to the "desktop-notification-show" event, and in gecko, don't send the system message if this property is true

Or several of them.

I would do 1) because I'd argue that the apps don't care about the "show" event. Actually, they have currently no way to distinguish a "show" event and a "close" event from the system message.

2) makes sense too. So maybe 1) and 2) is the right answer here.

Michael, what are your thoughts?


Gregot, if your team is busy, I can take care of this patch because I know well this code. Just ask :)
Flags: needinfo?(mhenretty)
I think 1) is sufficient, since it will cover 2) (ie, a resending notification will almost certainly result in a system message for the originating app). I think this is a good change to make since I can't imagine a case where an app would want to be launched for a "show" event.

(In reply to Julien Wajsberg [:julienw] from comment #0)
> Gregot, if your team is busy, I can take care of this patch because I know
> well this code. Just ask :)

This would be very nice of you :)
Flags: needinfo?(mhenretty)
Okay, let me take this, I'll do 1).
Assignee: nobody → felash
Not sure it's the most elegant way.

I'd like to see if I can unit test this (not sure I have enough knowledge in gecko tests for this). Also, I can't try it on the device right now because my device build looks busted :/
Attached patch WIP testSplinter Review
Mike - Could this help us solve bug 1037706?
Flags: needinfo?(mhenretty)
Hey Michael,

I admit I don't know how to do a test for this; I tried do some unit test but mocking XPCOM service is out of my knowledge for now. I also don't know how I can "listen" to the system message... So I'm happy to do some tests if you can give me some pointers :)

Also, it seems that we have nearly the same code in AlertsHelper and AlertsService. As a result, I think the "catch" block in AlertsHelper is never executed, because potential errors are already handled in AlertsService. This is quite confusing, but I didn't want to change this here. It's probably a good idea to clean this up in another bug though?
Attachment #8466305 - Attachment is obsolete: true
Attachment #8468054 - Flags: review?(mhenretty)
(In reply to Jason Smith [:jsmith] from comment #5)
> Mike - Could this help us solve bug 1037706?

The bug happens only when some notifications are to be displayed at boot time. It should not affect the homescreen start during normal operation.
NI Bhavana for the blocking status.
Flags: needinfo?(bbajaj)
Here is an example of mocking AlertsService, it could be helpful: http://mxr.mozilla.org/mozilla-central/source/dom/tests/mochitest/notification/MockServices.js
(In reply to Jason Smith [:jsmith] from comment #5)
> Mike - Could this help us solve bug 1037706?

As Julien mentioned, this only applies to rebooting the phone, and is actually a system perf improvement rather than a homescreen one. The case where the homescreens gets OOM'd will not be helped by this patch.
Flags: needinfo?(mhenretty)
Comment on attachment 8468054 [details] [diff] [review]
0001-Bug-1046828-Notifications-API-We-should-never-send-t.patch

Review of attachment 8468054 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Julien Wajsberg [:julienw] from comment #6)
> Also, it seems that we have nearly the same code in AlertsHelper and
> AlertsService. As a result, I think the "catch" block in AlertsHelper is
> never executed, because potential errors are already handled in
> AlertsService. This is quite confusing, but I didn't want to change this
> here. It's probably a good idea to clean this up in another bug though?

Yup this is indeed confusing. To shed a little light, AlertsHelper.jsm lives in the parent process, and is the one responsible for sending the system message and thereby waking up apps. AlertService.js lives in the child on the other hand. It surprises my that the AlertsHelper catch block is never used, because I would've thought it was the other way around (AlertService lives in the child so we should be assured the app is indeed alive).

In any case, the messiness here stems from the fact that we are supporting both the old desktop notification api and the new web notification api with similar but slightly different code paths. Once we have completely removed the old desktop notification API (bug 952453), I think it will be a good time to clean up this logic. As it is, I am afraid to alter these services too drastically as it works but is complex and fragile. So for now, I think your approach to this bug makes the most sense.

As for help writing tests, I'd be happy to do so. As Robert pointed out, there is an example of mocking the AlertService in dom/tests/mochitest/notification/MockServices.js. As for testing the system message itself, the best way to do so would be an gaia integration test like this [1]. Although its not clear to me that calling 'window.close()' after 'new Notification' will close the app before the show event would come back, so we should investigate that.

1.) https://github.com/mozilla-b2g/gaia/blob/cd4b8ea2848b71d54f85c636aefc3b4f7641c0ce/apps/system/test/marionette/notification_launch_app.js

::: b2g/components/AlertsHelper.jsm
@@ +134,5 @@
>            topic: topic,
>            target: listener.target
>          });
>        } catch (e) {
> +        if (detail.type !== kDesktopNotificationShow) {

We should add a comment here as to why we are excluding the show event in the case of the system message.

::: b2g/components/AlertsService.js
@@ +135,5 @@
>        // It seems like there is no callbacks anymore, forward the click on
>        // notification via a system message containing the title/text/icon of
>        // the notification so the app get a change to react.
>        if (data.target) {
> +        if (topic !== kTopicAlertShow) {

Again, let's comment on why we exlude this.
Attachment #8468054 - Flags: review?(mhenretty) → review+
(In reply to Robert Bindar from comment #9)
> Here is an example of mocking AlertsService, it could be helpful:
> http://mxr.mozilla.org/mozilla-central/source/dom/tests/mochitest/
> notification/MockServices.js

Thanks! It's a good thing I didn't try to do it then ;)
I was wondering if it's easy to load "sinon" and use it instead of putting too much logic in the mock (as we do in Gaia).


As for integration tests, I didn't think of this because since it was in Gecko I wanted to a test in Gecko, but this is a good idea. I'll think more of it.
(In reply to Michael Henretty [:mhenretty] from comment #11)
> Yup this is indeed confusing. To shed a little light, AlertsHelper.jsm lives
> in the parent process, and is the one responsible for sending the system
> message and thereby waking up apps. AlertService.js lives in the child on
> the other hand. It surprises my that the AlertsHelper catch block is never
> used, because I would've thought it was the other way around (AlertService
> lives in the child so we should be assured the app is indeed alive).

I'm not sure I understand everything here; is it possible that AlertsService lives in the System app in this case?

This is confusing to follow.

* if there is no listener in AlertsService.receiveMessage, then we don't even try to send a notification
* AlertsService.showAppNotification seems to be the only place where we set listeners
* AlertsService.showAppNotification seems to be called in dom/src/notification/ChromeNotifications.js only.
* this call passes no listener argument
* might actually be the reason of the exception in AlertsService.receiveMessage...

I'm sure I miss something here...
Attached patch patch v2Splinter Review
carrying r=mhenretty

I saw that I forgot to declare the constant in AlertsService.js, which was breaking everything ;)

I'll do a try run later today before requesting checkin.
Attachment #8468054 - Attachment is obsolete: true
Attachment #8468385 - Flags: review+
Thanks Alexandre, much clearer !

I'm quite sure now that AlertsService.js catching the exception lives in the System app. This is the only logical explanation :)
(In reply to Michael Henretty [:mhenretty] from comment #10)
> (In reply to Jason Smith [:jsmith] from comment #5)
> > Mike - Could this help us solve bug 1037706?
> 
> As Julien mentioned, this only applies to rebooting the phone, and is
> actually a system perf improvement rather than a homescreen one. The case
> where the homescreens gets OOM'd will not be helped by this patch.

Mike - Can you weigh in here from a performance perspective if this is a blocker to our 2.0 performance criteria?
Flags: needinfo?(mlee)
(In reply to Julien Wajsberg [:julienw] from comment #17)
> Try is https://tbpl.mozilla.org/?tree=Try&rev=4ac3f86cfe8b

Gij is failing and this looks related. Will look at it tomorrow.
(In reply to Julien Wajsberg [:julienw] from comment #19)
> (In reply to Julien Wajsberg [:julienw] from comment #17)
> > Try is https://tbpl.mozilla.org/?tree=Try&rev=4ac3f86cfe8b
> 
> Gij is failing and this looks related. Will look at it tomorrow.

Can't make it fail locally, I restarted the job on TBPL.
(In reply to Julien Wajsberg [:julienw] from comment #20)
> (In reply to Julien Wajsberg [:julienw] from comment #19)
> > (In reply to Julien Wajsberg [:julienw] from comment #17)
> > > Try is https://tbpl.mozilla.org/?tree=Try&rev=4ac3f86cfe8b
> > 
> > Gij is failing and this looks related. Will look at it tomorrow.
> 
> Can't make it fail locally, I restarted the job on TBPL.

This is well permared on TBPL :(
Rebased try: https://tbpl.mozilla.org/?tree=Try&rev=660014bc9754
(maybe the red was from a separate commit)
(In reply to Jason Smith [:jsmith] from comment #18)
> (In reply to Michael Henretty [:mhenretty] from comment #10)
> > (In reply to Jason Smith [:jsmith] from comment #5)
> > > Mike - Could this help us solve bug 1037706?
> > 
> > As Julien mentioned, this only applies to rebooting the phone, and is
> > actually a system perf improvement rather than a homescreen one. The case
> > where the homescreens gets OOM'd will not be helped by this patch.
> 
> Mike - Can you weigh in here from a performance perspective if this is a
> blocker to our 2.0 performance criteria?

If this only impacts boot/reboot we shouldn't block 2.0 as we haven't defined boot latency criteria for that release. Work is in-progress [1] to add instrumentation that will allow us to monitor boot performance from power-on to homescreen launch. 

[1] https://wiki.mozilla.org/FirefoxOS/Performance/Boot_Timing_Automation
Flags: needinfo?(mlee)
Keywords: feature, perf
Whiteboard: [c=boot p= s= u=]
https://tbpl.mozilla.org/?tree=Try&rev=660014bc9754&showall=1
so it's really red and I can't reproduce locally...
(In reply to Julien Wajsberg [:julienw] from comment #24)
> https://tbpl.mozilla.org/?tree=Try&rev=660014bc9754&showall=1
> so it's really red and I can't reproduce locally...

Is that perma-red? I re-triggered to find out. It could be bug 1037924, and only a coincidence that its a notification test.
No, I'm quite sure it's permared.
(In reply to Mike Lee [:mlee] from comment #23)
> (In reply to Jason Smith [:jsmith] from comment #18)
> > (In reply to Michael Henretty [:mhenretty] from comment #10)
> > > (In reply to Jason Smith [:jsmith] from comment #5)
> > > > Mike - Could this help us solve bug 1037706?
> > > 
> > > As Julien mentioned, this only applies to rebooting the phone, and is
> > > actually a system perf improvement rather than a homescreen one. The case
> > > where the homescreens gets OOM'd will not be helped by this patch.
> > 
> > Mike - Can you weigh in here from a performance perspective if this is a
> > blocker to our 2.0 performance criteria?
> 
> If this only impacts boot/reboot we shouldn't block 2.0 as we haven't
> defined boot latency criteria for that release. Work is in-progress [1] to
> add instrumentation that will allow us to monitor boot performance from
> power-on to homescreen launch. 
> 
> [1] https://wiki.mozilla.org/FirefoxOS/Performance/Boot_Timing_Automation

Given this comment I think its ok if this rides the trains as this is not highlighted a ship blocker by perf team and partners doing perf testing here.
blocking-b2g: 2.0? → -
Flags: needinfo?(bbajaj)
Michael, given this is not a blocker I think I won't be able to look at these failures more than what I already did (especially that I don't have a clue of why it's failing) before my holidays and thus before the 2.1 branching date. So feel free to look at it. I can do the necessary changes if you find the issue though.
I looked into the failing test, and it turns out we work around the issue of the 'show' system message re-opening closed apps here [1]. I believe we can remove this workaround in the test for your patch.

1.) https://github.com/mozilla-b2g/gaia/blob/fbcb01fe39cc2eed7b2860f05908521e4936a70a/apps/system/test/marionette/notification_events_test.js#L208
Flags: needinfo?(felash)
Ah, I get it, I saw these lines but didn't understand that we were working around the specific issue I was fixing here.

Too bad a bug # was not fixed at the time, with a comment here ;)

Thanks then I'll just remove this and it should be fine !
I want to really try to reproduce the issue locally first. I think I had a wrong b2g build. Ideally I'll also have a test that checks that the app is not launched, but I don't really know how to check this.
Flags: needinfo?(felash)
Attachment #8472453 - Attachment description: github PR → github PR with a permissive test, so that we can land and stay green
Comment on attachment 8472453 [details] [review]
github PR with a permissive test, so that we can land and stay green

This should land on Gaia before landing Gecko.

I checked locally that it passes with both the current central and the patch. Hopefully it will on TBPL too ;)
Attachment #8472453 - Flags: review?(mhenretty)
Comment on attachment 8472467 [details] [review]
github PR with a proper test

The first commit is just the "permissive" test. The second commit makes a test that really test for this change. This should land in Gaia some time (let's say 1 day) after the Gecko patch lands.

I checked locally that it fails with current central and passes with the patch.
Attachment #8472467 - Flags: review?(mhenretty)
Priority: -- → P2
Status: NEW → ASSIGNED
Attachment #8472453 - Flags: review?(mhenretty) → review+
github PR with a permissive test, so that we can land and stay green:
master: dd1156ca426e4747f04d79cfda3334101a4e9537
Comment on attachment 8472467 [details] [review]
github PR with a proper test

r+, with a few comments on github.
Attachment #8472467 - Flags: review?(mhenretty) → review+
New try now that the permissive test landed: https://tbpl.mozilla.org/?tree=Try&rev=b94c36234a1d
Gij is green, let's land the gecko patch (attachment 8468385 [details] [diff] [review])!
Comment on attachment 8472467 [details] [review]
github PR with a proper test

Rebased and pushed to trigger a new try run with newer Gecko.

Michael, I could not completely finish this after all, I'll let you land the remaining test if everything is green?
Flags: needinfo?(mhenretty)
Green.

master: https://github.com/mozilla-b2g/gaia/commit/b8f6073e5733f693cc39fdf61f070fcbba9f25f4
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Flags: needinfo?(mhenretty)
Resolution: --- → FIXED
Keywords: leave-open
Target Milestone: --- → mozilla34
This patch seem to result in a Gij error.

I can reproduce this error locally every time, but on TBPL I see it every once in a while, like here:
0) https://tbpl.mozilla.org/php/getParsedLog.php?id=47200832&tree=Gaia-Try&full=1#error0
1) https://tbpl.mozilla.org/php/getParsedLog.php?id=47165720&tree=Gaia-Try&full=1
2) https://tbpl.mozilla.org/php/getParsedLog.php?id=47166295&tree=Gaia-Try&full=1

It seems that my latest checkin (0) may have make the error be more reproducible on tbpl, but the checking (1) and (2) landed a few days before and they have the same error. On my machine I can reproduce it always and bisected it to this changeset.

Steve, can you take a look at this and confirm if my readings are correct?
Flags: needinfo?(schung)
So it means that bug 1061221 is instead a regression from this one.
Depends on: 1061221
Sorry I'm not familiar with this notification changes :-/ but I can not reproduce error related to notification in my local(Since it's intermittent on TBPL, I could not guarantee this patch is totally safe per my brief testing). Hi Michael, could you please check for any possible failure in this patch? Thanks.
Flags: needinfo?(schung) → needinfo?(mhenretty)
A agree with :gerard-majax, the problem :gandalf mentioned in comment 43 should be taken care of in bug 1061221. I'll look into this soon.
Flags: needinfo?(mhenretty)
This issue has been verified successfully on Flame 2.1

See attachment: Verify_video.3gp
Reproducing rate: 0/5
Flame2.1 build:
Gaia-Rev        38e17b0219cbc50a4ad6f51101898f89e513a552
Gecko-Rev       https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/8b92c4b8f59a
Build-ID        20141205001201
Version         34.0
Attached video Verify_video.3gp
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.