Open Bug 1252249 Opened 8 years ago Updated 2 years ago

Remove 3 second grace period before reducing the quota

Categories

(Core :: DOM: Push Subscriptions, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: lina, Unassigned)

Details

The quota code currently waits 3 seconds before checking for visible notifications. This isn't ideal for a few reasons:

* If the message isn't carrying data, the `push` event handler will need to fetch data from a server. 3 seconds isn't a lot of time for a network request. Also, if the notification includes an icon (`showNotification("Title", { icon: "http://example.com/icon.png" })`), that's another potential request. FWIW, OS X waits 6 seconds for that request to complete (https://dxr.mozilla.org/mozilla-central/rev/9da51cb4974e03cdd8fa45a34086fe1033abfeaf/widget/cocoa/OSXNotificationCenter.mm#412-416).

* If the user enables "do not disturb" on a platform that uses XUL alerts (like Windows, or Ubuntu with Unity), the alert won't be shown, even if the worker calls `showNotification()`. This doesn't apply to OS X or GNOME, where the alert is still marked as visible, but goes directly into the Notification Center. In this case, the call to `showNotification()` succeeding should be enough to establish intent, since the alert would have been shown otherwise.

* The user might click or dismiss the notification before the timer fires.

Instead of checking for visible notifications after 3 seconds, I think we could do something like this:

* Add an `mNotificationShown` flag to `ServiceWorkerRegistrationInfo`. Set this to true if the call to `ShowPersistentNotification` (https://dxr.mozilla.org/mozilla-central/rev/9da51cb4974e03cdd8fa45a34086fe1033abfeaf/dom/workers/ServiceWorkerRegistration.cpp#731-733) succeeds.

* Append a `PromiseNativeHandler` that observes the promise passed to `pushEvent.waitUntil()`.

* Once the promise resolves, check the state of `mRegistration->mNotificationShown`. If false, or the promise rejects, reduce the quota.

What do you think, William and Martin?
Is this based on feedback?

I have no problem with tightening the implementation so that it tracks open notifications (or notifications that were requested to be open), but we have to watch for sites gaming the system.  The solution you proposed would not deal well with the case of a push message chain where multiple messages arrive, some that clear notifications and others that show them.

How about tracking opens and closes under a single, per-origin timer that is refreshed each time that a push message is delivered.  When this timer lapses, if the total number of notifications shown exceeds the number closed, or there is a notification visible (i.e., there was a pre-existing notification open), then let the quota stand; else dock a point.

I would be OK with letter the timer run longer.  The 3s value was picked when we showed notifications for 4s before automatically closing them.  I agree that 3s might be a little tight, particularly if 2s is taken up with waking the radio.  On the other hand, 6s is an eternity.

I'm not very happy with the idea that we would gate this on the `waitUntil` promise.  That might never be resolved.  If the SW takes more than our timer to do its work, then let it suffer the consequences of that.  We have to provide some incentive to constrain SW activity.
Whiteboard: btpp-followup-2016-03-08
I think part of the reason why we check for visible notification is so that the user has a visible indication of push activity as well as being as being self punishing for the author (if they keep sending push messages with visible notifications, it will annoy the user to distance themselves from the site) so we may not want to relax the quota when an intent to display a notification does not actually show a notification (or not long enough to be significant to the user).

I don't think your proposed solution quite solves the problem because of the reasons Martin points out (the author could also show and close a notifications in the same push message and we don't want that to relax quota).

Another thing we could do is let push messages immediately reduce quota and have web notifications undo the quota penalty when they have been visible for some amount of time or have been explicitly dismissed by the user. Although if a message pushed over the quota limit and caused unsubscription, we can't undo the penalty.
(In reply to William Chen [:wchen] from comment #2)
> I don't think your proposed solution quite solves the problem because of the
> reasons Martin points out (the author could also show and close a
> notifications in the same push message and we don't want that to relax
> quota).

That makes sense. I didn't consider the case of using multiple messages to game the system by showing and immediately closing a notification.

I do like your approach of reducing the quota immediately, and restoring once the user sees or interacts with the notification. We could fire `pushsubscriptionchange` after unsubscribing, but I think that would reset the quota for the new subscription, even if the site didn't deserve it.

The per-origin timer might be the way to go.
Whiteboard: btpp-followup-2016-03-08 → btpp-backlog
For anyone that has kibana permission to prod, this is the data for Firefox 48 and on (where the unregister code indicating why it was unsubscribed is now included):
https://kibana.shared.us-west-2.prod.mozaws.net/index.html#dashboard/temp/AVXskKTKfRWlk2PpmA3j

Some unregister numbers (also, consider this is *only* FF 48+ users) for unregister calls in the past month (30 days):
Code 201 (site went over background quota): 10,420
Code 200 (page called .unsubscribe()): 9,635
Code 202 (permission revoked): 1,070

We've gotten a decent some reports from push developers already about hitting a 410 error code specific to the channel being gone (unsubscribed).

3 seconds is a very short time, especially if a developer doesn't realize how critical it is to display a notification to avoid hitting the background quota. Consider the case where a service-worker might make more than one HTTPS call before displaying a notification, it could easily exceed 3 seconds.

I'd vote for upping it to at least 5 seconds (maybe 10? network latency is not fun), and making sure all documentation on webpush *everywhere* mentions how critical it is to display that pop-up for Firefox to avoid hitting our quotas. Right now, afaik, no WebPush docs on the web indicate this issue in Firefox (Chrome doesn't have it).
Flags: needinfo?(martin.thomson)
How about we start to tune this, but also land some telemetry that will help us understand how often we are running into these problems.  Basically, move it to 5, and setup dud timers at 8/10/15 and see what changes.
Flags: needinfo?(martin.thomson)
:mt: we have server-side telemetry from FF 48+, which is what I reported. We aren't logging total unsubscribes yet though, so we don't know what % of all unsubscribes are from the quota. We will be adding that logging soon.
I'm talking about adding telemetry that would allow us to tune the value, rather than just setting it blindly.  The server, unfortunately, can't help us with that.
Priority: -- → P3
Whiteboard: btpp-backlog
Type: defect → enhancement
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.