1029426 - Decide what to do about handling incoming notifications whilst client is disconnected/offline

Reporter

Description

•

11 years ago

Sometimes the web socket for the push notification might drop, e.g. due to a network issue, restarting the browser for an update, or just the user shutting down the browser for a period of time. We need to decide what to do (if anything) about notifications that come in during the network issue. For example, if we're offline for 1-2 seconds, and we happen to get a push notification in, then we could still pick up the details and handle the call before the caller end times out. If we're offline for longer, then we could still potentially use the information for tracking missed calls. There's a couple of possible ways of doing this: 1) Resend the uaid to the push server as part of the "hello" message. This would get the push server to send us any pending notifications. However, this requires the push server to re-accept the uaid. It would also mean that we need to store more information locally. 2) Query the loop server after each re-connection/re-registration This requires an additional ping to the loop server, but it means that we're using the server as the single authoritative rather than having state split across two servers. Assigning to Adam to look at, as he said he needed to think about it on irc - I filed so that it can be tracked.

Rémy Hubscher (:natim)

Comment 1

•

11 years ago

The SimplePush server is not providing any queuing messages, so if the websocket is down during a push no notification will be send. I don't think we should queue any notification on the server but I do agree that we should ask the server for pending calls on a reconnection. Also we will need to make sure to agree about the call information TTL. The protocol documentation is waiting for 10s before terminating the call if no connection on the websocket.

Mark Banner (:standard8)

Reporter

Comment 2

•

11 years ago

(In reply to Rémy Hubscher (:natim) from comment #1) > The SimplePush server is not providing any queuing messages, so if the > websocket is down during a push no notification will be send. FYI, I actually tested this with the push.services.mozilla.com service (after a quick hack so that Loop stored the uaid), and it did queue up the messages even when the socket was disconnected, it also sent them on the next hello. Granted the specification is actually unclear (and still developing), so it may not remain there in future, but there is some sense of a queue in the documentation about notification and I based this discussion on what the server is currently doing.

Rémy Hubscher (:natim)

Comment 3

•

11 years ago

Oh ok, nice to know. JR do you know for how long the SimplePush server is queuing notifications?

Flags: needinfo?(jrconlin)

Alexis Metaireau (:alexis)

Comment 4

•

11 years ago

Is that behavior part of the specification? If it's not then we shouldn't rely on it. If it is, that would avoid having the client pinging the server without reason to have the list of calls on startup (is it what the client is currently doing, Mark?) Another easy solution is to ignore these queued calls on startup and always have a look at the list of calls just after registration (since a call is valid for a small amount of time, that would provide the same results).

JR Conlin [:jrconlin,:jconlin]

Comment 5

•

11 years ago

SimplePush queues a notification event for 72 hours. Of course, it may be replaced by a newer version, but if a client is unable to immediately receive a notification, the server retains it until the device reconnects.

Flags: needinfo?(jrconlin)

Rémy Hubscher (:natim)

Comment 6

•

11 years ago

Thank you JR. Since the call information are valid for only 30s I guess this is good enough for us.

Alexis Metaireau (:alexis)

Comment 7

•

11 years ago

If these are stored for 72 hours, then we will get notified of calls we don't care about. I believe we should ignore these queued calls on startup and always have a look at the list of calls just after registration. That would provide the same results.

:shell escalante

Updated

•

11 years ago

Target Milestone: mozilla33 → mozilla34

:shell escalante

Comment 8

•

11 years ago

RT: looks like a good scenario to handle, but is it more edge than needed for MVP. can it wait until Fx35 while we get Hello looking good, good call, and the UX good? If yes, please move to Fx35

Flags: needinfo?(rtestard)

:shell escalante

Comment 9

•

11 years ago

triaged together. edge case and risky logic - so want to de-risk.

Flags: needinfo?(rtestard)

Priority: P3 → P2

Target Milestone: mozilla34 → mozilla35

Mark Banner (:standard8)

Reporter

Updated

•

11 years ago

Depends on: 1028871

Maire Reavy [:mreavy]

Updated

•

11 years ago

backlog: --- → Fx38?

Target Milestone: mozilla35 → ---

:shell escalante

Comment 10

•

11 years ago

Hi Adam, this wasn't part of the spec - can you write a straw man possibly - depending on importance based against your other priorities? also totally dependent on impact you think this scenario has and risk posed. not that sure what this bug causes. backlogging and can pull out based on your input.

backlog: Fx38? → backlog

Flags: needinfo?(adam)

Priority: P2 → P3

Adam Roach [:abr]

Assignee

Comment 11

•

11 years ago

I think the only really clean way to handle this is to re-query the Loop server for state whenever the push connection has gone offline for any period of time. Given that this should (presumably) be a relatively infrequent error situation, the increase of load on the Loop server should be imperceptible. This should also be pretty easy to implement. *However*, we should test the premise that this is a rare situation: the "re-poll for state" functionality should land in tandem with a telemetry patch that counts the number of times this happens. In order to allow us to reasonably compare numbers, we should also peg the number of times an initial connection occurs. If the telemetry bit is unclear, ping me on IRC and I can explain. ____ As a sidenote, I know that the Simple Push team has made some optimizations to the Loop Push servers that make several simplifying assumptions based on the behavior that we need. If my memory serves, one of those optimizations was elimination of message queuing.

Flags: needinfo?(adam)

:shell escalante

Updated

•

10 years ago

backlog: backlog+ → backlog-

:shell escalante

Comment 12

•

10 years ago

we are merging some of rooms and calling - so this is more determining how to handle for both. mark is thinking of a few things and will look at this. possibly change title to reflect both.

backlog: backlog- → ---

Rank: 45

Flags: needinfo?(standard8)

Flags: firefox-backlog+

Priority: P3 → P4

Whiteboard: [notification]

Adam Roach [:abr]

Assignee

Updated

•

10 years ago

Summary: Decide what to do about handling incoming call notifications whilst client is disconnected/offline → Decide what to do about handling incoming notifications whilst client is disconnected/offline

Mark Banner (:standard8)

Reporter

Comment 13

•

10 years ago

Currently the Loop push servers don't do queuing of events. When we switch to the new push backend, (bug 1131813), then we'll start getting notifications. I did some testing and analysis ahead of this: - Rooms: On startup, we should be getting the list of rooms, we don't currently. If there's been room changes, we'll presumably get poked by the push server, at which point, because we've not retrieved any rooms yet we'll get a full list which should be the up to date one. (Note: there's probably a bug here that we should be getting the list of rooms at startup, in case someone is in one, although I think having push notifications would solve most of the cases). - Direct calls: I tested altering a client with a delayed handling of the push notification. I delayed it for ~10 seconds, and the call was gone from the list when queried on the server. Delaying the push notification handling for ~8 seconds meant the call still got handled. At the moment, I think we could close this as invalid, but when we swap to the new service, we should do some testing around network connectivity to ensure we get all the right information.

Flags: needinfo?(standard8)

Mark Banner (:standard8)

Reporter

Comment 14

•

10 years ago

I did the testing in comment 13 for the new push server switch that happened recently - even without the new backend, our existing backend resends the same uaid, and the servers now deliver offline notifications. Hence, we're now handling the calls and room notifications as per comment 13. So we can close this as WFM.

Status: NEW → RESOLVED

Closed: 10 years ago

Resolution: --- → WORKSFORME

Bugzilla

Decide what to do about handling incoming notifications whilst client is disconnected/offline

Categories

(Hello (Loop) :: Client, defect, P4)

Tracking

(Not tracked)

People

(Reporter: standard8, Assigned: abr)

References

Details

(Whiteboard: [notification])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Comment 8

Comment 9

Updated

Updated

Comment 10

Comment 11

Updated

Comment 12

Updated

Comment 13

Comment 14