Closed Bug 1015087 Opened 10 years ago Closed 10 years ago

Loop server - handle ringback notifications

Categories

(Hello (Loop) :: Server, defect, P1)

x86
Linux
defect

Tracking

(Not tracked)

VERIFIED DUPLICATE of bug 1025872
mozilla33

People

(Reporter: RT, Assigned: alexis+bugs)

References

Details

(Whiteboard: [investigation, p=3])

The Loop clients need to be informed when the called party gets notified of an incoming call.
Adam is investigating what's needed for architecture and decision around having web sockets to server open in scenarios. will break down bug.

ring back that the successful party has been reached and is alerted.  there's some architecture flow work that needs to happen.  today we post to ID of other person and get a 200 back with tokens and wait for them to show up.  no call progress information is shown, as it would require an additional channel between client and server.  push or web sockets.
Whiteboard: [investigation, p=3]
We were planning to build a first version based on SimplePush to get the ringing information.
(In reply to Rémy Hubscher (:natim) from comment #2)
> We were planning to build a first version based on SimplePush to get the
> ringing information.

My instinct is that using straight web sockets would be easier, because then we don't have to implement and maintain a push client for content-side applications. Although we could re-use some of what we've already hacked for the work around for the gecko side of things, that's likely to go away once we get SimplePush on the desktop, leaving us with a separate impl to maintain.
The reasoning behind the use of simple push over websockets is that we don't need to maintain the web socket connection on the server, as this is hard do scale.

We already have a Simple Push connection, so we could just leverage it and use it rather than using another mechanism. Simple push allows us to register on different topics, and be notified that one of these topics had been updated on the server.

I believe we should use this rather than doing web sockets. Bug 1013382 will also leverage the same mechanism.
Blocks: 1013382
(In reply to Alexis Metaireau (:alexis) from comment #4)
> We already have a Simple Push connection, so we could just leverage it and
> use it rather than using another mechanism. Simple push allows us to
> register on different topics, and be notified that one of these topics had
> been updated on the server.

Just to be clear, we won't have an existing simple push connection for the standalone/link-clicker UI. If we have to create a simple push client for web content, then we'll do so, though I'd rather await the architecture information from Adam, before we make a final call here.
Gotcha. I believe Simple Push clients can be just Websockets directly: in this case, that makes it easy to have the WebApp using websockets, and have the other clients rely on the already implemented SimplePush clients.

Adam, do you think that would be okay?
Flags: needinfo?(adam)
(In reply to Alexis Metaireau (:alexis) from comment #6)
> Gotcha. I believe Simple Push clients can be just Websockets directly: in
> this case, that makes it easy to have the WebApp using websockets, and have
> the other clients rely on the already implemented SimplePush clients.
> 
> Adam, do you think that would be okay?

Not really, no.

The issue here is that the speed of alerting over Push might not be appropriate for in-call signaling. Minimally, I think we're going to end up in a case where ongoing calls maintain a WSS connection to the Loop server to receive in-band information about call progress and call status. When thinking about scaling, keep in mind that this connection only exists when a user is in an active call, not whenever they are registered with the Loop server. From that perspective, this seems perfectly reasonable.
Flags: needinfo?(adam)
One other solution could be to simply let the client poll the server when it's ringing, or building this with SSE (Server Sent Events), but I have no experience with that yet.

The goal of Simple Push is to:

- Avoid having to maintain more than one web socket connection between the client and the SP server;
- Avoid having the app maintain sockets at all, because they're a pain to scale correctly.

If we could rely on that, that would be fantastic. If not we'll need to find another mechanism.

That's true that there are no guarantees of the time it will take to wake up the app, but it seems pretty reasonable. I'm ccing JR here since he might have some more information about the delays involved here.
Flags: needinfo?(jrconlin)
Even if we have no guarantee of the delay, the `as soon as possible` delay should be enough.

The SimplePush service has been build for this exact purpose. If we say that we can't rely on it we say that the service is useless.

If the callee is connected it will receive the message fast enough.
If not he will never receive the message and we will get the 10s connection timeout.

Which is exactly why we want this `ringing` event to be able to differenciate the fact that the user got or not our call.
Simplepush is a way for servers to remotely ping apps. It was created so that apps don't create unneeded polling or socket connections, which can drain device battery.  To that end, once the App wakes, it makes sence for it to create it's own socket connection to handle whatever info the app needs.

I'd suggest a "dialing" state, where the app uses SP to wake the remote, then a "ringing" state aftr the remote app connects back to your server and starts alerting the remote party.
Flags: needinfo?(jrconlin)
This isn't ringing; it's ringback: http://en.wikipedia.org/wiki/Ringback_tone
Summary: Loop server - handle "call ringing" notifications → Loop server - handle ringback notifications
I believe one way to solve that would be Server-Sent Events. This means the server will keep a connection open with the client for up to 30s, and I believe how bad that would scale up.

ccing Tarek and Romain for more feedback about that.
(In reply to Alexis Metaireau (:alexis) from comment #12)
> I believe one way to solve that would be Server-Sent Events. This means the
> server will keep a connection open with the client for up to 30s, and I
> believe how bad that would scale up.

I've been thinking about this and other related issues, and I believe what we want is a WSS connection during call setup that can be used to send progress indications to all of the involved parties. Once the call is set up, this connection can go away.
The problem with this is that EC2 instances have a ~240K concurrent TCP socket limit for all incoming and outgoing.

We need to know how many concurrent calls we will have, so we know if this limit will be reached, and how many boxes we'll need.
(In reply to Alexis Metaireau (:alexis) from comment #14)
> We need to know how many concurrent calls we will have, so we know if this
> limit will be reached, and how many boxes we'll need.

Concurrent call *setups*. These connections won't stay up after the call is established.
I'll work up the engineering estimates for the number of connections this will necessitate.
yes, call setups, that's completely right. Thanks for the estimates!
Flags: needinfo?(adam)
Also note that it will require a WSS connection for each device, not for each user.
(In reply to Alexis Metaireau (:alexis) from comment #14)
> The problem with this is that EC2 instances have a ~240K concurrent TCP
> socket limit for all incoming and outgoing.
> 
> We need to know how many concurrent calls we will have, so we know if this
> limit will be reached, and how many boxes we'll need.

Given some very conservative estimates, the peak number of simultaneous connections required is approximately one per every 43 system users (where "system users" is those users who actively make use of the Loop service); if we can support 240k users per server, that requires approximately one server per 10.9 million users.

I do not anticipate Loop approaching 10.9 million active users in the near future; and the protocol I am proposing is specifically designed for distributing these connections among several servers once we do pass that threshold.

See https://wiki.mozilla.org/Loop/Architecture/MVP#Network_Resource_Usage for documentation around assumptions for these numbers.
Flags: needinfo?(adam)
No longer blocks: 1013382
No longer blocks: loop_mvp_server
Depends on: 1025872
Depends on: 1025876
Assignee: nobody → alexis+bugs
No longer depends on: 1025876
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
OK.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.