Closed Bug 1104733 Opened 10 years ago Closed 8 years ago

Use the existing pref for determining the push server, rather than a roundtrip to loop-server

Categories

(Hello (Loop) :: Client, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mreavy, Assigned: fzzzy)

References

()

Details

(Whiteboard: [btpp-fix-later][47])

User Story

In MozLoopPushHandler#_openSocket:

- Remove the code to get the push url endpoint from the server
- Change getting the push server uri from the debug pref (hence dropping the debug pref), to getting it from the dom.push.serverURL preference.
- Remove any redundant code.
- Fix the test_looppush_initialize.js xpcshell test to account for the changes.
- Check all in-loop-repo tests pass
- Check mochitests & xpcshell-test pass when exported to mozilla-central.

Attachments

(1 file)

This is for the desktop client side  - Store the push endpoint in the client and expost it as a pref in about:config for developers


+++ This bug was initially created as a clone of Bug #1101447 +++

This API endpoint is supposedly used by the client to get a list of Push endpoints 
Right now its just a static config file in the Loop server. 

I do not uderstand the benefit for this indirection. It adds deployment complexity with no apparent benefits.

I think we should simply store the endpoint in the client (and also offer a way to configure it in about:config - so developers can point Hello to a specific push server.)

If we're keeping it nevertheless, I propose that we move this API under Push responsibility - so *they* can decide (and make evolve) what to send back to the client.
I see services.push.serverURL in about:config.
Is this already in place or work-in-progress?
Flags: needinfo?(mreavy)
(In reply to Richard Pappalardo [:rpapa][:rpappalardo] from comment #1)
> I see services.push.serverURL in about:config.
> Is this already in place or work-in-progress?

This is older and only does part of what we want.
Flags: needinfo?(mreavy)
Adam -- Do we need this sooner than Fx37?  (Do we need to uplift it to Fx36?)
backlog: --- → Fx37?
Flags: needinfo?(adam)
(In reply to Maire Reavy [:mreavy] (Plz needinfo me) from comment #3)
> Adam -- Do we need this sooner than Fx37?  (Do we need to uplift it to Fx36?)

We don't want to do this at all, as doing so would defeat the load distribution mechanism for loop-push servers.

If we want to add a pref to *override* the push server for testing purposes, that would be fine; but we don't want to store a push server in the client for production.

If anything I say above isn't self-evident, let's get a meeting together with Tarek, me, you, and at least one representative from the loop-push team to get to a common understanding.

In any case, I believe we should close this bug as invalid. Needinfo to Maire to do so.
Flags: needinfo?(adam) → needinfo?(mreavy)
Hi Tarek -- Please see Abr's last comment in this bug (Comment 4). I think there may be some difference of opinion on how to test the client with the servers. I think this bug reveals only some of the differences.  Do you want to have a quick meeting to discuss (as Abr suggests/offers)?
Flags: needinfo?(mreavy) → needinfo?(tarek)
I concur with what Adam says. We don't want to bypass the roundrobin of LoopPush. What we want to do instead is to have the clients asking to LoopPush directly rather than asking to the loop server, which would help them handle their load better.
> If we want to add a pref to *override* the push server for testing purposes, that would be fine; 
> but we don't want to store a push server in the client for production.

> If anything I say above isn't self-evident, let's get a meeting together with Tarek, me, you, and at 
> least one representative from the loop-push team to get to a common understanding.

Yeah we should meet. I don't understand the strategy used here: it seems suboptimal to call the server
all the time just to get the url of the SP endpoint.

We usually do load balancing in two ways:
A- the client asks the endpoint and stores it. that client uses that endpoint all the time (==Firefox Sync)
B- the client always use the same endpoint, and the server uses a DNS round robin strategy to load balance the calls. Tjis load balancing may be sticky. (==most services)

My current understanding of the optimal thing to do in our case once we start to have more than one SP cluster is B- which is : have all the clients point to something like "wss://simplepush.loop.services.mozilla.com", and let ops handle on the server side the load balancing by adding/removing push clusters behind that CNAME.

Here you seem to do it differently which, as I understand it now is A- with a Twist. The client call the server everytime it needs to open the websocket to ask for the endpoint to use. I don't see the benefit of doing that extra HTTP round trip. But maybe I missed a key discussion with Ben and/or Travis :)
Flags: needinfo?(tarek)
Note for Travis: both ELB and Nginx supports the Proxy Protocol, which makes it easy to load balance web sockets and deal with several SP stacks without having to maintain a list of servers in some ad-hoc place.
After a dicussion with Ben - we realized the Proxy Protocol is a new addition to AWS. When they tried ELB with Simple Push it did not have it so the current solution built by Ben/Adam/Travis was a good solution.

Let's keep it like this for now, but it's worth a try to experiment with the Proxy Protocol to see how it goes. 

Ben said he'll experiment with the ELB/Nginx stack to see how it behaves on high load with SP and we can revisit then.
The main issue I've seen with proxy protocol is that the proxy is still proxying, so this doesn't alleviate our issues with ELB cutting long-lived connections. However it did come up in my conversation with Tarek that perhaps we could have a single hostname if we dump *all the available cluster machines IP's* into it. This would require some ops scripting, such that all the IP's of a cluster could be added/removed from the DNS hostname easily, but would mean less work client-side.
backlog: Fx37? → Fx38?
backlog: Fx38? → ---
Whiteboard: [triage]
Rank: 35
Flags: needinfo?(standard8)
Priority: -- → P3
Whiteboard: [triage] → [btpp-backlog]
Ben/Adam/Tarek: A while ago, Loop moved to the same push infrastructure as Firefox, though we still have separate code in Firefox.

We still have the code that gets the push server uri from the loop-server.

I can't see a reason for us to keep the additional round-trip to get the uri from the loop-server. Its the same as what Firefox is using, and isn't likely to change quickly.

Worst case, if we did need to change the URI quickly, we could push out a system add-on update to all Loop users now we have that capability (or even a hotfix to Firefox).

So, can we drop this round trip and revert to using the "dom.push.serverURL" that the main push code uses in Firefox?

If not, what should we do and why?
Flags: needinfo?(tarek)
Flags: needinfo?(standard8)
Flags: needinfo?(bbangert)
Flags: needinfo?(adam)
Assignee: nobody → standard8
(In reply to Mark Banner (:standard8) from comment #11)
> Worst case, if we did need to change the URI quickly, we could push out a
> system add-on update to all Loop users now we have that capability

I agree that moving to a system addon changes the dynamics here pretty significantly. The original plan was designed not just for moving the server around, but also for load balancing among different push server clusters, in case the service saw explosive growth. Given our current and historic growth curve, I don't think this is likely to become an issue in the foreseeable future.

> So, can we drop this round trip and revert to using the "dom.push.serverURL"
> that the main push code uses in Firefox?

Given the above, that seems reasonable to me.
Flags: needinfo?(adam)
Given Adam's comment, lets just go with the dom.push.serverURL preference.
Assignee: standard8 → nobody
User Story: (updated)
Flags: needinfo?(tarek)
Flags: needinfo?(bbangert)
QA Contact: anthony.s.hughes
Agreed, no reason for this. We're now able to use Amazon's ELB's, so there's no longer a reason for this even if there's rapid growth.
Summary: Store the push endpoint in the client and expose it as a pref in about:config → Use the existing pref for determining the push server, rather than a roundtrip to loop-server
Assignee: nobody → dpreston
Donovan: I took a brief look at this the other day. Removing the pref from the push part is fine, but we'll also need to look into some sort of fix for starting up when the network/server is offline - that seems very broken without this.

e.g. start up Firefox with the push server online but the loop-server offline.
OS: Mac OS X → All
Hardware: x86 → All
Whiteboard: [btpp-backlog] → [btpp-fix-later][47]
(In reply to Mark Banner (:standard8) from comment #15)
> e.g. start up Firefox with the push server online but the loop-server
> offline.

Missed a bit: Then open the panel (gets a failure), then start the loop-server and try and use it.

Might also be worth trying being completely offline to begin with.
(In reply to Mark Banner (:standard8) from comment #15)
> but we'll also need to look into some sort of fix for
> starting up when the network/server is offline - that seems very broken
> without this.

Ok. I fixed this already, but I'll take a look at what happens when offline.
Ok, when starting offline and opening the panel, plugging in the ethernet cord does not cause the open panel to recover. However, closing the panel and then reopening it works fine. I think we decided in the standup today that this is sufficient.
Attachment #8741461 - Flags: review?(standard8)
Comment on attachment 8741461 [details] [review]
[loop] fzzzy:remove-config-roundtrip > mozilla:master

Looks good, r=Standard8 with the couple of changes mentioned in the PR.
Attachment #8741461 - Flags: review?(standard8) → review+
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: