Closed Bug 353237 Opened 18 years ago Closed 15 years ago

Bouncer needs peer network logic added to redirect script

Categories

(Webtools :: Bouncer, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: morgamic, Assigned: morgamic)

Details

If we are able to utilize peer networks for specific IP ranges in the future, we need to add this logic to Bouncer so that incoming URIs from specific IPs can be sent to the appropriate mirrors.
OS: Mac OS X 10.3 → All
Hardware: PC → All
Whiteboard: [blocking Firefox2]
Whiteboard: [blocking Firefox2]
Discussion is in bug 395241.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → DUPLICATE
This is actually very separate from geo-ip so reopening.  We will focus on general geo->ip mapping first, then implement a peer network patch later.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Thought of something on my way to grab food -

Right now the mirror weight figures into which mirror a client it sent to.  That results in a weighted round-robin (I think).  

If I'm best served out of peer-mirror.cn, instead of sending me just that mirror IP (which could be down until sentry notices), artificially raise that mirror's weight to all the mirrors in .cn (or maybe to the highest weighted mirror) to give me a pool of mirrors I could be sent to.  

I think this would solve the case of that mirror being down - on a browser reload, I'd have a chance of getting an alternate mirror.
How about using the GeoDNS patch for BIND?  You create views to direct users via DNS to their geographically closest cluster.
http://www.caraytech.com/geodns/

See also implementation study here:
http://hiredgnu.co.za/2007/11/20/bind-geoip-and-python-a-beautiful-soup-doth-make/
We want to be able to weight mirrors differently, and bouncer does redirects, so not sure if DNS will work for this (assuming you want to use dns redirection).  Also we get into weird states with dns caching...
To clarify, within a single geo region you want to be able to weight mirrors?
Downloadee -> download.mozilla.org (single ip) -(http redirect)> mirror that is in their geo-region or peer network in this case.
I'm thinking DNS -> geolocal download site -> weighted http redirect -> preferred mirror.  
Laura's method seems like I'd have to have more machines to host the geolocal download site, unless I'm missing something.

user.sj.ca.us - DNS returns download.sanjose - HTTP redirect
user.ams.nl - DNS returns download.amsterdam - HTTP redirect
user.pek.nl - DNS returns download.beijing - HTTP redirect

In reality, we only have one download.mozilla.org and can't easily replicate that unless the databases live separately.   

Oremj's idea feels better to me.
I like that model too, but it requires us to have download servers running in each of our geolocal/netlocal service areas.  That limits our ability to bring up a new pure-FTP mirror quickly to give good service to a new region or ISP, because we have to deploy a download.m.o instance there, which is a much bigger ask of a mirror provider.
(In reply to comment #10)
> I like that model too, but it requires us to have download servers running in
> each of our geolocal/netlocal service areas.  That limits our ability to bring

> because we have to deploy a download.m.o instance there, which is a much bigger
> ask of a mirror provider.

Not necessarily - ams.nl is going to serve all of .eu unless we have some more specific geographic site.  We don't so our regions are going to be really large - .eu, .cn and all of Americas.  Like I doubt we're going to download.m.o in the south pole and we wouldn't need to - they'd just fall back to some other download.m.o (because we told DNS to, or whatever).  
With my method is would be easier to, for example, send Brazilian users to a Brazilian mirror without having a bouncer instance there.  Yes, they would have to hit sj once to get a 302, but I don't think that is going to be noticeable.
(In reply to comment #11)
> Not necessarily - ams.nl is going to serve all of .eu unless we have some more
> specific geographic site.  We don't so our regions are going to be really large
> - .eu, .cn and all of Americas.  Like I doubt we're going to download.m.o in
> the south pole and we wouldn't need to - they'd just fall back to some other
> download.m.o (because we told DNS to, or whatever).  

Sure, but in that case we're not getting the free locality by piggybacking onto the DNS resolution -- still have to hit a download.m.o -- in which case we're deploying another instance and biting off replication/synchronization work just to cut the latency on the initial redirect, right?  If download.m.o has to be involved, I don't think we gain much by splitting its load, TBH.
So there are a few additional issues to resolve here...

One problem we had during the 3.0.2/3.0.3 release is that the download mirror in Beijing kept getting pulled out of rotation because sentry couldn't reach it to verify it was up.  This is a general limitation of the network connection between San Jose and China, which basically sucks.  That mirror probably worked fine for people local to it (it's one we actually run ourselves).  On the one hand, it's good to pull it from the general pool because it would be a poor download experience for users from elsewhere in the world getting to it if it's behaving poorly at the moment.  On the other hand, there's no reason not to deliver it to people local to it since it's still behaving locally.

For this to be generally useful, we probably also need to come up with a way for Sentry to do distributed monitoring (like monitor the Chinese hosts from China, the European/African ones from Amsterdam, everything else from San Jose)

Having any kind of geolocation involved also screws with our "availability" statistics that we look at to decide if we're prepared to release yet, and also with regional weights.  For example, right now we have better mirror coverage in Europe and Asia than we do in the US.  But we have more people trying to download from the US than we do in either Europe or Asia.  If we send everyone in the US to the US mirrors, and only 3 mirrors in the US have picked up the new files yet, that'd be a problem.  Obviously you'd counter this by trying to fall back on the global pool, but trying to figure out when to do that would be complicated I think, and it also makes balancing mirror loads by automatically tampering with the weights as problems are detected a little bit complicated.
Blocks: 459919
1.  I don't know if peer network logic is needed.  We aren't peering anywhere where this becomes important or adds any value.

2. Dave's comment in comment #14 feels like an all together separate bug, unrelated to "peer network logic" (more of a bug in how sentry assumes global reachability/performance from San Jose).

I recommend WONTFIX'ing this bug and opening a new one to track the sentry issue Dave talks about. 

Thoughts?
No longer blocks: 459919
A year without disagreements, marking WONTFIX.
Status: REOPENED → RESOLVED
Closed: 17 years ago15 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.