Closed
Bug 613620
Opened 14 years ago
Closed 13 years ago
Would like to be able to specify a backup region for a given region
Categories
(Webtools :: Bouncer, defect, P2)
Webtools
Bouncer
Tracking
(Not tracked)
RESOLVED
FIXED
2.0
People
(Reporter: justdave, Assigned: brandon)
References
Details
When we set a region to less than 100% throttle, the remaining traffic gets sent to the global pool. I'd like the option to specify a "fallback region" other than the global pool for this use. For example, we have a *lot* of mirrors (probably enough to actually handle the global traffic) just in Europe, whereas North America has much fewer mirrors than the number required for the amount of users here. We'd get a lot better user experience if the failover from North America all went to Europe than if it went into the global pool.
Reporter | ||
Comment 1•14 years ago
|
||
And as another example for this one, we have quite a number of single-country regions now because of countries that have very good internal infrastructure, but have limited access to the outside world. Those countries would be better served by having their failover traffic go to the parent continent region or in some cases a country region for a neighboring country that they have better connectivity to.
Reporter | ||
Comment 2•14 years ago
|
||
This is high on the user experience wishlist. We gets complaints when people in North America have to download from a mirror in Cambodia, for example. :)
Severity: normal → major
Priority: -- → P2
Updated•14 years ago
|
Assignee: nobody → anthony
OS: Mac OS X → All
Hardware: x86 → All
Updated•14 years ago
|
Whiteboard: [Rik Q3]
Comment 3•14 years ago
|
||
This would also fix the problem we created in bug 646076, where we tried to create an internal mirror, and ended up sending global traffic to it.
Comment 5•13 years ago
|
||
Any update? I ask because of bug#646076.
Comment 6•13 years ago
|
||
QA is also waiting for this bug to be fixed. With a backup region and the bouncer set to internal mirrors we could reduce the time for our update tests during release testing drastically. Thanks for any update on this bug.
Comment 7•13 years ago
|
||
I've finally found the time to tackle some Bouncer bugs lately. I'm gonna try to come up with a first pass by the end of this week.
Comment 8•13 years ago
|
||
Any update here? This is holding up a cascade of build network isolation bugs for releng/IT.
Comment 9•13 years ago
|
||
(In reply to Chris Cooper [:coop] from comment #8)
> Any update here? This is holding up a cascade of build network isolation
> bugs for releng/IT.
ping?
Comment 10•13 years ago
|
||
Anthony is looking forward to working on this but still has a few bugs to tie up on his current project before he can get started here. I expect it to be another weekish before he can have a serious look at this. Thanks for bearing with us!
Comment 11•13 years ago
|
||
We've been blocking bug 617414 on this for nearly 6 months now. What's the latest?
Comment 12•13 years ago
|
||
QA is also blocked on it to be able to drastically speed-up the update tests for releases (external mirrors vs. internal servers). We would kindly like to get some positive feedback and that it can be implemented in the near future.
Comment 13•13 years ago
|
||
(In reply to Ben Hearsum [:bhearsum] from comment #11)
> We've been blocking bug 617414 on this for nearly 6 months now. What's the
> latest?
per offline discussion with LauraT, she will investigate and get back to us with a timeline.
Comment 14•13 years ago
|
||
(In reply to John O'Duinn [:joduinn] from comment #13)
> per offline discussion with LauraT, she will investigate and get back to us
> with a timeline.
LauraT, do you have an update here? Would be great to see a timeline.
Comment 15•13 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #14)
> (In reply to John O'Duinn [:joduinn] from comment #13)
> > per offline discussion with LauraT, she will investigate and get back to us
> > with a timeline.
>
> LauraT, do you have an update here? Would be great to see a timeline.
rik/LauraT: ping?
Comment 16•13 years ago
|
||
All right, sorry for the delay. Sitting down with wenzel and Rik tomorrow to get briefed. After that it's probably a week of work + a week review/test/QA. Let's commit to having this ready to go in about two weeks' time.
Comment 17•13 years ago
|
||
Assigning this to Brandon for now, he and Laura got this.
Assignee: anthony → bsavage
Whiteboard: [Rik Q3]
Comment 18•13 years ago
|
||
(In reply to Laura Thomson :laura from comment #16)
> All right, sorry for the delay. Sitting down with wenzel and Rik tomorrow
> to get briefed. After that it's probably a week of work + a week
> review/test/QA. Let's commit to having this ready to go in about two weeks'
> time.
(In reply to Fred Wenzel [:wenzel] from comment #17)
> Assigning this to Brandon for now, he and Laura got this.
ping - any revised ETA?
Comment 19•13 years ago
|
||
Brandon's been working on this, and is not done yet. He's out until Tuesday but he may see this and give us an ETA before then.
Assignee | ||
Comment 20•13 years ago
|
||
I worked on this before I left but didn't get quite finished. It'll be done middle of next week when I return.
Comment 21•13 years ago
|
||
(In reply to Brandon Savage [:brandon] from comment #20)
> I worked on this before I left but didn't get quite finished. It'll be done
> middle of next week when I return.
cool, thanks Brandon. We'll plan work on our side accordingly.
Assignee | ||
Comment 22•13 years ago
|
||
Just to be clear, do you want a single backup for a region or the ability to specify multiple backups? This affects how I complete my work.
Assignee | ||
Comment 23•13 years ago
|
||
Pull request here, for single region fallback. https://github.com/fwenzel/tuxedo/pull/8
Comment 24•13 years ago
|
||
(In reply to Brandon Savage [:brandon] from comment #22)
> Just to be clear, do you want a single backup for a region or the ability to
> specify multiple backups? This affects how I complete my work.
Dave, what say you?
Assignee | ||
Comment 25•13 years ago
|
||
Not to influence you, but there is already a patch for specifying a single backup region. Multiple backup regions would take longer.
Reporter | ||
Comment 26•13 years ago
|
||
Single backup works for 90% of our use cases.
Comment 27•13 years ago
|
||
(In reply to Ben Hearsum [:bhearsum] from comment #24)
> (In reply to Brandon Savage [:brandon] from comment #22)
> > Just to be clear, do you want a single backup for a region or the ability to
> > specify multiple backups? This affects how I complete my work.
>
> Dave, what say you?
(In reply to Brandon Savage [:brandon] from comment #25)
> Not to influence you, but there is already a patch for specifying a single
> backup region. Multiple backup regions would take longer.
(In reply to Dave Miller [:justdave] from comment #26)
> Single backup works for 90% of our use cases.
justdave: ok, but is this enough to unblock bug#646076?
Comment 28•13 years ago
|
||
(In reply to John O'Duinn [:joduinn] from comment #27)
> (In reply to Ben Hearsum [:bhearsum] from comment #24)
> > (In reply to Brandon Savage [:brandon] from comment #22)
> > > Just to be clear, do you want a single backup for a region or the ability to
> > > specify multiple backups? This affects how I complete my work.
> >
> > Dave, what say you?
>
> (In reply to Brandon Savage [:brandon] from comment #25)
> > Not to influence you, but there is already a patch for specifying a single
> > backup region. Multiple backup regions would take longer.
>
> (In reply to Dave Miller [:justdave] from comment #26)
> > Single backup works for 90% of our use cases.
>
> justdave: ok, but is this enough to unblock bug#646076?
Brandon, does having a single backup mean that IP blocks can be completely restricted from going outside of their primary/backup, even if both are heavily loaded?
Assignee | ||
Comment 29•13 years ago
|
||
The way that the system currently works is that when a request is throttled, the request is forwarded to the global pool. Under the new code I've written, if a request is throttled and there exists a backup region, the request is forwarded there. If the request does not have a backup region, or no suitable mirrors for the backup region can be found, the request is forwarded to the global pool as a sanity check.
Bouncer makes no examination or distinction about load when it calculates a mirror. It doesn't test the mirror to find out how loaded it is. While an acceptable backup with a returned mirror would in fact prevent it from going outside the primary/secondary, it wouldn't be related to load.
Comment 30•13 years ago
|
||
(In reply to Brandon Savage [:brandon] from comment #29)
> The way that the system currently works is that when a request is throttled,
> the request is forwarded to the global pool. Under the new code I've
> written, if a request is throttled and there exists a backup region, the
> request is forwarded there. If the request does not have a backup region, or
> no suitable mirrors for the backup region can be found, the request is
> forwarded to the global pool as a sanity check.
>
> Bouncer makes no examination or distinction about load when it calculates a
> mirror. It doesn't test the mirror to find out how loaded it is. While an
> acceptable backup with a returned mirror would in fact prevent it from going
> outside the primary/secondary, it wouldn't be related to load.
I had a long conversation with Brandon about this, and I don't think the current implementation will address the use case we want for bug 646076. According to him, even with a backup specified requests will fall back to the global pool if an acceptable mirror isn't found in the primary or backup region. That's definitely the right behaviour for the real world (IMO), but for internal mirror purposes having some sort of way to make certain IP blocks or regions not fall back to the global region would be good. Maybe that's a follow-up bug, though?
Assignee | ||
Comment 31•13 years ago
|
||
Pull request, including the new feature in Comment 30, here: https://github.com/fwenzel/tuxedo/pull/9/commits Waiting for review and testing.
Comment 32•13 years ago
|
||
(In reply to Brandon Savage [:brandon] from comment #31)
> Pull request, including the new feature in Comment 30, here:
> https://github.com/fwenzel/tuxedo/pull/9/commits Waiting for review and
> testing.
I don't know the Tuxedo code, but it looks like this will do what we need, whee! Thank you!
Assignee | ||
Comment 33•13 years ago
|
||
The code changes (which were r+'ed) include a flag, which when set, prevents global pool failover.
Comment 34•13 years ago
|
||
(In reply to Brandon Savage [:brandon] from comment #33)
> The code changes (which were r+'ed) include a flag, which when set, prevents
> global pool failover.
Sweet to see that r+'d patch landed in github! Thank you Brandon.
So... of course the next question is: how long will this take to get tested and deployed to production?
Comment 35•13 years ago
|
||
Brandon, would you mind giving me a reply if that also helps Mozilla QA with the update testing of Firefox as mentioned in comment 6? I assume I should file a new bug which depends on the resolution of that one.
Comment 36•13 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #35)
> Brandon, would you mind giving me a reply if that also helps Mozilla QA with
> the update testing of Firefox as mentioned in comment 6? I assume I should
> file a new bug which depends on the resolution of that one.
AFAICT, your use case is the same as RelEng's.
Comment 37•13 years ago
|
||
(In reply to John O'Duinn [:joduinn] from comment #34)
> (In reply to Brandon Savage [:brandon] from comment #33)
> > The code changes (which were r+'ed) include a flag, which when set, prevents
> > global pool failover.
>
> Sweet to see that r+'d patch landed in github! Thank you Brandon.
>
> So... of course the next question is: how long will this take to get tested
> and deployed to production?
ping?
Assignee | ||
Comment 38•13 years ago
|
||
Yes, this will fit the use case in Comment 6.
Comment 39•13 years ago
|
||
John: I'll merge some code for bug 700482 today and open a bug to get this deployed.
Comment 40•13 years ago
|
||
(In reply to Anthony Ricaud (:rik) from comment #39)
> John: I'll merge some code for bug 700482 today and open a bug to get this
> deployed.
:rik, Thanks for that. I see bug#740002 is now closed as FIXED. Anything left to do here?
Assignee | ||
Comment 41•13 years ago
|
||
This still needs to be staged, QA'ed, and released. Fred and Rik probably have a better idea than I do about how to get that process going.
Comment 42•13 years ago
|
||
This is staged. You can check on https://tuxedo.stage.mozilla.com/. I can create accounts for the admin if you need to test the features.
We need to test this since it strongly impacts the download experience.
After you've tested this, we'll open a bug for pushing to production.
Comment 43•13 years ago
|
||
Thanks to Rik I've got an account on tuxedo stage now. I've set-up the region/mirror/ip blocks/country like we want them for bug 646076, and once releng-mirror01 is up and running again (bug 741774) I can verify the disable-global-fallback part of this bug.
Assignee | ||
Updated•13 years ago
|
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•12 years ago
|
Target Milestone: --- → 2.0
Web QA has temporarily deferred testing on this functionality, and tested heavily (positive/negative) around the changes that we shipped, tonight, to Bouncer 2.0.
We will revisit this at a later date, TBD, and may need Ben's help :-)
You need to log in
before you can comment on or make changes to this bug.
Description
•