Closed
Bug 406267
Opened 17 years ago
Closed 16 years ago
Releases_Lag nagios test should automatically add/remove servers from DNS
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: justdave, Assigned: reed)
References
Details
We currently have a nagios test which monitors the servers in the releases.mozilla.org round-robin and pages when they get too far behind or become unreachable or too slow. It also provides an overview and statistics at https://nagios.mozilla.org/ftplag/ We should fix this test so that it automatically adds/removes the mirrors from the pool instead of paging about them, and only pages when less than a certain percentage of them are left in the pool (say half). This would reduce work for the sysadmins, which still letting us know when there are real problems.
Reporter | ||
Updated•17 years ago
|
Assignee: server-ops → nobody
Component: Server Operations → Server Operations: Projects
Comment 1•17 years ago
|
||
Can we just put this behind the netscaler?
Reporter | ||
Comment 2•17 years ago
|
||
Putting it behind the netscaler would end up routing all the traffic through us, not sure if we have that kind of bandwidth actually. We're talking about roughly 8 Gbit of traffic during releases here, I'm guessing.
Comment 3•17 years ago
|
||
Good point.
Comment 4•17 years ago
|
||
I wonder if we could use the global load balancing stuff to do this.
Comment 5•17 years ago
|
||
GSLB on the Netscaler requires that the IPs returned in a query live behind the Netscaler. You could probably do something with bind to query and external something that had an array of "up" boxes. Actually, I bet something like http://mysql-bind.sourceforge.net/ would be something to look at and would be an interesting candidate for NDB !
Comment 6•17 years ago
|
||
Given we only have 4-5 hosts in releases, a far simpler approach might be just to have a script re-generate the mozilla.org-soa file and update the serial # every 15 minutes...adding mysql seems to add a lot of complexity for a 4 element array.
Reporter | ||
Comment 7•17 years ago
|
||
There's 14 actually, not 4.
Reporter | ||
Comment 8•17 years ago
|
||
And yeah, that was what I was kind of thinking of if I ever got time to mess with it was to take the releases section of the mozilla.org-ftp zone file and make it an $INCLUDE to a file that was generated by the monitoring script, and have it update the serial in the soa file every time it modified it.
Updated•16 years ago
|
Assignee: nobody → mrz
Whiteboard: pending geodns
Updated•16 years ago
|
Assignee: mrz → reed
Assignee | ||
Updated•16 years ago
|
Component: Server Operations: Projects → Server Operations
Whiteboard: pending geodns
Assignee | ||
Updated•16 years ago
|
Whiteboard: Waiting on geodns.pl mods from xb95
Comment 10•16 years ago
|
||
So, the script already allowed enable/disable by IP. I've updated it so it can now operate on hostnames (descriptions) as well: [root@geodns01 ~]# ./geodns.pl --list | grep sand 45 ) US GLOBAL irc-mozilla-org.geo.mozilla.com A 63.245.208.159 60 sand.mozilla.org [root@geodns01 ~]# ./geodns.pl --enable sand.mozilla.org No change necessary. [root@geodns01 ~]# ./geodns.pl --disable sand.mozilla.org Disabling 45. Rebuilding views... building CN from CC = CN building US from region = North America building GLOBAL from global entries building JP from global entries (no enabled, matching entries found) building EU from region = Europe Rebuilt views. If you try to specify something not distinct enough, it will error: [root@geodns01 ~]# ./geodns.pl --enable mozilla Found more than one matching description. Does this work?
Assignee | ||
Comment 11•16 years ago
|
||
(In reply to comment #10) > Does this work? Looking good, but what about descriptions with spaces in them? [root@geodns01 ~]# ./geodns.pl --enable "trillian.gtlib.gatech.edu - ipv6" Unable to find host by ID, IP, or description.
Comment 12•16 years ago
|
||
I assumed your script wouldn't know to put " - ipv6" on there as the IP would resolve to just trillian.gtlib.gatech.edu. In effect it's doing a match against he first word. Do you need it to support matching the entire phrase? That seems like it'd be something a human would type. In that case you presumably already have the id and can use that?
Assignee | ||
Comment 13•16 years ago
|
||
(In reply to comment #12) > Do you need it to support matching the entire phrase? That seems like it'd be > something a human would type. In that case you presumably already have the id > and can use that? 90 ) US GLOBAL releases.geo.mozilla.com A 128.61.111.9 60 trillian.gtlib.gatech.edu - ipv4 91 ) US GLOBAL releases.geo.mozilla.com AAAA 2610:148:fd80:3d6f:209:3dff:fe12:7bf9 60 trillian.gtlib.gatech.edu - ipv6 I need to be able to differentiate those two descriptions... "trillian.gtlib.gatech.edu - ipv4" and "trillian.gtlib.gatech.edu - ipv6".
Comment 14•16 years ago
|
||
Alright, fixed it so it works as expected now: [root@geodns01 ~]# ./geodns.pl --disable "trillian.gtlib.gatech.edu - ipv6" Disabling 91. Rebuilding views... building CN from CC = CN building US from region = North America building GLOBAL from global entries building JP from global entries (no enabled, matching entries found) building EU from region = Europe Rebuilt views. I also fixed a bug in that it was only selecting A records, now it includes AAAA and CNAME as well.
Assignee | ||
Updated•16 years ago
|
Whiteboard: Waiting on geodns.pl mods from xb95 → Working on new script to use xb95's new features
Assignee | ||
Updated•16 years ago
|
Whiteboard: Working on new script to use xb95's new features → Script made and working; need to add count percentage limit check
Assignee | ||
Comment 15•16 years ago
|
||
Pages critical if percentage of active servers goes under 40%.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Whiteboard: Script made and working; need to add count percentage limit check
Updated•9 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•