Closed
Bug 406267
Opened 18 years ago
Closed 17 years ago
Releases_Lag nagios test should automatically add/remove servers from DNS
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: justdave, Assigned: reed)
References
Details
We currently have a nagios test which monitors the servers in the releases.mozilla.org round-robin and pages when they get too far behind or become unreachable or too slow. It also provides an overview and statistics at https://nagios.mozilla.org/ftplag/
We should fix this test so that it automatically adds/removes the mirrors from the pool instead of paging about them, and only pages when less than a certain percentage of them are left in the pool (say half).
This would reduce work for the sysadmins, which still letting us know when there are real problems.
| Reporter | ||
Updated•18 years ago
|
Assignee: server-ops → nobody
Component: Server Operations → Server Operations: Projects
Comment 1•18 years ago
|
||
Can we just put this behind the netscaler?
| Reporter | ||
Comment 2•18 years ago
|
||
Putting it behind the netscaler would end up routing all the traffic through us, not sure if we have that kind of bandwidth actually. We're talking about roughly 8 Gbit of traffic during releases here, I'm guessing.
Comment 3•18 years ago
|
||
Good point.
Comment 4•18 years ago
|
||
I wonder if we could use the global load balancing stuff to do this.
Comment 5•18 years ago
|
||
GSLB on the Netscaler requires that the IPs returned in a query live behind the Netscaler.
You could probably do something with bind to query and external something that had an array of "up" boxes. Actually, I bet something like http://mysql-bind.sourceforge.net/ would be something to look at and would be an interesting candidate for NDB !
Comment 6•18 years ago
|
||
Given we only have 4-5 hosts in releases, a far simpler approach might be just to have a script re-generate the mozilla.org-soa file and update the serial # every 15 minutes...adding mysql seems to add a lot of complexity for a 4 element array.
| Reporter | ||
Comment 7•18 years ago
|
||
There's 14 actually, not 4.
| Reporter | ||
Comment 8•18 years ago
|
||
And yeah, that was what I was kind of thinking of if I ever got time to mess with it was to take the releases section of the mozilla.org-ftp zone file and make it an $INCLUDE to a file that was generated by the monitoring script, and have it update the serial in the soa file every time it modified it.
Updated•18 years ago
|
Assignee: nobody → mrz
Whiteboard: pending geodns
Updated•18 years ago
|
Assignee: mrz → reed
| Assignee | ||
Updated•17 years ago
|
Component: Server Operations: Projects → Server Operations
Whiteboard: pending geodns
| Assignee | ||
Updated•17 years ago
|
Whiteboard: Waiting on geodns.pl mods from xb95
Comment 10•17 years ago
|
||
So, the script already allowed enable/disable by IP. I've updated it so it can now operate on hostnames (descriptions) as well:
[root@geodns01 ~]# ./geodns.pl --list | grep sand
45 ) US GLOBAL irc-mozilla-org.geo.mozilla.com A 63.245.208.159 60 sand.mozilla.org
[root@geodns01 ~]# ./geodns.pl --enable sand.mozilla.org
No change necessary.
[root@geodns01 ~]# ./geodns.pl --disable sand.mozilla.org
Disabling 45.
Rebuilding views...
building CN from CC = CN
building US from region = North America
building GLOBAL from global entries
building JP from global entries (no enabled, matching entries found)
building EU from region = Europe
Rebuilt views.
If you try to specify something not distinct enough, it will error:
[root@geodns01 ~]# ./geodns.pl --enable mozilla
Found more than one matching description.
Does this work?
| Assignee | ||
Comment 11•17 years ago
|
||
(In reply to comment #10)
> Does this work?
Looking good, but what about descriptions with spaces in them?
[root@geodns01 ~]# ./geodns.pl --enable "trillian.gtlib.gatech.edu - ipv6"
Unable to find host by ID, IP, or description.
Comment 12•17 years ago
|
||
I assumed your script wouldn't know to put " - ipv6" on there as the IP would resolve to just trillian.gtlib.gatech.edu. In effect it's doing a match against he first word.
Do you need it to support matching the entire phrase? That seems like it'd be something a human would type. In that case you presumably already have the id and can use that?
| Assignee | ||
Comment 13•17 years ago
|
||
(In reply to comment #12)
> Do you need it to support matching the entire phrase? That seems like it'd be
> something a human would type. In that case you presumably already have the id
> and can use that?
90 ) US GLOBAL releases.geo.mozilla.com A 128.61.111.9 60 trillian.gtlib.gatech.edu - ipv4
91 ) US GLOBAL releases.geo.mozilla.com AAAA 2610:148:fd80:3d6f:209:3dff:fe12:7bf9 60 trillian.gtlib.gatech.edu - ipv6
I need to be able to differentiate those two descriptions... "trillian.gtlib.gatech.edu - ipv4" and "trillian.gtlib.gatech.edu - ipv6".
Comment 14•17 years ago
|
||
Alright, fixed it so it works as expected now:
[root@geodns01 ~]# ./geodns.pl --disable "trillian.gtlib.gatech.edu - ipv6"
Disabling 91.
Rebuilding views...
building CN from CC = CN
building US from region = North America
building GLOBAL from global entries
building JP from global entries (no enabled, matching entries found)
building EU from region = Europe
Rebuilt views.
I also fixed a bug in that it was only selecting A records, now it includes AAAA and CNAME as well.
| Assignee | ||
Updated•17 years ago
|
Whiteboard: Waiting on geodns.pl mods from xb95 → Working on new script to use xb95's new features
| Assignee | ||
Updated•17 years ago
|
Whiteboard: Working on new script to use xb95's new features → Script made and working; need to add count percentage limit check
| Assignee | ||
Comment 15•17 years ago
|
||
Pages critical if percentage of active servers goes under 40%.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Whiteboard: Script made and working; need to add count percentage limit check
Updated•11 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•