Closed
Bug 789058
Opened 12 years ago
Closed 12 years ago
New script for Nagios checking database checksums
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: scabral, Assigned: ashish)
References
Details
Attachments
(1 file, 2 obsolete files)
4.47 KB,
text/x-perl-script
|
Details |
I have modified the code at: https://github.com/palominodb/PalominoDB-Public-Code-Repository/blob/master/nagios/table_checksums/check_table_checksums.pl To work with updated versions of pt-table-checksum, and to ignore system databases. Specifically, I have: - taken "host" out of the SELECT statements for the checksum, as host is not captured by default from pt-table-checksum - added an option, --ignore-sys or -b, to ignore system databases (hard-coded as mysql, INFORMATION_SCHEMA, PERFORMANCE_SCHEMA). *** Ideally this would be changed to options like --ignore-dbs and --ignore-tbls, which would have comma-separated lists of dbs to ignore and tables to ignore, but that's a 'nice-to-have' and perhaps PalominoDB will do it upstream after we send these patches back to them. (but if anyone has time, I'm happy to have those features, in which case --ignore-sys becomes --ignore-dbs mysql,INFORMATION_SCHEMA,PERFORMANCE_SCHEMA If you just want to see the diff, you can download the original script at https://raw.github.com/palominodb/PalominoDB-Public-Code-Repository/master/nagios/table_checksums/check_table_checksums.pl to compare.
Comment 1•12 years ago
|
||
tl;dr: Looks fine to me, nothing glaring, and I'm not going to nitpick style. Line 13 says unknown returns 2. It should return 3, but that's an upstream bug. Line numbers refer to the upgraded script. Nitpicking: Line 144 and 158 print your and_clause variable, which could make this chattier than necessary, particularly on 144 where it's in a newline. Since it's a nagios check, you might not ever see it, so if it's important, you might need to shuffle the failure case printout around. If you wanted to do ignore dbs as a customizable variable, the secret sauce is up around line 38, where they say vArNaMe=s, meaning varname expects a 's'tring associated with it. That's easy. The tougher part, of course, is to let Little Bobby Tables check the inputs. They don't seem to be doing that anyway on the --table variable, so, maybe they're considering it skippable.
Reporter | ||
Comment 2•12 years ago
|
||
Thanx about Line 13 :D um, 2 = critical right? As for the chattiness, the idea is to make sure when people see "OK" or "WARNING/CRITICAL" they know that the check didn't include those DBs. So OK might be "OK but we didn't check x,y,z" Is there a magic letter for a comma-separated list and an easy way to go through each one to quote it? e.g. --ignore-dbs a,b,c needs to turn into AND db not in ('a','b','c') So that's the hard part IMO. I'm not worried about SQL injection, because this is a nagios check, and if you can screw around with our nagios, it's over anyway.
Comment 3•12 years ago
|
||
'2' is critical. OK, ignore my chattiness concern. Hadn't realized the one-line-only limit was lifted. As to option processing for DB stuff, stealing from http://perldoc.perl.org/Getopt/Long.html#Options-with-multiple-values : gcox@fibbsbozza:~$ ./test.pl gcox@fibbsbozza:~$ ./test.pl --ignore-db foo --ignore-db bar --ignore-db baz AND db not in ('foo','bar','baz') gcox@fibbsbozza:~$ ./test.pl --ignore-db foo,bar --ignore-db baz AND db not in ('foo','bar','baz') gcox@fibbsbozza:~$ ./test.pl --ignore-db foo,bar,baz AND db not in ('foo','bar','baz') gcox@fibbsbozza:~$ cat test.pl #!/usr/bin/perl -w use strict; use Getopt::Long; my @ignore_dbs = (); GetOptions ("ignore-dbs=s" => \@ignore_dbs); @ignore_dbs = split(/,/, join(',',@ignore_dbs)); my $and_clause = @ignore_dbs ? ' AND db not in ('.join(',',map {"'$_'"} @ignore_dbs).')' : ''; print $and_clause."\n" if ($and_clause);
Reporter | ||
Comment 4•12 years ago
|
||
Well, that's good. It's a bit more work, but it's good work for tomorrow's "no change Friday"
Reporter | ||
Comment 5•12 years ago
|
||
OK, I took out the ignore-sys and put in an ignore-db option. Anything glaringly wrong here? I'm attaching the new file, and here's the diff from the previous version. [root@tp-bugs01-master01 bin]# diff working.pl check_table_checksums.pl 26,27d25 < my $ignore_sys = 0; < 30c28 < my $ignore_dbs="'mysql','INFORMATION_SCHEMA','PERFORMANCE_SCHEMA'"; --- > my $ignore_dbs=''; 42c40 < 'ignore-sys|b' => \$ignore_sys, --- > 'ignore-db|b=s' => \$ignore_dbs, 59c57 < --ignore-sys,-b Ignore system databases/tables like mysql, INFORMATION_SCHEMA, etc. --- > --ignore-db,-b Ignore databases 76,77c74,78 < $and_clause = $ignore_sys ? " AND db not in ($ignore_dbs) " : "" ; < #print "$ignore_sys is ignore-sys\n $and_clause = and clause"; --- > if ($ignore_dbs) { > $and_clause .= "AND db NOT IN ('"; > $and_clause .=join("','",split(/,/,$ignore_dbs)); > $and_clause .= "')"; > }
Reporter | ||
Comment 6•12 years ago
|
||
Comment 7•12 years ago
|
||
All I have is nitpicking (Maybe better help on the syntax of the -b option. Maybe a FIXME comment for the future that the input isn't injection-sanitized). I say "Ship it."
Reporter | ||
Comment 8•12 years ago
|
||
Help now says: --ignore-db,-b Ignore these databases (comma separated list) And I changed use constant UNKNOWN => 2; to use constant UNKNOWN => 3;
Attachment #658880 -
Attachment is obsolete: true
Attachment #659263 -
Attachment is obsolete: true
Reporter | ||
Comment 9•12 years ago
|
||
Please put the attached check into Nagios: It should be run with the following options: check_table_checksums.pl --user nagiosdaemon --password **ELIDED** -T percona.checksums -I 24 -H $HOSTNAME$ -b mysql,INFORMATION_SCHEMA,PERFORMANCE_SCHEMA At first this should be run on against the production phoenix bugzilla servers: tp-bugs01-master01.phx.mozilla.com tp-bugs01-slave01.phx.mozilla.com tp-bugs01-slave02.phx.mozilla.com tp-bugs01-slave03.phx.mozilla.com I would make a group (service group?) called mysql-checksum to put this in, because we're going to be adding more machines in the future. This check should e-mail infra-dbnotices, but NOT page. It can be run every few hours; there's no need to run it every 5 minutes. (please don't add this check on a Friday)
Summary: Verify perl script to check checksums on slaves in Nagios → New script for Nagios checking database checksums
Reporter | ||
Comment 10•12 years ago
|
||
Upping the importance, this is the last step in a q3 goal for the DB team.
Severity: normal → major
Assignee | ||
Updated•12 years ago
|
Assignee: server-ops → ashish
Assignee | ||
Comment 11•12 years ago
|
||
Added these to Nagios: https://nagios.mozilla.org/phx1/cgi-bin/status.cgi?hostgroup=mysql-checksum&style=detail
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 12•12 years ago
|
||
Please make sure that: -b mysql,INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,percona is in the check (this should be configurable, but probably won't change a ton) and This check should e-mail infra-dbnotices, but NOT page. It can be run every few hours; there's no need to run it every 5 minutes.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 13•12 years ago
|
||
all better, thanx for fixing ashish!
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Updated•9 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•