modify MySQL backup nagios check to do some sanity checking around file size



6 years ago
4 years ago


(Reporter: scabral, Unassigned)





6 years ago
Interesting....our check script says:

[root@backup2.db.phx1 ~]# /data/backups/bin/do-check-backups
Backups OK: nothing too old.

[root@backup2.db.phx1 sqldumps]# ls -l tbpl_mozilla_org/
total 32
-rwx------ 1 root root 1470 Nov  5 08:32 tbpl_mozilla_org.2012.11.05.slaveinfo
-rwx------ 1 root root  487 Nov  5 08:32 tbpl_mozilla_org.2012.11.05.sql.gz
-rwx------ 1 root root 1470 Nov  6 08:32 tbpl_mozilla_org.2012.11.06.slaveinfo
-rwx------ 1 root root  488 Nov  6 08:32 tbpl_mozilla_org.2012.11.06.sql.gz
-rwx------ 1 root root 1940 Nov  7 08:32 tbpl_mozilla_org.2012.11.07.slaveinfo
-rwx------ 1 root root  487 Nov  7 08:32 tbpl_mozilla_org.2012.11.07.sql.gz
-rwx------ 1 root root 1470 Nov  8 08:33 tbpl_mozilla_org.2012.11.08.slaveinfo
-rwx------ 1 root root  487 Nov  8 08:33 tbpl_mozilla_org.2012.11.08.sql.gz

Ah, we HAVE backups but they're empty....I get it now.

I ran a manual backup with [root@backup2.db.phx1 tbpl_mozilla_org]# /data/backups/bin/db-sqldump generic

and I'm getting Out of resources - mysqldump: Error: 'Out of resources when opening file '/tmp/#sql_2edd_2.MYI' (Errcode: 24)' when trying to dump tablespaces
mysqldump: Couldn't execute 'SHOW FUNCTION STATUS WHERE Db = 'xtags_wordpress'': Out of resources when opening file '/tmp/#sql_2edd_0.MYI' (Errcode: 24) (23)


I'll file a bug and work on the issue.

Comment 1

6 years ago
And then Brian said:

If you would like some help modifying the nrpe script to catch that, I'm totally down to help!

So he's cc'd.

Comment 2

6 years ago
Also cc'ing rtucker, because he's handy with scripts.
Is there a size threshold? If filesize is less than X?

Comment 4

6 years ago
It's different for every backup, unfortunately. 

empty files and empty databases have gzip file sizes like 474, 486. databases that have a table that's empty have gzip file sizes like 893.

So I guess if <500?

But really we should compare to the previous day or days if possible.
I'm not sure the value or even the accuracy of comparing to previous days.

It might be expected that the dump is smaller today than yesterday.

I think that basing it on size is probably the way to go. We pick a number, let's say 1000 and run with it? We can easily adjust the error based on different filesizes based on the text that gets logged, and as long as we err on the side of conservative, we should only get false positives.

Comment 6

6 years ago
Sure, let's do that. I guess make the size easily configurable, so we can change it easily when needed.

When we put the check into place, while we tweak it, we'll make it e-mail only (no paging).

Comment 7

6 years ago
Let's ponder what this means with the new backups, too (backing up to Data Domain).
Product: → Data & BI Services Team


4 years ago
Last Resolved: 4 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.