Closed Bug 786220 Opened 12 years ago Closed 12 years ago

stop the dbnotices about missing slow logs

Categories

(Data & BI Services Team :: DB: MySQL, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

Details

I strongly suspect that these are due to MySQL not having logged any slow queries since the last run of the script - in which case we can simply not try to copy it.

If that's the case, this is just adding a conditional to the script.

Sheeri, can you confirm my suspicion?
I think if we want to copy the slow logs, they should be on, so I'd look as to why the slow logs aren't on first......(e.g. for addons we want the slow query logs on)
They are on, but does MySQL create an empty file immediately, or wait until a slow query comes along?  I guess I can answer that myself :)
Yes, it does create the file immediately.  So these hosts must have slow logs off, or some other issue.
Here's a summary of last night's cronspam:

ENOENT EPERM  LOST CONN  NO MYSQL TIME (PST)  HOST(S)
              Y                   4:00        addons1.db.phx1
              Y                   4:00        addons2.db.phx1
Y                                 4:00        addons3.db.phx1
              Y                   4:00        addons4.db.phx1
              Y                   4:00        addons5.db.phx1
              Y                   4:00        addons6.db.phx1
              Y                   4:00        addons2.stage.db.phx1
              Y                   4:00        addons3.stage.db.phx1
              Y                   4:00        addons4.stage.db.phx1
              Y                   4:00        addons5.stage.db.phx1
Y                                 4:00        b1-db2.db.scl3
Y                                 4:00        b1-db1.db.scl3
Y                        Y        4:00        backup2.db.phx1
       Y                          4:00        backup1.db.scl3
Y                                 21:00       bugzilla1.db.scl3
Y                                 21:00       bugzilla2.stage.db.scl3
Y                                 4:00        bugzilla2.stage.db.scl3
Y                                 21:00       bugzilla3.db.scl3
Y                                 21:00       bugzilla4.db.scl3
Y      Y                          4:00        builder-addons2.db.scl3
Y                                 4:00        db1.stage.bugs.scl3
Y                                 4:00        db2.stage.bugs.scl3
Y                                 4:00        desktop-disk-encrypt.dmz.scl3
Y                                 4:00        generic1.db.phx1
Y                                 4:00        generic2.db.phx1
Y                                 4:00        mysql1.metrics.scl3
Y                                 4:00        node36.seamicro.phx1
Y                                 4:00        node37.seamicro.phx1
Y                                 4:00        node38.seamicro.phx1
Y                                 4:00        node39.seamicro.phx1

ENOENT:
  mv: cannot stat `/var/log/mysql/mysql-slow.log': No such file or directory

EPERM:
  ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)

LOST CONN:
  lost connection

NO MYSQL:
  ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)
Assignee: server-ops-database → dustin
Y                                 4:00        addons3.db.phx1
 was probably my fault - most like slow logs weren't enabled when I restarted it in bug 717638.  Fixed now.

Confirmed my.cnf settings and ran 'set global slow_query_log=ON' on the following:

b1-db2.db.scl3
b1-db1.db.scl3
db1.stage.bugs.scl3 (+)
db2.stage.bugs.scl3
bugzilla2.stage.db.scl3 (+)
bugzilla1.db.scl3 (+)
bugzilla2.db.scl3 (+)
bugzilla3.db.scl3 (+)
bugzilla4.db.scl3 (+)
desktop-disk-encrypt.dmz.scl3 (+) (*)
generic1.db.phx1
generic2.db.phx1
mysql1.metrics.scl3
node36.seamicro.phx1
node37.seamicro.phx1
node38.seamicro.phx1
node39.seamicro.phx1

(+) created and chown'd /var/log/mysql
(*) ran 'flush logs' to create logfile

NOTE: the 21:00's are 4am UTC, and only on the bugzilla prod db hosts.

I strongly suspect something odd is going on here -- all of these had slow_query_log=ON in their my.cnf, but the runtime variable was OFF.  I'd like to pull one from prod, restart mysql, and see if the variable "sticks".

Still todo:
 * examine lost connection errors on addons hosts (is that from SSH?)
 * look at permissions error on builder-addons2.db.scl3
 * talk to sheeri about backup*, which have other work going on too
The lost connections are due, I think, to webdev1.db.scl3 using the known-to-be-crappy LDAP extension for SSH.  So, open questions for sheeri:

 * Why would slow_query_log=OFF on all of the hosts above, where my.cnf says ON?  None of the staging/dev hosts I changed above seem to be in n+1 clusters where I can safely remove one to restart and see what happens to the variable.

 * Who should I talk to about changing the SSH config on webdev1? or sending those files elsewhere?

 * How should I address the permissions problem on builder-addons2?  It doesn't have a ~/.my.cnf.  Should I set a random root password and add the .my.cnf?

 * What's the situation with backup*?
On addons3.stage.db.phx1:

mysql> show variables like 'slow%';
+---------------------+-------------------------------+
| Variable_name       | Value                         |
+---------------------+-------------------------------+
| slow_launch_time    | 2                             |
| slow_query_log      | OFF                           |
| slow_query_log_file | /var/log/mysql/mysql-slow.log |
+---------------------+-------------------------------+
3 rows in set (0.00 sec)

..run puppet, with diffs to my.cnf:
 # Here you can see queries with especially long duration
+log_output = FILE
+slow_query_log = ON
 slow_query_log_file        = /var/log/mysql/mysql-slow.log
 long_query_time=2
-slow_query_log = ON
-log_output = FILE

..stop (move files around) and start mysql..

mysql> show variables like 'slow%';
+---------------------+-------------------------------+
| Variable_name       | Value                         |
+---------------------+-------------------------------+
| slow_launch_time    | 2                             |
| slow_query_log      | OFF                           |
| slow_query_log_file | /var/log/mysql/mysql-slow.log |
+---------------------+-------------------------------+
3 rows in set (0.00 sec)

so the value in my.cnf isn't being respected, which is why all of the query logs were off in the previous comment.  What's going on here?
Does /var/log/mysql exist? Is it chown mysql:mysql? 

Anything in the error log? If this is MySQL 5.1 or higher (and IIRC it is), tail -f the error log and login to mysql and do "set global slow_query_log=ON;" and see if there's any warnings, etc.
Also do a "FLUSH LOGS;" so the slow query log will be created. If you really want to test it, check out the value of long_query_time (probably 2) and do a sleep like this:

mysql> do sleep(3);

And that query should be put in the slow query log when it's done.
(In reply to Sheeri Cabral [:sheeri] from comment #8)
> Does /var/log/mysql exist? Is it chown mysql:mysql? 

[root@addons3.stage.db.phx1 mysql]# ls -ald /var/log/mysql
drwxr-xr-x 2 mysql mysql 4096 Aug 22 12:18 /var/log/mysql

> Anything in the error log?

Nothing in the error log on startup.

(In reply to Sheeri Cabral [:sheeri] from comment #9)
> Also do a "FLUSH LOGS;" so the slow query log will be created. If you really
> want to test it, check out the value of long_query_time (probably 2) and do
> a sleep like this:

FLUSH LOGS didn't make the log appear (which isn't surprising since it's disabled).

> If this is MySQL 5.1 or higher (and IIRC it is),
> tail -f the error log and login to mysql and do "set global
> slow_query_log=ON;" and see if there's any warnings, etc.

This works fine, creates the logfile, and even complains to the console if the directory doesn't exist.  So the issue is that it doesn't get enabled properly at MySQL startup.

mysql> set global slow_query_log=ON;
Query OK, 0 rows affected (0.01 sec)

mysql> Bye
[root@addons3.stage.db.phx1 mysql]# ls -al /var/log/mysql
total 12
drwxr-xr-x   2 mysql mysql 4096 Aug 30 09:01 .
drwxr-xr-x. 11 root  root  4096 Aug 26 03:22 ..
-rw-rw----   1 mysql mysql  177 Aug 30 09:01 mysql-slow.log

Is there some error in my.cnf that's causing this setting to be ignored?
Make sure we're setting ALL the associated variables, including log_output, slow_query_log_file and slow_query_log.

CAn you copy/paste from the my.cnf files what those are? maybe I can spot a typo or something.
They're all in comment 7.  I'm hoping it's whitespace around = or something equally stupid.
ah, yeah, they all look right to me. I guess check bugs.mysql.com? Make a bug if one doesn't exist.....
I didn't find anything, so I filed http://bugs.mysql.com/bug.php?id=66713.

I'll look into adding something for this to puppet.
OK, puppet now enforces this variable's value (and that of slow_query_log_file and log_output) at runtime.

I'll re-evaluate the cronspam tomorrow morning.
The MySQL bug indicates the value has to be 0 or 1 in the config file.  Le Sigh.
I committed a fix to my.cnf, but left the mysql2::variable call in.
OK, I'm going to call this fixed - the remainder is in fixing the sync script, which is another bug entirely.  Note that not all of these will go away until everything uses mysql2::server.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
(In reply to Dustin J. Mitchell [:dustin] from comment #16)
> The MySQL bug indicates the value has to be 0 or 1 in the config file.  Le
> Sigh.

Instead of ON, you mean? or a PATH? and which particular variable is that?
slow_query_log=ON
 is ignored
slow_query_log=1
 turns the slow query log on.
Good to know! Obviously, I didn't know that.
Product: mozilla.org → Data & BI Services Team
You need to log in before you can comment on or make changes to this bug.