So, This might be filed in wrong place, but tonight I saw a handful of: _mysql_exceptions.OperationalError: (1040, 'Too many connections') from a few masters. I suspect this may be something we can alleviate somehow on the DB side.. it *could* also be related to many queries being blocked on a long running query or two, due to the nature of these DBs, but I wanted to file it before I forgot it.
HRm, we didn't get paged about that, so if there were too many connection it was transient. We do need to build a new buildbot slave, though, so let's turn this into that.
I have built the new buildbot slave. Hopefully this helps. However, I did check out variables and it looks like at some point in time, the master server hit max connections. Obviously adding a slave won't help... But I also noticed that the master server is serving connections from buildbot_reader, too....shouldn't those be sent to the slave only?
hit it on 12 seperate masters tonight (was ~3 or 4 yesterday) as well. re: "But I also noticed that the master server is serving connections from buildbot_reader, too....shouldn't those be sent to the slave only?" I forget the bug, but *you* made it part of the read-only pool as well, when we were load constrained. So if we have a new slave now, we should drop it from the reader pool
(In reply to Justin Wood (:Callek) from comment #3) > hit it on 12 seperate masters tonight (was ~3 or 4 yesterday) as well. Did I say 12, I meant something like ~24 (new mail flew in while I typed that, apparently google is delaying stuff from our relays by up to 10 min again)
OK, buildbot master is not in the read pool, 2 buildbot slaves are. There are no connections from buildbot_reader. :D I think there's nothing left in this bug? Can we resolve?
Will file a new if this hits again