Closed Bug 698314 Opened 8 years ago Closed 8 years ago
ganglia claims that l10n VMs are down, though they're not
https://ganglia.mozilla.org/sjc1/?c=Localization&m=load_one&r=hour&s=descending&hc=4&mc=2 reports that bm-l10n-dashboard01 and bm-l10n-db1 are down, though they're up. There has been a db connection error from bm-l10n-dashboard01 to bm-l10n-db1, though. The relevant snippet from my logs would be 2011-10-30 18:30:15+0000 [-] Unhandled error in Deferred: 2011-10-30 18:30:15+0000 [-] Unhandled Error Traceback (most recent call last): File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 1170, in run self.mainLoop() File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 1179, in mainLoop self.runUntilCurrent() File "/usr/lib/python2.6/dist-packages/twisted/internet/base.py", line 778, in runUntilCurrent call.func(*call.args, **call.kw) File "/usr/lib/python2.6/dist-packages/twisted/internet/task.py", line 194, in __call__ d = defer.maybeDeferred(self.f, *self.a, **self.kw) --- <exception caught here> --- File "/usr/lib/python2.6/dist-packages/twisted/internet/defer.py", line 117, in maybeDeferred result = f(*args, **kw) File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/db/transaction.py", line 265, in _commit_manually return func(*args, **kw) File "/home/dashboard/site/locale-inspector/l10ninsp/changes.py", line 40, in poll transaction.commit() File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/db/transaction.py", line 167, in commit connection._commit() File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/db/backends/__init__.py", line 38, in _commit return self.connection.commit() _mysql_exceptions.OperationalError: (2013, 'Lost connection to MySQL server during query') ... which maps the "5 hours ago" I see in ganglia, time-wise. Filing this for tracking and investigation. If there's a need to stop either of the VMs, let's coordinate on that with a real downtime.
Any update on this? It'd be nice to get ganglia to report on these again.
This started reporting again friday afternoon, without any action from the IT team. I've been searching through the logs on both the machines, but I can't find any mention of ganglia complaining of any network failure. If the error rises again, please reopen and I'll look at it straight away.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.