Closed Bug 844912 Opened 12 years ago Closed 12 years ago

SUMO database inaccesible for a few minutes this morning.

Categories

(Data & BI Services Team :: DB: MySQL, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: mythmon, Unassigned)

Details

I'm not sure if this is the right place to file this. I got 11 emails this morning between 05:42 PST and 05:46 PST that indicated one of the SUMO servers wasn't able to communicate with it's database. The site does not appear to be negatively affected, and all these errors come from a cron job that runs once a minute. It looks to me like every cron job failed for a few minutes. Sheeri checked mysql logs and nagios, and didn't notice anything, so it sounds like it might have been a network glitch. Here is one of the emails I got. It is pretty uninteresting except two things: That last line, that says it failed, and it appears to have been doing a write operation, not a read. Traceback (most recent call last): File "manage.py", line 49, in <module> execute_manager(settings) File "/data/support-stage/www/support.allizom.org/kitsune/vendor/src/django/django/core/management/__init__.py", line 459, in execute_manager utility.execute() File "/data/support-stage/www/support.allizom.org/kitsune/vendor/src/django/django/core/management/__init__.py", line 382, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "/data/support-stage/www/support.allizom.org/kitsune/vendor/src/django/django/core/management/base.py", line 196, in run_from_argv self.execute(*args, **options.__dict__) File "/data/support-stage/www/support.allizom.org/kitsune/vendor/src/django/django/core/management/base.py", line 232, in execute output = self.handle(*args, **options) File "/data/support-stage/www/support.allizom.org/kitsune/vendor/src/django-cronjobs/cronjobs/management/commands/cron.py", line 38, in handle registered[script](*args) File "/data/support-stage/www/support.allizom.org/kitsune/apps/customercare/cron.py", line 80, in collect_tweets tweet.save() File "/data/support-stage/www/support.allizom.org/kitsune/vendor/src/django/django/db/models/base.py", line 463, in save self.save_base(using=using, force_insert=force_insert, force_update=force_update) File "/data/support-stage/www/support.allizom.org/kitsune/vendor/src/django/django/db/models/base.py", line 551, in save_base result = manager._insert([self], fields=fields, return_id=update_pk, using=using, raw=raw) File "/data/support-stage/www/support.allizom.org/kitsune/vendor/src/django/django/db/models/manager.py", line 203, in _insert return insert_query(self.model, objs, fields, **kwargs) File "/data/support-stage/www/support.allizom.org/kitsune/vendor/src/django/django/db/models/query.py", line 1593, in insert_query return query.get_compiler(using=using).execute_sql(return_id) File "/data/support-stage/www/support.allizom.org/kitsune/vendor/src/django/django/db/models/sql/compiler.py", line 912, in execute_sql cursor.execute(sql, params) File "/data/support-stage/www/support.allizom.org/kitsune/vendor/src/django/django/db/backends/mysql/base.py", line 114, in execute return self.cursor.execute(query, args) File "/usr/lib64/python2.6/site-packages/MySQLdb/cursors.py", line 173, in execute self.errorhandler(self, exc, value) File "/usr/lib64/python2.6/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler raise errorclass, errorvalue django.db.utils.DatabaseError: (2013, 'Lost connection to MySQL server during query')
:uberj reported I got a 'Lost connection to MySQL server during query' at 5:50AM when trying to connect to dev-zeus-rw.db.phx1.mozilla.com.
cc'ing Jake
Just to clarify, this was the support.allizom.org server which is -stage and not -prod.
-dev is also represented in the emails, but I didn't receive any for -prod. This isn't happening any more, so it is certainly not a fire, just a curiosity.
Checked the stage database server (same as in comment 1) and there are no MySQL errors there, and Nagios has no problems, not even "soft" states, reported today. MySQL error logs for today have nothing unusual.
That path (/data/support-stage/www/...) is the path on the admin node (supportadm.private.phx1), rather than the web nodes. Therefore this probably won't be in Sentry. Indeed I see nothing there. Just throwing this out as another data point... seems like to have been cron- or deploy-related, because that should be all that happens on the admin node.
This was over a month ago, and we haven't seen this recur, so I'm going to close this. If this is in fact still an issue, please re-open.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
Product: mozilla.org → Data & BI Services Team
You need to log in before you can comment on or make changes to this bug.