Closed Bug 1542820 Opened 3 years ago Closed 3 years ago

Mission Control dev instance has problems since Friday's deploy


(Cloud Services :: Mission Control, defect)

Not set


(Not tracked)



(Reporter: wlach, Unassigned)



Since Friday's updates, I've been seeing a large number of problems with the mission control dev server. It is quite strange -- I'm seeing a lot of errors regarding atomic transactions not completing with the update_builds tasks and when I ssh'ed in to the dev environment I saw this query was hung:

21088 | 2 days 21:04:51.222387 | read_write | SELECT "django_migrations"."app", "django_migrations"."name" FROM "django_migrations"

I'm not sure exactly what this is about, I think it's something Django just runs on start up. The suspicious part is that it seems to have been hung for almost 3 days.

Looking at new relic, it appears we're getting errors coming from the fact that more recent versions of Django expect you to wrap create operations that may fail inside transaction.atomic:

I'm not sure if this explains the problems we're seeing, but I have a PR to fix this issue-- we'll see how it goes after that lands:

Flags: needinfo?(whd)

Oops, didn't mean to set needinfo on whd.

Flags: needinfo?(whd)

Hi Will, Marcia pointed me to this bug (thanks!) and I am wondering if this is a change that is external and could not be tested on staging. Or did we test this on staging before deploying it to MC stable? vs

I assume the latter is staging and former is stable. Thoughts?

Flags: needinfo?(wlachance)

This change has not been deployed to stable, as stated these problems are on the development instance.

Unfortunately there are currently some problems we're seeing on the "stable" version (, so this bug has an unusually high priority. Fortunately it looks like my PR above worked, as it's displaying data again. I'll do up a backfill and if everything goes smoothly for the next 24 hours, I'll deploy this new, fixed version to production.

Flags: needinfo?(wlachance)
Blocks: 1541033

Yup, that pull request fixed things nicely so should now be up to date again. I'll try to get :whd to do up a deploy tomorrow to the production site.

Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.