bugzilla.mozilla.org will be intermittently unavailable on Saturday, March 24th, from 16:00 until 20:00 UTC.


Infrastructure & Operations
5 years ago
5 years ago


(Reporter: dustin, Assigned: dustin)




(1 attachment)

We have two mozpools right now, and it's no fun.

One is running on the staging db, and one is running on the prod db.  So I'll need to merge those, and land the puppet configs to make the servers all use the prod DB.
I've verified that device ID's don't overlap.  pxe_configs is identical in the two environments.  I fixed things so that the imaging_servers row for mobile-imaging-001.p1 has the same id in both tables.

Merging logs will be hard.  I think I should just delete the production logs and copy the staging logs over, resetting the auto_increment ID while I do so.

Al of the requests in production are closed, so I'll just delete those and copy staging over wholesale.
Assignee: server-ops-releng → dustin
Created attachment 689304 [details] [diff] [review]
Attachment #689304 - Flags: review?(bugspam.Callek)
Comment on attachment 689304 [details] [diff] [review]

Review of attachment 689304 [details] [diff] [review]:

r+ on syntactically correct

per :dustin armen and jhopkins signed off on doing this in the first place, including the temporary lack of an official admin host
Attachment #689304 - Flags: review?(bugspam.Callek) → review+
This is done, but one of the crontasks ran and deleted the staging devices.  I've restored them, but I'm still working on getting MySQL back into shape afterward.
So, what had happened was the following:

I imported a bunch of data from staging to production using a hand-edited mysqldump.  This went fine.

A puppet bug (turns out 0 is true in puppet - good to know!) left the inventory-sync crontab running on *all* mobile-imaging servers, not just the one where I explicitly disabled it.  So that ran, with the old configuration excluding rack 1 from production.  As a result, it deleted the rack 1 devices from the production DB.

Seeing this, I tried to re-import those devices.  It did not go so well this time.  The import begins with
 .. some inserts ..

The last statement hung.  There are <600 rows in this table, so it shouldn't take but a short time.  I interrupted it eventually, when I saw that it apparently had a write lock on the table, and was thus preventing changes.  The ALTER TABLE was waiting on a metadata lock, oddly, so it wasn't even running slowly or something.

After some casting about for a solution, I renamed devices to something I won't repeat in a bug, created a new devices table, copied the data into it, and dropped the bogus one.  Problem solved (I think.. MySQL doesn't give you a way to see if keys are enabled on a table, but I assume they are on a new table..)
Last Resolved: 5 years ago
Resolution: --- → FIXED
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.