Closed Bug 627255 Opened 15 years ago Closed 15 years ago

input.stage does not seem to auto-update

Categories

(mozilla.org Graveyard :: Server Operations, task)

All
Other
task
Not set
blocker

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: wenzel, Assigned: justdave)

References

Details

input.stage does not seem to run its "auto-update from git" cron job anymore. We get tracebacks that indicate that database tables are missing which would be there if migrations were still ran automatically. This might -- and I am just guessing -- have been caused by work on bug 625965. Please ensure the update_staging.sh cron job is still running. You might want to run it yourself once and see if it outputs any errors. Thanks.
Blocks: 627247
Assignee: server-ops → justdave
[root@mrapp-stage02 cron.d]# cd /data/www/input.stage.mozilla.com/reporter/; ./bin/update_staging.sh Running migration 4: BEGIN; CREATE TABLE `theme` ( `id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY, `pivot_id` integer NOT NULL, `num_opinions` integer NOT NULL, `feeling` varchar(20) NOT NULL, `platform` varchar(255) NOT NULL, `created` datetime NOT NULL ) ; ALTER TABLE `theme` ADD CONSTRAINT `pivot_id_refs_id_7c9e77b1` FOREIGN KEY (`pivot_id`) REFERENCES `feedback_opinion` (`id`); CREATE TABLE `theme_item` ( `id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY, `theme_id` integer NOT NULL, `opinion_id` integer NOT NULL, `score` double precision NOT NULL, `created` datetime NOT NULL ) ; ALTER TABLE `theme_item` ADD CONSTRAINT FOREIGN KEY (`theme_id`) REFERENCES `theme` (`id`); ALTER TABLE `theme_item` ADD CONSTRAINT FOREIGN KEY (`opinion_id`) REFERENCES `feedback_opinion` (`id`); CREATE INDEX `theme_c360d361` ON `theme` (`pivot_id`); CREATE INDEX `theme_af507caf` ON `theme` (`num_opinions`); CREATE INDEX `theme_97bf82c4` ON `theme` (`feeling`); CREATE INDEX `theme_eab31616` ON `theme` (`platform`); CREATE INDEX `theme_3216ff68` ON `theme` (`created`); CREATE INDEX `theme_item_1079d5be` ON `theme_item` (`theme_id`); CREATE INDEX `theme_item_ac81e047` ON `theme_item` (`opinion_id`); COMMIT; Error: Had trouble running this: BEGIN; BEGIN; CREATE TABLE `theme` ( `id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY, `pivot_id` integer NOT NULL, `num_opinions` integer NOT NULL, `feeling` varchar(20) NOT NULL, `platform` varchar(255) NOT NULL, `created` datetime NOT NULL ) ; ALTER TABLE `theme` ADD CONSTRAINT `pivot_id_refs_id_7c9e77b1` FOREIGN KEY (`pivot_id`) REFERENCES `feedback_opinion` (`id`); CREATE TABLE `theme_item` ( `id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY, `theme_id` integer NOT NULL, `opinion_id` integer NOT NULL, `score` double precision NOT NULL, `created` datetime NOT NULL ) ; ALTER TABLE `theme_item` ADD CONSTRAINT FOREIGN KEY (`theme_id`) REFERENCES `theme` (`id`); ALTER TABLE `theme_item` ADD CONSTRAINT FOREIGN KEY (`opinion_id`) REFERENCES `feedback_opinion` (`id`); CREATE INDEX `theme_c360d361` ON `theme` (`pivot_id`); CREATE INDEX `theme_af507caf` ON `theme` (`num_opinions`); CREATE INDEX `theme_97bf82c4` ON `theme` (`feeling`); CREATE INDEX `theme_eab31616` ON `theme` (`platform`); CREATE INDEX `theme_3216ff68` ON `theme` (`created`); CREATE INDEX `theme_item_1079d5be` ON `theme_item` (`theme_id`); CREATE INDEX `theme_item_ac81e047` ON `theme_item` (`opinion_id`); COMMIT; UPDATE schema_version SET version = 4; COMMIT; stdout: stderr: ERROR 1050 (42S01) at line 3: Table 'theme' already exists returncode: 1
I've noticed that same error before when setting up a new dev environment.
I don't see why our staging database would possibly be as old as migration 3.
Ryan: What did you do to mitigate it, just increment the schema version?
Yeah, I just manually set the schema version in the db. With the initial db sync, the db was already up to the most recent version.
Isn't the schema version stored in the database? If so, shouldn't that have come with the import when it got imported from production?
(In reply to comment #6) > Isn't the schema version stored in the database? If so, shouldn't that have > come with the import when it got imported from production? Yes and yes. I don't know why this happened. Will you do us a favor and tell us what the content of the schema_version table is, both on prod and stage? Prod should be at approximately 10, and stage, well, should be at 13 by now.
Blocks: 625411
I have to make this a blocker, since it keeps QA from properly verifying the Input 3.0 bugs for two days now. Let's step back a bit: The latest results on input.stage are 4 days old. That indicates the cron job copying the prod DB over to stage every night might have died. CCing cshields, who set it up. If that worked, auto-updating would probably recommence. Please still tell me what schema_version prod thinks it has (comment 7), so I can fix it if there is an additional bug there.
Severity: critical → blocker
Blocks: 626705
mysql> select * from schema_version; +---------+ | version | +---------+ | 3 | +---------+ 1 row in set (0.00 sec) poking at the updater cron job now.
The cron job is set to only run on Tuesday, want me to make it daily?
(In reply to comment #9) > mysql> select * from schema_version; That's on the production database? Woah. (In reply to comment #10) > The cron job is set to only run on Tuesday, want me to make it daily? Yes, please run that daily.
(In reply to comment #11) > (In reply to comment #9) > > mysql> select * from schema_version; > > That's on the production database? Woah. That was on the staging database. Here's production: mysql> select * from schema_version; +---------+ | version | +---------+ | 10 | +---------+ 1 row in set (0.00 sec)
(In reply to comment #11) > (In reply to comment #9) > > mysql> select * from schema_version; > > That's on the production database? Woah. > > (In reply to comment #10) > > The cron job is set to only run on Tuesday, want me to make it daily? > > Yes, please run that daily. We just changed this database cron job to happen on a weekly basis a couple of weeks ago per request. see https://bugzilla.mozilla.org/show_bug.cgi?id=614675#c13 Please come to a consensus and decide which way it needs to go.
Yes that was my mistake. I thought we changed it from daily at 9am to daily at midnight, but I was wrong. I talked to davedash and justdave, and keep on running the import at the same time on Tuesdays, and justdave makes sure the code-updating cron job gives the import enough time to run before attempting to run code updates and migrations on it.
ok, per discussion on IRC, we're going to keep this once per week. The backup restore currently takes 8 minutes. It took about 3 minutes back when we first set this up, so the "cron cluster" cron job on mrapp-stage02 that needs to run right after the import had been set to run 5 minutes afterwards. I'm now leaving 15 minutes for that. We suspect part of the problem may be because update_staging.sh runs every 5 minutes and attempts to run migrations when it's done. If the DB import is now taking longer than 5 minutes to import that means it's going to run migrations on an incomplete database and probably corrupt it. So update_staging.sh has been mangled to not interfere. The cron jobs have been re-arranged as follows: on tm-stage01-master01: > 0 0 * * 2 root /root/bin/import-input-db on mrapp-stage02: > 15 0 * * 2 apache /data/virtualenvs/input/bin/python26 /data/www/input.stage.mozilla.com/reporter/manage.py cron cluster &> /dev/null > */5 1-23 * * * root cd /data/www/input.stage.mozilla.com/reporter/; ./bin/update_staging.sh > /dev/null > */5 0 * * 0,1,3-6 root cd /data/www/input.stage.mozilla.com/reporter/; ./bin/update_staging.sh > /dev/null > 20-55/5 0 * * 2 root cd /data/www/input.stage.mozilla.com/reporter/; ./bin/update_staging.sh > /dev/null that makes the only gap in update_staging.sh be from midnight to 0:20 on Tuesday, right during the backup restore
The import I ran manually a little bit ago was apparently still corrupted because update_staging ran while it was importing. I disabled the update_staging job and ran it again. Here's the output, as requested by wenzel: [root@tm-stage01-master01 ~]# import-input-db Importing /data/backup-drop/a01/mysql/input_mozilla_com/input_mozilla_com.2011.01.21.sql.gz to input_stage_mozilla_com... real 6m26.053s user 0m10.416s sys 0m0.788s [root@mrapp-stage02 ~]# cd /data/www/input.stage.mozilla.com/reporter/; ./bin/update_staging.sh Running migration 11: CREATE TABLE `feedback_rating` ( `id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY, `opinion_id` integer NOT NULL, `type` smallint UNSIGNED NOT NULL, `value` smallint UNSIGNED ) TYPE=innodb; ; ALTER TABLE `feedback_rating` ADD CONSTRAINT `opinion_id_refs_id_36fb0fb1` FOREIGN KEY (`opinion_id`) REFERENCES `feedback_opinion` (`id`) ON DELETE CASCADE; CREATE INDEX `opinion` ON `feedback_rating` (`opinion_id`); CREATE INDEX `type` ON `feedback_rating` (`type`); That took 1.44 seconds ################################################## Running migration 12: CREATE INDEX `type` ON `feedback_opinion` (type); That took 125.41 seconds ################################################## Running migration 13: ALTER TABLE feedback_opinion ADD INDEX (`created`); ALTER TABLE feedback_opinion ADD INDEX (`product`); ALTER TABLE feedback_opinion ADD INDEX (`version`); ALTER TABLE feedback_opinion ADD INDEX (`os`); ALTER TABLE feedback_opinion ADD INDEX (`locale`); That took 678.95 seconds ################################################## [root@mrapp-stage02 ~]# su - apache -s /bin/bash -c "/data/virtualenvs/input/bin/python26 /data/www/input.stage.mozilla.com/reporter/manage.py cron cluster" [root@mrapp-stage02 ~]#
OK, based on what happened while running it manually, I've made the following changes to the cron jobs on mrapp-stage02: > */5 1-23 * * * root cd /data/www/input.stage.mozilla.com/reporter/; ./bin/update_staging.sh > /dev/null > */5 0 * * 0,1,3-6 root cd /data/www/input.stage.mozilla.com/reporter/; ./bin/update_staging.sh > /dev/null > 15 0 * * 2 root cd /data/www/input.stage.mozilla.com/reporter/; ./bin/update_staging.sh; su - apache -s /bin/bash -c "/data/virtualenvs/input/bin/python26 /data/www/input.stage.mozilla.com/reporter/manage.py cron cluster" > 30-55/5 0 * * 2 root cd /data/www/input.stage.mozilla.com/reporter/; ./bin/update_staging.sh > /dev/null So the automated/devnulled ones skip midnight to 00:30 on Tuesday morning. A manual one that's not devnulled (so you get its output on the cron mail) is done at 00:15, and runs the "cron cluster" job in serial as soon as it completes. Since "cron cluster" takes 10 minutes or so to run, and migrations could take a bit, I gave that a 15 minute window before the automated update_staging resumes.
Having not heard anything (I think everyone's busy with mfbt ;) I'm going to assume this is resolved. If that turns out not to be the case, please reopen.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
I tested it and staging is working as expected now. I assume you reenabled the cron jobs that you disabled temporarily. Based on this assumption, marking this verified. Thanks a lot for helping us figure this out and fix it!
Status: RESOLVED → VERIFIED
(In reply to comment #19) > I assume you reenabled the cron jobs that you disabled temporarily. I did.
Blocks: 630550
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.