Closed Bug 614675 Opened 14 years ago Closed 14 years ago

Move prod -> stage DB sync cron job to the nighttime

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: stephend, Assigned: cshields)

References

()

Details

It's during business hours when I'm filing this, so hopefully someone can take a look :-)

http://input.stage.mozilla.com/en-US/ is down frequently today, contains stale data on its homepage, and queries often individual Opinions often time out trying to load.

See the screencast for examples:

http://screencast.com/t/XumSQbr5
Assignee: server-ops → jeremy.orem+bugs
Is this still happening?
(In reply to comment #1)
> Is this still happening?

While it's not timing out, I still see "1 day ago" on the dashboard/homepage -- Michael/Dave, any ideas?
Which cron populates this?
That's what we'd like to know. We assume it is a cron job on one of the database machines. It is not part of the application's crons on the app node.
I am not sure if sites data is okay on stage. It might quite possibly not be okay due to bug 612610. 

I just can’t seem to even connect to stage to verify this.
Okay, was able to browse staging now. Sites data is definitely not up to date, and (from the logs on the metrics machine) bug 612610 is to blame for that. I hope I can attend that tonight. This should not have anything to do with the timeout problems though.
Dave, did you set up the DB sync for input? I'm not sure what is going on here.
Assignee: jeremy.orem+bugs → justdave
Downgrading this from critical; this, in fact, probably isn't a bug.  In our last Input meeting, we talked through it, and, iirc, here's what's happening:

* because we have a dump from prod -> staging, the timestamp reflects when that data was submitted on prod, not staging
* since we don't yet have any automated tests that submit to *staging*, it appears the data is stale
I have no idea how the site itself is set up.  This needs debugging from the app side if it's still an issue.  I concur with comment 9 about the reason for stale data.  That's populated from production daily.
Assignee: justdave → server-ops
Severity: critical → normal
To summarize: We would like to know what cron job is run to copy the DB from prod to stage at about 9am Pacific every day. Once we've identified the cron job, we will add it to the code repository for reference, and change its running time to the nighttime as opposed to the morning.
Summary: Input staging is timing out and showing stale data → Move prod -> stage DB sync cron job to the nighttime
This has been a fun one to trace down..

The dump happens on tm-backup02 at 0700 every day

From there, a cron runs every 3 hours at 17 after the hour (so the next chance being 9:17 every morning) to copy the dump tarball to dracula.

From there, a cron job run every hour at :25 pushes the dump to tm-stage01-master01 (so this occurs at 9:25).

On tm-stage01-master01 there is a cron job that runs every 30 mins (/etc/cron.d/import-databases which fires /root/bin/import-input-db) that eventually imports the data.

Does that help at all? :) Now, most of these scripts handle a multitude of databases in each step (not just input), so moving any times may have a ripple effect.
Assignee: server-ops → cshields
What the hell?

Here's what we want:

The last cron hat puts the data into the staging database - can that be done once a week?  Like on Monday at midnight?
Ok, this is done per comment 13:

in tm-stage01-master01:/etc/cron.d/import-databases I've changed the following

30 * * * * root /root/bin/import-input-db

to

0 0 * * 2 root /root/bin/import-input-db
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.