Closed Bug 901986 Opened 11 years ago Closed 11 years ago

bouncer entries not updated for FF 23.0 as normal

Categories

(Infrastructure & Operations Graveyard :: WebOps: Product Delivery, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: hwine, Assigned: jd)

References

Details

Attachments

(1 file)

While publishing Firefox 23, bouncer did not update after the normal 5 minute interval after update of entries in admin app.

:jd found a cronjob that hadn't run and resolved issue after about 45 minutes total

This bug is to capture details and do RFO.
On bounceradm.private.phx1 the cron job for updating product details apparently did not run. It looks like:

# Update product details *only*
0 0 * * * root /usr/bin/flock -w 10 /var/lock/bouncer-prod /data/bouncer/src/download.mozilla.org/update -p > /dev/null 2>&1

As this job is scheduled to run constantly I went ahead and ran the job manually and got:

# cd $CODE_DIR
# ./update -p
    Updating product_details...
  ~ snip ~
    [localhost] out: 10 files changed, 10 insertions(+), 10 deletions(-)
  ~ snip ~

Full details will be attached.

I am not sure at this point why this job did not run but perhaps someone will see something in the detailed output.

For completeness I will mention that I also busted the Zeus cache several times (before and after the cron job execution) as well as issued an Apache reload for all the web servers (one at a time) prior to manually running the cron job. I do not think any of this had any affect on the outcome, I only include it hear in case other errors creep up as a result of these actions.
Attached file Cron manual output
Well, the obvious potential causes are:

1) It uses flock ... perhaps a previous instance was still running?

2) It pulls from SVN ... perhaps 'svn up' wasn't working?

3) That cron runs *daily*, not constantly. Are we sure the changes to product-details made it into the repo before midnight?

4) It attempts to do an automated deploy (automated ssh login from admin node to web nodes)... perhaps this is broken?


If I read the output right:
Found last update timestamp: Tue, 30 Jul 2013 21:29:42 GMT
It's been broken for days? Or have there just not been any changes? I don't really know how to interpret this.

Observation: the cron spits out a lot of information on every run. We /dev/null it to keep from getting spammed. We should check if there's anything we can do to make it quieter so we don't have to do that.

/var/log/cron does report this job as running at midnight. Unfortunately since we discard output, I can't say what actually happened.
I don't think we have a RFO yet, but we did ship FF23!

I'm going to close this bug, and open a new one for the RFO.
Assignee: server-ops-webops → jcrowe
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Blocks: 902260
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: