bouncer entries not updated for FF 23.0 as normal

RESOLVED FIXED

Status

Infrastructure & Operations Graveyard
WebOps: Product Delivery
RESOLVED FIXED
4 years ago
a year ago

People

(Reporter: hwine, Assigned: jd)

Tracking

Details

Attachments

(1 attachment)

(Reporter)

Description

4 years ago
While publishing Firefox 23, bouncer did not update after the normal 5 minute interval after update of entries in admin app.

:jd found a cronjob that hadn't run and resolved issue after about 45 minutes total

This bug is to capture details and do RFO.
(Assignee)

Comment 1

4 years ago
On bounceradm.private.phx1 the cron job for updating product details apparently did not run. It looks like:

# Update product details *only*
0 0 * * * root /usr/bin/flock -w 10 /var/lock/bouncer-prod /data/bouncer/src/download.mozilla.org/update -p > /dev/null 2>&1

As this job is scheduled to run constantly I went ahead and ran the job manually and got:

# cd $CODE_DIR
# ./update -p
    Updating product_details...
  ~ snip ~
    [localhost] out: 10 files changed, 10 insertions(+), 10 deletions(-)
  ~ snip ~

Full details will be attached.

I am not sure at this point why this job did not run but perhaps someone will see something in the detailed output.

For completeness I will mention that I also busted the Zeus cache several times (before and after the cron job execution) as well as issued an Apache reload for all the web servers (one at a time) prior to manually running the cron job. I do not think any of this had any affect on the outcome, I only include it hear in case other errors creep up as a result of these actions.
(Assignee)

Comment 2

4 years ago
Created attachment 786343 [details]
Cron manual output

Comment 3

4 years ago
Well, the obvious potential causes are:

1) It uses flock ... perhaps a previous instance was still running?

2) It pulls from SVN ... perhaps 'svn up' wasn't working?

3) That cron runs *daily*, not constantly. Are we sure the changes to product-details made it into the repo before midnight?

4) It attempts to do an automated deploy (automated ssh login from admin node to web nodes)... perhaps this is broken?


If I read the output right:
Found last update timestamp: Tue, 30 Jul 2013 21:29:42 GMT
It's been broken for days? Or have there just not been any changes? I don't really know how to interpret this.

Observation: the cron spits out a lot of information on every run. We /dev/null it to keep from getting spammed. We should check if there's anything we can do to make it quieter so we don't have to do that.

/var/log/cron does report this job as running at midnight. Unfortunately since we discard output, I can't say what actually happened.
(Reporter)

Comment 4

4 years ago
I don't think we have a RFO yet, but we did ship FF23!

I'm going to close this bug, and open a new one for the RFO.
Assignee: server-ops-webops → jcrowe
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
(Reporter)

Updated

4 years ago
Blocks: 902260
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.