AUS2 snippets are not mirroring into production

RESOLVED FIXED

Status

()

Toolkit
Application Update
--
major
RESOLVED FIXED
11 years ago
10 years ago

People

(Reporter: nthomas, Assigned: chizu)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(URL)

(Reporter)

Description

11 years ago
All but 2 of the builds that should be offered a nightly update didn't get one today. The URL has the details - only Thunderbird/Trunk/Linux & Thunderbird/Mozilla1.8.0/Linux are green (non-null update info), the others are red with no update returned.

The two working builds were the earliest of the today, finishing by 03:43 PST. The first build to not get an update finished at 04:34. prometheus-vm is still pushing partial mars for all the builds to stage, and is cycling on the Mozilla1.8 tinderbox. Some possibilities: something is preventing prometheus-vm from sending snippets to aus2-staging, or the snippets are making AUS misbehave. There is a aus2-staging -> aus2 sync as well perhaps ?
These two URLs should match:
https://aus2-staging.mozilla.org/update/1/Firefox/3.0a1/2006121404/WINNT_x86-msvc/en-US/nightly/update.xml
https://aus2.mozilla.org/update/1/Firefox/3.0a1/2006121404/WINNT_x86-msvc/en-US/nightly/update.xml

This seems to indicate that aus2 is not syncing from aus2-staging; morgamic what do you think?
Assignee: preed → morgamic
There is no diff between PROD and STAGING tags, so the code should be identical -- so that means the divergence would most likely be caused by a failure to sync the data.

So, I would recommend:
* checking to make sure PROD is indeed AUS2_PRODUCTION
* checking to make sure the file syncs are happening correctly
* verify that any local config files are up-to-date (they should be symlinked to config-dist, but check it anyway)
Depends on: 364015

Comment 3

11 years ago
I would bump this up to critical, since it halts nightly updates, but that's going to page people, and I think this is already known.

Hopefully this won't affect Tuesday's release.
Assignee: morgamic → server-ops

Updated

11 years ago
Summary: No nightly update offered → AUS2 snippets are not mirroring into production

Comment 4

11 years ago
This is probably due to the hardware failure at the OSL (osadm01 - see the dep bug).  We have someone onsite who is working to get the machine back up ASAP.
(Reporter)

Comment 5

11 years ago
Confirming that justdave's manual rsyncing of data to the webheads has given us a green update-status page. Leaving this open to see what happens with today's nightlies.
osadm01 is up and running again, so this should all be working now.  If someone could verify that'd be appreciated.
Assignee: server-ops → justdave
(Reporter)

Comment 7

11 years ago
Updates are working fine now (update status page is green, spot checked Firefox/Trunk/Windows). Thanks for the fix.
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED
This is affecting release items.  

Staging:
https://aus2-staging.mozilla.org/update/1/Firefox/2.0/2006101022/Linux_x86-gcc3/en-US/release/update.xml

Production:
https://aus2.mozilla.org/update/1/Firefox/2.0/2006101022/Linux_x86-gcc3/en-US/release/update.xml

They should match if the rsync was happening correctly.
Severity: major → blocker
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Comment 9

11 years ago
Logs on aus2-staging.mozilla.org report:

Dec 19 17:53:04 do-stage01 rsyncd[1058]: rsync on aus2 from mradm01.mozilla.org (63.245.208.161)
Dec 19 17:54:23 do-stage01 rsyncd[1063]: rsync: connection unexpectedly closed (0 bytes received so far) [receive
r]
Dec 19 17:54:23 do-stage01 rsyncd[1063]: rsync error: error in rsync protocol data stream (code 12) at io.c(359)

every few minutes or so.
(Assignee)

Comment 10

11 years ago
Looks like some rsync processes were stacking up, delaying the rsyncing for longer periods of time, and long running ones are eventually erroring out. The files have been synced, but some of the syncing process probably needs to be reorganized to prevent this in the future.

I'm lowering the severity, but we'll work on smoothing out the sync process before closing this bug.
Severity: blocker → major
(Assignee)

Updated

11 years ago
Assignee: justdave → thardcastle
Status: REOPENED → NEW
(Assignee)

Comment 11

11 years ago
The sync process was simplified for the purposes of AUS2 during the 2.0.0.1 and 1.5.0.9 release. It was syncing from OSL -> MPT -> OSL. The MPT hop is now skipped, limiting the scope of this issue. 

I'm resolving this bug, the final solution will be the migration of AUS2 production to MPT, which is a different issue.
Status: NEW → RESOLVED
Last Resolved: 11 years ago11 years ago
Resolution: --- → FIXED
Product: Firefox → Toolkit
You need to log in before you can comment on or make changes to this bug.