Closed Bug 1308756 Opened 9 years ago Closed 9 years ago

Trees closed: Linux m-c nightlies, esr45 builds, and thus beta and release Linux builds timing out uploading symbols

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
blocker

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: philor, Assigned: nthomas)

Details

Attachments

(1 file)

The ones I've seen already fail are linux32 and linux64 nightlies on m-c (they didn't manage to run before the TCW, so they got stuck running after, and thus after whatever part of it broke them), and have twice each failed with "2400 seconds without output" uploading symbols to crash-stats.mozilla.com like https://treeherder.mozilla.org/logviewer.html#?job_id=5200853&repo=mozilla-central and the linux32 and linux64 opt builds on mozilla-esr45, failing the same way. According to logs, we also upload symbols for on-push builds on mozilla-beta and mozilla-release (hello, release-promotion), so I'm closing them as well as esr45 (and leaving mozilla-central open despite not having Linux nightlies, because it isn't going to make any difference either way, I don't have anything to speak of to merge there and nightlies will run and fail on the same push either way).
The uploads go to https://crash-stats.mozilla.com/symbols/upload. Here's what an AWS instance sees: [root@bld-linux64-spot-017.build.releng.use1.mozilla.com ~]# host crash-stats.mozilla.com crash-stats.mozilla.com is an alias for crash-stats.mocotoolsprod.net. crash-stats.mocotoolsprod.net has address 52.41.151.208 crash-stats.mocotoolsprod.net has address 52.35.146.173 crash-stats.mocotoolsprod.net has address 52.33.197.113 All of those IPs are routed back to SCL3 then back to AWS, which slows down the upload by increasing the latency. I'm going to rerun the routing script, so that those IPs route directly across AWS (ie via the internet gateway, IGW).
Attached file route changes.txt
Attaching the log because I removed a bunch of stale routing, and there's a risk something else might break. Some of the 63.245 will be for aus4.mozilla.org (now in CloudOps AWS), and some is that *carbon.hostedgraphite.com are known to hop around (DNS roundrobin or summink). Anyway, I can openssl s_client to each of the IPs on 443 now.
We have a green linux64 on m-esr45, but that was from before the fix. There's some packet loss in SCL3 which may have been a factor too. Pre-emptively closing this based on tests on instances.
Assignee: nobody → nthomas
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
FTR, reopened m-b, m-r, m-esr45.
Checked some logs and symbols uploads are succeeding.
Status: RESOLVED → VERIFIED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: