High packet loss from usw2 causing frequent timeouts

RESOLVED FIXED

Status

--
blocker
RESOLVED FIXED
5 years ago
9 months ago

People

(Reporter: RyanVM, Unassigned)

Tracking

Details

Attachments

(2 attachments)

(Reporter)

Description

5 years ago
All trees closed.
Depends on: 1060416
Created attachment 8481439 [details] [diff] [review]
bug_1060407_disable_usw2.patch

for landing: I think aws-manager's cloud tools updates via cron now

I would assume 0'ing this would have no effect after this patch: http://mxr.mozilla.org/build/source/cloud-tools/configs/watch_pending.cfg#3 ?
Attachment #8481439 - Flags: review?(catlee)
Attachment #8481439 - Flags: review?(catlee) → review+
Comment on attachment 8481439 [details] [diff] [review]
bug_1060407_disable_usw2.patch

checked in: https://hg.mozilla.org/build/cloud-tools/rev/3d7663029c69

I think we will need to keep an eye on pending and limits.

These dials might need to be adjusted: http://mxr.mozilla.org/build/source/cloud-tools/configs/watch_pending.cfg#115
Attachment #8481439 - Flags: checked-in+
Created attachment 8481573 [details] [diff] [review]
bug_1060407_limit_usw2_to_750.patch

we have not experienced the packet loss since we stopped bringing up new usw2 slaves. We went from ~1100 before disabling to 781.

This patch will ensure that we stay around that limit (750) and proportion the slaves to what we currently consider as our 'usw2 limit capacity'. which was:

we can gradually increase this number as needed until we discover the root cause of network woes.
Attachment #8481573 - Flags: review?(catlee)
Attachment #8481573 - Flags: review?(catlee) → review?(rail)

Updated

5 years ago
Attachment #8481573 - Flags: review?(rail) → review+
Backed out all the above changes with https://hg.mozilla.org/build/cloud-tools/rev/e47bf4c569dd

Reso for now.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED

Updated

9 months ago
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.