increase disk size for AWS buildbot masters

RESOLVED FIXED

Status

RESOLVED FIXED
5 years ago
9 months ago

People

(Reporter: arich, Assigned: coop)

Tracking

Details

Attachments

(1 attachment)

(Reporter)

Description

5 years ago
I've noticed that buildbot masters in AWS frequently alert on low disk space (115 and 116 in particular). Can we increase the disk size to stop the checks alerting?
These are alerting because of the increase in the # of 2.3 emulator jobs being thrown at them. 

I've fixed the alerting for now by removing all the double-digit twistd.log files from bm115 and bm116.

As an alternative to increasing the disk space, perhaps we should consider limiting the number of logs retained by these masters to 50 instead of 100?
Duplicate of this bug: 1030881
Created attachment 8450246 [details] [diff] [review]
Keep only 50 twistd.log files per master
Assignee: nobody → coop
Status: NEW → ASSIGNED
Attachment #8450246 - Flags: review?(bugspam.Callek)
Attachment #8450246 - Flags: review?(bugspam.Callek) → review+
Comment on attachment 8450246 [details] [diff] [review]
Keep only 50 twistd.log files per master

Review of attachment 8450246 [details] [diff] [review]:
-----------------------------------------------------------------

https://hg.mozilla.org/build/puppet/rev/13e690097f26
Attachment #8450246 - Flags: checked-in+
See action_set_logging in https://github.com/catlee/tools/compare/master...fabric if you are looking for a way to deploy this change without a master restart.
(In reply to Nick Thomas [:nthomas] from comment #5)
> See action_set_logging in
> https://github.com/catlee/tools/compare/master...fabric if you are looking
> for a way to deploy this change without a master restart.

Fantastic. Thanks, Nick.
The script failed on a bunch of slow-running masters where it couldn't establish a timely connection, but I patched those ones us by hand via the manhole.
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
merged to production

Updated

9 months ago
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.