Open Bug 1385106 Opened 7 years ago Updated 2 years ago

Investigate using io1 volumes instead of gp2, and monitor BurstBalance

Categories

(Firefox Build System :: Task Configuration, task)

task

Tracking

(Not tracked)

People

(Reporter: dustin, Unassigned)

Details

https://jeremyeder.com/2017/07/25/docker-operations-slowing-down-on-aws-this-time-its-not-dns/
https://aws.amazon.com/blogs/aws/new-burst-balance-metric-for-ec2s-general-purpose-ssd-gp2-volumes/

I don't know all of the background, but I do know that Windows instances are large in order to get more IOPS, and this has us requesting a total of 800TiB in us-east-1 -- a little extreme.  Maybe it would be more cost-efficient to use smaller io1 volumes?

If we stay with gp2, we should set up monitoring of the BurstBalance so that we know how often instances run out of IOPS.
Found in triage.

Pete: do any of the recent OCC/generic-worker improvements deal with this already?
Flags: needinfo?(pmoore)
Flags: needinfo?(pmoore) → needinfo?(rthijssen)
in the aws provisioner configuration there is a section like this:

  "instanceTypes": [
    {
      ...
      "launchSpec": {
        "BlockDeviceMappings": [
          {
            "DeviceName": "/dev/sda1",
            "Ebs": {
              "DeleteOnTermination": true,
              "VolumeSize": 40,
              "VolumeType": "gp2"
            }
          },
          {
            "DeviceName": "/dev/sdb",
            "Ebs": {
              "DeleteOnTermination": true,
              "VolumeSize": 120,
              "VolumeType": "gp2"
            }
          }
        ],
        ...
      },
      ...
    }
  ],
  ...

modify the Ebs section of the BlockDeviceMappings. change "VolumeType" to have a value of "io1" instead of "gp2" and add an "Iops" property with a value matching the required IOPS.
see: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ec2-blockdev-template.html
Flags: needinfo?(rthijssen)
This is still valid, and should be investigated if we don't end up moving Windows to GCP.
Now that we know that at least some Windows traffic will remain in AWS (Windows 7), we should plan to test this.
Component: General → Task Configuration
Product: Taskcluster → Firefox Build System
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.