Closed Bug 1827719 Opened 2 years ago Closed 2 years ago

load test Eliot GCP prod

Categories

(Eliot :: General, task, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

Attachments

(1 file)

We should do a load test for Eliot GCP prod environment to help us size everything and suss out any performance degradation between Eliot AWS and Eliot GCP prod environments.

We can use bug #1826764 and the tools we built there.

Assignee: nobody → willkg
Status: NEW → ASSIGNED
Depends on: 1828542

I worked out how I wanted to do a load test that gives us confidence on how we've sized things and sussed out any performance differences between the AWS and GCP environments. The methodology and the beginnings of results are in this doc:

https://docs.google.com/document/d/1oKVhvs2DMd28dhj3RWpNjY6pLgJm1CWPkrVTjw38564/edit#

While working on that, I was getting an awful lot of HTTP 502 timeouts where the request took longer than 60s to complete. I wrote up DSRE-1245 to cover that.

Jira doesn't support markdown, so I wrote up bug #1828542 to work in.

Bug #1828542 (DSRE-1245) was fixed.

Then I did some more load test and timing work and determined Eliot in GCP was much slower than Eliot in AWS. I opened up DSRE-1298 to look into that.

They switched Eliot to c2 nodes and also changed the disk from hdd to sdd.

I did another round of timings and load tests on 2023-04-24 and things look a lot better.

I updated the document. I think Eliot GCP in prod looks good. Once we cutover to Eliot GCP, we can tune it further.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED

willkg merged PR #27: "bug 1827719: loadtest eliot gcp prod" in c76995e.

This lands all the load test scripts and stuff I used.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: