1026800 - loan jgriffin an AWS linux64 test box, and then bump up the instance cpu/ram til tests pass

Reporter

Description

•

11 years ago

This is partially a loan request, and partially an AWS bump for the instance once it's loaned. We don't have the capacity to run all of our tests on ix, so let's find what test VM resources are needed for our tests to pass. Once we do that, let's create a new pool of that type of node.

Aki Sasaki (not active)

Reporter

Comment 1

•

11 years ago

Buildduty: feel free to kick this bug out of the loan requests queue once the loan's done... maybe General Automation ? Dunno.

Jordan Lund (:jlund)

Comment 3

•

11 years ago

I resolve the loan request by tomorrow am PT.

Jordan Lund (:jlund)

Comment 4

•

11 years ago

Bug 1007211 - loan linux64 ec2 slave to jmaher <- this loan was only stopped and never terminated. Joel, can you try using this again. You are still on the vpn access list for it and I just started it back up again: full fqdn: tst-linux64-ec2-jmaher.test.releng.use1.mozilla.com passwords have not changed

Assignee: nobody → jmaher

Jordan Lund (:jlund)

Comment 5

•

11 years ago

I am on buildduty. I must be losing it. jgriffin != jmaher. /me terminates jmahers old instance that was never reclaimed fully and creates one for jgriffin.

Jordan Lund (:jlund)

Comment 6

•

11 years ago

Email sent to jgriffin for further instructions. Loaning slaves: - tst-linux64-ec2-jgriffin.test.releng.use1.mozilla.com Hi j<ateam mozillian>, I am going to assign this to you to keep track of the loan(s). When you are finished with the loan(s) forever, please comment stating so and mark this bug as resolved. By the way, now that this aws instance has been created, starting and stopping it can happen in a flash! If you are not going to be using this machine for multiple hours, let us know in this bug and we can stop it. Comment again when you want it started back up. *For really fast turnaround, ping #releng (look for nick with 'buildduty')

Assignee: jmaher → jgriffin

Jordan Lund (:jlund)

Updated

•

11 years ago

Blocks: 1027473

Jordan Lund (:jlund)

Updated

•

11 years ago

Component: Loan Requests → Tools

QA Contact: coop → hwine

Aki Sasaki (not active)

Reporter

Comment 7

•

11 years ago

Rail, Catlee, if you can either expedite changing the type of AWS node this loaner runs on at jgriffin's request, or document it well enough for others to do so, that would be much appreciated.

Jonathan Griffin (:jgriffin)

Assignee

Comment 8

•

11 years ago

(In reply to Aki Sasaki [:aki] from comment #7) > Rail, Catlee, if you can either expedite changing the type of AWS node this > loaner runs on at jgriffin's request, or document it well enough for others > to do so, that would be much appreciated. Specifically, can I get this loaner rebooted as an m1.large instance?

Chris AtLee [:catlee]

Comment 9

•

11 years ago

It's very simple. Go to the AWS console: https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:instancesFilter=all-instances;instanceTypeFilter=all-instance-types;search=tst-linux64-ec2-jgriffin * Select the instance * Select Actions -> Stop * Wait for it to stop * Select Actions -> Change Instance Type * Select the type you want. * Select Actions -> Start * ??? * Profit! I've done this to change it to an m1.large instance. I've also enabled termination protection on this instance to prevent accidental deletion since we're going to be doing lots of manual work in the AWS console for this instance.

Chris AtLee [:catlee]

Comment 10

•

11 years ago

jgriffin, please keep bug 966070 in mind when testing new instance types!

Jonathan Griffin (:jgriffin)

Assignee

Comment 11

•

11 years ago

Update: the m1.large instance allows the tests to progress just a little further into startup than the m1.medium, but they still die. If I launch the emulator manually, I can see gaia come up eventually, but it's excruciatingly slow. Taras suggested switching to a c3.xlarge instance. Catlee, can you do this or redirect to someone else to do it? Thanks!

Flags: needinfo?(catlee)

Jordan Lund (:jlund)

Comment 12

•

11 years ago

I will do this now.

Flags: needinfo?(catlee)

Jordan Lund (:jlund)

Comment 13

•

11 years ago

done. tst-linux64-ec2-jgriffin is now running and is a c3.xlarge instance

Jonathan Griffin (:jgriffin)

Assignee

Comment 14

•

11 years ago

Good news. The tests run as well on the c3.xlarge instance as they do for me locally. Based on my experiment with m1.large, I doubt c3.large would be adequate, but I'm willing to try it if you'd like me to.

Flags: needinfo?(catlee)

Chris AtLee [:catlee]

Comment 15

•

11 years ago

Worth a shot. The c3 family has a slightly more powerful processor than m3 I believe. I've converted your instance to c3.large.

Flags: needinfo?(catlee)

Jonathan Griffin (:jgriffin)

Assignee

Comment 16

•

11 years ago

gaia-ui-tests do run on the c3.large, about 20% slower on average. For the media mochitests (another chunk we want to run here), I haven't seen the tests timeout in a few runs, but they're intermittent so it's hard to be sure if c3.large is enough to avoid them. Probably the best thing to do is stand up a new platform based on c3.large so we can play with them on cedar and see how stable they are there, and consider switching to c3.xlarge if intermittent CPU-related problems are frequent.

Chris AtLee [:catlee]

Comment 17

•

11 years ago

If you have time, can you look at some of our desktop unittest suites. I'm particularly interested to see how much faster they run.

Jonathan Griffin (:jgriffin)

Assignee

Comment 18

•

11 years ago

I ran mochitest-plain chunk 1 on the c3.large, which seemed to go about 20% faster than on the current m1.medium. The average time for the test harness part (excluding setup and teardown in mozharness) is about 30 minutes on TBPL; it took 24 minutes on the c3.large.

Jonathan Griffin (:jgriffin)

Assignee

Comment 19

•

11 years ago

I don't need this slave any longer, so we can terminate; see bug 1031083 for follow-up.

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

8 years ago

Component: Tools → General

Bugzilla

loan jgriffin an AWS linux64 test box, and then bump up the instance cpu/ram til tests pass

Categories

(Release Engineering :: General, defect)

Tracking

(Not tracked)

People

(Reporter: mozilla, Assigned: jgriffin)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Updated

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Comment 17

Comment 18

Comment 19

Updated