Closed
Bug 1159384
Opened 10 years ago
Closed 9 years ago
Capture available build time comparisons for build slaves in AWS versus Colo hardware
Categories
(Infrastructure & Operations :: RelOps: General, task)
Infrastructure & Operations
RelOps: General
x86_64
Windows Server 2008
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: q, Assigned: q)
References
()
Details
(Whiteboard: [windows])
Attachments
(1 file)
1.61 KB,
text/plain
|
Details |
No description provided.
Numbers cor total build times worked out during the recent Portland Work week:
Ninty five percentile calculated try end to end build times on windows 2008
AWS region USE1 in play ( slow network issues) - 7000 seconds
AWS region USW2 only - 5403 seconds
AWS region USW2 r3.2xlarge instance types with SSDs - 5100
IX hardware in slc3 - 5611
Updated•10 years ago
|
Assignee: relops → q
So based on the first pass it appears that instance types in AWS with no known network issues are faster than our hardware in colo numbers. However, these jobs need a deeper breakdown of type etc.. As a proof of concept I believe AWS 2008 builders are a success. However, in troubleshooting the AWS regional network issues we lost our sterile environment. I believe the numbers are encouraging and that a controlled staging environment should be built in both the colo and AWS USW2 focusing on instance types r3.xlarge and higher. Based on conversations with Releng management there is no set standard for these metrics or prescribed method for retrieval.
I have been referred to Nick for some analysis training and to https://secure.pub.build.mozilla.org/builddata/buildjson/ to start pulling data after the tests environments are settled.
Comment 4•10 years ago
|
||
Let's re-instantiate the USW2 instances, put them in staging, and run direct comparison jobs there and in SCL3.
Can we please also test out using local storage (I'm not sure if we'll have enough) for builds and get numbers on that?
Flags: needinfo?(q)
Comment 5•10 years ago
|
||
Q -> AWS
--------
4/16/2015
Traces captured during a slow uplaod
Comment 6•10 years ago
|
||
sorry :(
Comment 7•10 years ago
|
||
Q: with the stack tuning that we've now done, what do our numbers look like?
Updated•9 years ago
|
Whiteboard: [windows]
Here is the current successful try build times in seconds with a 95 percentile measurement:
IX machines: 11648
EC2 r3.Xlarge: 6594
EC2 r3.2Xlarge: Need more data ( being collected now )
These numbers are based on a smaller sample size than is ideal but show a promising trend. The median shows a less dramatic difference. I have created a parse and calc python script so we can keep tabs on these measurements as our sample size grows.
Data gathered from:
https://secure.pub.build.mozilla.org/buildapi/recent/
Flags: needinfo?(q)
Comment 9•9 years ago
|
||
What specific build type is this for?
Assignee | ||
Comment 10•9 years ago
|
||
All successful builds. I can narrow the scope during parse. Do you have any suggestions for which type I should focus on ?
Assignee | ||
Comment 11•9 years ago
|
||
By build type IX:
b2g-inbound-win32
4367
b2g-inbound-win32-debug
5669
b2g-inbound-win32-mulet
4934
b2g-inbound-win32-pgo
15769
b2g-inbound-win32_gecko
6564
b2g-inbound-win32_gecko-debug
6564
b2g-inbound-win64
6813
b2g-inbound-win64-debug
6695
b2g-inbound-win64-pgo
15932
comm-aurora-win32-l10n-dep
586
comm-beta-win32
5347
comm-beta-win32-debug
6514
comm-central-win32
6251
comm-central-win32-debug
6327
comm-esr38-win32
10066
comm-esr38-win32-debug
9301
cypress-win64-debug
6922
fuzzer-win64-rev2
2042
fx-team-win32
6360
fx-team-win32-debug
5693
fx-team-win32-mulet
3144
fx-team-win32-pgo
15745
fx-team-win32_gecko-debug
6799
fx-team-win64
5396
fx-team-win64-debug
6860
fx-team-win64-pgo
15750
fx-team_win32-debug_spidermonkey-compacting
8696
jamun_win32-debug_spidermonkey-compacting
8940
mozilla-aurora-win32
11861
mozilla-aurora-win32-l10n-dep
1299
mozilla-aurora-win64
16934
mozilla-aurora-win64-l10n-dep
485
mozilla-b2g32_v2_0-win32_gecko-nightly
11877
mozilla-b2g34_v2_1-win32_gecko-nightly
14482
mozilla-b2g37_v2_2-win32-mulet-nightly
5868
mozilla-beta-win32
13712
mozilla-beta-win64
19285
mozilla-central-win32-mulet-nightly
4954
mozilla-central-win32-pgo
19254
mozilla-central-win32_gecko-nightly
7276
mozilla-central-win64-pgo
19260
mozilla-inbound-win32
6170
mozilla-inbound-win32-debug
7552
mozilla-inbound-win32-mulet
5265
mozilla-inbound-win32-pgo
13499
mozilla-inbound-win32_gecko
7127
mozilla-inbound-win32_gecko-debug
8401
mozilla-inbound-win64
7302
mozilla-inbound-win64-debug
7076
mozilla-inbound-win64-pgo
15608
mozilla-inbound_win32-debug_spidermonkey-compacting
7718
mozilla-inbound_win32-debug_spidermonkey-plaindebug
4954
mozilla-inbound_win32_spidermonkey-plain
4482
try-comm-central-win32
5269
try-win32
4030
try-win32-debug
5602
try-win32-mulet
3939
try-win32_gecko
5619
try-win32_gecko-debug
7148
try-win64
4127
try-win64-debug
4809
try_win32-debug_spidermonkey-compacting
9028
try_win32_spidermonkey-compacting
7302
try_win64-debug_spidermonkey-compacting
8717
Assignee | ||
Comment 12•9 years ago
|
||
By build type ec2 r2.xlarge:
try-comm-central-win32
5320
try-win32
6130
try-win32-debug
5606
try-win32-mulet
4170
try-win32_gecko
5849
try-win32_gecko-debug
6038
try-win64
5649
try-win64-debug
5970
try_win32-debug_spidermonkey-compacting
11369
try_win32-debug_spidermonkey-plaindebug
7491
try_win32_spidermonkey-compacting
5957
try_win32_spidermonkey-plain
3083
try_win64-debug_spidermonkey-compacting
6089
try_win64-debug_spidermonkey-plaindebug
7499
try_win64_spidermonkey-plain
5277
Assignee | ||
Comment 13•9 years ago
|
||
Relevant tests compared:
EC2 R3.Xlarge IX
try_win32_spidermonkey-compacting 5957 try_win32_spidermonkey-compacting 7302
try_win32-debug_spidermonkey-compacting 11369 try_win32-debug_spidermonkey-compacting 9028
try_win64-debug_spidermonkey-compacting 6089 try_win64-debug_spidermonkey-compacting 8717
try-comm-central-win32 5320 try-comm-central-win32 5269
try-win32 6130 try-win32 4030
try-win32_gecko 5849 try-win32_gecko 5619
try-win32_gecko-debug 6038 try-win32_gecko-debug 7148
try-win32-debug 5606 try-win32-debug 5602
try-win32-mulet 4170 try-win32-mulet 3939
try-win64 5649 try-win64 4127
try-win64-debug 5970 try-win64-debug 4809
Comment 14•9 years ago
|
||
That is screaming for a spreadsheet (and I corrected your r2 to r3 in comment 12, since that seems to be what the instance type actually is):
https://docs.google.com/spreadsheets/d/1QQ2U13rmqo7OSTUmrFrwC_DvW31tScXX25yF_eRLIss
When you get the numbers for r3.2xlarge, please put them in there so we can do some easy comparisons.
Flags: needinfo?(q)
Comment 15•9 years ago
|
||
Some times were added for r3.2xl, but we probably need some more data. Also going to investigate r3.4xl and the c3 types again since our problems with them previously were network constraints that might have been mitigated by the network patches Q deployed.
Assignee | ||
Comment 16•9 years ago
|
||
Using the following script to slice and dice data:
https://github.com/mozilla/buildapi-recent-stats-read
Flags: needinfo?(q)
Assignee | ||
Comment 17•9 years ago
|
||
C level performance looks good so far. There was an error with subnets and signing. I am having to redeploy my c3.xlarge
Assignee | ||
Comment 18•9 years ago
|
||
No redeploy yet as focus on puppett configs has taken precedence. I will get the C4.2xl numbers upload however, due to average build times going up across the board this data will be skewed.
Assignee | ||
Comment 19•9 years ago
|
||
Tables updated in original spread sheet. I will be stopping instances today to save cost.
Assignee | ||
Comment 20•9 years ago
|
||
Numbers captured during a relative period of performance for all types. I think we should have a good rough picture of speeds and this can now be reevaluated in a repeatable way. Closing this bug and future comparisons should live in their own.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•