Merge talos suites that finish in less than 10 minutes to improve wait times

RESOLVED WORKSFORME

Status

Release Engineering
General
P4
normal
RESOLVED WORKSFORME
7 years ago
4 years ago

People

(Reporter: armenzg, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [buildfaster:p1]merged a11y & scroll into chrome - in production since Aug. 16th, URL)

Attachments

(1 attachment, 2 obsolete attachments)

(Reporter)

Description

7 years ago
There are some test suites that take a very short time to run (few mins) and then they reboot which means that the machine is not running jobs while it is rebooting.

We should determine which suites could be merged together and be on the right balance of wait times VS end time.
(Reporter)

Updated

7 years ago
Priority: -- → P2
Bug 586418 already did some analysis here, but got blocked on bug 594415.

Comment 2

7 years ago
How much time does it take from "reboot now" to "start next test suite"?
reboot time varies a bit between platforms...the fedora and windows boxes take around 4 minutes to reboot, osx boxes are a bit faster
(Reporter)

Comment 4

7 years ago
I have taken the times for each test run and I have done some analysys.

My goal is not to have jobs that take less than 10mins to run but instead group them so they run within 10 and 30 minutes.

I have added conditional formatting to mark test suites taking less than 10 minutes ("Average Time spent running Tests") in purple.

There is a sheet showing a summary to see which suites take a short time (purple) and which ones have a bad ratio (setup time VS setup time). The latter I would like to investigate in a separate bug (bug 661585).

What do we gain when joining suites?
We save a "setup time" + "reboot time" per push. This not a substantial improvement but it's worth doing.									
										
On one side (from catlee's suggestion), we would like to move a11y and scroll into 'chrome', remove ts, twinopen from 'chrome', and add tp5.		

From my analysis, these are some changes I propose doing to begin with:
* merge crashtest and jsreftest
* merge a11y & scroll into 'chrome'
* merge svg and nochrome (or paint)
* suites m3, m4 & m5 finish significantly earlier - could we just have m3 & m4? (merge m5 into the other two)
** btw debug mochitests can take up to 10 times more time than optimized; is that expected?

How does all of this sound like?

I would like to start with just tackling anything below 10 minutes and then do a second pass once we have new data and discover how much work is involved for these few changes (I am thinking of tbpl).

This is from the spreadsheet's summary. These are suites that take on average less than 10mins to run.
This normally means that the ratio of setup time VS test run is also low
							
	    fed	f64	leo	sno	w7	xp					
a11	     x	 x	n/a	n/a	x	x
scroll	     x	 x	x	x	x	x
crashtest    x	 x	x	x	x	x
m5	     x	 x	x	x	x	x
jsreftest    x	 x	x	x	x	x					
nochrome     x	 x	x	x	x	x
m4	     x	 x			x	x					
m3	     x	 x	x	x		x					
debug_crash  x	 x	x	x	x					
m2	     x	 x	x	x	x	x					
svg	     x	 x	x	x	x	x					
paint	     x	 x	x	x	x	x					
reftest	     		x								
xpcshell			x
(Reporter)

Comment 5

7 years ago
Putting on the side for now.
Bug 661585 has higher priority as it will give more noticeable improvements.
Assignee: armenzg → nobody
Blocks: 661585
Priority: P2 → P4
(Reporter)

Comment 6

7 years ago
(In reply to comment #4)
> From my analysis, these are some changes I propose doing to begin with:
> * merge crashtest and jsreftest
> * merge a11y & scroll into 'chrome'

Let's start with these 2 items first.
Assignee: nobody → armenzg
Priority: P4 → P3
Summary: Determine suites to merge into one job → Merge test and talos suites that finish in less than 10 minutes to improve wait times
Whiteboard: [waittimes]
(Reporter)

Comment 7

7 years ago
Also to disable ipcplugin in m-o.
You do mean "disable ipcplugins on 10.5 only," right?
(Reporter)

Comment 9

7 years ago
Yes indeed! (mind -> bug dump fail!)
I'm in favour of combining short-running suites, but we still need to sort out the reporting side of that on TBPL.
Depends on: 594415
Is there actually a *third* bug, besides just this and bug 586418 (of which this is a no-question straight-up duplicate)? I know that somewhere I typed a comment explaining that if you want to get unblocked and gain some machine time, you just need to do the Talos combining separately, because absolutely nobody cares where you put a particular Talos test, because absolutely nobody knows where they are now. Even TryChooser just shrugs and says "go look at SUITES in config.py if you have to know." No need to post to newsgroups asking for permission, no need to patch tbpl, just push-and-reconfig and post that it's already done. You could even claim no need for review, since there's already a reviewed patch to do it in bug 586418, if not for the way that review's 9 months old.
(Reporter)

Comment 12

7 years ago
To confirm what philor says; For instance, for talos suites it is not required any extra work unless it is a new suite (e.g. tp5).

Perhaps some TryChooser website/syntax needs to be adjusted.

You are right, it is a straight dupe of bug 586418 but let me take care of dupping it myself when I start working on it.
(Reporter)

Comment 13

7 years ago
To get this detangled I will take care of just doing talos suite merges on this bug.

Merging unit test suites on bug 586418 will require a lot of work on bug 594415.

Formatting sucks in comment 4. Let me adjust it.
(In reply to comment #4)
>			
> 	    fed	f64 leo	sno w7 xp
> a11	     x	 x  n/a	n/a  x	x
> scroll     x	 x    x	  x  x	x
> nochrome   x	 x    x   x  x	x

> crashtest  x	 x    x   x  x	x
> m5	     x	 x    x   x  x	x
> jsreftest  x	 x    x   x  x	x
> m4	     x	 x   ok  ok  x	x
> m3	     x	 x    x   x ok  x
> d_crash   ok   x    x   x  x  x
> m2	     x	 x    x   x  x	x
> svg	     x	 x    x   x  x	x
> paint	     x	 x    x   x  x	x
> reftest   ok  ok    x	 ok ok ok
> xpcshell  ok  ok   ok   x ok ok
No longer depends on: 594415
Priority: P3 → P2
Summary: Merge test and talos suites that finish in less than 10 minutes to improve wait times → Merge talos suites that finish in less than 10 minutes to improve wait times
(Reporter)

Comment 14

7 years ago
Created attachment 544315 [details] [diff] [review]
merge a11y and scroll into chrome suite

This is an unbitrotten version of attachment 473040 [details] [diff] [review].

This has not yet been tested. I had to do some adjustments with OLD_BRANCH support for 1.9.2.

Updated

7 years ago
Whiteboard: [waittimes] → [waittimes][buildfaster:p1]
(Reporter)

Comment 15

6 years ago
Created attachment 552199 [details] [diff] [review]
merge a11y and scroll into chrome suite & create chrome_mac for mac only
Attachment #544315 - Attachment is obsolete: true
Attachment #552199 - Flags: review?(catlee)
(Reporter)

Comment 16

6 years ago
Created attachment 552468 [details] [diff] [review]
merge a11y and scroll into chrome suite & create chrome_mac for mac only & remove twinopen except 1.9.2

I have decided to put together in this patch the work from bug 660124 as well.
Attachment #552199 - Attachment is obsolete: true
Attachment #552199 - Flags: review?(catlee)
Attachment #552468 - Flags: review?(jmaher)
Attachment #552468 - Flags: review?(catlee)
(Reporter)

Updated

6 years ago
Duplicate of this bug: 660124
Comment on attachment 552468 [details] [diff] [review]
merge a11y and scroll into chrome suite & create chrome_mac for mac only & remove twinopen except 1.9.2

Review of attachment 552468 [details] [diff] [review]:
-----------------------------------------------------------------

this look pretty good.  Just a simple question below (probably lack of understanding of all the scripts).  Also will there be other patches on this bug to remove ts and txul(twinopen)?

::: mozilla/project_branches.py
@@ +12,5 @@
>              'tp': 0,
>              'chrome': 0,
>              'nochrome': 0,
>              'dromaeo': 0,
>              'svg': 0,

do we need to add chrome_twinopen and chrome_mac here?
Attachment #552468 - Flags: review?(jmaher) → review+
Comment on attachment 552468 [details] [diff] [review]
merge a11y and scroll into chrome suite & create chrome_mac for mac only & remove twinopen except 1.9.2

Review of attachment 552468 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla-tests/config.py
@@ +137,5 @@
>  
>  SUITES = {
>      'chrome': {
>          'enable_by_default': True,
> +        'suites': GRAPH_CONFIG + ['--activeTests', 'tsscroll:a11y:ts:tdhtml:tsspider'],

this should be 'tscroll' I believe, not 'tsscroll' same below
Attachment #552468 - Flags: review?(catlee) → review+
(Reporter)

Comment 20

6 years ago
Comment on attachment 552468 [details] [diff] [review]
merge a11y and scroll into chrome suite & create chrome_mac for mac only & remove twinopen except 1.9.2

Landed on default:
http://hg.mozilla.org/build/buildbot-configs/rev/dba3f0b5e54e

Addressed all issues including not disabling "chrome" for accessibility branch which needed to run the "a11y" suite.
Attachment #552468 - Flags: checked-in+
(Reporter)

Comment 21

6 years ago
Landed in production yesterday:
http://hg.mozilla.org/build/buildbot-configs/rev/3f14385485dc

I will get new numbers tomorrow and measure what's next.
Whiteboard: [waittimes][buildfaster:p1] → [buildfaster:p1]merged a11y & scroll into chrome - in production since Aug. 16th
Depends on: 682686
Depends on: 682601
(Reporter)

Updated

6 years ago
Priority: P2 → P3
(Reporter)

Updated

6 years ago
Priority: P3 → P4
(Reporter)

Updated

6 years ago
Assignee: armenzg → nobody
(Assignee)

Updated

4 years ago
Product: mozilla.org → Release Engineering

Updated

4 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.