Bug 1608837 Comment 18 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Few more updates since my discussion with :ahal this past Monday (04/13):

1. it was agreed to split the patches into smaller sizes.
2. it was agreed to find the simplest solution that works reasonably well, then iterate.

For item 2, the perpetual problem has been that if tests are grouped at the highest level (wpt terminology: _groups_) then we lose the ability to provide some semblance of balanced chunks. On the other hand, using any additional granularity beyond _groups_ has the unintended effect of double or triple-scheduling tests depending on how many subdirectories exist.

Since the discussion on Monday I have attempted to work with a couple of approaches:

2a. split groups simply by the number of chunks
2b. split groups using test count
2c. split groups using test count but isolate larger groups

For 2a. the approach showed some promise but could not overcome the wildly imbalanced chunk issue:
https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&resultStatus=pending%2Crunning%2Csuperseded%2Cusercancel%2Cretry%2Csuccess%2Ctestfailed%2Cbusted%2Cexception&classifiedState=unclassified&revision=61c9779cb5bead66d4161276445cc7b512b21835

Note that linux1804-64/debug chunk 6 has a runtime of 137min.

For 2b. the basic premise is that if for larger groups (eg. _html_, _css_) the number of tests contained in that directory will be much higher than groups that are smaller (eg. _portals_).

The base idea appeared sound, and produced more balanced chunks than before:
https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&resultStatus=superseded%2Cusercancel%2Cretry%2Cpending%2Crunning%2Csuccess%2Ctestfailed%2Cbusted%2Cexception&classifiedState=unclassified&revision=5d12dcb7942d4a59063167e695a7d11b7e2f0633

Note that `linux1804-64/debug` chunk 18 has a runtime of 103min, which is an improvement.

However platform combinations like `linux1804-64-asan/opt` or `linux1804-64-qr/debug` _still_ produce consistent timeouts so this was not going to work.

The last and current approach I'm trying to work out is to chunk tests using groups, but to smartly isolate the longest running groups (as reported by runtime information) and split them into smaller list of subgroups if permissible.
Few more updates since my discussion with :ahal this past Monday (04/13):

1. it was agreed to split the patches into smaller sizes.
2. it was agreed to find the simplest solution that works reasonably well, then iterate.

For item 2, the perpetual problem has been that if tests are grouped at the highest level (wpt terminology: _groups_) then we lose the ability to provide some semblance of balanced chunks. On the other hand, using any additional granularity beyond _groups_ has the unintended effect of double or triple-scheduling tests depending on how many subdirectories exist.

Since the discussion on Monday I have attempted to work with a couple of approaches:

2a. split groups simply by the number of chunks
2b. split groups using test count
2c. split groups using test count but isolate larger groups

For 2a. the approach showed some promise but could not overcome the wildly imbalanced chunk issue:
https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&resultStatus=pending%2Crunning%2Csuperseded%2Cusercancel%2Cretry%2Csuccess%2Ctestfailed%2Cbusted%2Cexception&classifiedState=unclassified&revision=61c9779cb5bead66d4161276445cc7b512b21835

Note that linux1804-64/debug chunk 6 has a runtime of 137min.

For 2b. the basic premise is that if for larger groups (eg. _html_, _css_) the number of tests contained in that directory will be much higher than groups that are smaller (eg. _portals_).

The base idea appeared sound, and produced more balanced chunks than before:
https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&resultStatus=superseded%2Cusercancel%2Cretry%2Cpending%2Crunning%2Csuccess%2Ctestfailed%2Cbusted%2Cexception&classifiedState=unclassified&revision=5d12dcb7942d4a59063167e695a7d11b7e2f0633

Note that `linux1804-64/debug` chunk 18 has a runtime of 103min, which is an improvement.

However platform combinations like `linux1804-64-asan/opt` or `linux1804-64-qr/debug` _still_ produce consistent timeouts so this was not going to work.

In 2c which is my current approach I'm trying to work out is to chunk tests using groups, but to smartly isolate the longest running groups (as reported by runtime information) and split them into smaller list of subgroups if permissible.

Back to Bug 1608837 Comment 18