Closed Bug 1548160 Opened 6 months ago Closed 5 months ago

revisit chunking of xpcshelltests

Categories

(Testing :: General, enhancement, P3)

enhancement

Tracking

(firefox68 fixed)

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: egao, Assigned: egao)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Summary

Similar to bug #1548106, xpcshelltest is currently run in many chunks that I feel possibly unnecessary as each chunk introduces additional overhead to set up the environment.

An example push is seen here.

Data

Using X1 as standardized example (where comparable in chunk count), in the mozilla-central revision e8aebe488b2f2e567940577de25013d00e818f7c (linked above):

linux64-shippable: 6 minutes, 17:59:24 - 18:01:06 = 00:01:42
linux64-asan: 16 minutes, 16:49:21 - 16:51:38 = 00:02:17

Thoughts

Each chunk is running very quickly and requires approximately 1-2 minutes to set up. If we can reduce the number of chunks required to 50% of current values for linux64-debug for example, it is possible to save 12 minutes of overhead per push.

Type: defect → enhancement

Baseline: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&resultStatus=pending%2Crunning%2Csuperseded%2Cusercancel%2Cretry%2Csuccess%2Ctestfailed%2Cbusted%2Cexception&classifiedState=unclassified&tier=1%2C2%2C3&group_state=expanded&revision=e8aebe488b2f2e567940577de25013d00e818f7c&searchStr=xpcshell%2Cccov&selectedJob=243356572

Try push: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&revision=8707c76c267fec042a9842321562d42482cdd0bb

The above try push shows the results of reducing chunks across the board:

            android-em-4.3-arm7-api-16/debug: 12
            macosx.*[^ccov]/.*: 1
            windows.*[^ccov]/.*: 1
            .*-ccov/.*: 5
            default: 4

ccov
in the baseline push, ccov builds appear to have uneven chunking with some chunks going over the 30 minute soft rule, and other chunks running for only 10 minutes.

the goal was to reduce the chunks from 8 to a manageable 5, in the hopes that some faster chunks are merged together. The shorter chunks are combined, but at the same time the longer chunks also take correspondingly longer time to run.

windows, macosx
chunk count remains at 1, as it currently stands. Runtime of these single-chunk xpcshelltest sometimes exceed 30 minutes, sometimes under 30 minutes. Considering the current baseline also sees the same behavior, this is not a concern.

linux
linux32 and linux64 had its chunk count reduced greatly, from 8 or 12 to 4 across the board.

For most of the linux runs the X4 chunk appears to take the longest, in some cases (linux64 opt) exceeding 45 minutes. Despite this I don't think this is cause for concern since even on the baseline, linux64 opt takes 40 minutes, so the extra 5 minutes consumed in the reduced chunk is easily made up by savings from reduced overhead.

40 minutes is long, but it does simplify our chunks- what runtime dow we get with 6? closer to 30?

:jmaher - yesterday I ran a try push with 6 chunks for a bunch of platforms including linux variants. It is availble here.

First impression is that chunk runtimes don't decrease much between 4 and 6 chunks.
Using linux64/opt as example:

4 chunks: 15, 14, 20, 45
6 chunks: 11, 8, 7, 14, 14, 38

So it looks like by reducing chunks from 6 -> 4, it had the effect of redistributing the very short tests (8, 7) to the moderately long tests (11, 14) and some of the modifications also spilled over to the last chunks, which takes the longest.

An idea for another task efficiencies project could be to investigate the chunking mechanism, to better distribute the load for situations like this. Since the chunking mechanism is not related to overhead, it will be outside the scope of this project.

overall I don't like the 45, but 6 chunks has a 38. Chunking is basically taking the list of tests we have and dividing them up- we are using chunk_by_slice for xpcshell:
https://searchfox.org/mozilla-central/source/testing/xpcshell/runxpcshelltests.py#900

and the definition is here:
https://searchfox.org/mozilla-central/source/testing/mozbase/manifestparser/manifestparser/filters.py#153

xpcshell runs tests in parallel and any failures is repeats at the end in series. Unlike mochitest and retest it doesn't chunk_by_dir.

We could look at the runtimes of the tests and find the individuals which are running longer than normal- either split them up, limit their running to select platforms, isolate them in another job, or chunk_by_runtime and include test weights (as we do for mochitest)

I agree this is out of scope, but good to have an understanding of what we are doing and what we could easily do.

As for chunks, it seems we save about 2-3 minutes per chunk we remove and the runtimes are better except for the single 45 minute, but that is just one job which is an outlier and we already have an outlier. I give this a thumbs up

Priority: -- → P3
  • ccov chunks are set to 6, with exception of macosx64-ccov at 8
  • various linux platforms saw reduction in chunks from 8 to 4

linux

4 chunks
https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&searchStr=xpcshell&revision=f725d3c1c16b612b9748813d6c4d0bb4844ac575&selectedJob=244078851

observations

  • chunk runtimes are either uneven, or generally under the 30 minute mark but just so.

5 chunks
https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&revision=ae20fb489e94cf6a9df348cc3c9e9f932f3d5408

observations

  • similar to above

6 chunks
https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&searchStr=xpcshell&revision=787eb0c2a1b1ded6589c4f5d32cff54bbc957b5a&selectedJob=243786868

observations

  • too many small chunks for not much gain

conclusion

  • linux chunk sizes of 4 or 5 are preferable; if uneven chunks can be resolved, 4 chunks or even 3 chunks may be preferable.

windows10-aarch64

1 chunk

  • runtime is too long (> 60 min)

2 chunks

  • runtime is too long
  • runtime is uneven

3 chunks

  • chunk runtime is uneven
  • currently used in mozilla-central

conclusion

  • windows10-aarch64 chunk size of 3 (current) or 4 is preferable.
Attachment #9062279 - Attachment description: Bug 1548160 - task efficiency - review and reduce chunk count for various platforms → Bug 1548160 - task efficiency: review and reduce chunk count of xpcshell for various platforms
No longer depends on: 1548106
Pushed by egao@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b447fc4d689d
task efficiency: review and reduce chunk count of xpcshell for various platforms r=gbrown,jmaher
Status: NEW → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
Regressions: 1552580

I see that osx debug is 2 chunks and opt is 5 chunks:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=aa906ac6a62cb0d8e9d8e73b5804183cffc720cd

opt runs fast, 4 chunks in <=10 minutes and 5th chunk is 20+ minutes;
debug runs slow ~17 and ~37 minutes

I suspect there is one or two manifests which are longer- maybe to follow up here is to split large manifests into smaller ones so we can load balance better?

You need to log in before you can comment on or make changes to this bug.