Better propagation of test paths
Categories
(Firefox Build System :: Task Configuration, enhancement)
Tracking
(Not tracked)
People
(Reporter: egao, Unassigned)
References
(Blocks 1 open bug)
Details
Currently, all test paths scheduled for the chunk are passed to mozharness using environment variables.
They are structured like so:
"MOZHARNESS_TEST_PATHS": {
'web-platform-tests': [
'test',
'test-2'
]
}
This is problematic for two reasons:
- it is unwieldy as the test paths grow.
- environment variables have a finite character limit and this is rather quickly reached for windows platforms.
Addressing this issue with some sort of an artifact generated by the decision task that can be downloaded then read by the chunks would result in significantly better chunk balance as it would unlock ability to specific individual tests to be run.
Reporter | ||
Comment 1•5 years ago
|
||
:ahal - I am looking to making this change as the panacea for my web-platform-test chunk imbalance issues.
My work on web-platform-tests-reftest so far leads me to believe that best permanent solution for chunking issues for web-platform-tests is to be able to specify individual tests, thus avoiding both problems: double-scheduling and imbalanced chunks.
Specifically, what led to this bug is that web-platform-tests-reftest
on android-em experience permanent chunk timeout anywhere the /css/CSS2
directory was run, since this is quite a large directory. In the current paradigm, web-platform-tests harness splits the /css/CSS2
directory across two or more chunks. I could not replicate this behavior with the current chunking algorithm.
Am I going in the right direction for this? Is this what you had in mind in https://bugzilla.mozilla.org/show_bug.cgi?id=1634554#c4?
Reporter | ||
Comment 2•5 years ago
•
|
||
Some pushes to back up the statements:
try push with timeout extended to 180min: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&selectedTaskRun=YQoe1jTYTv2G4_8wGy_1tw-0&revision=9148eed0e7c1cc8732afb01502a2fd0f2633c77d
Both pushes see chunk 6 of web-platform-tests-reftest
time out on android-em
.
As an experiment I resurrected the path splitting algorithm that I devised for web-platform-tests
, but android-em
still times out after 90min.
push with path split: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&revision=c843568b85140061cb2e2037b1c38fc91cbc2bcd
This leads me to believe that web-platform-tests-reftest is irreconcilable with the current chunking algorithm.
Reporter | ||
Comment 3•5 years ago
|
||
Instead of passing in individual tests, maybe start in conjunction with Bug 1634554 so continue passing in groups.
Comment 4•5 years ago
|
||
Yes, we'll want to keep passing in groups. I do agree that MOZHARNESS_TEST_PATHS feels like a hack and a more robust method of passing down tests would be valuable.
That said I can think of two cons to the artifact approach:
- The task definitions for each task would be identical (other than the label name). I don't know if this is a problem or not.
- Setting an env is "free", whereas downloading an artifact adds overhead.
I think this is fairly tangential to bug 1634554 (i.e, we can solve that without this). Though it's definitely worth keeping this one on file too.
Comment 5•5 years ago
|
||
Note: Taskcluster does have the ability to "mount" files on the workers:
https://docs.taskcluster.net/docs/reference/workers/generic-worker/docker-posix-payload
Though looks like if we're writing the contents explicitly there's a 64Kb limit. So would have a similar limitation in size as the env. It's also possible to do a url
or artifact
mount, so could directly mount the file on the worker so mozharness wouldn't need to handle it.
Comment 6•5 years ago
|
||
Comment 7•5 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #5)
Note: Taskcluster does have the ability to "mount" files on the workers:
https://docs.taskcluster.net/docs/reference/workers/generic-worker/docker-posix-payload
Note that while I'd really like to have this feature on docker workers, generic-worker's docker support is not complete (and certainly not deployed everywhere.
Updated•2 years ago
|
Description
•