Expose platform/configuration in tasks
Categories
(Firefox Build System :: Task Configuration, enhancement)
Tracking
(firefox96 fixed)
Tracking | Status | |
---|---|---|
firefox96 | --- | fixed |
People
(Reporter: marco, Assigned: ahal)
References
(Blocks 1 open bug)
Details
Attachments
(6 files, 1 obsolete file)
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
Currently I'm resorting to parsing the label of the task to figure out its platform and configuration (windows/linux/..., debug/opt, normal/fis/whatever). It'd be nice if this information was available in AD, in a way that could be easily used later as input to taskgraph.
Assignee | ||
Comment 1•5 years ago
|
||
Yes, we should generate this value in the transforms such that it is a unique configuration that tests can run on. I.e, takes all platform and runtime information into account.
If possible, I'd suggest we try to avoid the "label" approach and store this as a dict. Though I certainly see the value in passing strings around. Maybe we could do both.
Updated•5 years ago
|
Comment 2•5 years ago
|
||
In Treeherder we would take a combination of values and turn them into a signature/string:
https://github.com/mozilla/treeherder/blob/368c112266f4f276251a4886a5337fcb17b3a1e9/treeherder/etl/jobs.py#L147-L171
In Perfherder that looks more like a series of values separated by a comma.
Could we generate such artifact at the taskgraph generation level and somehow make the info available via Treeherder/AD?
Assignee | ||
Comment 3•5 years ago
|
||
Kyle: fwiw I think there is nothing for you to do here yet. We need to expose the data in each task first.
Armen: Yes. I think we would just make it a part of each task definition. Possibly a tag (in which case there might be nothing to do on the AD side, unsure about treeherder).
Reporter | ||
Comment 4•5 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #1)
If possible, I'd suggest we try to avoid the "label" approach and store this as a dict. Though I certainly see the value in passing strings around. Maybe we could do both.
I agree! It could even be a string, as long as it's structured, but making it a dict enforces it and is way cleaner and self-explanatory. Users can then decide if they want to somehow stringify it and how.
I imagine we have several attributes to take into account, and some of them might have multiple possible values:
{
"os": "linux",
"arch": "x86",
"config": ["fission", "webrender"],
}
Comment 5•5 years ago
|
||
I am glad there is interest in this. I have been using something like this for my own dashboards; to accelerate searches. There is a subroutine in the ActiveData-ETL pipeline that "parses" the run.key
into properties: It translates string segments into dicts, which are then merged into the ETL output.
Here are the lookup tables: They map strings to properties.
https://github.com/klahnakoski/ActiveData-ETL/blob/2c6b8a7f7a6cbc3864e869828e3befd4c5faa43a/activedata_etl/imports/task.py#L458
Here is the test suite:
https://raw.githubusercontent.com/klahnakoski/ActiveData-ETL/dev/tests/resources/metadata_names.json
These mappings have never been reviewed; so reviews and requests from domain experts are appreciated.
Reporter | ||
Updated•5 years ago
|
Assignee | ||
Updated•4 years ago
|
Comment 6•4 years ago
|
||
This has been on my wishlist for a while as well, I'm happy to lend a hand if needed.
Assignee | ||
Comment 7•3 years ago
|
||
I believe this is safe to remove because:
- There are only 3 talos tasks that use this platform
- They don't run anywhere by default (try only)
- It is using reference hardware from 2017
Dropping this platform simplifies the logic that parses the 'test-platform' key
(in future commit) as this platform has a non-standard format compared to all
the others.
Assignee | ||
Comment 8•3 years ago
|
||
This was dumping the entire output of the generation after the pytest
failure. Which basically makes these tests impossible to work with (as you need
to scroll up forever to find the error). If you had a previous test run in
your shell buffer prior, it was really difficult to find the start of the
current run.
Ideally we'd still have a way of seeing this ouput when requested.. maybe we
could save it to a file? But for now this is a much saner default. Plus if
additional context is needed, we can focus on re-writing the tests to provide
it in their error messages.
Depends on D131279
Assignee | ||
Comment 9•3 years ago
|
||
This will:
-
Give consumers of test tasks a way to reliable determine the configuration
it is running under without needing to parse magic labels. -
Consolidate much of the logic around parsing the build and test platform
strings.
Depends on D131280
Assignee | ||
Comment 10•3 years ago
|
||
This also adds way to bypass the 'check_schema' call since this schema's keys
contains strings from variants.yml
which sometimes have underscoes in them.
Depends on D131281
Assignee | ||
Comment 11•3 years ago
|
||
This avoids re-parsing the build/test platform a second time by re-using what
is already defined in the test setting.
This does result in a few differences in the taskgraph, but in all cases that
I've been able to detect, it's actually fixing errors that were previously
going uncaught.
Depends on D131282
Assignee | ||
Comment 12•3 years ago
|
||
The copy and deepcopy operations currently use 'setitem' which obviously fails
for 'ReadOnlyDict'. But copying the dict yields a new instance so there should
be no reason this is prevented.
Specifically, I'd like to make certain subsets of task configuration read-only.
But copying is needed when we split the task into multiple tasks (e.g for
chunking).
Depends on D131283
Assignee | ||
Comment 13•3 years ago
|
||
This will make it easy for consumers to check whether a set of tasks are on the
same test setting or not by comparing their hash.
This also uses a 'ReadOnlyDict' to ensure the setting isn't modified later.
Depends on D131284
Assignee | ||
Comment 14•3 years ago
|
||
Hey Dave,
Do you know if the test-windows10-64-ref-hw-2017-qr/opt-talos-webgl-gli-e10s tasks are still needed?
It's fine if they are (I'll just need to revisit my parser logic here). Just checking as I accidentally thought they were unused until Joel pointed out my error. Figured I'd confirm with you before re-doing my stack.
Comment 15•3 years ago
|
||
We're discussing this in bug 1691561 comment 79, and may decide to retire this hardware from our CI infrastructure. I don't know how soon we will come to a decision here, so I suggest moving forward with the assumption that these tasks are still needed and to follow the referenced bug for further updates.
Assignee | ||
Comment 16•3 years ago
|
||
Thanks, sounds good.
Updated•3 years ago
|
Comment 17•3 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #16)
Thanks, sounds good.
We've decided to retire this hardware pool. We're planning to identify a replacement reference device, but this shouldn't block your work on this bug.
Comment 18•3 years ago
|
||
Comment 19•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/145c441094fb
https://hg.mozilla.org/mozilla-central/rev/78f0e18b4a79
https://hg.mozilla.org/mozilla-central/rev/b46d18801724
https://hg.mozilla.org/mozilla-central/rev/6456f575421f
https://hg.mozilla.org/mozilla-central/rev/0b71677b32d1
https://hg.mozilla.org/mozilla-central/rev/5363cdd650bc
Description
•