Closed Bug 1390169 Opened 7 years ago Closed 7 years ago

Make linux64-qr talos tests show up properly in TreeHerder and PerfHerder

Categories

(Tree Management :: Treeherder: Data Ingestion, enhancement, P1)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kats, Assigned: emorley)

References

Details

Attachments

(1 file)

Currently, we have some test suites (e.g. reftests) already running on the linux64-qr test platform. I'm trying to add talos tests to run as well; bug 1383149 is the main bug for this.

My latest attempt is in the try push at https://treeherder.mozilla.org/#/jobs?repo=try&revision=7ed2a904ec830c8ebce84c241b072358a366de02

The jobs themselves seem to be running correctly (the buildbot configs are set up and running the talos commands with webrender enabled). However, the jobs show up in TreeHerder under "Linux x64 opt" instead of "Linux x64 QuantumRender opt". Similarly PerfHerder seems to be mixing the talos results from the linux64-qr talos jobs with the talos results from regular linux64 talos jobs, which is not good.

Note that for the all of the QR tests (including these talos tests), the *build platform* is linux64 but the *test platform* is linux64-qr. I'm not sure if that's relevant here.

I believe I'm missing some mapping somewhere in the TreeHerder codebase but it's not clear to me what needs updating or where. The ui/js/values.js already contains a mapping from "linux64-qr" to "Linux x64 QuantumRender" but I'm not sure what else is needed.

[1] https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=df9beb781895fcd0493c21e95ad313e0044515ec&filter-searchStr=talos%20stylo&group_state=expanded&selectedJob=122962916
(Since my "latest attempt" try push is still going, here is an older one with completed jobs: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5ead161cf5549d9ca2ea5084cf4a9175c30c4099)
Ah I was about to say that try push doesn't have any buildbot jobs, only taskcluster jobs (which don't use buildbot.py but instead in-tree configs). From the new try push, the buildbot buildername is (from the "job details" tab):

Ubuntu HW 12.04 x64 qr try talos dromaeojs-e10s
For this to show correctly in perfherder, we should somehow incorporate qr into the platform name. This file needs to be modified for that to happen:

https://github.com/mozilla/treeherder/blob/master/treeherder/etl/buildbot.py
Ah ok the issue here is that two different approaches have been taken.

The reftest QR jobs add the QR part to the job/group name, whereas the talos QR jobs that are already running on taskcluster have added it to the platform name. Given it's the same builds being used, it would seem the job name is more appropriate?
Flags: needinfo?(bugmail)
Sadly at the moment we have a complete mis-mash of conventions being used.

Historically a job would be of form:
* platform: {linux32, windows 10, ...}
* build type: {opt, debug, pgo}
* job/group name: ...

However now we have all three being used to denote variants of different types. For example "addon" builds use "build type", but yet "stylo" builds use "platform" to differentiate.
(In reply to Ed Morley [:emorley] from comment #5)
> Historically a job would be of form:
> * platform: {linux32, windows 10, ...}
> * build type: {opt, debug, pgo}
> * job/group name: ...

In this description, what is "platform" supposed to represent? The build platform? Or the test platform?

I'm ok with making whatever changes you think are best/most consistent. As long as the tests run and the results show up in a useful place, that's really all I care about.
Flags: needinfo?(bugmail)
So jobs have both a build_platform and a machine_platform, which was originally intended to differentiate between the machine that ran the job and the target platform for the build. However in reality other build_platform is used in the UI. At least from the Treeherder side, there is no such property of `test_platform` (that might be an internal buildbot term?).

Will, can Perfherder still do the performance comparisons against different job groups, or does it have to be a different platform name?
Flags: needinfo?(wlachance)
The notion of "test_platform" is something I've seen in the taskcluster code. For example the "linux64-qr/opt" test platform is defined at http://searchfox.org/mozilla-central/rev/6482c8a5fa5c7446e82ef187d1a1faff49e3379e/taskcluster/ci/test/test-platforms.yml#144 and basically says "the linux64-qr/opt test platform uses the linux64/opt build platform and we want to run the qr-tests test set on it"
(In reply to Ed Morley [:emorley] from comment #7)
> So jobs have both a build_platform and a machine_platform, which was
> originally intended to differentiate between the machine that ran the job
> and the target platform for the build. However in reality other
> build_platform is used in the UI. At least from the Treeherder side, there
> is no such property of `test_platform` (that might be an internal buildbot
> term?).
> 
> Will, can Perfherder still do the performance comparisons against different
> job groups, or does it have to be a different platform name?

Perfherder ignores the job group, so you'll need a different platform name (it uses machine_platform). All the etl code is here:

https://github.com/mozilla/treeherder/blob/master/treeherder/etl/perf.py
Flags: needinfo?(wlachance)
Comment on attachment 8897039 [details] [review]
[treeherder] mozilla:linux-qr-talos > mozilla:master

Kartikaya, Is this what you were after? :-)

(In reply to William Lachance (:wlach) (use needinfo!) from comment #9)
> Perfherder ignores the job group, so you'll need a different platform name

Eugh. Guess we'll have to put up with the inconsistencies with QR reftest then.
Attachment #8897039 - Flags: review?(bugmail)
Assignee: nobody → emorley
Status: NEW → ASSIGNED
Component: Treeherder → Treeherder: Data Ingestion
Priority: -- → P1
Comment on attachment 8897039 [details] [review]
[treeherder] mozilla:linux-qr-talos > mozilla:master

Review of attachment 8897039 [details] [review]:
-----------------------------------------------------------------

It seems reasonable but I have no way of testing it until it's landed. So r+ I guess.
Attachment #8897039 - Attachment is patch: true
Attachment #8897039 - Attachment mime type: text/x-github-pull-request → text/plain
Attachment #8897039 - Flags: review?(bugmail) → review+
Comment on attachment 8897039 [details] [review]
[treeherder] mozilla:linux-qr-talos > mozilla:master

Not sure why Bugzilla changed the attachment type...
Attachment #8897039 - Attachment is patch: false
Attachment #8897039 - Attachment mime type: text/plain → text/x-github-pull-request
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #13)
> Not sure why Bugzilla changed the attachment type...

Another case of bug 1384310 :-)
Commit pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/2e1f1a24076611a9897dced59bd779b5d711d7e3
Bug 1390169 - Add support for linux64-qr talos

It's having to be added as a platform rather than a new job/group
name, since otherwise comparisons can't be made in Perfherder with
the existing tests.
I've pushed master to the production branch, so this should deploy within the next 12 or so minutes. Existing jobs won't have their metadata fixed, but any ingested after that point will reflect the changes here.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
This seems to have done the job, thanks!
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: