Activate performance alerting on multi commit Fenix performance tests
Categories
(Testing :: Performance, task, P2)
Tracking
(Not tracked)
People
(Reporter: Bebe, Assigned: Bebe)
References
()
Details
Attachments
(2 files)
To be able to start performance sherrifing on Fenix we need to turn on alerting on these tests.
Currently the app link test is running on Mozilla-Central as a try job.
A son mc we have multiple test suites and we don't want to activate alerting on this branch we need to figure out a way to activate alerting for fenix perf tests.
From Riot discussions we decided one option is to move the cron job to the fenix repo and activate perfherder alerting in there.
Assignee | ||
Comment 1•4 years ago
|
||
:sparky can you sync with :jlorenzo and identify the best solution to run and alert on the app link test
Assignee | ||
Updated•4 years ago
|
Comment 2•4 years ago
|
||
:jlorenzo, based on our conversations on riot, it sounds like we'll have to temporarily duplicate the tasks while you add a way for us to run our m-c tasks on fenix? I'm wondering if it would be possible to get a new option in run-on-projects
to include the fenix and reference-browser branch there for this de-duplication solution.
:bebe, you'll have to port these 4 tasks to the Fenix taskcluster definitions - you'll also need to figure out how to get the mozperftest package into the fenix repo from mozilla-central (:jlorenzo can help with this):
VIEW test: https://searchfox.org/mozilla-central/source/taskcluster/ci/perftest/android.yml#35-100
MAIN test: https://searchfox.org/mozilla-central/source/taskcluster/ci/perftest/android.yml#227-296
You'll need to also make these run on a cron task (running at around 4:00AM). There are also these job-defaults to integrate: https://searchfox.org/mozilla-central/source/taskcluster/ci/perftest/android.yml#6-13
Updated•4 years ago
|
Comment 3•4 years ago
|
||
Sorry for the delay.
I agree, duplication is our best solution if we want quick results.
I'm wondering if it would be possible to get a new option in
run-on-projects
to include the fenix and reference-browser branch there for this de-duplication solution.
That could be a nice way to tell a job has to run on the fenix repo, for instance. That said, I'm not sure how we could get this working under the hood. I current 2 potential ways:
- The fenix decision task runs
taskgraph
against the fenix repo (which is what currently happens), then runstaskgraph
against the mozilla-central. That doesn't seem achievable in the near future becausetaskgraph
is built with the assumption that it runs a single time. - The fenix decision task checks out parts of mozilla-central and
taskgraph
runs once and reads task definitions from 2 different places. This is a different kind of complexity, but I'm guessing it's as complex as solution 1. :tomprince likely has a more educated view on this.
If we forget about run-on-projects
, I see a 3rd solution. As a matter of fact, Releng has also struggled with taskgraph
duplication in mobile projects. For instance, most of the changes we make in Fenix have to be done in Reference-Browser. Sometimes they have to be done to 2 more mobile repos. An idea we thought about was to centralize the mobile common config to a single repo. We haven't done anything yet, but maybe the mobile perf tests could belong to this (future) repo. How does that sound to you, :sparky?
you'll also need to figure out how to get the mozperftest package into the fenix repo from mozilla-central
May I have a link to that package? That'll help me finding a good way to fetch it in Fenix.
Comment 4•4 years ago
|
||
(In reply to Johan Lorenzo [:jlorenzo] from comment #3)
If we forget about
run-on-projects
, I see a 3rd solution. As a matter of fact, Releng has also struggled withtaskgraph
duplication in mobile projects. For instance, most of the changes we make in Fenix have to be done in Reference-Browser. Sometimes they have to be done to 2 more mobile repos. An idea we thought about was to centralize the mobile common config to a single repo. We haven't done anything yet, but maybe the mobile perf tests could belong to this (future) repo. How does that sound to you, :sparky?
Good to know we're not the only ones hitting this problem. Hmm, the issue for me with the mobile common config is that it seems like there will still be duplication for us since we'll be defining them in m-c and the common mobile config right? If we could centralize all of taskgraph somehow that would be the best solution for me. Thinking about it further, even if we only define the mobile tests in this common config, we'd still be duplicating transforms, and then there's the issue of being able to run these tests on try so we'd have to still duplicate them completely with this common config.
The single-run assumption is interesting. How do cron-decision tasks fit into this since they also trigger tasks? I wonder if there's a way we could make use of that behaviour in the mobile repos to run things from m-c.
you'll also need to figure out how to get the mozperftest package into the fenix repo from mozilla-central
May I have a link to that package? That'll help me finding a good way to fetch it in Fenix.
The android tests use sparse profiles rather than packages actually which you can see in the logs for this test: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&tier=1%2C2%2C3&searchStr=perftest&selectedTaskRun=VmhLvN5qRDe1AZDtITZvcg.0
[vcs 2020-08-05T04:05:40.977Z] executing ['hg', 'robustcheckout', '--sharebase', '/builds/worker/checkouts/hg-store', '--purge', '--upstream', 'https://hg.mozilla.org/mozilla-unified', '--sparseprofile', 'build/sparse-profiles/perftest', '--revision', '451800aa75df8b442154aa120806d77a2b5dc8b0', 'https://hg.mozilla.org/mozilla-central', '/builds/worker/checkouts/gecko']
[vcs 2020-08-05T04:05:41.026Z] (using Mercurial 4.5.3)
[vcs 2020-08-05T04:05:41.026Z] ensuring https://hg.mozilla.org/mozilla-central@451800aa75df8b442154aa120806d77a2b5dc8b0 is available at /builds/worker/checkouts/gecko
[vcs 2020-08-05T04:05:41.170Z] (cloning from upstream repo https://hg.mozilla.org/mozilla-unified)
[vcs 2020-08-05T04:05:41.225Z] (sharing from new pooled repository 8ba995b74e18334ab3707f27e9eb8f4e37ba3d29)
[vcs 2020-08-05T04:05:41.460Z] applying clone bundle from https://hg.cdn.mozilla.net/mozilla-unified/7cb90fa4f485fc9dda5c1fef3ae09a826f83774a.zstd-max.hg
[vcs 2020-08-05T04:05:41.502Z] adding changesets
[vcs 2020-08-05T04:05:43.504Z]
[vcs 2020-08-05T04:05:44.503Z] changesets [> ] 21370/607105 55s
Assignee | ||
Comment 5•4 years ago
|
||
Updated•4 years ago
|
Comment 6•4 years ago
|
||
Thinking about it further, even if we only define the mobile tests in this common config, we'd still be duplicating transforms
Okay, I was wondering if we move them off mozilla-central but there would still be some bits in m-c
. This is not a suitable solution, then.
The single-run assumption is interesting. How do cron-decision tasks fit into this since they also trigger tasks? I wonder if there's a way we could make use of that behaviour in the mobile repos to run things from m-c.
I might have said something ambiguous. Just to clarify: we run taskgraph
once per decision task. A cron task is a new decision task, so we call taskgraph
once more. My proposal #1 assumed we ran taskgraph
twice in the same task.
To expand on cron, it basically generates the same full graph of tasks, but ends up selecting the ones we really want. I'm not sure how this could solve our case, but I might be missing something. Feel free to expand your thinking!
Tom, is there a long-term solution that you think we can apply, here?
The android tests use sparse profiles rather than packages actually which you can see in the logs for this test: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&tier=1%2C2%2C3&searchStr=perftest&selectedTaskRun=VmhLvN5qRDe1AZDtITZvcg.0
Got it, so we somehow need a local checkout of this folder[1]. right? As a simple solution, we can download a zip archive from [2]. This hg.m.o endpoint is still supported but will eventually go away (bug 1596135).
[1] https://hg.mozilla.org/mozilla-central/file/297a47c209fae11e550bd0c52513fed00d9b0b2d/python/mozperftest/
[2] https://hg.mozilla.org/mozilla-central/archive/297a47c209fae11e550bd0c52513fed00d9b0b2d.zip/python/mozperftest/
Comment 7•4 years ago
|
||
(In reply to Johan Lorenzo [:jlorenzo] from comment #6)
Thinking about it further, even if we only define the mobile tests in this common config, we'd still be duplicating transforms
Okay, I was wondering if we move them off mozilla-central but there would still be some bits in
m-c
. This is not a suitable solution, then.The single-run assumption is interesting. How do cron-decision tasks fit into this since they also trigger tasks? I wonder if there's a way we could make use of that behaviour in the mobile repos to run things from m-c.
I might have said something ambiguous. Just to clarify: we run
taskgraph
once per decision task. A cron task is a new decision task, so we calltaskgraph
once more. My proposal #1 assumed we rantaskgraph
twice in the same task.To expand on cron, it basically generates the same full graph of tasks, but ends up selecting the ones we really want. I'm not sure how this could solve our case, but I might be missing something. Feel free to expand your thinking!
Oh ok thanks for the clarification, what I was trying to get towards is if we could sort of run two decision tasks - one for the mobile-specific common config you suggested, and one for the mozilla-central configs.
The android tests use sparse profiles rather than packages actually which you can see in the logs for this test: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&tier=1%2C2%2C3&searchStr=perftest&selectedTaskRun=VmhLvN5qRDe1AZDtITZvcg.0
Got it, so we somehow need a local checkout of this folder[1]. right? As a simple solution, we can download a zip archive from [2]. This hg.m.o endpoint is still supported but will eventually go away (bug 1596135).
[1] https://hg.mozilla.org/mozilla-central/file/297a47c209fae11e550bd0c52513fed00d9b0b2d/python/mozperftest/
[2] https://hg.mozilla.org/mozilla-central/archive/297a47c209fae11e550bd0c52513fed00d9b0b2d.zip/python/mozperftest/
It also pulls the tests from these areas: https://searchfox.org/mozilla-central/source/build/sparse-profiles/perftest
Comment 8•4 years ago
|
||
Just a small note to everyone to say that we are going to see if we can run these tests on autoland - that will get us sherriffing much faster than duplicating if it works. We'll post an update here to mention how it goes.
Updated•4 years ago
|
Comment 10•4 years ago
|
||
bugherder |
Assignee | ||
Comment 11•4 years ago
|
||
@Sparky as this did not worked any suggestion on how to proceed?
Updated•4 years ago
|
Comment 13•4 years ago
|
||
Comment 14•4 years ago
|
||
Comment 15•4 years ago
|
||
We now have cron set to run on autoland. I tested it by manually triggering it[1]. The graph looks great[2] (I saw 2 intermittent failures - which were unrelated). The next step is to wait for tomorrow 4am UTC[3]. Then we can likely close this bug.
Clearing NI because we went the cron way, for now.
[1] https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=97220a9e8b698e9e0e63d1cd9044f52cdb215fd8&selectedTaskRun=EAXa4GjgST2GwvDNbP_gCg.0
[2] https://firefox-ci-tc.services.mozilla.com/tasks/groups/EAXa4GjgST2GwvDNbP_gCg
[3] https://searchfox.org/mozilla-central/rev/0c682c4f01442c3de0fa6cd286e9cadc8276b45f/.cron.yml#325
Comment 16•4 years ago
|
||
This is blocked by the multi-ingestion changes on the perfherder side atm.
Assignee | ||
Comment 17•4 years ago
|
||
(In reply to Greg Mierzwinski [:sparky] from comment #16)
This is blocked by the multi-ingestion changes on the perfherder side atm.
this is now unblocked. is there anything we can do here?
Comment 18•4 years ago
|
||
Nope, we're done now. Fenix results are on autoland, have alerting enabled, and the multi-commit ingestion is working.
We should leave this bug open though since the ideal solution for us would be bug 1672250.
Assignee | ||
Updated•4 years ago
|
Comment 19•4 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:davehunt, maybe it's time to close this bug?
Updated•4 years ago
|
Comment 20•4 years ago
|
||
I've opened bug 1709626 for future work so we can close this now.
Updated•4 years ago
|
Description
•