Open Bug 1656475 Opened 6 months ago Updated 3 months ago

Activate performance alerting on multi commit Fenix performance tests

Categories

(Testing :: Performance, task, P2)

task

Tracking

(Not tracked)

People

(Reporter: Bebe, Unassigned)

References

(Depends on 1 open bug, )

Details

(Keywords: leave-open)

Attachments

(2 files)

To be able to start performance sherrifing on Fenix we need to turn on alerting on these tests.

Currently the app link test is running on Mozilla-Central as a try job.
A son mc we have multiple test suites and we don't want to activate alerting on this branch we need to figure out a way to activate alerting for fenix perf tests.

From Riot discussions we decided one option is to move the cron job to the fenix repo and activate perfherder alerting in there.

:sparky can you sync with :jlorenzo and identify the best solution to run and alert on the app link test

Flags: needinfo?(jlorenzo)
Flags: needinfo?(gmierz2)

:jlorenzo, based on our conversations on riot, it sounds like we'll have to temporarily duplicate the tasks while you add a way for us to run our m-c tasks on fenix? I'm wondering if it would be possible to get a new option in run-on-projects to include the fenix and reference-browser branch there for this de-duplication solution.

:bebe, you'll have to port these 4 tasks to the Fenix taskcluster definitions - you'll also need to figure out how to get the mozperftest package into the fenix repo from mozilla-central (:jlorenzo can help with this):
VIEW test: https://searchfox.org/mozilla-central/source/taskcluster/ci/perftest/android.yml#35-100
MAIN test: https://searchfox.org/mozilla-central/source/taskcluster/ci/perftest/android.yml#227-296

You'll need to also make these run on a cron task (running at around 4:00AM). There are also these job-defaults to integrate: https://searchfox.org/mozilla-central/source/taskcluster/ci/perftest/android.yml#6-13

Flags: needinfo?(gmierz2)
Severity: -- → S3
Priority: -- → P2

Sorry for the delay.

I agree, duplication is our best solution if we want quick results.

I'm wondering if it would be possible to get a new option in run-on-projects to include the fenix and reference-browser branch there for this de-duplication solution.

That could be a nice way to tell a job has to run on the fenix repo, for instance. That said, I'm not sure how we could get this working under the hood. I current 2 potential ways:

  1. The fenix decision task runs taskgraph against the fenix repo (which is what currently happens), then runs taskgraph against the mozilla-central. That doesn't seem achievable in the near future because taskgraph is built with the assumption that it runs a single time.
  2. The fenix decision task checks out parts of mozilla-central and taskgraph runs once and reads task definitions from 2 different places. This is a different kind of complexity, but I'm guessing it's as complex as solution 1. :tomprince likely has a more educated view on this.

If we forget about run-on-projects, I see a 3rd solution. As a matter of fact, Releng has also struggled with taskgraph duplication in mobile projects. For instance, most of the changes we make in Fenix have to be done in Reference-Browser. Sometimes they have to be done to 2 more mobile repos. An idea we thought about was to centralize the mobile common config to a single repo. We haven't done anything yet, but maybe the mobile perf tests could belong to this (future) repo. How does that sound to you, :sparky?

you'll also need to figure out how to get the mozperftest package into the fenix repo from mozilla-central

May I have a link to that package? That'll help me finding a good way to fetch it in Fenix.

Flags: needinfo?(jlorenzo) → needinfo?(gmierz2)

(In reply to Johan Lorenzo [:jlorenzo] from comment #3)

If we forget about run-on-projects, I see a 3rd solution. As a matter of fact, Releng has also struggled with taskgraph duplication in mobile projects. For instance, most of the changes we make in Fenix have to be done in Reference-Browser. Sometimes they have to be done to 2 more mobile repos. An idea we thought about was to centralize the mobile common config to a single repo. We haven't done anything yet, but maybe the mobile perf tests could belong to this (future) repo. How does that sound to you, :sparky?

Good to know we're not the only ones hitting this problem. Hmm, the issue for me with the mobile common config is that it seems like there will still be duplication for us since we'll be defining them in m-c and the common mobile config right? If we could centralize all of taskgraph somehow that would be the best solution for me. Thinking about it further, even if we only define the mobile tests in this common config, we'd still be duplicating transforms, and then there's the issue of being able to run these tests on try so we'd have to still duplicate them completely with this common config.

The single-run assumption is interesting. How do cron-decision tasks fit into this since they also trigger tasks? I wonder if there's a way we could make use of that behaviour in the mobile repos to run things from m-c.

you'll also need to figure out how to get the mozperftest package into the fenix repo from mozilla-central

May I have a link to that package? That'll help me finding a good way to fetch it in Fenix.

The android tests use sparse profiles rather than packages actually which you can see in the logs for this test: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&tier=1%2C2%2C3&searchStr=perftest&selectedTaskRun=VmhLvN5qRDe1AZDtITZvcg.0

[vcs 2020-08-05T04:05:40.977Z] executing ['hg', 'robustcheckout', '--sharebase', '/builds/worker/checkouts/hg-store', '--purge', '--upstream', 'https://hg.mozilla.org/mozilla-unified', '--sparseprofile', 'build/sparse-profiles/perftest', '--revision', '451800aa75df8b442154aa120806d77a2b5dc8b0', 'https://hg.mozilla.org/mozilla-central', '/builds/worker/checkouts/gecko']
[vcs 2020-08-05T04:05:41.026Z] (using Mercurial 4.5.3)
[vcs 2020-08-05T04:05:41.026Z] ensuring https://hg.mozilla.org/mozilla-central@451800aa75df8b442154aa120806d77a2b5dc8b0 is available at /builds/worker/checkouts/gecko
[vcs 2020-08-05T04:05:41.170Z] (cloning from upstream repo https://hg.mozilla.org/mozilla-unified)
[vcs 2020-08-05T04:05:41.225Z] (sharing from new pooled repository 8ba995b74e18334ab3707f27e9eb8f4e37ba3d29)
[vcs 2020-08-05T04:05:41.460Z] applying clone bundle from https://hg.cdn.mozilla.net/mozilla-unified/7cb90fa4f485fc9dda5c1fef3ae09a826f83774a.zstd-max.hg
[vcs 2020-08-05T04:05:41.502Z] adding changesets
[vcs 2020-08-05T04:05:43.504Z] 
[vcs 2020-08-05T04:05:44.503Z] changesets [>                                               ]  21370/607105 55s
Flags: needinfo?(gmierz2) → needinfo?(jlorenzo)
Assignee: nobody → fstrugariu
Status: NEW → ASSIGNED

Thinking about it further, even if we only define the mobile tests in this common config, we'd still be duplicating transforms

Okay, I was wondering if we move them off mozilla-central but there would still be some bits in m-c. This is not a suitable solution, then.

The single-run assumption is interesting. How do cron-decision tasks fit into this since they also trigger tasks? I wonder if there's a way we could make use of that behaviour in the mobile repos to run things from m-c.

I might have said something ambiguous. Just to clarify: we run taskgraph once per decision task. A cron task is a new decision task, so we call taskgraph once more. My proposal #1 assumed we ran taskgraph twice in the same task.

To expand on cron, it basically generates the same full graph of tasks, but ends up selecting the ones we really want. I'm not sure how this could solve our case, but I might be missing something. Feel free to expand your thinking!

Tom, is there a long-term solution that you think we can apply, here?


The android tests use sparse profiles rather than packages actually which you can see in the logs for this test: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&tier=1%2C2%2C3&searchStr=perftest&selectedTaskRun=VmhLvN5qRDe1AZDtITZvcg.0

Got it, so we somehow need a local checkout of this folder[1]. right? As a simple solution, we can download a zip archive from [2]. This hg.m.o endpoint is still supported but will eventually go away (bug 1596135).

[1] https://hg.mozilla.org/mozilla-central/file/297a47c209fae11e550bd0c52513fed00d9b0b2d/python/mozperftest/
[2] https://hg.mozilla.org/mozilla-central/archive/297a47c209fae11e550bd0c52513fed00d9b0b2d.zip/python/mozperftest/

Flags: needinfo?(jlorenzo) → needinfo?(mozilla)
See Also: → 1596135

(In reply to Johan Lorenzo [:jlorenzo] from comment #6)

Thinking about it further, even if we only define the mobile tests in this common config, we'd still be duplicating transforms

Okay, I was wondering if we move them off mozilla-central but there would still be some bits in m-c. This is not a suitable solution, then.

The single-run assumption is interesting. How do cron-decision tasks fit into this since they also trigger tasks? I wonder if there's a way we could make use of that behaviour in the mobile repos to run things from m-c.

I might have said something ambiguous. Just to clarify: we run taskgraph once per decision task. A cron task is a new decision task, so we call taskgraph once more. My proposal #1 assumed we ran taskgraph twice in the same task.

To expand on cron, it basically generates the same full graph of tasks, but ends up selecting the ones we really want. I'm not sure how this could solve our case, but I might be missing something. Feel free to expand your thinking!

Oh ok thanks for the clarification, what I was trying to get towards is if we could sort of run two decision tasks - one for the mobile-specific common config you suggested, and one for the mozilla-central configs.

The android tests use sparse profiles rather than packages actually which you can see in the logs for this test: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&tier=1%2C2%2C3&searchStr=perftest&selectedTaskRun=VmhLvN5qRDe1AZDtITZvcg.0

Got it, so we somehow need a local checkout of this folder[1]. right? As a simple solution, we can download a zip archive from [2]. This hg.m.o endpoint is still supported but will eventually go away (bug 1596135).

[1] https://hg.mozilla.org/mozilla-central/file/297a47c209fae11e550bd0c52513fed00d9b0b2d/python/mozperftest/
[2] https://hg.mozilla.org/mozilla-central/archive/297a47c209fae11e550bd0c52513fed00d9b0b2d.zip/python/mozperftest/

It also pulls the tests from these areas: https://searchfox.org/mozilla-central/source/build/sparse-profiles/perftest

Just a small note to everyone to say that we are going to see if we can run these tests on autoland - that will get us sherriffing much faster than duplicating if it works. We'll post an update here to mention how it goes.

Pushed by fstrugariu@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b9306bc41394
Activate performance alerting on multi commit Fenix performance tests r=perftest-reviewers,sparky

@Sparky as this did not worked any suggestion on how to proceed?

Flags: needinfo?(gmierz2)

Checking with jlorenzo on riot about this.

Flags: needinfo?(gmierz2)
Priority: P2 → P1
Pushed by jlorenzo@mozilla.com:
https://hg.mozilla.org/ci/ci-configuration/rev/f01622053410
Enable cron jobs on autoland r=aki

We now have cron set to run on autoland. I tested it by manually triggering it[1]. The graph looks great[2] (I saw 2 intermittent failures - which were unrelated). The next step is to wait for tomorrow 4am UTC[3]. Then we can likely close this bug.

Clearing NI because we went the cron way, for now.

[1] https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=97220a9e8b698e9e0e63d1cd9044f52cdb215fd8&selectedTaskRun=EAXa4GjgST2GwvDNbP_gCg.0
[2] https://firefox-ci-tc.services.mozilla.com/tasks/groups/EAXa4GjgST2GwvDNbP_gCg
[3] https://searchfox.org/mozilla-central/rev/0c682c4f01442c3de0fa6cd286e9cadc8276b45f/.cron.yml#325

Flags: needinfo?(mozilla)

This is blocked by the multi-ingestion changes on the perfherder side atm.

Depends on: 1672250

(In reply to Greg Mierzwinski [:sparky] from comment #16)

This is blocked by the multi-ingestion changes on the perfherder side atm.

this is now unblocked. is there anything we can do here?

Flags: needinfo?(gmierz2)

Nope, we're done now. Fenix results are on autoland, have alerting enabled, and the multi-commit ingestion is working.

We should leave this bug open though since the ideal solution for us would be bug 1672250.

Flags: needinfo?(gmierz2)
Assignee: fstrugariu → nobody
Status: ASSIGNED → NEW
Priority: P1 → P2
You need to log in before you can comment on or make changes to this bug.