Closed Bug 1754823 Opened 3 years ago Closed 2 years ago

Improve user experience of triggering performance tests via try perf selector

Tracking

(firefox108 fixed)

Status:

RESOLVED FIXED

Milestone:

108 Branch

Tracking Flags:

Tracking

Status

firefox108

---

fixed

People

(Reporter: sparky, Assigned: sparky)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [fxp])

Attachments

(9 files, 5 obsolete files)

Bug 1754823 - Add a perf-specifc try selector. r?#perftest 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Add a flag to allow capturing the push-to-try logs. r?ahal 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Add option to allow log collection to some try selectors. r?ahal 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Gather treeherder revisions for the perf selector. r?aglavic 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Show perf-selector categories in the fuzzy interface. r?#perftest,ahal 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Move tryselect fzf utils to separate file. r?ahal 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Move tryselect fzf utils to separate file. r?ahal 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Add a new perf selector skeleton file. r?#perftest 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Add a method for expanding the perf categories. r?#perftest 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Add methods for getting tasks for the perf selector. r?#perftest 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Refactor the CompareParser for the perf selector. r?aglavic 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Add a push-to-try method to the perf selector. r?#perftest 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Add the run methods to the perf selector. r?ahal,#perftest 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1754823 - Add tests for the perf selector. r?ahal,#perftest 2 years ago Greg Mierzwinski [:sparky] 48 bytes, text/x-phabricator-request		Details \| Review

Greg Mierzwinski [:sparky]

Assignee

Description

•

3 years ago

Identifying suitable performance tests and running them in our CI is currently confusing, meaning that this is under utilised. We see frustration when engineers are attempting to run tests in CI, and there are likely other engineers who give up early or have been put off from past experience. This is harmful to promoting a culture of performance at Mozilla.

One suggestion is that instead of listing all tasks in try chooser, we could have have logical collections (page load, benchmark, essential, etc.)

Greg Mierzwinski [:sparky]

Assignee

Updated

•

3 years ago

URL: https://mozilla-hub.atlassian.net/bro...

Greg Mierzwinski [:sparky]

Assignee

Updated

•

3 years ago

Blocks: dev-pain

Greg Mierzwinski [:sparky]

Assignee

Updated

•

3 years ago

Assignee: nobody → gmierz2

Greg Mierzwinski [:sparky]

Assignee

Updated

•

3 years ago

Depends on: 1677559

Michael Comella (:mcomella) [NI reported issues only: ex-Mozilla]

Comment 1

•

3 years ago

I recently used try to get page load perf results for essentially the first time – since my perspective is fairly fresh, feel free to ask me for feedback on the design. From my history as a product developer, I think the experience should be simple enough that I come with a question like, "How did my change impact cold start up on fenix?" and I'll be able to get the results without asking for help.

At a high level, there are two major roadblocks I hit:

figuring out which tests to select is hard. "logical collections (page load, benchmark, essential, etc.)" seems like it'd get most of the way there, at least for page load
I don't know how to interpret the results, e.g. what's the difference between SpeedIndex, ContentfulSpeedIndex, and PerceptualSpeedIndex? Do I need to retrigger to get confident results? etc. I provided this feedback to the perfherder compare google survey so I won't go into details here.

Greg Mierzwinski [:sparky]

Assignee

Comment 2

•

3 years ago

(In reply to Michael Comella (:mcomella) [needinfo or I won't see it] from comment #1)

I don't know how to interpret the results, e.g. what's the difference between SpeedIndex, ContentfulSpeedIndex, and PerceptualSpeedIndex? Do I need to retrigger to get confident results? etc. I provided this feedback to the perfherder compare google survey so I won't go into details here.

Thanks for your input :mcomella!

Regarding your second point, I think we can target a good portion of that with some new perftest guidelines we are thinking of making. There's no bugs or documentation available about this yet.

Greg Mierzwinski [:sparky]

Assignee

Updated

•

2 years ago

Blocks: 1772541

Greg Mierzwinski [:sparky]

Assignee

Updated

•

2 years ago

Summary: Improve user experience of triggering performance tests via try chooser → Improve user experience of triggering performance tests via try perf selector

Greg Mierzwinski [:sparky]

Assignee

Comment 3

•

2 years ago

Attached file Bug 1754823 - Add a perf-specifc try selector. r?#perftest (obsolete) — Details

This patch adds a new try selector specifically for performance tests. We make use of the compare selector as well as the fuzzy, and again selectors (used within the compare one) to do this.

The code starts off by building the variant combinations (e.g. fission+live-sites+profiling), then uses these to build up the rest of the categories. The variants are hidden by default and only available with something like --variants profiling. The same is true for apps, where chrome tests are hidden by default, and for platforms where android is hidden by default. There are flags developers can use to enable these if needed. All of this is done to prevent excessive try runs as these categories make it easier to do so.

The fuzzy selector was also modified to handle displaying the categories, and then parsing those categories for the tasks requested.

Greg Mierzwinski [:sparky]

Assignee

Comment 4

•

2 years ago

Attached file Bug 1754823 - Add a flag to allow capturing the push-to-try logs. r?ahal — Details

This patch adds a capture_log flag that will allow a user to capture the logs produced when running the push_to_try methods. Currently, we use subprocess.check_call which causes hg to hang when we attempt to redirect and gather stdout. Using subprocess.run is better as it lets us capture the logs, but the log output is very slow and can easily lead people to believe that hg is hanging when it's not. This results in corrupted repos. Using Popen, the logs are output slower than check_call, but faster than run so you know something is happening. The speed at which these logs are printed is also why I have this log capturing behind a flag.

This functionality will be used in the ./mach try perf selector to capture the child revisions produced for the try task config file changes and allow us to produce a PerfCompare link to provide the user.

Greg Mierzwinski [:sparky]

Assignee

Comment 5

•

2 years ago

Attached file Bug 1754823 - Add option to allow log collection to some try selectors. r?ahal (obsolete) — Details

Depends on D155980

Greg Mierzwinski [:sparky]

Assignee

Comment 6

•

2 years ago

Attached file Bug 1754823 - Gather treeherder revisions for the perf selector. r?aglavic (obsolete) — Details

This patch modifies the Compare selector to gather the logs when perf_categories are supplied. This argument will hold the categories to display in the Fuzzy selector and is currently the only thing that dictates if logs should be collected. Also, this simplifies the process of using --show-all in the perf selector since we still want to capture logs but we don't want to display any categories. A log processor was added to capture the logs and parse the revisions. Furthermore, the run method was moved into the CompareParser to make it's usage clearer in the perf selector.

Depends on D155981

Greg Mierzwinski [:sparky]

Assignee

Comment 7

•

2 years ago

Attached file Bug 1754823 - Show perf-selector categories in the fuzzy interface. r?#perftest,ahal (obsolete) — Details

This patch changes the fuzzy selector to show perf-specific categories when they are available (provided from the perf selector). When these are provided, only they will be shown, and the actual tasks will be hidden. After the user selects some categories, we run a set of queries specified by that category to select the tasks to run.

Depends on D154515

Greg Mierzwinski [:sparky]

Assignee

Updated

•

2 years ago

Blocks: 1523972

Greg Mierzwinski [:sparky]

Assignee

Comment 8

•

2 years ago

:ahal, regarding this patch (https://bugzilla.mozilla.org/attachment.cgi?id=9292284), the suggestion of using iter to get the logs didn't work either. I've been trying a few other things and combinations but I can't get it to work any better/faster than what I already have here.

I spent some time looking at the mercurial source code to see what is going on there when the ui.* methods are called for logging and I noticed that the logs become "protected" and I think this is why I'm having a hard time capturing them: https://www.mercurial-scm.org/repo/hg-stable/file/tip/mercurial/utils/procutil.py#l426

The code suggests that I can get this output from stderr, but there's nothing being output there (or nothing that I can catch).

What do you think about this?

Flags: needinfo?(ahal)

Andrew Halberstadt [:ahal]

Comment 9

•

2 years ago

Sorry, I have to admit.. this is going a bit over my head. I guess your original patch is good enough then :)

Flags: needinfo?(ahal)

Greg Mierzwinski [:sparky]

Assignee

Comment 10

•

2 years ago

No problem, thanks! I'll reduce the duplication as you requested.

Greg Mierzwinski [:sparky]

Assignee

Comment 11

•

2 years ago

Attached file Bug 1754823 - Move tryselect fzf utils to separate file. r?ahal — Details

This patch moves the fzf utility methods to a separate file so that they can be used by other try choosers such as the perf chooser. At the same time, helper methods for two aspects of the mach try fuzzy run method (setting up the tasks, and building the base command) are added but the code in the run method is left untouched.

Phabricator Automation

Updated

•

2 years ago

Attachment #9292285 - Attachment is obsolete: true

Greg Mierzwinski [:sparky]

Assignee

Comment 12

•

2 years ago

Attached file Bug 1754823 - Move tryselect fzf utils to separate file. r?ahal (obsolete) — Details

Phabricator Automation

Updated

•

2 years ago

Attachment #9299496 - Attachment is obsolete: true

Pulsebot

Comment 13

•

2 years ago

Pushed by gmierz2@outlook.com: https://hg.mozilla.org/integration/autoland/rev/f88c3eed72f7 Move tryselect fzf utils to separate file. r=ahal

Greg Mierzwinski [:sparky]

Assignee

Updated

•

2 years ago

Keywords: leave-open

Serban Stanca [:SerbanS]

Comment 14

•

2 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/f88c3eed72f7

Kash Shampur [:kshampur] ⌚EST

Updated

•

2 years ago

Updated

•

2 years ago

Attachment #9292286 - Attachment is obsolete: true

Phabricator Automation

Updated

•

2 years ago

Attachment #9289653 - Attachment is obsolete: true

Phabricator Automation

Updated

•

2 years ago

Attachment #9292287 - Attachment is obsolete: true

Greg Mierzwinski [:sparky]

Assignee

Comment 15

•

2 years ago

Attached file Bug 1754823 - Add a new perf selector skeleton file. r?#perftest — Details

This patch adds the basics for the perf selector: file, categorizations, selector configuration, and CLI arguments.

Depends on D155980

Greg Mierzwinski [:sparky]

Assignee

Comment 16

•

2 years ago

Attached file Bug 1754823 - Add a method for expanding the perf categories. r?#perftest — Details

This patch adds a method for expanding the categories. This method takes all the configuration details from the PerfParser (variants, apps, platforms, etc.) and combines them to create the categories that people will be seeing and selecting from. It creates the queries, and checks for any restrictions among them to provide selections that work. See the method for more details on all of this.

Depends on D160413

Greg Mierzwinski [:sparky]

Assignee

Comment 17

•

2 years ago

Attached file Bug 1754823 - Add methods for getting tasks for the perf selector. r?#perftest — Details

This patch adds two methods for getting/selecting tasks. The first one (get_tasks) is used to either query fzf directly, or to use interactively for user selections. The second (get_perf_tasks) makes use of the first to query for categories (through the user) and then query for the selected tasks. It also performs all the necesary intersections/unions for the queries.

These two are separated because we need a simple way to allow the user to look and select the raw tasks instead of categories if needed. This way the user can still get a comparison link at the end instead of going to mcah try fuzzy. See the next patches.

Depends on D160414

Greg Mierzwinski [:sparky]

Assignee

Comment 18

•

2 years ago

Attached file Bug 1754823 - Refactor the CompareParser for the perf selector. r?aglavic — Details

Depends on D160415

Greg Mierzwinski [:sparky]

Assignee

Comment 19

•

2 years ago

Attached file Bug 1754823 - Add a push-to-try method to the perf selector. r?#perftest — Details

This patch adds a method that allows us to push to try. It performs two pushes similar to the compare selector, except in this case we're also capturing the logs in all cases so that we can provide a Perfherder URL afterwards. At the same time, this patch adds a LogProcessor that can capture these logs and parse them for the revisions.

Depends on D160416

Greg Mierzwinski [:sparky]

Assignee

Comment 20

•

2 years ago

Attached file Bug 1754823 - Add the run methods to the perf selector. r?ahal,#perftest — Details

This patch adds the run methods for the perf selector as well as the entry point for the mach command. It also produces the Perfherder URL at the end.

Depends on D160417

Greg Mierzwinski [:sparky]

Assignee

Comment 21

•

2 years ago

Attached file Bug 1754823 - Add tests for the perf selector. r?ahal,#perftest — Details

This patch adds two parametrized tests (totalling 18 tests) for the perf selector. The first set of 15 tests are for testing the category expansions under different configurations. The last 3 tests are for testing the run methods in the three configurations that we offer: standard, dry runs, and uncategorized test selection (or --show-all).

Depends on D160418

Greg Mierzwinski [:sparky]

Assignee

Updated

•

2 years ago

Keywords: leave-open

Pulsebot

Comment 22

•

2 years ago

Pushed by gmierz2@outlook.com: https://hg.mozilla.org/integration/autoland/rev/df816f6bad3f Add a flag to allow capturing the push-to-try logs. r=ahal,AlexandruIonescu https://hg.mozilla.org/integration/autoland/rev/ff5e60523368 Add a new perf selector skeleton file. r=perftest-reviewers,AlexandruIonescu https://hg.mozilla.org/integration/autoland/rev/6e94007a5fe8 Add a method for expanding the perf categories. r=perftest-reviewers,AlexandruIonescu https://hg.mozilla.org/integration/autoland/rev/3c18a3f66f89 Add methods for getting tasks for the perf selector. r=perftest-reviewers,AlexandruIonescu https://hg.mozilla.org/integration/autoland/rev/d57d61c31656 Refactor the CompareParser for the perf selector. r=AlexandruIonescu https://hg.mozilla.org/integration/autoland/rev/cac9a015e794 Add a push-to-try method to the perf selector. r=perftest-reviewers,AlexandruIonescu https://hg.mozilla.org/integration/autoland/rev/461ee4c95c53 Add the run methods to the perf selector. r=ahal https://hg.mozilla.org/integration/autoland/rev/0a6ce361ed41 Add tests for the perf selector. r=ahal

Atila Butkovits

Comment 23

•

2 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/df816f6bad3f
https://hg.mozilla.org/mozilla-central/rev/ff5e60523368
https://hg.mozilla.org/mozilla-central/rev/6e94007a5fe8
https://hg.mozilla.org/mozilla-central/rev/3c18a3f66f89
https://hg.mozilla.org/mozilla-central/rev/d57d61c31656
https://hg.mozilla.org/mozilla-central/rev/cac9a015e794
https://hg.mozilla.org/mozilla-central/rev/461ee4c95c53
https://hg.mozilla.org/mozilla-central/rev/0a6ce361ed41

Status: NEW → RESOLVED

Closed: 2 years ago

status-firefox108: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 108 Branch

Greg Mierzwinski [:sparky]

Assignee

Updated

•

2 years ago

Blocks: 1799113

Greg Mierzwinski [:sparky]

Assignee

Updated

•

2 years ago

Blocks: mach-try-perf

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

1 year ago

URL: https://mozilla-hub.atlassian.net/bro...

See Also: → https://mozilla-hub.atlassian.net/browse/FXP-2028

Whiteboard: [fxp]

You need to log in before you can comment on or make changes to this bug.