Closed Bug 1667866 Opened 4 years ago Closed 1 year ago

Add mach try command to rerun performance tests from a specific alert

Categories

(Testing :: Performance, enhancement, P1)

enhancement

Tracking

(firefox118 fixed)

RESOLVED FIXED
118 Branch
Tracking Status
firefox118 --- fixed

People

(Reporter: kimberlythegeek, Assigned: myeongjun.ko)

References

(Blocks 3 open bugs)

Details

(Whiteboard: [fxp])

Attachments

(1 file)

Add a ./mach try command that will re-run the failing tests from a performance alert. Something like mach try perfalert 88348 or just mach try 88348 where 88348 is the bug number for that alert, or instead use the alert number from Perfherder

I guess the difference here would be that with the alert number, I would assume tests would re-run as soon as an alert is generated. If using a bug number, the tests would re-run when a bug is filed for that alert.

This came up in a brainstorming session, further input is welcome

Flags: needinfo?(dave.hunt)
Flags: needinfo?(jmaher)
Severity: -- → S3
Priority: -- → P5

I think there are good ideas here. I would like clarification.

  1. ./mach try perfalert https://treeherder.mozilla.org/perf.html#/alerts?id=27035 - this would push to try server, run the builds, and then run the tests; the comment mentions automatically rerunning tests as soon as the alert is generated and I would like clarification if this is designed as a developer tool for reproducing tests, or for automation?

  2. ./mach try perfalert https://bugzilla.mozilla.org/show_bug.cgi?id=1656546, if this is a bug, it would be nice to run the tests necessary- I am not sure how to get the exact set of tests to run. Would the intention here be to query the perfherder database for all alert summaries associated with the referenced bug and run the tests?

In order to make this work, we will need a clear mapping of alert name -> scheduled test name. This would be more straightforward for AWSY and Raptor, Talos would need a mapping file.

I guess the question is what specific problem/use case are you trying to solve with this, and ensure that the final technical solution meets that.

Overall, I assume this would take a few days to hack together. I would want to make sure it could work easily for:

  • bisecting a failure
  • testing a new fix
  • running on a specific platform (if 200 tests need to run, but you can reproduce it on windows, just run tests there only)
Flags: needinfo?(jmaher)

Good questions, Joel. I think #1 would be a better implementation, with the use case of a developer manually re-running tests

I see this being valuable for confirming/bisecting a regression and for testing a potential fix. By referencing the alert summary and automatically selecting the jobs containing the affected tests, the developer doesn't need to understand the different harnesses or tests. I feel that this could significantly improve the user experience for handling regressions.

Joel brings up a good point on filtering, which could be a valuable advanced feature. If we're concerned about running a large number of tests we could build in a limit, or default to only running jobs for alerts that have been marked as 'important'.

Flags: needinfo?(dave.hunt)

I am concerned about volume of tests as many people do --rebuild 20, this tool should rebuild the correct amount.

a redash query to generate this given a summary id:

select
  distinct jt.name
from
  performance_signature ps,
  performance_datum pd,
  performance_alert_summary pas,
  performance_alert pa,
  job j,
  job_type jt
where
  ps.id=pa.series_signature_id and
  ps.repository_id=pd.repository_id and
  ps.id=pd.signature_id and
  pd.job_id=j.id and
  pas.id=31099 and
  (pa.summary_id=pas.id or pa.related_summary_id=pas.id) and
  pd.push_id=pas.push_id and
  j.job_type_id=jt.id

theoretically we just need to hook this up to an API, maybe https://treeherder.mozilla.org/api/performance/alertsummary/?id=31099 could be extended to https://treeherder.mozilla.org/api/performance/alertsummary/?id=31099&format=try_tasks

or instead we could do:
https://treeherder.mozilla.org/api/performance/alertsummary_tasks/?id=31099

Assignee: nobody → myeongjun.ko
Status: NEW → ASSIGNED
Priority: P5 → P2
Priority: P2 → P1

I doubt that one wants to push to try by the summary alert id. There's a lot of reassignments, and what is an initial alert summary with a couple of items, could end up having a dozen more items reassigned to it. And usually developers use profiles and code to fix regressions, and end up needing more control of the try pushes than just passing the summary alert id.

BUT this would be useful to be implemented at alert item level and would benefit to have the task id associated with the alert item. This would actually be a step forward to filling the data rupture between alerts view and job view, and would ultimately help us having the triggered side-by-side task id directly in alerts view.

Flags: needinfo?(gmierz2)

(In reply to Alexandru Ionescu (needinfo me) [:alexandrui] from comment #8)

I doubt that one wants to push to try by the summary alert id. There's a lot of reassignments, and what is an initial alert summary with a couple of items, could end up having a dozen more items reassigned to it. And usually developers use profiles and code to fix regressions, and end up needing more control of the try pushes than just passing the summary alert id.

I believe this came out of developer surveys. I don't think this needs to be perfect to start, and we can improve it with other work around alert reassignments (or maybe we can make sure the API endpoint handles this?). We'll also be implementing a --alert option in mach try perf to do this, so they'll still be able to do everything else. I just realized there haven't been any comments about this so the command will look like this: ./mach try perf --alert <ALERT-ID>

Flags: needinfo?(gmierz2)

I remember this coming up as early as 2019 when we were brainstorming improvements for performance testing. The idea was that an engineer identified as the author of a regression patch could easily confirm and iterate on fixing the issue without needing to work out what tasks to run in CI.

It's worth keeping in mind that the alert/summary distinction is not one that's exposed in the Perfherder UI, so for most users, the summary is the alert. I don't think we want to complicate things by exposing this implementation detail. I would include any reassigned alerts, and even though we have some big alert summaries, I imagine this is mostly due to the many metrics we have and the number of actual tasks will be relatively few.

I also don't think we need to worry too much about volume of tests now that we're storing replicates, but perhaps we could warn users trying to run with --rebuild as it will take a while to break this habit. Rather than interrupt their flow, we could log a clear warning and reduce any --rebuild value to a hardcoded maximum, optionally with a command line override.

I'm very excited about this. Being able to say in comment 0 that you can just run ./mach try perf --alert XXXXXX to reproduce the regression is a huge win for the user experience!

Whiteboard: [fxp]
Duplicate of this bug: 1605748
Pushed by gmierz2@outlook.com: https://hg.mozilla.org/integration/autoland/rev/98924afbd295 Add mach try command to rerun performance tests from a specific alert r=sparky,perftest-reviewers
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 118 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: