If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

[raptor] run gaia raptor suite 6x right off, and use median results

RESOLVED FIXED

Status

Firefox OS
Gaia::PerformanceTest
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: rwood, Assigned: rwood)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments)

(Assignee)

Description

2 years ago
When the raptor emulator tests are running and reporting to treeherder, we want to be sure a flagged regression is a valid regression. Running the test once on the reference build and once on the tester build, comparing the results (and having an apparent regression appear) is not always a definitive result due to environmental factors, etc.

In order to make the results more reliable, implement raptor suite repeats. Run the suite once (launch test of 30 runs, on reference build and on tester build). If the results indicate no regression, finish and mark the result task green.

If the first suite results indicate a possible regression, then repeat the suite 6 more times concurrently (on the same builds). Then calculate the median results for the reference build and test build from all six runs, and compare. If this result indicates a regression, flag the result task as red, otherwise green. This way an indicated regression is well proven before being flagged (thanks for the input :jmaher).
(Assignee)

Comment 1

2 years ago
As discussed with James, instead of waiting for the first suite to finish and then kicking off more runs if a regression is detected, just launch the suite 6x right from the start. Use the median results to check for a performance regression. This way, since each suite runs concurrently, the testing time won't be extended, but yet we will be confident the regression is valid if flagged.
Summary: [raptor] repeat raptor suite several times if a regression is indicated → [raptor] run gaia raptor suite 6x right off, and use median results
(Assignee)

Comment 2

2 years ago
Created attachment 8607728 [details] [review]
https://github.com/rwood-moz/raptor-results/pull/2

Update raptor-results docker image code to:
- receive results for a variable number of suite runs (variable number of base and patch taskIds
- use the median result in each set (base and patch results) to check for the performance regression
Attachment #8607728 - Flags: review?(eperelman)

Comment 3

2 years ago
Comment on attachment 8607728 [details] [review]
https://github.com/rwood-moz/raptor-results/pull/2

r+ with comments addressed.
Attachment #8607728 - Flags: review?(eperelman) → review+
(Assignee)

Comment 4

2 years ago
Thanks Eli. The raptor-results docker image portion of this bug has been merged:

https://github.com/rwood-moz/raptor-results/commit/e8191e782cc24160e1456267b535177285280403

Comment 5

2 years ago
Created attachment 8608772 [details] [review]
[gaia] rwood-moz:bug1163088 > mozilla-b2g:master
(Assignee)

Comment 6

2 years ago
Raptor-results image, didn't commit the post-review updates; did that now:

https://github.com/rwood-moz/raptor-results/commit/12a286f039c85417368e7df226269ecc812c5ca7
(Assignee)

Comment 7

2 years ago
Created attachment 8608826 [details] [review]
https://github.com/mozilla-b2g/gaia/pull/30179

Taskcluter raptor decision task code, update to generate appropriate raptor tasks for N number of suite iterations
Attachment #8608826 - Flags: review?(garndt)
(Assignee)

Updated

2 years ago
Blocks: 1166776

Updated

2 years ago
Attachment #8608826 - Flags: review?(garndt) → review+
(Assignee)

Updated

2 years ago
Keywords: checkin-needed
(Assignee)

Comment 8

2 years ago
Landed myself, don't know why autolander doesn't seem to work for my raptor patches.

https://github.com/mozilla-b2g/gaia/commit/ea07a22e99979fbce2df447df73caf76c351909e
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.