There is a bug in the raptor docker container when calculating results. If the p95 value for the base task run is 4 digits, and the p95 value for the patch task is 5 digits, the comparison script reports that the patch result did not increase. For example, from: https://s3-us-west-2.amazonaws.com/taskcluster-public-artifacts/-4bvY3RqRRqt9WwItvcNEw/0/public/logs/live_backing.log Base task value: 9774.000 Patch task value: 11008.000 Raptor launch test results summary for sms: No regression detected, p95 median value has not increased When in reality the patch task value increased / regressed by 12.6%
Created attachment 8646507 [details] [review] https://github.com/rwood-moz/raptor-gaia/pull/5 Ensure results are compared explicitly as numeric values. With this change, the new output for the same log files above: Base task value: 9774 Patch task value: 11008 Raptor launch test results summary for appname: coldlaunch.visuallyLoaded p95 has regressed by 12.625%, which is within the 15% threshold
Attachment #8646507 - Flags: review?(eperelman)
Comment on attachment 8646507 [details] [review] https://github.com/rwood-moz/raptor-gaia/pull/5 Looks good, hate it when that happens.
Attachment #8646507 - Flags: review?(eperelman) → review+
Landed: https://github.com/rwood-moz/raptor-gaia/commit/223d5626aa2501eda33a6768836f53e24087d508 Raptor image rebuilt, and pushed to docker hub: taskcluster/raptor-gaia:0.0.3
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.