Open Bug 1931536 Opened 1 year ago Updated 1 year ago

Add additional columns that show confidence results of different statistical tests

Categories

(Testing :: PerfCompare, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: kshampur, Unassigned)

References

Details

(Whiteboard: [pcf])

Attachments

(1 file)

since the data is already present for t-test calculation and the column showing the confidence for that, it would be cool to have an (optionally visible?) set of columns to show the confidence of a few different statistical tests result e.g. mwu, or pearson's chi squared. No need to alert on it at this time

I think this would be very valuable since the student-t test is ideal when the data is normally distributed but many of our tests are actually multi-modal.

Here is an example of how Denis has presented various statistical tests in the Perf Telemetry Reports:
https://protosaur.dev/perf-reports/post-quantum-key-exchange-for-tls.html#All-http_page_tls_handshake-mean

Attached image compare.png

The perf compare results will expire shortly so I'm posting a screenshot:

Here's an example of an 8.5% improvement in the mean that is low confidence even though there have been a very high number of retriggers re-run.
Part of the problem is that the patch affects the number of very slow results, but that's hard to pick up with student-t since the standard deviation is so high, 20/27%.

Summary: Add aditional columns that show confidence results of different statistical tests → Add additional columns that show confidence results of different statistical tests

I was chatting about something like this that's somewhat related in the perfcompare channel with :julienw. I think it would be better if we had a new option menu that could let us change which method to use.

It would be easier to add new methods that way, and it wouldn't be as hard on the backend versus having to compute multiple statistics on all the tests/metrics at the same time.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: