Open Bug 2017544 Opened 9 hours ago Updated 9 hours ago

The interface does not explain Mann-Whitney-U

Categories

(Testing :: PerfCompare, defect)

defect

Tracking

(Not tracked)

People

(Reporter: pbone, Unassigned)

References

Details

(Whiteboard: [pcf])

The user interface doesn't do enough to explain what the various results from Mann-Whitney-U are, it suggests what they mean but I need something more concrete. Partly what I want to answer is "by what degree did my result get better or worse?". I realise I could "just google it" but it'd be ideal if the user interface could explain it. Or at least give me a column with the magnitude of the difference.

I'm looking at this page: https://perf.compare/compare-results?baseRev=f718ed4839fe185875d6843aaecda056636c71d2&baseRepo=try&newRev=9f6623d51e773645a85cfd563232d548a74a2625&newRepo=try&framework=13&test_version=mann-whitney-u

"Cliff’s Delta quantifies the magnitude of the difference between Base and New values."

I know "magnitude of the difference", but what happens when it quantifies it? I'm pretty sure that what I actually want is the magnitude of the difference. My result is 2.07% better (the magnitude of difference between means? from the t-test) and has a Cliff's Delta of 0.8. I don't know what to do with Cliff's Delta or if it can help.

I'm happy with "Significant" and "Not significant" and I think I understand p-value. Although I would be happier if it put it this in common terms like "..a p-value of less than 0.05 meaning that a significant result... actually I'm not sure I can explain it. Something about there being a 5% chance that the difference between the means is not true. (but I'm fuzzy on the "not true" part).

I'm lost again when it comes to Effect Size. and this is why I filed the bug. Because I looked at this and thought Woah, I made Firefox 10% faster! But when I switch to t-test it's 2.07%. So I was expecting it to have something to do with the difference between the distributions. When I hover it it says "An improvement or regression being shown here means that the effect size is meaningful, and the difference has a significant p-value." I'm afraid it's not very meaningful to me.

In the UI I'd like some more common-language descriptions of what these things mean, probably reintroduce something that shows the magnitude of the difference. Outside of the UI, in documentation and/or an announcement, it might be helpful to explain why we're moving from Student's T Test to Mann-Whitney-U.

Thanks.

You need to log in before you can comment on or make changes to this bug.