Open Bug 1255001 Opened 4 years ago Updated 2 years ago

The 'important changes' and 'uncertain results' checkboxes don't behave as advertised

Categories

(Tree Management :: Perfherder, defect, P5)

defect

Tracking

(Not tracked)

People

(Reporter: kats, Unassigned)

Details

(Whiteboard: perfherder-docs)

It's not really clear to me what the "show only important changes" and "hide uncertain results" checkboxes on perfherder do. There's tooltips that claim to explain them, but they don't match actual behaviour. For example, go to [1]. Observe that most of the datapoints listed there have 6+ datapoints for the base/new results. So according to the tooltip on "hide uncertain results" those should all be "certain", yet if you check that box all the results go away.

The "important changes" tooltip says it filters out anything less than 2%. On [1] there is a tart opt e10s osx-10-10 result which is a 3.09%, and yet if you check the "show only important changes" box that gets hidden as well.

https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=a4929411c0aa&newProject=try&newRevision=35ef70a10445&framework=1&filter=e10s&showOnlyImportant=0
6+ datapoints does not imply "certain", it's just a precondition for having a sufficient number of comparison points to run the t-test. It really isn't a very large sample, and especially if there's a large standard deviation it's hard to be confident that a small difference is meaningful.

The tart example you gave is right on the border of being meaningful (a prerequisite for "important"): a score of 2.0 implies a certain degree of certainty that something's going on. The problem is that random test noise might also be an explanation. More retriggers here might help, if you think there's something to your patch.

I totally get that this isn't the most intuitive thing in the world to understand (even for me, and I wrote or reviewed all of it). Part of the problem is that this thing has to play many roles and deal with data with a wide variety of properties. It's hard to come up with a visual representation of this stuff which works well for everything. If you have any suggestions on how to improve the wording in the user interface, I'd be grateful.
Ok, that's fair. I suggest then just updating the tooltip to say "based on results of t-test; more datapoints should increase certainty" or something along those lines for the uncertainty checkbox. My main annoyance is that the tooltip says something very precise that doesn't match the behaviour. Changing the tooltip to be less precise is fine by me, as long as it is consistent with the observed behaviour.
Priority: -- → P5
You need to log in before you can comment on or make changes to this bug.