(In reply to Greg Mierzwinski [:sparky] from comment #9)
The only time you would hit this issue is when you retrigger the
*-vismet tasks in Treeherder - the correct way of doing this is described here: https://wiki.mozilla.org/TestEngineering/Performance/Raptor/Browsertime/VisualMetrics
Right, but that's the common case if you're using
mach try fuzzy --rebuild N.
The perf sherriffs should know how to retrigger the vismet tasks correctly so you shouldn't worry about this happening in regression/improvement bugs. When you look at the Perfherder Compare view, if you see a metric with multiple runs but they only report a single value, then you know that it wasn't retriggered correctly.
Right, so in my example I have to know what either " ± 0" means, or seeing confidence either blank (?) or "Infinity", despite "Total Runs" being > 1.
And as a result of looking at this, I'm noticing that my --rebuild 7 pushes are reporting some useful data -- specifically, the non-vismet metrics. Which apparently I can pattern-match because the vismet have spelled out names like FirstVisualChange and the non-vismet ones don't. Or, if looking at the results from individual tasks, the non-vismet ones have abbreviated names like "fcp". Ah, I see now that those show up in the comparison view as "subtests". It would be nice if the vismet ones were grouped the same way, or if neither was grouped. (Or at least, with my limited understanding I think it would be nice. I'm not really understanding the overall picture, so I could be way off base.)
We're looking to getting visual-metrics running in the test tasks themselves to get around this issue but we need to get FFMPEG and ImageMagick installed on the machines first.
Right, I saw that in the dependent bug, though it seems a little unfortunate to need that. It seems like the taskgraph should have the necessary smarts added for this. Still, whatever works. Expediency is good.