Closed Bug 1294798 Opened 8 years ago Closed 8 years ago

Order of alerts in textual summary should be alphanumeric

Categories

(Tree Management :: Perfherder, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wlach, Assigned: ShrutiJ)

Details

Attachments

(1 file)

If you click on the "Investigating" portion of alert #2348 (https://treeherder.mozilla.org/perf.html#/alerts?id=2348) and copypaste the alert summary somewhere, you'll note that textual summary is not sorted alphabetically. It would be better if it was. Here's the current behavior (on a subset of items): tp5o_scroll summary linux64 opt e10s: 5.02 -> 5.2 (3.67% worse) tscrollx summary linux64 opt e10s: 3.4 -> 3.77 (10.95% worse) cart summary linux64 opt e10s: 30.78 -> 32.15 (4.44% worse) cart summary linux64 opt: 28.66 -> 29.99 (4.63% worse) And here's the desired ordering (also on a subset): cart summary linux64 opt e10s: 30.78 -> 32.15 (4.44% worse) cart summary linux64 opt: 28.66 -> 29.99 (4.63% worse) tp5o_scroll summary linux64 opt e10s: 5.02 -> 5.2 (3.67% worse) tscrollx summary linux64 opt e10s: 3.4 -> 3.77 (10.95% worse) Shruti, do you want to take a quick look at this? :rwood's patch in bug 1285368 should give you enough hint on which parts of the code you need to change.
(In reply to William Lachance (:wlach) from comment #0) > ... you'll note that textual summary is not sorted alphabetically. > It would be better if it was. Why would it be better? also, what's the current ordering? effectively random? I actually think that ordering by severity would be more useful. The order doesn't really matter for short lists of few items, but for longer lists, it's hard to identify what are the bigger (and more important) regressions to focus on first, as well as hard to identify what are the biggest improvements, if there are such. Order by severity is therefore more developer friendly IMO than random or alphabetical.
Yeah the current order is just random, so any kind of sorting would be an improvement over that. I feel like it might be easier to pick out *what* changed (and on which platforms) if we sort by name. It should be easy enough to pick out which regressions are "big" just by scanning down the right of the list, no?
I agree alphabetically is better than random, especially since it will naturally group the lines into tests and platforms, and I also think there's a value in it. My hunch though is that severity ordering would still be overall more useful than such grouping. Maybe take the last few regression bug reports, sort them both ways, and ask devs what they prefer?
Here's what severity ordering looks like for the original case, which seems pretty chaotic to me. I'm not a developer but I think it's pretty hard to divine the meaning of what happened from this ordering. Summary of tests that regressed: tresize linux64 opt: 21.69 -> 25.27 (16.52% worse) tresize linux64 pgo: 20.61 -> 23.3 (13.03% worse) tscrollx summary linux64 opt e10s: 3.4 -> 3.77 (10.95% worse) tsvgr_opacity summary linux64 pgo e10s: 343.78 -> 374.69 (8.99% worse) tsvgr_opacity summary linux64 pgo: 381.8 -> 414.35 (8.53% worse) cart summary linux64 pgo e10s: 22.91 -> 24.45 (6.73% worse) cart summary linux64 pgo: 21.27 -> 22.72 (6.81% worse) cart summary linux64 opt: 28.66 -> 29.99 (4.63% worse) cart summary linux64 opt e10s: 30.78 -> 32.15 (4.44% worse) tscrollx summary linux64 pgo e10s: 3.39 -> 3.54 (4.2% worse) tp5o_scroll summary linux64 opt e10s: 5.02 -> 5.2 (3.67% worse) tart summary linux64 opt: 5.44 -> 5.57 (2.43% worse) tp5o_scroll summary linux64 pgo e10s: 4.87 -> 4.99 (2.63% worse) damp summary linux64 opt e10s: 257.83 -> 263.87 (2.34% worse) Summary of tests that improved: tsvgx summary linux64 pgo: 305.18 -> 232.95 (23.67% better) tsvgx summary linux64 opt: 358.25 -> 285.42 (20.33% better) tsvgx summary linux64 pgo e10s: 188.22 -> 152.58 (18.94% better) tsvgx summary linux64 opt e10s: 227.13 -> 192.83 (15.1% better) tps summary linux64 pgo e10s: 44.65 -> 38.79 (13.13% better) tp5o_scroll summary linux64 opt: 8.59 -> 7.65 (10.93% better) tsvgr_opacity summary linux64 opt e10s: 487.17 -> 434.14 (10.88% better) tps summary linux64 opt e10s: 51.14 -> 45.84 (10.37% better) tsvgr_opacity summary linux64 opt: 551.16 -> 495.01 (10.19% better) tscrollx summary linux64 opt: 7.44 -> 6.7 (10% better) tps summary linux64 pgo: 57.22 -> 52.14 (8.87% better) tcanvasmark summary linux64 pgo e10s: 10468.08 -> 11370.62 (8.62% better) tcanvasmark summary linux64 pgo: 10177.58 -> 11023.88 (8.32% better) tps summary linux64 opt: 68.49 -> 63.69 (7.01% better) tp5o_scroll summary linux64 pgo: 7.41 -> 7.24 (2.21% better) -- vs. alphanumeric Summary of tests that regressed: cart summary linux64 opt e10s: 30.78 -> 32.15 (4.44% worse) cart summary linux64 opt: 28.66 -> 29.99 (4.63% worse) cart summary linux64 pgo e10s: 22.91 -> 24.45 (6.73% worse) cart summary linux64 pgo: 21.27 -> 22.72 (6.81% worse) damp summary linux64 opt e10s: 257.83 -> 263.87 (2.34% worse) tart summary linux64 opt: 5.44 -> 5.57 (2.43% worse) tp5o_scroll summary linux64 opt e10s: 5.02 -> 5.2 (3.67% worse) tp5o_scroll summary linux64 pgo e10s: 4.87 -> 4.99 (2.63% worse) tresize linux64 opt: 21.69 -> 25.27 (16.52% worse) tresize linux64 pgo: 20.61 -> 23.3 (13.03% worse) tscrollx summary linux64 opt e10s: 3.4 -> 3.77 (10.95% worse) tscrollx summary linux64 pgo e10s: 3.39 -> 3.54 (4.2% worse) tsvgr_opacity summary linux64 pgo e10s: 343.78 -> 374.69 (8.99% worse) tsvgr_opacity summary linux64 pgo: 381.8 -> 414.35 (8.53% worse) Summary of tests that improved: tcanvasmark summary linux64 pgo e10s: 10468.08 -> 11370.62 (8.62% better) tcanvasmark summary linux64 pgo: 10177.58 -> 11023.88 (8.32% better) tp5o_scroll summary linux64 opt: 8.59 -> 7.65 (10.93% better) tp5o_scroll summary linux64 pgo: 7.41 -> 7.24 (2.21% better) tps summary linux64 opt e10s: 51.14 -> 45.84 (10.37% better) tps summary linux64 opt: 68.49 -> 63.69 (7.01% better) tps summary linux64 pgo e10s: 44.65 -> 38.79 (13.13% better) tps summary linux64 pgo: 57.22 -> 52.14 (8.87% better) tscrollx summary linux64 opt: 7.44 -> 6.7 (10% better) tsvgr_opacity summary linux64 opt e10s: 487.17 -> 434.14 (10.88% better) tsvgr_opacity summary linux64 opt: 551.16 -> 495.01 (10.19% better) tsvgx summary linux64 opt e10s: 227.13 -> 192.83 (15.1% better) tsvgx summary linux64 opt: 358.25 -> 285.42 (20.33% better) tsvgx summary linux64 pgo e10s: 188.22 -> 152.58 (18.94% better) tsvgx summary linux64 pgo: 305.18 -> 232.95 (23.67% better) Glandium, BenWa: which of the above do you find easier to understand?
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(bgirard)
Sorting by overall change might be good too (largest delta to smallest delta). But I think alphabetical might be easier. I agree that no ordering is disorienting and that either would be a big improvement over random.
Flags: needinfo?(bgirard)
TBH, to me both look quite cluttered. True that the alphabetic one looks less cluttered because the beginnings of the lines correlate better, but associating them with the actual regression/improvement percentage is hard to do in a glance. What if, regardless of ordering, we'll make the actual regression/improvement value easier to notice and grasp? Here's what I had in mind: - Move the number and "better"/"worse" to the beginning of the line - Round the percentage value (decimal digits are more clutter than value here IMO). - Align the numbers correctly (and the "Worse" too, since "Better" is one char more and we'd probably want to keep that aligned too) Worse 17% tresize linux64 opt: 21.69 -> 25.27 Worse 13% tresize linux64 pgo: 20.61 -> 23.3 Worse 11% tscrollx summary linux64 opt e10s: 3.4 -> 3.77 Worse 9% tsvgr_opacity summary linux64 pgo e10s: 343.78 -> 374.69 Worse 9% tsvgr_opacity summary linux64 pgo: 381.8 -> 414.35 Worse 7% cart summary linux64 pgo e10s: 22.91 -> 24.45 Worse 7% cart summary linux64 pgo: 21.27 -> 22.72 Worse 5% cart summary linux64 opt: 28.66 -> 29.99 Worse 4% cart summary linux64 opt e10s: 30.78 -> 32.15 Worse 4% tscrollx summary linux64 pgo e10s: 3.39 -> 3.54 Worse 4% tp5o_scroll summary linux64 opt e10s: 5.02 -> 5.2 Worse 3% tp5o_scroll summary linux64 pgo e10s: 4.87 -> 4.99 Worse 2% tart summary linux64 opt: 5.44 -> 5.57 Worse 2% damp summary linux64 opt e10s: 257.83 -> 263.87
Also, do notice that even when ordered by value they tend to group also by test/platform. At least with this example.
That looks even better! If you sort by severity then you can merge regressed/improved into a single list.
Yeah, I like this suggestion -- funny how moving the number to the beginning of the line makes things look so much better. I think I prefer having two lists though, in that case maybe we can just leave off the better/worse part entirely (since it's implied)
Agreed, two lists where each has a title of "Regressions"/"Improvements" would make this prefix redundant on each line. I don't feel strongly about it either way as long as it's clear.
So, should I make the changes such that two lists titled "Regressions" and "Improvements" are made with the summary sorted in order of severity?
Flags: needinfo?(wlachance)
(In reply to Shruti Jasoria [:ShrutiJ] from comment #11) > So, should I make the changes such that two lists titled "Regressions" and > "Improvements" are made with the summary sorted in order of severity? Let's wait a day or two for Glandium to weigh in if he wants to, but that's what I'm leaning towards.
Flags: needinfo?(wlachance)
I can see how I'd use the alphanumeric sorting, but I can also see myself preferring something sorted by platform (where linux64 opt and linux64 pgo would be considered as different platforms) in other cases. What I'm saying is that there's probably no "one size fits all" for the order.
Flags: needinfo?(mh+mozilla)
(In reply to William Lachance (:wlach) from comment #12) > (In reply to Shruti Jasoria [:ShrutiJ] from comment #11) > > So, should I make the changes such that two lists titled "Regressions" and > > "Improvements" are made with the summary sorted in order of severity? > > Let's wait a day or two for Glandium to weigh in if he wants to, but that's > what I'm leaning towards. Ok, looks like that approach is the winner. Let's do it! (whenever you have time) :)
Comment on attachment 8783038 [details] [review] [treeherder] SJasoria:bug_1294798 > mozilla:master I have made changes such that two lists are made for regression and improvement, similar to the one in comment 6 with regression/improvement percentage in the beginning. Should I add the words 'Worse' and 'Better' in the list?
Attachment #8783038 - Flags: feedback?(wlachance)
Thanks for working on it. Would you mind pasting a sample textual output, preferably with at least several improvements and regressions? It would be better to suggest improvements now than after it lands, IMO.
This is how the summary looks after the changes which I have made: == Change summary for alert #2348 (as of August 08 2016 11:36 UTC) == Summary of tests that regressed: 16.52% tresize linux64 opt: 21.69 -> 25.27 13.03% tresize linux64 pgo: 20.61 -> 23.3 10.95% tscrollx summary linux64 opt e10s: 3.4 -> 3.77 8.99% tsvgr_opacity summary linux64 pgo e10s: 343.78 -> 374.69 8.53% tsvgr_opacity summary linux64 pgo: 381.8 -> 414.35 6.81% cart summary linux64 pgo: 21.27 -> 22.72 6.73% cart summary linux64 pgo e10s: 22.91 -> 24.45 4.63% cart summary linux64 opt: 28.66 -> 29.99 4.44% cart summary linux64 opt e10s: 30.78 -> 32.15 4.20% tscrollx summary linux64 pgo e10s: 3.39 -> 3.54 3.67% tp5o_scroll summary linux64 opt e10s: 5.02 -> 5.2 2.63% tp5o_scroll summary linux64 pgo e10s: 4.87 -> 4.99 2.43% tart summary linux64 opt: 5.44 -> 5.57 2.34% damp summary linux64 opt e10s: 257.83 -> 263.87 Summary of tests that improved: 23.67% tsvgx summary linux64 pgo: 305.18 -> 232.95 20.33% tsvgx summary linux64 opt: 358.25 -> 285.42 18.94% tsvgx summary linux64 pgo e10s: 188.22 -> 152.58 15.10% tsvgx summary linux64 opt e10s: 227.13 -> 192.83 13.13% tps summary linux64 pgo e10s: 44.65 -> 38.79 10.93% tp5o_scroll summary linux64 opt: 8.59 -> 7.65 10.88% tsvgr_opacity summary linux64 opt e10s: 487.17 -> 434.14 10.37% tps summary linux64 opt e10s: 51.14 -> 45.84 10.19% tsvgr_opacity summary linux64 opt: 551.16 -> 495.01 10.00% tscrollx summary linux64 opt: 7.44 -> 6.7 8.87% tps summary linux64 pgo: 57.22 -> 52.14 8.62% tcanvasmark summary linux64 pgo e10s: 10468.08 -> 11370.62 8.32% tcanvasmark summary linux64 pgo: 10177.58 -> 11023.88 7.01% tps summary linux64 opt: 68.49 -> 63.69 2.21% tp5o_scroll summary linux64 pgo: 7.41 -> 7.24 For up to date results, see: http://localhost:8000/perf.html#/alerts?id=2348 Is it good or should I add the words 'Worse' and 'Better' as you suggested in comment 6?
Flags: needinfo?(avihpit)
(In reply to Shruti Jasoria [:ShrutiJ] from comment #18) > Is it good or should I add the words 'Worse' and 'Better' as you suggested > in comment 6? I think it's good, and IMO it's fine without the better/worse on each line. Few other things which caught my eye but which I'm not sure of, so I'd prefer wlach to have the final say on it: - For the titles, instead of "Summary of tests that regressed" [/improved], I think it would be nicer as just "Regressions" and "Improvements" (possibly with some underline like --------- or =========). - Maybe add some spaces (like, maybe 5 spaces?) after the test name and before the AA -> BB. Without that spacing it's harder to identify where the name ends, and the suffix of the name is actually important, since it has things like PGO and e10s, which would be useful if they were more noticeable IMO. - Maybe reverse the order of the improvements? such that it's a continuous severity values from worst (top of the regressions) to best (bottom of the improvements)? I'll let wlach decide on those :) worst case they can be improved as a followup. Thanks again for working on it!
Flags: needinfo?(avihpit)
(In reply to Avi Halachmi (:avih) from comment #19) > (In reply to Shruti Jasoria [:ShrutiJ] from comment #18) > > Is it good or should I add the words 'Worse' and 'Better' as you suggested > > in comment 6? > > I think it's good, and IMO it's fine without the better/worse on each line. > > Few other things which caught my eye but which I'm not sure of, so I'd > prefer wlach to have the final say on it: > > - For the titles, instead of "Summary of tests that regressed" [/improved], > I think it would be nicer as just "Regressions" and "Improvements" (possibly > with some underline like --------- or =========). Yeah, let's just shorten this to regressions / improvements. I don't think we gain anything from the more verbose title. > - Maybe add some spaces (like, maybe 5 spaces?) after the test name and > before the AA -> BB. Without that spacing it's harder to identify where the > name ends, and the suffix of the name is actually important, since it has > things like PGO and e10s, which would be useful if they were more noticeable > IMO. This sounds like a good idea, let's use _.padRight to pad entries shorter than the longest test name (similar to my suggestion in the bug of using _.padLeft for the percentages): https://github.com/lodash/lodash/blob/3.10.1/doc/README.md#_padrightstring-length0-chars- > - Maybe reverse the order of the improvements? such that it's a continuous > severity values from worst (top of the regressions) to best (bottom of the > improvements)? Hmm, unclear to me whether this would be an improvement. I'd say we leave the actual ordering the way it is for now.
Comment on attachment 8783038 [details] [review] [treeherder] SJasoria:bug_1294798 > mozilla:master A few adjustments needed, see this bug and comments in PR. But it's looking good!
Attachment #8783038 - Flags: feedback?(wlachance) → feedback+
Comment on attachment 8783038 [details] [review] [treeherder] SJasoria:bug_1294798 > mozilla:master I have made the changes you had asked for in PR. I hope its good now. :) Sample output: == Change summary for alert #2348 (as of August 08 2016 11:36 UTC) == == Regressions == 17% tresize linux64 opt: 21.69 -> 25.27 13% tresize linux64 pgo: 20.61 -> 23.3 11% tscrollx summary linux64 opt e10s: 3.4 -> 3.77 9% tsvgr_opacity summary linux64 pgo e10s: 343.78 -> 374.69 9% tsvgr_opacity summary linux64 pgo: 381.8 -> 414.35 7% cart summary linux64 pgo: 21.27 -> 22.72 7% cart summary linux64 pgo e10s: 22.91 -> 24.45 5% cart summary linux64 opt: 28.66 -> 29.99 4% cart summary linux64 opt e10s: 30.78 -> 32.15 4% tscrollx summary linux64 pgo e10s: 3.39 -> 3.54 4% tp5o_scroll summary linux64 opt e10s: 5.02 -> 5.2 3% tp5o_scroll summary linux64 pgo e10s: 4.87 -> 4.99 2% tart summary linux64 opt: 5.44 -> 5.57 2% damp summary linux64 opt e10s: 257.83 -> 263.87 == Improvements == 24% tsvgx summary linux64 pgo: 305.18 -> 232.95 20% tsvgx summary linux64 opt: 358.25 -> 285.42 19% tsvgx summary linux64 pgo e10s: 188.22 -> 152.58 15% tsvgx summary linux64 opt e10s: 227.13 -> 192.83 13% tps summary linux64 pgo e10s: 44.65 -> 38.79 11% tp5o_scroll summary linux64 opt: 8.59 -> 7.65 11% tsvgr_opacity summary linux64 opt e10s: 487.17 -> 434.14 10% tps summary linux64 opt e10s: 51.14 -> 45.84 10% tsvgr_opacity summary linux64 opt: 551.16 -> 495.01 10% tscrollx summary linux64 opt: 7.44 -> 6.7 9% tps summary linux64 pgo: 57.22 -> 52.14 9% tcanvasmark summary linux64 pgo e10s: 10468.08 -> 11370.62 8% tcanvasmark summary linux64 pgo: 10177.58 -> 11023.88 7% tps summary linux64 opt: 68.49 -> 63.69 2% tp5o_scroll summary linux64 pgo: 7.41 -> 7.24 For up to date results, see: http://localhost:8000/perf.html#/alerts?id=2348
Attachment #8783038 - Flags: review?(wlachance)
Three thumbs up from me :)
(In reply to Shruti Jasoria [:ShrutiJ] from comment #22) > == Change summary for alert #2348 (as of August 08 2016 11:36 UTC) == > > ... That alert #2348 would end up as a link, right? or would that link appear elsewhere at the report?
(In reply to Shruti Jasoria [:ShrutiJ] > 2% tp5o_scroll summary linux64 pgo: 7.41 -> 7.24 > > For up to date results, see: http://localhost:8000/perf.html#/alerts?id=2348 The link will appear at the end of the report.
Comment on attachment 8783038 [details] [review] [treeherder] SJasoria:bug_1294798 > mozilla:master Very close, just a few more minor changes requested. :)
Attachment #8783038 - Flags: review?(wlachance)
Comment on attachment 8783038 [details] [review] [treeherder] SJasoria:bug_1294798 > mozilla:master I have updated the PR. I hope its good now.
Attachment #8783038 - Flags: review?(wlachance)
Comment on attachment 8783038 [details] [review] [treeherder] SJasoria:bug_1294798 > mozilla:master Looks great now, thanks! Merged.
Attachment #8783038 - Flags: review?(wlachance) → review+
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: