Create a UI for tracking perfherder alerts

RESOLVED FIXED

Status

Tree Management
Perfherder
RESOLVED FIXED
2 years ago
a year ago

People

(Reporter: wlach, Assigned: wlach)

Tracking

Details

Attachments

(2 attachments)

This is a bug that should cover creating the first iteration of a UI for tracking performance alerts.

Prior art:

Alert Manager: alertmanager.allizom.org:8080/alerts.html#
Chrome's perf sheriffing dashboard (view in Chrome): https://chromeperf.appspot.com/

This depends on both bug 1150152 and a specification for what the UI should like, which I will try to develop in the next few weeks.

Comment 1

2 years ago
hey will,

I'm exciting about this bug! Are there anything I can help about this bug?

thank you!

MikeLing
(In reply to MikeLing from comment #1)
> hey will,
> 
> I'm exciting about this bug! Are there anything I can help about this bug?

Hi Mike, I'm planning on doing some kind of UI spec first, but for sure there will be lots and lots of fun Angular/JS/Bootstrap work to do here. For now, I'd look at the links above as well as Joel's documents and videos on performance sheriffing for inspiration. https://wiki.mozilla.org/Buildbot/Talos/Sheriffing

Comment 3

2 years ago
(In reply to William Lachance (:wlach) from comment #2)
> (In reply to MikeLing from comment #1)
> > hey will,
> > 
> > I'm exciting about this bug! Are there anything I can help about this bug?
> 
> Hi Mike, I'm planning on doing some kind of UI spec first, but for sure
> there will be lots and lots of fun Angular/JS/Bootstrap work to do here. For
> now, I'd look at the links above as well as Joel's documents and videos on
> performance sheriffing for inspiration.
> https://wiki.mozilla.org/Buildbot/Talos/Sheriffing

cool! we can discuss that on irc tomorrow!:)
Depends on: 1153956
Created attachment 8666283 [details]
Screenshot of alert sheriffing UI prototype w/ "alert summaries"

Hey, so I have been working a bit on UI ideas for this after chatting a bit with :jmaher.

I think we need to seperate out the concept of "alerts" (which are indications that the numbers in a particular performance series changed) from "root causes". The sheriffing interface should really be about dealing with root causes, which are tied to an individual commit (push). 

So I've been thinking about a UI that highlights that has something I'm tentatively calling an "Alert Summary". We would populate the alert summary initially based on the commit that an alert comes from, but the user would have an option to tie a performance alert to another summary than the one it came with (e.g. an alert generated from a merge commit should be associated with the root commit). Thus an alert summary has both "generated alerts" (alerts associated with that summary in particular) and "downstream alerts" (follow-on alerts generated from merging, etc.)

I envision the alerts dashboard just being a list of the most recent alert summaries. By default we'll probably just show everything, though we might want to provide the option of filtering by branch. Just like treeherder, we'll allow the user to load more by scrolling to the bottom and clicking "next 10", "next 50", etc.

Anyway, there are a bunch of other details which I still need to sort out, but I thought I'd throw this out there for the perf sheriffs and :mikeling to look at and comment on.
Attachment #8666283 - Flags: feedback?(vaibhavmagarwal)
Attachment #8666283 - Flags: feedback?(sabergeass)
Attachment #8666283 - Flags: feedback?(jmaher)
Nice screenshot of the prototype :wlach!
A couple of thoughts, feel free to oppose/neglect some at this stage: 


1) In the screenshot I see that we are listing the (new) commits for a particular alert. However, in case of merges there are generally a lot of revisions (>50), so it would not be a good experience to list all of them. I think a link to pushlog (as currently in alert manager) is probably a better way.
2) There would need to be links for alert row to perfherder url, Treeherder url, bugs in which the alert is tracked (these links are necessary for speeding up sheriffing).
3) There will be a need to have more filters in the UI like platforms, branch, tests etc. 
4) Personally I don't like using the phrase "downstream alerts" since it requires context, and someone not familiar with the terminology cannot understand. I would prefer a generic descriptive name like "related alerts"
5) I am not sure how "related alerts" will work in the UI. Will it only show when filters are applied? Will it also show with untriaged alerts? I think that one should be able to see all untriaged alerts in one place, and related alerts should only show the alerts that have been triaged (either manually or automatically). This is because "merges" also happen on mozilla-inbound, fx-team and b2g-inbound.
6) An important piece missing in Alert Manager is authentication. Anyone can change/alter the alerts in alert manager right now. I would like to see different priviledges for users, hopefully it would be possible in new alert UI since it is closely tied to treeherder.

Updated

2 years ago
Attachment #8666283 - Flags: feedback?(vaibhavmagarwal)
Thanks for the quick feedback Vaibhav!

(In reply to Vaibhav (:vaibhav1994) from comment #5)
> Nice screenshot of the prototype :wlach!
> 1) In the screenshot I see that we are listing the (new) commits for a
> particular alert. However, in case of merges there are generally a lot of
> revisions (>50), so it would not be a good experience to list all of them. I
> think a link to pushlog (as currently in alert manager) is probably a better
> way.

Yes, my plan was to only show only the top 4 (or so) commits per push, linking to the pushlog if people want to see more.

> 2) There would need to be links for alert row to perfherder url, Treeherder
> url, bugs in which the alert is tracked (these links are necessary for
> speeding up sheriffing).

Yup, I was planning to include all of these somewhere.

> 3) There will be a need to have more filters in the UI like platforms,
> branch, tests etc. 

Yup.

> 4) Personally I don't like using the phrase "downstream alerts" since it
> requires context, and someone not familiar with the terminology cannot
> understand. I would prefer a generic descriptive name like "related alerts"

That probably is better. I was also thinking of putting a help icon beside to explain what these are, as I suspect there is no terminology which will not be confusing.

> 5) I am not sure how "related alerts" will work in the UI. Will it only show
> when filters are applied? Will it also show with untriaged alerts? I think
> that one should be able to see all untriaged alerts in one place, and
> related alerts should only show the alerts that have been triaged (either
> manually or automatically). This is because "merges" also happen on
> mozilla-inbound, fx-team and b2g-inbound.

An alert can only ever be associated with one particular summary. By definition, if an alert is in "related alerts" it has already been triaged, since by default an alert will be associated with a particular merge commit where it was detected (at least this is my plan for the first version -- in the future we might want to have automatic handling of merge commits, somehow).

> 6) An important piece missing in Alert Manager is authentication. Anyone can
> change/alter the alerts in alert manager right now. I would like to see
> different priviledges for users, hopefully it would be possible in new alert
> UI since it is closely tied to treeherder.

Yes, I was planning to hook into Treeherder's persona-based authentication. I was planning on restricting write actions to people with the "sheriff" bit on their profiles.
Comment on attachment 8666283 [details]
Screenshot of alert sheriffing UI prototype w/ "alert summaries"

Vaibhav had a lot of great feedback.  One thing I just spent an hour doing is triaging 17 random revisions which were mostly duplicates of 3 real issues on the tree.

In many cases we have already triaged alerts for a specific revision and then come back in with a new alert to find out there are previous alerts which have been triaged.  My biggest fault here with alert manager is they are marked 'duplicate' in the UI, but I don't know what they are duplicates of- it would be really nice to have a better method for 'bumping' alerts so the 'related alerts' are useful.

Another big feature which is lacking in alert manager is the ability to select all alerts in the list.  Right now selecting one alert at a time is really inefficient and I tend to write sql queries to save time.

Lastly if we want to add features, it would be nice to determine which branch we are on (trunk*, aurora, beta) and have a list of active bugs on that branch so we can quickly determine which revision/bug/regressions are active- therefore providing a shortcut to marking as a duplicate/related.

What I don't like about the screenshot is the fact that we fill up so much real estate on the screen for just 2 alerts- It is nice to see a list of all recent alerts and it becomes obvious where the real problems are as patterns emerge.  This could be useful if the related alerts was reliable.  It is nice to see the data of what was checked in, but that seems to be useful after the fact- once we have determined this isn't a duplicate alert.

I would say the workflow falls into:
* determine which branch the alert really started on (viewing graphs of different branches)
* pinpoint the revision which is the root cause (related to collecting more data)
* then look at the code for the root cause, related bugs, etc.

This specific view looks to be great for helping find duplicates and providing meta info when we have the specific alert identified.

Comment 8

2 years ago
Sorry for the late reply, I just working on other bug of perfherder this weekend. 

Hmm, those feedback from Joel and Vaibhav are awesome and I totally agree with them. But, you know, I don't have use alter manager too much like Joel and Vaibhav. So I would like to tell my feeling about this screenshot and compare it with alter manager, as an inexperienced user. :)

When I first meet this screenshot, I just thought the dropdown menu which include "Generated Alerts" and "Downstream Alerts" was the main body rather than the series itself. Just like Vaibhav said, there are a lot of redundant spaces in there which make it doesn't looks like a whole.

The other thing I would like to mention is, if I'm looking for some basic information of a series,I need to find out the information piece by piece and it may need to trigger and untrigger very frequently. On the other hand, when we look at the alert manager, we can find the information of series very easily and comfortable, signature, name, related project name, related branch, almost anything you need can be found exactly by one glimpse.

At last, maybe we could add some useful button for it? like file bug, merge bug, mark as duplicate, unmark duplicate bug, etc. Maybe a funvtion plane would be handful. I know it just a simple mock up for init stage, so I just point it out and I totally defer to you guys' opinion. :)
Comment on attachment 8666283 [details]
Screenshot of alert sheriffing UI prototype w/ "alert summaries"

didn't use the proper feedback flag- commented earlier.  I think this is a great start and it is awesome that it gives us something to talk about.
Attachment #8666283 - Flags: feedback?(jmaher) → feedback-
Created attachment 8687425 [details] [review]
Basic UI for managing performance alerts

Ok, so here's a first pass at a "read only" view of performance "alerts" (read: observed discontinuities in performance data). I figure we can extend this model to read/write and other things over time.

Mauro: Could I ask you a huge favor and look over this for any glaring Django/Angular problems? I know this is out of your area, but the PR is fortunately not gigantic and I think relatively well isolated. I'd be happy to go over the functionality too if it helps.
Assignee: nobody → wlachance
Attachment #8687425 - Flags: review?(mdoglio)
Attachment #8687425 - Flags: feedback?(jmaher)
Comment on attachment 8687425 [details] [review]
Basic UI for managing performance alerts

a handful of comments on the PR- my only concerning issue was answered on irc- that in the future when things are editable we will be able to move individual alerts to a summary !!
Attachment #8687425 - Flags: feedback?(jmaher) → feedback+
Comment on attachment 8687425 [details] [review]
Basic UI for managing performance alerts

Left some comments and shared some thoughts on the data model with wlach. r? me again once ready for a second round.
Attachment #8687425 - Flags: review?(mdoglio)
Comment on attachment 8687425 [details] [review]
Basic UI for managing performance alerts

Hey Mauro, I rewrote stuff and made it use rest_framework's pagination and serializer classes. Could you have a look if you have time? I probably won't have time to address feedback until Friday morning, so no rush.
Attachment #8687425 - Flags: review?(mdoglio)
Comment on attachment 8687425 [details] [review]
Basic UI for managing performance alerts

Please r? me again when ready for a final round of review
Attachment #8687425 - Flags: review?(mdoglio)
Comment on attachment 8687425 [details] [review]
Basic UI for managing performance alerts

I created a seperate commit to address your review comments. Hopefully a bit easier to read?
Attachment #8687425 - Flags: review?(mdoglio)
Comment on attachment 8687425 [details] [review]
Basic UI for managing performance alerts

I added very few comments. This is a "FIX IT, THEN SHIP IT" kind of approval :)
Attachment #8687425 - Flags: review?(mdoglio) → review+

Comment 17

2 years ago
Commits pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/4d901e28fdd233de72a1da3b15f30dd3ae830620
Bug 1201154 - Simplify client side API for getting list of resultsets

This functionality is kind of mashed in with just getting a range, which
is rather confusing. The functionality isn't actually being used for
anything right now, but is useful for handling performance alerts,
so let's fix it.

https://github.com/mozilla/treeherder/commit/3bfa8b7415012558cbb8603b2cb5c38bbfedce5d
Bug 1201154 - Basic read-only UI for performance alerts
Ok, addressed Mauro's comments and merged!
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED

Comment 19

2 years ago
Commit pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/d9b5b958281b23c1b0f66d123c9e1aaa103539a4
Bug 1201154 - Fix display display of a single alert summary

This got accidentally broken when addressing review comments.

Comment 20

2 years ago
Commit pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/d556b18760a39344aa9d5705bc53bec7e1196df6
Bug 1201154 - Fix missing semicolon
Actually let's reopen this, as it kind of encompasses all the functionality, and much of the discussion there is still relevant. 

I probably should have filed a seperate bug for the readonly view. Live and learn...
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Comment 22

2 years ago
Commit pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/ac8ea015d5826dfaa0352304413299f109a768ac
Bug 1201154 - Adding missing migrations

Comment 23

2 years ago
Commit pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/b52f5d441ac4d372dedb1237ac2823e78a407763
Bug 1201154 - Fix print method for performance alert

Comment 24

2 years ago
Commit pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/651c5d90d0fed062e357ffb1ba6391f820c5268c
Bug 1201154 - Add django-filter to list of required packages

Comment 25

2 years ago
Commit pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/9ddb21d521bcbe3fd51e8d97e53b5bc1cb2f588e
Bug 1201154 - Fix getting more alert summaries
Depends on: 1228156
Depends on: 1228158
Depends on: 1246774

Comment 26

a year ago
Comment on attachment 8666283 [details]
Screenshot of alert sheriffing UI prototype w/ "alert summaries"

I gonna remove this feedback tag :)
Attachment #8666283 - Flags: feedback?(sabergeass)
what is left to do here?  My understanding is this is done!
This is done!
Status: REOPENED → RESOLVED
Last Resolved: 2 years agoa year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.