Closed Bug 1220445 Opened 9 years ago Closed 9 years ago

Please back out autoclassification in the UI on production

Categories

(Tree Management :: Treeherder, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Assigned: mdoglio)

References

Details

Attachments

(1 file)

47 bytes, text/x-github-pull-request
emorley
: review+
Details | Review
Something, I presume bug 1215107, has given production treeherder the feature of having failures which autoclassification believes it recognizes a hollow star. As near as I can tell, it does not associate the failures with the bug which it recognizes the failure as matching, it merely takes the failure completely out of any sheriff's workflow, so that the only way we can continue to have failures associated with bugs, which is the only way we have to know how frequently failures happen or which runs had a particular failure, is to alter our workflow to show all classified failures, and manually and continually comb through them looking for hollow stars to star over the top of autoclassification. Please back it out of production, before the start of the working week Sunday night. I can live with it and leave trees open on a weekend, but not during the week.
Blocks: 1215107
Flags: needinfo?(mdoglio)
Flags: needinfo?(james)
Priority: -- → P1
To summarise the issues that I believe Phil is describing: 1) autoclassify isn't linking a bug number with a failure (eg see the annotations tab), only the classification "autoclassified intermittent" 2) There's no visual change on the "failure summary" tab, so you have to look at both that and the annotations tab to figure out what is going on 3) Without a bug number or failure line string, there's no easy way to tell if autoclassify has made a mistake 4) This wouldn't be a problem if the "autoclassified intermittent" jobs still behaved as unclassified (ie could be selected using the j/k keyboard shortcut or showed up in "only unclassified mode" - eg press "u") 5) It's auto-classifying green jobs too (which mostly just have false positives in the output) I didn't realise we'd actually turned this on for real - it's quite a big change and IMO ideally would have had sheriffs test it on stage first (or even we could have done a Heroku deploy off a main branch and had them test there before it landed on master). James/Mauro - would you mind taking a look? If there's a way we can disable this without backing out (eg disabling in set_classification()) I'm open to trying, otherwise I'll need to revert this before tomorrow. Additional context: 04:28 <philor> huh, we just... ship autostarring? 04:29 — philor grapples with tenses 04:35 <philor> or is that jgraham running a separate autoclassifier script? 04:37 — philor looks for and fails to find a classified-by filter 04:37 <philor> which makes me deeply, deeply disturbed by this 04:45 <philor> oh, hollow stars 04:45 <philor> meh 04:49 <philor> but at least it's willing to misstar things the exact same way we do, that's a nice touch 04:52 <philor> wait, is it classifying without associating with bugs? 04:53 <philor> I guess the War on Orange is over, and we lost completely 04:57 <philor> camd / emorley|away : please to be not-away 05:27 <philor> so this must be bug 1215107 which we shipped? 05:27 <philor> please unship it 07:01 <philor> boy, having to continuously look over every classified failure in every push with running jobs really points out how handy it would be if the # in progress indication actually worked so you could tell what pushes had running jobs 08:32 <philor> fun question: should autoclassifier be classifying the failures in green runs? it is. 08:46 ⇐ mikeling quit • philor → philor|away 09:42 <jmaher|afk> philor|away: are the failures in green jobs real failures?
Attached file PR 1107
Assignee: nobody → mdoglio
Status: NEW → ASSIGNED
Flags: needinfo?(mdoglio)
Attachment #8681689 - Flags: review?(emorley)
Attachment #8681689 - Flags: review?(emorley) → review+
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/ceeb27c2d077a48790a361fa8427a9c26e6d58e0 Bug 1220445 - hide automatic classification We want to find a better ui for the integration between manual and automatic classification.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Not sure there's any direct way to tell (just from the UI) that this is what happens, but I think it is: 1. Be fast at starring, very fast, star something that autoclassifier will do but before it does. 2. It autoclassifies, which it appears to do destructively, removing the existing manual classification and only leaving its classification. 3. Because of the patch from this bug, at the next refresh that failure which was classified and associated with a bug, and then had that classification overwritten with a autoclassification, will show as unclassified with no associated bugs. 4. Star it again, because there are thousands per day and there's no way you will recognize a single one as something you've done before, and you'll be told that there was an error associating it with the bug (because it already is associated with that bug, even though the UI doesn't show it), and it will be annotated as though you just chose to star it as intermittent without any bug.
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/9bef82adddaded4155166f57729c9a654db8ba5d Revert "Bug 1220445 - hide automatic classification" This reverts commit ceeb27c2d077a48790a361fa8427a9c26e6d58e0.
OK, fixing these issues is obviously a priority. Some of the issues could use more feedback so that I understand the workflow fully. (In reply to Ed Morley [:emorley] from comment #1) > To summarise the issues that I believe Phil is describing: > 1) autoclassify isn't linking a bug number with a failure (eg see the > annotations tab), only the classification "autoclassified intermittent" So there are two cases here: a) The autoclassification came from a classification that the sheriff originally made. In this case the autoclassifier is supposed to pick up the original bug. If it's not doing that it's an error. b) The autoclassification came from noticing a failure line that previously appeared in a set of retriggers on the same push where one job was green and one job was not. In this case there might not be a sheriff-marked bug for the error. Does it sound sensible to have some way to indicate this case so that a bug can be added, and then make the bug propogate to any other classifications that were made in the same way? I think a little thought will be required to make this work in a sensible way, but I'm sure something is possible. > 2) There's no visual change on the "failure summary" tab, so you have to > look at both that and the annotations tab to figure out what is going on What sort of visual change do you think is necessary here? > 3) Without a bug number or failure line string, there's no easy way to tell > if autoclassify has made a mistake Yeah, I guess we need to prioritize getting that information into the UI. camd has a PR [1] for a more detailed UI for the autoclassifier. There is an impedance mismatch between the per-line classification of autoclassifier and the per-job classification that we currently do. > 4) This wouldn't be a problem if the "autoclassified intermittent" jobs > still behaved as unclassified (ie could be selected using the j/k keyboard > shortcut or showed up in "only unclassified mode" - eg press "u") Yep, that sounds sensible at least until we trust the classifier more. > 5) It's auto-classifying green jobs too (which mostly just have false > positives in the output) Sounds like a bug. We're having a work week on this stuff this week, so now is a great time to get feedback. [1] https://github.com/mozilla/treeherder/pull/1096
Flags: needinfo?(james)
maybe it would be worth to test this somehow on a stage or so enviroment to play with it a little bit before pushing to production ?
(In reply to Carsten Book [:Tomcat] from comment #7) > maybe it would be worth to test this somehow on a stage or so enviroment to > play with it a little bit before pushing to production ? James said this was the intention (to test it more thoroughly somewhere other than prod first), I think there was a miscommunication about whether it was meant to land or not (anything that lands on master needs to be ready to ship) - either way it wasn't meant for production. It's been backed out in https://github.com/mozilla/treeherder/commit/bc43f09637ac09e5c24d2760fd6408f147966e15
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: