add method to collect metrics of "new" failures and how they get annotated
Categories
(Tree Management :: Treeherder: Frontend, task, P2)
Tracking
(Not tracked)
People
(Reporter: jmaher, Assigned: jmaher)
References
Details
every time we display a "new" button for the sheriffs to indicate the failure line is not seen in the last 21 days, we would like to track this so we know the button is not misleading or missing too many things.
In addition, we should track:
- a new button -> a new bug (includes classification::intermittent)
- a new button -> classification (infra, intermittent, fixed by commit, expected fail)
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 1•3 years ago
|
||
I was focused on glean and have a working PR up that is almost all good except for data opt out/in. Given the low frequency of data points we have, having an option for opt out could skew our results too much- I am exploring other options.
right now I am thinking in the job_note table, adding a new_failure: boolean
field. Then when saving a classification, I just add the new_failure data point.
There are 207K job_notes, so this would be a pretty low impact to add the field and deal with the extra data in storage and queries.
That is a simplistic view of a solution, there might be more interesting cases.
Here is what to track:
For every failing task that has a matching suggestion.search (matching == <error_tag> | <test_name> | <failure_message>
) we would look at the data from the bug_suggestion, specifically data from the 21 day cache:
- counter (1 == first time seeing the line, etc.)
- new_failure_in_rev (this failure might be counter>1 but want to show that all of the same failures in the rev are new and expected)
And to answer the question- how accurate is the NEW failure notification- we could do that with:
- newBug vs newBugNewFailure (ideally all new bugs with matching summary should have a newfailure notification or we are missing the NEW notification)
- fixedByCommit vs fixedByCommitNewFailure (same as above, with matching summary these should be equal or we are missing NEW notifications)
- otherNewFailure - any new failure that isn't related to the newBug or FBC will be associated with an existing intermittent bug, these would be false positives and we shouldn't be showing these.
Given ~200 new bugs/week and probably 75+ fixed_by_commits/week, I imagine that 30 days will give us enough information. I think anything >90% accurate is ok; >95% accuracy is good and we should consider opening up to try, >99% accuracy and big bonuses for all!
The advantage of adding a field in a database is for the otherNewFailure
cases we can look at the job.id and dig into what was showing. Honestly this field should be an int or we should have a second boolean, so that we can filter out failures that don't qualify (such as infra errors)
Assignee | ||
Comment 2•3 years ago
|
||
Assignee | ||
Comment 3•3 years ago
|
||
the github PR was just deployed this morning to production
Assignee | ||
Comment 4•2 years ago
|
||
while this is done, the glean metrics have no value, we will need to sort out a better way. I had downloaded ALL the failures and annotations to determine this manually- it was a long process and a lot of code.
Description
•