push health should use data as "NEW" failures only
Categories
(Tree Management :: Treeherder: Frontend, task)
Tracking
(Not tracked)
People
(Reporter: jmaher, Unassigned)
References
(Blocks 2 open bugs)
Details
currently push health is using a more comprehensive design to get raw failures as well as ignoring known intermittents. This is expensive in terms of CPU time and wall time. Since the original creation of push health, we have created "NEW" failures. If a failure is NEW (failure_classification_id=6), then we know it isn't a known intermittent (work is done to remove regressions from the list, as well as to sanitize the error lines).
With a potential upcoming step in log parser to take additional data (retriggers, confirm-failure, future pushes on integration branches) and compare results of the test group (or overall task) and determine if the failures are repeated or intermittent. If intermittent the task will have failure_classification_id=8, so a filter on failure_classification_id=6 will continue to update with accurate data.
There are other aspects of push health that will need updating (showing the similar tasks- maybe we don't need to?!?, if we do- add in confirm failures), when retriggering, use confirm-failure instead, we don't need multiple retriggers.
Upon resolution of this bug, please file a bug for the next logical step/work-item. That might be a UI update, database cleanup, refactor code, add/remove functionality, etc.
| Reporter | ||
Comment 2•11 days ago
|
||
current usage will follow these specifics:
- load first query with general push/task information- really fast to display basic info
- more expensive query to get test cases will load afterwards
- load failure_classification_id=6 by default
- show 50 test cases by default
- offer button to "show all failures". This could be something like
64 tasks failed but only 15 have new failures, click here to see all failures - if button is clicked, show failure_classification_id=[1,6,8], and do not limit number of test cases to show.
This "show all failures" method will allow users to look for known failures that are now perma failures without putting a lot of logic in. Potentially a round 2 could look for a given test case that fails >1 time and include that in the original list of "new failures".
Description
•