Open Bug 1978477 Opened 3 months ago Updated 11 days ago

push health should use data as "NEW" failures only

Tracking

(Not tracked)

Status:

NEW

People

(Reporter: jmaher, Unassigned)

References

(Blocks 2 open bugs)

Details

Joel Maher ( :jmaher ) (UTC -8) (PTO back normal Nov 17)

Reporter

Description

•

3 months ago

currently push health is using a more comprehensive design to get raw failures as well as ignoring known intermittents. This is expensive in terms of CPU time and wall time. Since the original creation of push health, we have created "NEW" failures. If a failure is NEW (failure_classification_id=6), then we know it isn't a known intermittent (work is done to remove regressions from the list, as well as to sanitize the error lines).

With a potential upcoming step in log parser to take additional data (retriggers, confirm-failure, future pushes on integration branches) and compare results of the test group (or overall task) and determine if the failures are repeated or intermittent. If intermittent the task will have failure_classification_id=8, so a filter on failure_classification_id=6 will continue to update with accurate data.

There are other aspects of push health that will need updating (showing the similar tasks- maybe we don't need to?!?, if we do- add in confirm failures), when retriggering, use confirm-failure instead, we don't need multiple retriggers.

Upon resolution of this bug, please file a bug for the next logical step/work-item. That might be a UI update, database cleanup, refactor code, add/remove functionality, etc.

Joel Maher ( :jmaher ) (UTC -8) (PTO back normal Nov 17)

Reporter

Comment 1

•

3 months ago

some reference info

Joel Maher ( :jmaher ) (UTC -8) (PTO back normal Nov 17)

Reporter

Updated

•

3 months ago

Blocks: 1978686

Joel Maher ( :jmaher ) (UTC -8) (PTO back normal Nov 17)

Reporter

Comment 2

•

11 days ago

current usage will follow these specifics:

load first query with general push/task information- really fast to display basic info
more expensive query to get test cases will load afterwards
load failure_classification_id=6 by default
show 50 test cases by default
offer button to "show all failures". This could be something like 64 tasks failed but only 15 have new failures, click here to see all failures
if button is clicked, show failure_classification_id=[1,6,8], and do not limit number of test cases to show.

This "show all failures" method will allow users to look for known failures that are now perma failures without putting a lot of logic in. Potentially a round 2 could look for a given test case that fails >1 time and include that in the original list of "new failures".

You need to log in before you can comment on or make changes to this bug.

Bugzilla

push health should use data as "NEW" failures only

Categories

(Tree Management :: Treeherder: Frontend, task)

Tracking

(Not tracked)

People

(Reporter: jmaher, Unassigned)

References

(Blocks 2 open bugs)

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2