Closed Bug 1315110 Opened 9 years ago Closed 9 years ago

Add push count and orange factor (or similar) to OrangeFactor Robot bug comments

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: gbrown, Assigned: gbrown)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

add push count and orangefactor to comments 9 years ago Geoff Brown [:gbrown] 1.59 KB, patch		Details \| Diff \| Splinter Review
add push count and orangefactor to comments 9 years ago Geoff Brown [:gbrown] 1.67 KB, patch	emorley : review+	Details \| Diff \| Splinter Review

Geoff Brown [:gbrown]

Assignee

Description

•

9 years ago

Today's daily/weekly OrangeFactor Robot bug comments look like: NN automation job failures were associated with this bug (yesterday|in the last 7 days). Repository breakdown: ... Platform breakdown: ... For more details, see: <link> If I see "10 automation job failures were associated with this bug yesterday" one day and "30 automation job failures were associated with this bug yesterday" on another day, one interpretation is "this bug just got 3x worse/more frequent". Another possibility is that there were 3x as many pushes that day (consider weekends and regional holidays obviously, but also tree closures, company meetings, etc). In addition, it is currently unclear if NN failures in a day (week) is frequent or not. A particular concern might be, if I push to try, how many failures are expected in 10 retries?

Geoff Brown [:gbrown]

Assignee

Comment 1

•

9 years ago

Attached patch add push count and orangefactor to comments (obsolete) — Details — Splinter Review

This adds 1 new line to each comment. For example: # Bug 1285173: 19 automation job failures were associated with this bug yesterday. (19 failures in 119 pushes, or 0.16 failures per push.) Repository breakdown: * autoland: 14 * mozilla-aurora: 2 * try: 1 * mozilla-release: 1 * mozilla-central: 1 Platform breakdown: * linux64: 7 * osx-10-10: 6 * windows7-32-vm: 5 * windows8-64: 1 For more details, see: https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1285173&startday=2016-11-02&endday=2016-11-02&tree=all

Joel Maher ( :jmaher ) (UTC -8)

Comment 2

•

9 years ago

Another way to look at this is What is the 7 day total for this bug and the 1 day total- and what ranking does this bug have. If we switch to an engineering model where we say "thou shalt not have any intermittent test with >XX instances in a 7 day window"- it would be nice to have the ability to see in the bugzilla comment: "This bug exceeds our acceptable weekly limit, now is the time to increase the priority of this bug". Alternatively, if we have a secondary rule that says "any bug over the threshold for >14 days will be disabled automatically", then we could have OF query that for us and comment in the bug so we are all aware that we need to disable the test. I am looking forward to the original data mentioned and possibly playing with additional rules/info.

Phil Ringnalda (:philor)

Comment 3

•

9 years ago

So if I have two intermittent-failure bugs assigned, one which fails in Linux32 opt e10s M(3) which runs every push, and one which fails in Linux32 opt e10s M(4), which runs every 8th push, I should work on the M(3) one because it's "0.125 failures per push" while the M(4) one is "0.042 failures per push" despite failing 25% of the time it runs?

Joel Maher ( :jmaher ) (UTC -8)

Comment 4

•

9 years ago

that is an interesting point :philor. I think that is something we need to figure out- right now what data orangefactor provides for #1 orange or top 50 oranges is calculating failures/push, so SETA being involved only means our orange factor should be much higher. In the short term this will help provide additional data which can help make decisions- ideally we can collect more data to make what Orange Factor presents even more useful.

Geoff Brown [:gbrown]

Assignee

Updated

•

9 years ago

Updated

•

9 years ago

Comment 5

•

9 years ago

My primary motivation for adding push count and per-bug orange factor is to provide better comparison of failure rates over time within each bug: Is this test failing more frequently now than it was yesterday/last week? I want to reduce misunderstandings such as: - there were half as many failures yesterday as there were the day before, so maybe I don't need to worry about this bug (forgetting that the trees were closed yesterday); - there were twice as many failures yesterday as there were the day before, so maybe I should perform some regression analysis to determine which changeset caused the change in frequency (only to find that the failure rate has not changed significantly). I am not trying to draw conclusions about the relative importance of one bug to another; I think rankings, like bug 1315275, or jmaher's ideas in comment 2, or even the existing simple failure counts, better address that issue. SETA, and perhaps other load-reducing mechanisms, certainly complicate the use of push count. I'd much rather provide test run counts: "(19 failures in 119 runs of this test, or 0.16 failures per test run.)", but I don't see how to implement that. Push count seems like the best approximation that is readily available. Remember that failure and push counts in bug comments are totals from across all trees, mitigating SETA effects, a little. Also, I am using parentheses around the comment addition in an effort to say, 'this is just FYI, the important thing is the simple failure count, above'. I have two concerns: - That this effort to provide more insight into the meaning of the failure counts will result in new misunderstandings -- particularly :philor's scenario in Comment 3; - That the addition of failure counts -- and perhaps rank and other information, in other bugs -- may complicate and confuse the overall message. I like the simplicity of the current messaging: N1 failures yesterday, N2 failures the day before, .... I don't want to end up with a daily paragraph of statistics that no one will read. Overall, I think I want to go ahead with this idea. I wonder if we can find better wording or otherwise tweak the concept.

Joel Maher ( :jmaher ) (UTC -8)

Comment 6

•

9 years ago

I think we should be able to move ahead with this, but possibly tweak it as we see what it ends up like in practice. How do we avoid misunderstandings and too much wording... proposed in comment 0: 19 automation job failures were associated with this bug yesterday. (19 failures in 119 pushes, or 0.16 failures per push.) alternative 1: 19 failures in 119 pushes (0.16 failures/push) were associated with this bug yesterday. alternative 2: Yesterday: 19 failures 119 pushes 0.16 failures/push to address the concern of confusion or making this irrelevant, could we do something like: alternative 3: Yesterday: 19 failures (increase from 17) 119 pushes 0.16 failures/push (increase from 0.14) ^ that might be redundant as you can see that in the previous comments. alternative 4: Yesterday 19 failures (0.16 failures/push) ^ here I removed total pushes as that isn't always useful. To help provide relevancy, maybe we have priority following a pattern: Priority 0: >100 failures in a 7 day window (if there are 7 days, otherwise project it) Priority 1: 50<x<=100 failures in a 7 day window Priority 2: 10<x<=50 failures in a 7 day window Priority 3: <=10 failures in a 7 day window ^ NOTE: 10, 50, 100 are arbitrary- if we choose something like this, then we should discuss what to pick, I would think maybe 10,30,100 would be more appropriate.

Geoff Brown [:gbrown]

Assignee

Comment 7

•

9 years ago

Thanks much for the alternatives. My favorite is alternative 1: 19 failures in 119 pushes (0.16 failures/push) were associated with this bug yesterday. ...brief, with no loss of information.

Geoff Brown [:gbrown]

Assignee

Comment 8

•

9 years ago

Attached patch add push count and orangefactor to comments — Details — Splinter Review

For example: # Bug 1206887: 7 failures in 606 pushes (0.012 failures/push) were associated with this bug in the last 7 days. Repository breakdown: * autoland: 4 * mozilla-inbound: 1 * mozilla-central: 1 * fx-team: 1 Platform breakdown: * android-4-3-armv7-api15: 7 For more details, see: https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1206887&startday=2016-10-31&endday=2016-11-06&tree=all

Attachment #8807348 - Attachment is obsolete: true

Attachment #8808417 - Flags: review?(emorley)

Ed Morley [:emorley]

Comment 9

•

9 years ago

Comment on attachment 8808417 [details] [diff] [review] add push count and orangefactor to comments Review of attachment 8808417 [details] [diff] [review]: ----------------------------------------------------------------- Not tested, but looks fine to me :-) Thank you for doing this!

Attachment #8808417 - Flags: review?(emorley) → review+

Geoff Brown [:gbrown]

Assignee

Comment 10

•

9 years ago

https://hg.mozilla.org/automation/orangefactor/rev/92e0c2c84355d1e468d96c3986e9a222f35db7f1

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Ed Morley [:emorley]

Updated

•

9 years ago

Blocks: 1317303

BMO Automation

Updated

•

5 years ago

Product: Tree Management → Tree Management Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Add push count and orange factor (or similar) to OrangeFactor Robot bug comments

Categories

(Tree Management Graveyard :: OrangeFactor, defect)

Tracking

(Not tracked)

People

(Reporter: gbrown, Assigned: gbrown)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Updated

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Updated

Attachment

General

Description

File Name

Content Type