Closed Bug 1136774 Opened 9 years ago Closed 9 years ago

Reduce Handoffs: Revision of plan for minimized execution/triage process

Categories

(Firefox OS Graveyard :: Gaia::UI Tests, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jlorenzo, Assigned: jlorenzo)

References

()

Details

Attachments

(1 file)

QA Whiteboard: [fxosqa-auto-s11]
Attached file Wiki link
I started to describe the workflows to stop relying on the report for the non-flaky suites and stop taking a look a every failure in the flaky suite. I think this will reduce the workload, nonetheless, the workflow might not be optimized enough. What do you think?
Attachment #8570528 - Flags: feedback?(gmealer)
Comment on attachment 8570528 [details]
Wiki link

I think these are good steps in the right direction. I have some specific comments:

Regarding Alert Automation, that's really Martijn's to be on point for, so I'd pull him into this and work with him on your thoughts there. 

Some initial thoughts though:

* I don't think we need to retire Jenkins chatter on fxos-automation. We don't really own it, and if that's what that team wants to do I have no issue with it. Hopefully with this strategy the chatter goes way down anyway.

* Regarding the flaky suite, we need to check to see if it's a product bug every single time it fails, unfortunately. Just because it failed due to automation error yesterday doesn't mean it was an automation error today.

* I would fold flaky trend analysis into the triage, at least initially. I think we should all be part of that process initially, and only start rotating it once it's on track and we know what we're doing.

* We need a transition plan. We don't have split suites yet. So how do we minimize process right now?

* It would be useful to articulate what we are doing now, at a slightly more simplified level than the documentation we inherited.

* I'd like to see more about minimizing the reporting, though possible I didn't dig it out of the workflow charts yet. I'd expect a report template somewhere along the line though.

I'm keeping the ? because I haven't worked through the two flowcharts yet. I'll have to more to say when I have. 

One general concern, though, is that this seems like a very crafted process based on the level of detail of the charts. 

Are you sure we've learned enough lessons yet to predict it at this level of detail, or is this going to be a straw man that we learn from? How flexible will it be to change?

Either way, we're going to need to sum this up as a series of bullet points, too, I think, to wrap heads around it, possibly letting the flow chart specialize in the decision-tree parts of it. So we should be looking in that direction.
Comment on attachment 8570528 [details]
Wiki link

(In reply to Geo Mealer [:geo] from comment #2)
> Regarding Alert Automation, that's really Martijn's to be on point for, so
> I'd pull him into this and work with him on your thoughts there. 
Adding Martijn for feedback on this.

> * I don't think we need to retire Jenkins chatter on fxos-automation.
Corollary removed

> * Regarding the flaky suite, we need to check to see if it's a product bug
> every single time it fails, unfortunately. Just because it failed due to
> automation error yesterday doesn't mean it was an automation error today.
I don't understand the argument here. If it failed due to automation yesterday, and it fails today due to a product bug, we would catch the product bug by checking it today. The problem could be about the regression window. In that case, we would need to check yesterday's build.
 
> * I would fold flaky trend analysis into the triage, at least initially. 
Added.

> * We need a transition plan. We don't have split suites yet. So how do we
> minimize process right now?
Added.

> * It would be useful to articulate what we are doing now, at a slightly more
> simplified level than the documentation we inherited.
Would these explanation fall into "Reduce Handoffs: Documentation of existing execution/triage process into wiki"? 
 
> * I'd like to see more about minimizing the reporting, though possible I
> didn't dig it out of the workflow charts yet. I'd expect a report template
> somewhere along the line though.
Added the template we started to work on by email.

> Are you sure we've learned enough lessons yet to predict it at this level of
> detail, or is this going to be a straw man that we learn from? How flexible
> will it be to change?
Because B2G and the automation infrastructure are a couple of years old, this workflow seems like a good start. I currently don't see how we could add or remove a step in the process and question the entire workflow. To be honest though, these 2 artefacts are indeed aged, I certainly lack some experience here :) 

> Either way, we're going to need to sum this up as a series of bullet points,
> too, I think, to wrap heads around it, possibly letting the flow chart
> specialize in the decision-tree parts of it.
Okay, I summarized the charts into 2-3 bullet points.
Attachment #8570528 - Flags: feedback?(martijn.martijn)
I was looking at the triage work flow drawing.
I'm wondering about the "Do you still have time?"/"Can you do it now?" steps. It seems to me that step should not be there and we should just put it always in the https://etherpad.mozilla.org/b2g-automation-daily-standup bucket, from which there work can continue.
Actually, that's what I'm missing to from the work flow diagram, the step where you file the bug.

Regarding the work flow for the flaky suite, I'm missing the case where we compare the failure to existing open bugs that we've filed against to see whether it's a new failure.
QA Whiteboard: [fxosqa-auto-s11] → [fxosqa-auto-s11][fxosqa-auto-s12]
Attachment #8570528 - Flags: feedback?(gmealer)
Attachment #8570528 - Flags: feedback?(gmealer)
Comment on attachment 8570528 [details]
Wiki link

Got some feedback since. Some majors changes will be made. Clearing the flags in the meantime.
Attachment #8570528 - Flags: feedback?(martijn.martijn)
Attachment #8570528 - Flags: feedback?(gmealer)
First iteration done: https://wiki.mozilla.org/B2G/QA/Automation/UI/Minimized_Acceptance_Execution
Status: NEW → RESOLVED
Closed: 9 years ago
QA Whiteboard: [fxosqa-auto-s11][fxosqa-auto-s12] → [fxosqa-auto-from-s11][fxosqa-auto-s12]
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: