1121655 - Define "tier 2" automated-test frameworks

Reporter

Description

•

10 years ago

"Tier 2" automated-test frameworks are loosely defined as "jobs that are semi-sheriffable", meaning they will show up in Treeherder, but sheriffs will not deeply investigate failures, referring developers to them instead. We need to define and document this whole concept and the work to be done both on our tools and infrastructure and in the frameworks themselves. This work includes * Definition of Tier 2 requirements. * Definition of Treeherder features required by Sheriffs or test framework owners to sheriff the Tier 2 test framework. * Definition of Treeherder api features needed by Tier 2 frameworks. * Definitions of Tier 2 framework api features needed by a framework in order to report to Treeherder. This will include Python code examples.

Mark Côté [:mcote]

Reporter

Updated

•

10 years ago

OS: Mac OS X → All

Hardware: x86 → All

Bob Clary [:bc] (inactive)

Assignee

Comment 1

•

10 years ago

Attached file rationale.org (obsolete) — Details

The 'rationale' part of the details in https://wiki.mozilla.org/Auto-tools/Meetings/2015-02-09#Define_and_document_Tier_2_jobs_.5Bbc.5D

Attachment #8561519 - Flags: feedback?(mcote)

Mark Côté [:mcote]

Reporter

Comment 2

•

10 years ago

Comment on attachment 8561519 [details] rationale.org Excellent thinking to start with the rationale. :) This is a good overview, but there's one significant problem: we're defining tier 2 as something that is "semi-" or "partially" sheriffable. Your definition above is closer to the status quo; the sheriffs either monitor the results and take action, or they completely ignore the results. The idea for the future, as I understand it anyway, is that sheriffs would still be monitoring tier 2 jobs, and they *may* still do back outs and/or flag intermittents, but it is recognized that there may be failures that they are uncomfortable diagnosing, due to instability or a general lack of knowledge/experience in the test system. In this case, developers need to take action, and the test itself is marked as "broken" until the developers show that it is fixed. So there are three categories: * Tier 1 jobs, which are fully sheriffed, * Tier 2 jobs, which are sheriffed to a certain degree but may be referred to developers, and * Unsheriffed jobs, which would be hidden from the default view, but may be visible to developers (when that feature is added to Treeherder). Also, to be effective, sheriffs must also be able to determine when a failure is due to an intermittent issue, in which case it is not treated as a reason to back out the patch. Related, a reason that a framework may not be fully sheriffable, but may be partially sheriffable, is if the sheriffs do not have the necessary knowledge or experience with the framework to determine if a failure is the result of the patch being tested or is an intermittent or infrastructure failure. I would also not bother mentioning about third-party browser tests. I don't think there are any plans to ever display these tests in treeherder, which is the focus here; things like mozbench are going to remain separate from treeherder and sheriffing for the foreseeable future. In light of the above, I think you should distinguish between the points in the second list that may still make a framework eligible for tier 2, versus points that make a framework completely unsheriffable (but still potentially visible to the developers). Hope that makes sense and I haven't just completely missed something while on PTO. :)

Attachment #8561519 - Flags: feedback?(mcote) → feedback-

Bob Clary [:bc] (inactive)

Assignee

Comment 3

•

10 years ago

(In reply to Mark Côté [:mcote] from comment #2) > Comment on attachment 8561519 [details] > rationale.org > > Excellent thinking to start with the rationale. :) > > This is a good overview, but there's one significant problem: we're defining > tier 2 as something that is "semi-" or "partially" sheriffable. Your > definition above is closer to the status quo; the sheriffs either monitor > the results and take action, or they completely ignore the results. > > The idea for the future, as I understand it anyway, is that sheriffs would > still be monitoring tier 2 jobs, and they *may* still do back outs and/or > flag intermittents, but it is recognized that there may be failures that > they are uncomfortable diagnosing, due to instability or a general lack of > knowledge/experience in the test system. In this case, developers need to > take action, and the test itself is marked as "broken" until the developers > show that it is fixed. Ok. I was definitely approaching it from the idea that Tier 2 was report to treeherder but not be sheriffable approach. mdoglio, what is your understanding about Tier 2 being partially sheriffable? > > So there are three categories: > > * Tier 1 jobs, which are fully sheriffed, > * Tier 2 jobs, which are sheriffed to a certain degree but may be referred > to developers, and > * Unsheriffed jobs, which would be hidden from the default view, but may be > visible to developers (when that feature is added to Treeherder). > > Also, to be effective, sheriffs must also be able to determine when a > failure is due to an intermittent issue, in which case it is not treated as > a reason to back out the patch. > > Related, a reason that a framework may not be fully sheriffable, but may be > partially sheriffable, is if the sheriffs do not have the necessary > knowledge or experience with the framework to determine if a failure is the > result of the patch being tested or is an intermittent or infrastructure > failure. These two seem appropriate to the necessary features/enhancements to Treeherder? > > I would also not bother mentioning about third-party browser tests. I don't > think there are any plans to ever display these tests in treeherder, which > is the focus here; things like mozbench are going to remain separate from > treeherder and sheriffing for the foreseeable future. > Ok. > In light of the above, I think you should distinguish between the points in > the second list that may still make a framework eligible for tier 2, versus > points that make a framework completely unsheriffable (but still potentially > visible to the developers). From my point of view, the thing that would make a framework Tier 2/partially sheriffable would be that it met all of the requirements for Tier 1 except for running on all of the trees that merge into mozilla-central. If a bad patch lands directly on one of the repos that is being tested, the sheriff could directly back it out. If the bad patch lands on a repo that isn't tested, then the sheriff would mark the framework as failing and the developers would be responsible for identifying the offending patch. It seems to me that there is no real difference between Tier 1 and Tier 2 with regard to intermittent failures so long as Bug 1080731 - Add mechanism to flag jobs as "ignore failures" until X and Bug 1131071 - Allow to select a visibility profile in the ui is implemented in Threeherder. mdoglio: would bug 1080731 'ignore failures until X' allow the marking of a specific test, e.g. job_symbol, or framework, e.g. group_name, as ignorable in the Tier 1 profile? would bug 1131071 'select a visibility profile' handle the case of making the ignorable failures visible when desired? I envision the following process: Treeherder would maintain a "tier" attribute which can be modified by sheriffs. It can have values: Tier 3 - unsheriffable job which reports to Treeherder. Tier 2 - partially sheriffable job which doesn't run on all repos merged to mozilla-central. Tier 1 - fully sheriffable job running on all repos merged to mozilla-central. A new test framework begins submitting results to Treeherder. It is unknown and therefore automatically classified as a "Tier 3" unsheriffable job. It may or may not immediately meet the Sheriffing/Job Visibility requirements but is it invisible to the default Sheriff visibility profile. The framework developer continues to add any missing sheriffing/job visibility requirements, while determining which tests are reliable and hiding the broken or intermittent tests. Once the test framework is green modulo the ignored tests, the framework developer could nominate the framework for Tier 1 or 2 depending on if it ran on all trees merged into mozilla-central. Tier 1 and 2 would have the same visibility profile but Tier 2 would indicate which set of tests only run on a limited set of repos. If bustage appeared due to a merge from an untested repo, Sheriffs would then be able to file a bug and mark the framework as failing and invisible to sheriffs until the bug is fixed. Is this definition of Tier 2 as Tier 1 without the full repo coverable sufficient?

Flags: needinfo?(mdoglio)

Flags: needinfo?(mcote)

Bob Clary [:bc] (inactive)

Assignee

Comment 4

•

10 years ago

Ryan, sorry for leaving the sheriffs out of the discussion.

Flags: needinfo?(ryanvm)

Mauro Doglio [:mdoglio]

Comment 5

•

10 years ago

> mdoglio, what is your understanding about Tier 2 being partially sheriffable? I guess :mcote is referring to bug 1080731 when he says that a job can be partially sheriffable. My understanding is that a Tier2 job should be not visible to the sheriffs, I think bug 1080731 should be more about Making a job invisible until X. Asking the sheriffs' opinion is probably the best thing to do.

Bob Clary [:bc] (inactive)

Assignee

Comment 6

•

10 years ago

I think we have a major disconnect on what a tier 2 job is and how it is to be used. I'll send out an email asking for a good time to meet up where we can talk it out.

Mauro Doglio [:mdoglio]

Comment 7

•

10 years ago

> would bug 1080731 'ignore failures until X' allow the marking of a specific > test, e.g. job_symbol, or framework, e.g. group_name, as ignorable in the > Tier 1 profile? As I said above, I think that bug should be "make a job invisible to sheriffs until X" > > would bug 1131071 'select a visibility profile' handle the case of making > the > ignorable failures visible when desired? A visibility profile will be composed by N rules of visibility. Each rule will have an "apply until X" clause as per bug 1080731. > I envision the following process: > > Treeherder would maintain a "tier" attribute which can be modified by > sheriffs. It can have values: > > Tier 3 - unsheriffable job which reports to Treeherder. > Tier 2 - partially sheriffable job which doesn't run on all repos merged to > mozilla-central. > Tier 1 - fully sheriffable job running on all repos merged to > mozilla-central. > > A new test framework begins submitting results to Treeherder. It is unknown > and therefore automatically classified as a "Tier 3" unsheriffable job. It > may or may not immediately meet the Sheriffing/Job Visibility requirements > but is it invisible to the default Sheriff visibility profile. > > The framework developer continues to add any missing sheriffing/job > visibility requirements, while determining which tests are reliable and > hiding the broken or intermittent tests. > > Once the test framework is green modulo the ignored tests, the framework > developer could nominate the framework for Tier 1 or 2 depending on if it > ran on all trees merged into mozilla-central. > > Tier 1 and 2 would have the same visibility profile but Tier 2 would > indicate which set of tests only run on a limited set of repos. If bustage > appeared due to a merge from an untested repo, Sheriffs would then be able > to file a bug and mark the framework as failing and invisible to sheriffs > until the bug is fixed. > > Is this definition of Tier 2 as Tier 1 without the full repo coverable > sufficient? It seems to be coherent, but I'm still not sure whether the Tier 2 jobs will be part of the sheriff activity or not. In the scenario you described it looks like Tier 1 jobs are important for release managers, Tier 1+2 for Sheriffs + devs and Tier 1+2+3 for some devs.

Flags: needinfo?(mdoglio)

Ryan VanderMeulen [:RyanVM]

Comment 8

•

10 years ago

I'm going to defer to our meeting tomorrow.

Flags: needinfo?(ryanvm)

Mark Côté [:mcote]

Reporter

Comment 9

•

10 years ago

Hopefully the meeting cleared everything up; if there are still open questions, needinfo me again.

Flags: needinfo?(mcote)

Bob Clary [:bc] (inactive)

Assignee

Comment 10

•

10 years ago

Comment on attachment 8561519 [details] rationale.org obsoleting the rationale as it was completely off target.

Attachment #8561519 - Attachment is obsolete: true

Bob Clary [:bc] (inactive)

Assignee

Comment 11

•

10 years ago

Attached file amended notes from 2015-02-11 meeting — Details

Bob Clary [:bc] (inactive)

Assignee

Updated

•

10 years ago

Status: NEW → ASSIGNED

Bob Clary [:bc] (inactive)

Assignee

Updated

•

10 years ago

Depends on: 1137519

Bob Clary [:bc] (inactive)

Assignee

Comment 12

•

8 years ago

https://wiki.mozilla.org/Auto-tools/Reporting_Test_Results_to_Treeherder_for_non-buildbot_Test_Frameworks

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

Resolution: --- → FIXED

rationale.org 10 years ago Bob Clary [:bc] (inactive) 3.25 KB, text/plain	mcote : feedback-	Details
amended notes from 2015-02-11 meeting 10 years ago Bob Clary [:bc] (inactive) 3.15 KB, text/plain		Details

Bugzilla

Define "tier 2" automated-test frameworks

Categories

(Testing :: General, defect)

Tracking

(Not tracked)

People

(Reporter: mcote, Assigned: bc)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Updated

Updated

Comment 12

Attachment

General

Description

File Name

Content Type