Open Bug 1916716 Opened 2 months ago Updated 17 days ago

browser/base|components :: enable skipped tests that are passing, standardize remaining skip-if conditions

Categories

(Testing :: General, task)

task

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: jmaher, Assigned: jmaher)

References

(Blocks 1 open bug)

Details

Attachments

(1 obsolete file)

in experimenting with methods for running known failures and re-enabling them if they pass, I ended up with a lot of test manifest cleanup. this is about 46% of the overall browser-chrome tests, but many are enabled. I think a few high frequency failures ended up in the skip-if list.

here is the process (not really supported with tools yet) that I am trying to follow:

  • run all tests in failure mode (fail on passing, green on failure/timeout/crash)
  • analyze results and remove annotations for tests that are passing
  • take those annotations and run them with the regular set of browser-chrome tests
  • skip very frequent failures

in reality, this didn't work out as expected, instead I ended up doing something more like this:

  • ignore all skip-if statements (some hacks here)
  • run all browser-chrome tests with --rebuild 5
  • use ./mach manifest skip-fails to add annotations
  • run all browser-chrome again and skip-fails if needed, in general things are a lot greener
  • hand edit the resulting diff to ensure things look right

overall there are some incoming changes that cause conflict, so this isn't a scenario that is once and done. I would like to do this mass change.

one thing that ./mach manifest skip-fails does is add unique lines for each test type, not something like os == 'linux', that will skip ALL tests on linux, instead we narrow down to the build type, achitecture, os version, and test variant. Having this in place will allow ./mach manifest skip-fails in the future to only make changes for new failures it sees, since all failures will be documented in a common format.

This isn't a perfect situation, part of the hand editing was to switch many tests to use run-if as we do not hack that, only the skip-if conditions. I need to leave the verify conditions alone as well, those are there for a different purpose and I need to re-run all of those and determine if we should remove those conditions.

Assignee: nobody → jmaher
Status: NEW → ASSIGNED

I am going to split this work into a few parts- as my original patch is very big touching many components.

  1. transition from skip-if -> run-if where possible
  2. adjust 5-10 manifests at a time- this will make reviews easier and regressions easier to pinpoint.

I will also hand edit what the tools generate- mostly as a way to help transition. some rules:

  1. keep things as a single line where possible
  2. unless a test is failing 4/5 or 5/5 times, do not add any new skip-ifs.
  3. if we have skip-if = ["os == 'mac'"] and now we have 6 long form lines, manually adjust this to be os == 'mac' && os_version == '10.15' and os == 'mac' && os_version == '11.20'; this will be a special case when all instances are skipped.
  4. for existing cases that remain do the best I can to add os == '...' && os_version == '...'. This will prevent us from blindly skipping tests when we migrate to new versions.

it was suggested that I consider changing from run-if -> always-skip-if, mostly to avoid confusion when humans edit. run-if and skip-if can be confusing, and I agree. It is also mentioned that using terminology like !msix is confusing, and where possible it would be nice to avoid using the ! version. To that note, I have worked to add opt as a buildtype which will allow us to get away from !debug (bug 1916928).

I will prefer to move forward now, knowing that some manifests will get revisited while we figure out run-if -> only-run-because-it-is-not-intended-to-be-supported-if = ...

there is also a hiccup where we have:

skip-if = [
  "os == 'mac' && opt && socketprocess_networking",
  "os == 'mac' && opt",
]

this causes confusion because the 2nd condition supersedes the first condition. the 2nd condition is assuming the default (i.e. no variant). I would like to propose something like os == 'mac' && opt && default which would mean only the default case (i.e. ALL known variants must be the default (usually false) value). These are defined in variants.yml if you search for mozinfo those are the keys. We could assume all equal false (except e10s/fission...we can fix that if needed)

:ahal, do you have thoughts on how to handle comment 3?

Flags: needinfo?(ahal)
Attachment #9422781 - Attachment is obsolete: true
Depends on: 1917021
Depends on: 1917073
Depends on: 1917158
Depends on: 1917219
Depends on: 1917222
Depends on: 1917223

Yeah that makes sense to me. Can we call it no_variant instead? default isn't very descriptive.

Flags: needinfo?(ahal)
Depends on: 1919776
Blocks: 1920535
Depends on: 1925104
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: