Closed Bug 1653986 Opened 4 years ago Closed 4 years ago

Display information about disabled and failing tests on those tests' pages

Categories

(Webtools :: Searchfox, enhancement)

enhancement

Tracking

(firefox81 fixed)

RESOLVED FIXED
Tracking Status
firefox81 --- fixed

People

(Reporter: asuth, Assigned: asuth)

References

(Blocks 2 open bugs)

Details

Attachments

(3 files)

In January on dev-platform and firefox-dev I proposed surfacing when tests are disabled or failing on searchfox on the test file pages and in the directory listings. In the discussion thread, James Graham characterized the goal well to "make the data ambiently available to people looking at the tests/code rather than requiring specific action to look at a dashboard or read a recurring email".

Currently, the "Linting opt Test manifest skip/fail information source-test-file-metadata-test-info-all all" task (example) produces a test-info-all-tests.json file that basically looks like (post me normalizing a bit to put the file in the searchfox tests repo):

{
  "description": "This imitates the taskcluster `Linting opt Test manifest skip/fail information source-test-file-metadata-test-info-disabled-by-os disabled-by-os` job",
  "summary": {
    "components": 1,
    "failed tests": 3,
    "manifests": 3,
    "skipped tests": 10,
    "tests": 100
  },
  "tests": {
    "Product::Component": [
      {
        "failed runs": 0,
        "skip-if": "toolkit == 'android'",
        "skipped runs": 1569,
        "test": "test_custom_element_base.xul",
        "total run time, seconds": 25916.85,
        "total runs": 5657
      },
      {
        "failed runs": 5,
        "skipped runs": 0,
        "test": "test_DOMWindowCreated_chromeonly.html",
        "total run time, seconds": 1065.66,
        "total runs": 2146
      },
      {
        "failed runs": 0,
        "skip-if": "(os == \"win\" && processor == \"aarch64\") || (os == \"mac\") || (os == \"linux\" && !debug)",
        "skipped runs": 890,
        "test": "test_talosconfig_browser_config.json",
        "total run time, seconds": 3877.57,
        "total runs": 2646
      }
    ]
  }
}

output-file already knows how to import the bugzilla-components.json file which contains a straightforward per-file mapping. For efficiency, I was thinking we might generalize this so that the indexing process can consolidate multiple sources of per-file information into a single lookup file. For pragmatism, I'd start with a JSON file like the bugzilla data, but we could move to an on-disk representation closer to the crossref file, as it could be useful for the web server to be able to have access to this information to decorate search results.

Note that for tests there's also more that we could do as the data can be made available in taskcluster jobs that pre-compute the information. For example, James Graham linked to a very cool WPT dashboard at https://jgraham.github.io/wptdash/?tab=Gecko+Data&bugComponent=core%3A%3Adom%3A+core+%26+html. Searchfox need not replicate all existing dashboard functionality, but it would be great to 1) provide direct access to bugzilla links for immediacy, and 2) link out to more powerful dashboards.

:kats, thoughts on the per-file lookup mechanism (and general functionality)?

In general this seems like a reasonable thing to do. It might be a little tricky UX-wise to display the set of conditions under which a test is disabled, because sometimes parts of tests are disabled and sometimes the whole test is disabled, and the conditions can be quite numerous. Perhaps, given that there's already a WPT dashboard for it, we can just link to that? I don't have strong feelings about this either way though.

If we were going to make it pretty, I'd probably crib what I did back in the day for ArbPL. But I think it's likely sufficient to just show the contents of the skip-if verbatim, as arguably the presence of any skip directive is something that should be dealt with. And then link out to the appropriate dashboard(s) which would be some combination of https://treeherder.mozilla.org/intermittent-failures.html, jgraham's wptash, wpt.fyi, etc.

In the event dashboards can commit to a reasonably fast load time, we could embed an iframe with fixed dimensions that could be directly embedded in the infobar.

Assignee: nobody → bugmail
Status: NEW → ASSIGNED

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #2)

But I think it's likely sufficient to just show the contents of the skip-if verbatim, as arguably the presence of any skip directive is something that should be dealt with.

For mochitests that's ok, but I was more wondering about the WPT case which can be arbitrarily complex. e.g. see https://searchfox.org/mozilla-central/rev/3b6958c26049c1e27b2790a43154caaba9f6dd4a/testing/web-platform/meta/css/css-fonts/animations/font-variation-settings-interpolation.html.ini which has a bunch of subtests with conditions.

(For future reference, cd testing/web-platform/meta; find . -type f | xargs ls -lS | head produces some of these massive condition files).

Ah, sorry, my brain bounced off the "parts of tests are disabled" aspect of your comment 1. Yeah, unfortunately it seems test-info-all-tests.json doesn't know anything about the WPT expectations for the test, it just operates at the "does it get run" granularity:

      {
        "failed runs": 0,
        "skipped runs": 0,
        "test": "testing/web-platform/tests/css/css-fonts/animations/font-variation-settings-interpolation.html",
        "total run time, seconds": 2882.57,
        "total runs": 1861
      },

This unfortunately does create a situation where my naive implementation thus far really lies to people about what's going on. I thought it might be worth showing successful run counts too, as well as the average runtime. Here's a skip-if resulting in skips and a skip-if not resulting in skips and failures but no skips. It appears the skip count needs to be subtracted off of the run count and this would impact the average run time calculation (upward).

For the WPT situation you raise, I think for the per-file info-box we want a few digested numbers like: expected failures, unexpected passes, etc. with a link to the .ini file in searchfox and any fancier dashboard. The heuristic would be to walk the specific ini file sections and bin that section for tally purposes based on what the possible results are. I suppose picking the worst in the set or just counting all possible outcomes would be equally valid.

In the super cool future, I think the best integration with searchfox would be to display the failures against the source lines in question that stack traces identified as involved in the production of the error check. Because all those checks frequently involve dynamic strings, statically mapping them is hard. But if we make sure that when a failure happens the test runner logs the stack of the failure and then the taskcluster helper task would scrape the stack to isolate all frames that were in the test file, these can be digested to a single output file that output-file can then use to emit in the file to be presented as a configurably visible blame bar, etc.

:jgraham, what are your thoughts about this (the simpler aggregate WPT expectations case)? Is there an existing taskcluster job or could there be a taskcluster job[1] that could provide Searchfox with pre-computed per-file WPT aggregated info? I only see "wpt" in https://github.com/mozilla/active-data-recipes for code coverage purposes (xref https://github.com/mozilla/ActiveData). Thanks!

1: The goal is to keep searchfox's indexing times down by ensuring that any additional data sources are maximally pre-computed.

Flags: needinfo?(james)

Another potential thing we could do is pushing the work to the client, like a "Show test info" button that loads data from e.g. wpt.fyi or such.

A link to the right dashboard / thing to look info about a WPT test would already be a massive improvement IMHO.

There's a wpt-meta job that produces a json summary of the metadata ini files; that's what wptdash uses. That's generated by the mach wpt-metadata-summary command, so it should be relatively easy to customise if you want slightly different information.

Linking things to specific lines of code is hard; we do log stacks in some cases so this isn't strictly impossible, but it doesn't always make sense (e.g. for reftests there isn't any stack).

Thanks, :jgraham! The summary file is very helpful! I've updated the patch to digest and merge this information in and it's now reflected on https://asuth.searchfox.org/. The test info boxes can now do 3 broad categories of things as they relate to WPTs which have a test file directly corresponding to the meta file which is not (always) the case for multi-global any.js tests which will likely need the JSON file to explain the relationship:

Other related changes:

  • For accessibility purposes (screen readers and red/green color-blindness), the info box heading now delineates between Errors (red)/Warnings (yellow)/FYI (green)/Info (green). I still need to wrap the contents in a <section> or otherwise make the contents of the infobox hierarchically skippable.
  • !fission and fission skip-ifs are specialized to be FYI rather than warnings because there are some tests that are intentionally specific to these different conditions and the fission team has been very proactive about these conditions and tracking them in spreadsheets and bugzilla so I don't think we need to make them seem bad.

Next steps:

  • Fix the meta file links. As of this moment, the info boxes attempt to link to the WPT meta files, but I was sleepy and replaced .ini onto the end rather than just appending it. I'm going to fix this as well as making the links look like links.
  • Add wpt.fyi links for WPT tests to the "Navigation" Panel "Other Tools" section.
    • I was thinking about putting things like this towards the right of the test info box because anecdotal reports suggest people blind themselves to the navigation panel and I worry about the panel become ridiculously tall, but the navigation panel being a dynamically sized position:fixed makes this a hassle.
  • See if I can clean up some of the JSON processing.

Things I'm punting on:

Summary: Display information about disabled and failing tests on those tests' pages and in directory listings → Display information about disabled and failing tests on those tests' pages

The enhancements in Bug 1653986 to display information about tests derives its
data from these 2 jobs and so it's appropriate to explicitly depend on them.

The current status of these jobs in the tree as far as I can tell is that:

This is amazing.

Flags: needinfo?(james)

Thanks!

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #8)

  • Indicate the presence of WPT meta disabling conditions and lists them and provides linkified bug links by naively prepending the BMO quicksearch URL, so it'll break if people put actual URLs in the field.

I've pushed a fix for this and am re-running the "asuth" channel indexer.

Pushed by bugmail@asutherland.org:
https://hg.mozilla.org/integration/autoland/rev/988343da285b
Add new test metadata taskcluster searchfox task deps. r=kats
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Blocks: 1655952
Blocks: 1656010
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: