Initial integration of instantly available code coverage data into searchfox via direct integration into HTML DOM
Categories
(Webtools :: Searchfox, enhancement)
Tracking
(Not tracked)
People
(Reporter: asuth, Assigned: asuth)
References
Details
Attachments
(3 files)
The server at https://mozilla.github.io/code-coverage-reports/ is able to dynamically provide coverage data via RESTy API that returns JSON. https://addons.mozilla.org/en-US/firefox/addon/gecko-code-coverage/ already exists and supports integrating the coverage data into searchfox's code listing[1].
So a good first step to integrating code coverage would be to add an entry to the navigation box that can trigger dynamic loading of code coverage leveraging the already-existing code for this. See attached screenshot for what it currently looks like. We might want to get slightly fancier with the styling, perhaps decorating the blame bar with green/red borders or using a less saturated green/red for the lines.
Future work after that could involve processing the coverage data as part of the indexing job so that the information can be provided with zero latency[2] and potentially inform a more clever search experience. That is, the coverage data provides the number of times a line was executed, not just a boolean. That could be used to derive a weighting for symbols people are more likely to be looking for, etc.
1: https://github.com/mozilla/code-coverage/blob/master/addon/src/searchfox.js
2: Latency-wise, https://mozilla.github.io/code-coverage-reports/#latest:dom/indexedDB/IDBKeyRange.cpp for a 415 line file took 2.8 seconds, but the world-ender https://mozilla.github.io/code-coverage-reports/#latest:dom/indexedDB/ActorsParent.cpp only took 5.6 seconds.
| Assignee | ||
Comment 1•2 years ago
|
||
Er, note that the screenshot seems to be indicating an off-by-1 error. All the green lines should be shifted down one. (Searchfox processing the coverage data during indexing time might also be able to perform fix-ups for expressions/statements that span multiple lines, such as constructors with a lot of arguments, where the whole constructor signature/function should be green rather than just 1 line.)
Comment 2•2 years ago
|
||
Hello Andrew,
I've been working lately on the code-coverage backend and frontend you mention.
Regarding the latency, it's due (in part) to waiting for HGMO to send us the file at the specific revision. You certainly already have a more efficient mechanism than us to do that !
Secondly, we are currently hosting that services on Heroku, on a single small dyno, with a remote redis instance as cache. That's not the most efficient architecture and are hoping to move soon on cloudops infra which would give us a nice boost.
Anyway, this would be awesome to have native searchfox integration for code-coverage !
Comment 3•2 years ago
|
||
(In reply to Andrew Sutherland [:asuth] (he/him) from comment #0)
Future work after that could involve processing the coverage data as part of the indexing job so that the information can be provided with zero latency
I would be really happy if we did this (in fact, this was our team's long term plan!), we could even retire part of our (hastily-written) frontend.
We can easily add a new endpoint for the indexing job to retrieve the full coverage data for a given revision (we do need to make the indexing job wait on the coverage data to be available though).
| Assignee | ||
Comment 4•2 years ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #3)
We can easily add a new endpoint for the indexing job to retrieve the full coverage data for a given revision (we do need to make the indexing job wait on the coverage data to be available though).
I put some docs up at https://github.com/mozsearch/mozsearch-mozilla#how-searchfoxorg-stays-up-to-date that cover how searchfox jobs are scheduled. (https://github.com/mozsearch/mozsearch/blob/master/docs/aws.md has some other related context.)
The short story is that searchfox indexer runs start independently of taskcluster at a hard-coded time (14:00 UTC) against the nightly builds (which taskcluster starts at ~ 10:00 UTC), with the windows searchfox taskcluster task being the slowest task. We can move the searchfox indexer run to happen later as needed. It looks like in general that the linux64 ccov jobs completed by 12:00 UTC which means they're already fast enough.
I guess the question is what then aggregates that data. Ideally whatever does it would be a taskcluster task that generates some artifact files that searchfox could ingest. Searchfox has had issues with services like the git-hg revision mapper that don't have 100% uptime. Consuming deterministic batch jobs that anyone can reschedule on taskcluster is very much preferable.
For initial at-indexing-time processing, the HTML output stage (documented at https://github.com/mozsearch/mozsearch/blob/master/docs/output.md) would ideally want a parallel file tree like analysis produces (documented at https://github.com/mozsearch/mozsearch/blob/master/docs/analysis.md) so that for any given path, it could open a corresponding file that contains the corresponding coverage data and ingest it. I see that the ccov jobs generate artifacts like code-coverage-jsvm.zip and code-coverage-grcov.zip which are in (lcov?) info format. It seems like there's already a ton of rust parsers for this format, so that would be easy enough for searchfox to ingest if the data was split out per-file by whatever processing step there is and searchfox could just download a tarball/zip/whatever and unzip that.
Comment 5•2 years ago
|
||
(In reply to Andrew Sutherland [:asuth] (he/him) from comment #4)
(In reply to Marco Castelluccio [:marco] from comment #3)
We can easily add a new endpoint for the indexing job to retrieve the full coverage data for a given revision (we do need to make the indexing job wait on the coverage data to be available though).
I put some docs up at https://github.com/mozsearch/mozsearch-mozilla#how-searchfoxorg-stays-up-to-date that cover how searchfox jobs are scheduled. (https://github.com/mozsearch/mozsearch/blob/master/docs/aws.md has some other related context.)
The short story is that searchfox indexer runs start independently of taskcluster at a hard-coded time (14:00 UTC) against the nightly builds (which taskcluster starts at ~ 10:00 UTC), with the windows searchfox taskcluster task being the slowest task. We can move the searchfox indexer run to happen later as needed. It looks like in general that the linux64 ccov jobs completed by 12:00 UTC which means they're already fast enough.
I guess the question is what then aggregates that data. Ideally whatever does it would be a taskcluster task that generates some artifact files that searchfox could ingest. Searchfox has had issues with services like the git-hg revision mapper that don't have 100% uptime. Consuming deterministic batch jobs that anyone can reschedule on taskcluster is very much preferable.
For initial at-indexing-time processing, the HTML output stage (documented at https://github.com/mozsearch/mozsearch/blob/master/docs/output.md) would ideally want a parallel file tree like analysis produces (documented at https://github.com/mozsearch/mozsearch/blob/master/docs/analysis.md) so that for any given path, it could open a corresponding file that contains the corresponding coverage data and ingest it. I see that the ccov jobs generate artifacts like code-coverage-jsvm.zip and code-coverage-grcov.zip which are in (lcov?) info format. It seems like there's already a ton of rust parsers for this format, so that would be easy enough for searchfox to ingest if the data was split out per-file by whatever processing step there is and searchfox could just download a tarball/zip/whatever and unzip that.
Yeah, we have built our own a while ago (https://github.com/mozilla/grcov). We parse those artifacts and upload the result (a single JSON file containing the coverage data for all files) to GCP. Searchfox could presumably download that and use it (there wouldn't be a file for each path, but a single file containing data for all paths, but it's easy to "extract" it).
| Assignee | ||
Comment 6•2 years ago
|
||
Can you provide a link to an example GCP json file? And is that just coverage data for the single build, or is there historical data in that too?
Also, is that parsing done on heroku right now, or elsewhere? Ideally, the processing could be a taskcluster job with the JSON exported via artifact.
Comment 7•2 years ago
|
||
(In reply to Andrew Sutherland [:asuth] (he/him) from comment #6)
Can you provide a link to an example GCP json file?
Here's an example: https://github.com/mozilla/code-coverage/blob/4bb73d8ce774b9e8af3ef43bab916b071836e8cb/backend/tests/fixtures/covdir.json. It's not a full report, it's just a mock file used for testing, but you can get an idea of the format.
And is that just coverage data for the single build, or is there historical data in that too?
Just coverage data for the single build.
Also, is that parsing done on heroku right now, or elsewhere? Ideally, the processing could be a taskcluster job with the JSON exported via artifact.
It's done on Taskcluster (not from mozilla-central CI though), the Taskcluster task is uploading to GCP. We could upload it as an artifact too, but downloading it from GCP might be better (so we avoid uploading the same file twice and pay for the storage twice).
| Assignee | ||
Comment 8•2 years ago
•
|
||
(In reply to Marco Castelluccio [:marco] from comment #7)
It's done on Taskcluster (not from mozilla-central CI though), the Taskcluster task is uploading to GCP. We could upload it as an artifact too, but downloading it from GCP might be better (so we avoid uploading the same file twice and pay for the storage twice).
Longer term, I think the ideal situation is that all of this (searchfox indexing, code coverage runs, searchfox analysis generation) happens inside of taskcluster and without involving external services. In that case, I think the costs are acceptable. (Uploading the artifact to S3 is free (in the same region), storage is $0.0023/GB/mo, and retrieval is free (same region).)
We're not there yet though, so I think the big question is can you provide shell script logic that tells searchfox how to derive the URL for your GCP storage and fetch it inside the script at https://github.com/mozsearch/mozsearch-mozilla/blob/master/shared/fetch-tc-artifacts.sh.
Comment 9•2 years ago
|
||
We are in the process of adding support for test platform & suites in the code coverage bot: that will change the generated paths.
To be able to add a link in your patch, we'll need to:
- add a Taskcluster route using the MC revision to the code coverage bot,
- export our generated reports as Artifacts
Comment 10•2 years ago
|
||
(In reply to Andrew Sutherland [:asuth] (he/him) from comment #8)
Longer term, I think the ideal situation is that all of this (searchfox indexing, code coverage runs, searchfox analysis generation) happens inside of taskcluster and without involving external services. In that case, I think the costs are acceptable. (Uploading the artifact to S3 is free (in the same region), storage is $0.0023/GB/mo, and retrieval is free (same region).)
We're not there yet though, so I think the big question is can you provide shell script logic that tells searchfox how to derive the URL for your GCP storage and fetch it inside the script at https://github.com/mozsearch/mozsearch-mozilla/blob/master/shared/fetch-tc-artifacts.sh.
It would be pretty easy for us to also publish the report as a Taskcluster artifact and not only upload to GCP, it's really just a few lines change.
(In reply to Bastien Abadie [:bastien] from comment #9)
To be able to add a link in your patch, we'll need to:
- add a Taskcluster route using the MC revision to the code coverage bot,
- export our generated reports as Artifacts
I've filed https://github.com/mozilla/code-coverage/issues/150 to index the code coverage parsing task in the Taskcluster index, and https://github.com/mozilla/code-coverage/issues/151 to publish the full report as a Taskcluster artifact.
Comment 11•10 months ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #10)
(In reply to Bastien Abadie [:bastien] from comment #9)
To be able to add a link in your patch, we'll need to:
- add a Taskcluster route using the MC revision to the code coverage bot,
- export our generated reports as Artifacts
I've filed https://github.com/mozilla/code-coverage/issues/150 to index the code coverage parsing task in the Taskcluster index, and https://github.com/mozilla/code-coverage/issues/151 to publish the full report as a Taskcluster artifact.
Since those have landed, is it now possible to fetch the URL of the artifact given an arbitrary revision in https://github.com/mozsearch/mozsearch-mozilla/blob/master/shared/fetch-tc-artifacts.sh?
Comment 12•10 months ago
|
||
(In reply to Brian Grinstead [:bgrins] from comment #11)
(In reply to Marco Castelluccio [:marco] from comment #10)
(In reply to Bastien Abadie [:bastien] from comment #9)
To be able to add a link in your patch, we'll need to:
- add a Taskcluster route using the MC revision to the code coverage bot,
- export our generated reports as Artifacts
I've filed https://github.com/mozilla/code-coverage/issues/150 to index the code coverage parsing task in the Taskcluster index, and https://github.com/mozilla/code-coverage/issues/151 to publish the full report as a Taskcluster artifact.
Since those have landed, is it now possible to fetch the URL of the artifact given an arbitrary revision in https://github.com/mozsearch/mozsearch-mozilla/blob/master/shared/fetch-tc-artifacts.sh?
Exactly. For example, for 7b8c4f32cdde947c1b1f942905fbc5039b6d1c65 on mozilla-central, the URL on the TC index is https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/project.relman.code-coverage.production.repo.mozilla-central.7b8c4f32cdde947c1b1f942905fbc5039b6d1c65/artifacts/public/code-coverage-report.json.
| Assignee | ||
Comment 13•10 months ago
•
|
||
:bgrins and I had a video call yesterday and the plan is for me to get up an MVP prototype of this in the next week. Specifically, integration of code coverage data in to the HTML renderings of source files. This will not initially include anything at the directory listing level[1] or anything in the search results[2].
My current UX plan is to expose a persistent (like hotkeys) <select> combo-box in the navigation panel for styling choices, which includes disabling the coverage, in order to let us land this sooner and iterate and get feedback on which presentations are least disruptive for normal code browsing and most helpful for reading the code while directly processing (which are different use cases). I'll provide a few options and then ideally other (possibly new!) contributors can help provide additional options and/or refinements of the initial options.
1: It should be reasonably easy to make a subsequent directory level enhancement, although, as with the test info bug, the harder issue is likely the UX of the directory listing.
2: The cross-referencing process would need to be involved in processing coverage for these purposes, and for accuracy this might want some additional help from the indexers to most appropriately attribute the coverage to the right symbol on a line, either in the data emitted by the indexer or as policy information taught to crossref so that it can perform proper attribution.
| Assignee | ||
Comment 14•10 months ago
•
|
||
(In reply to Marco Castelluccio [:marco] (On holiday until Aug 25) from comment #12)
Exactly. For example, for 7b8c4f32cdde947c1b1f942905fbc5039b6d1c65 on mozilla-central, the URL on the TC index is https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/project.relman.code-coverage.production.repo.mozilla-central.7b8c4f32cdde947c1b1f942905fbc5039b6d1c65/artifacts/public/code-coverage-report.json.
Something weird is going on with these hook jobs that seems to be introducing a 24 hour delay or something.
- It seems like the jobs are associated with the hook thing at https://firefox-ci-tc.services.mozilla.com/hooks/project-relman/code-coverage-repo-production
- The most recent job right now is https://firefox-ci-tc.services.mozilla.com/tasks/R1auWv1wQdCd38u8se0sYQ with a creation date of
2020-08-08T21:46:16.194Z. Its task definition lists a "payload.env.REVISION" of "768023eab2270602610ef80988e29773f52d1be8". - That revision corresponds to this treeherder push of basically exactly 1 day before.
- Arbitrarily picking its "Linux x64 opt" "Bocf" task it has a creation time of
2020-08-07T21:43:54.282Zwhich is just over 24 hours before.
To put them above each other for more direct comparison:
2020-08-08T21:46:16.194Z - Tardy coverage artifact job.
2020-08-07T21:43:54.282Z - Timely Bocf task that corresponds to that hg revision.
I noticed this after trying to do the naive URL transformation that corresponds to today's searchfox run giving me https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/project.relman.code-coverage.production.repo.mozilla-central.fa0dbdf15f291e814b4854d515d7ef3e4548b7fb/artifacts/public/code-coverage-report.json which didn't exist. All the coverage jobs on the treeherder push seem to have finished in a timely fashion, so I'm guessing there's some kind of amusing off-by-1 style error somewhere, but I don't understand what's going on enough.
I see :marco is on holiday so I'll try and sprinkle around some requests for help in various chat.mozilla.org channels.
| Assignee | ||
Comment 15•10 months ago
|
||
Good news! Although the job confusingly doesn't seem to have directly defined-in-the-task routes, it turns out the usual trick of using "latest" for the revision works and we get the day old data. This is likely too misleading to ship/enable for mozilla-central for searchfox.org, but should get us through the development process and is fine for dev.searchfox.org / etc.
| Assignee | ||
Comment 16•10 months ago
•
|
||
tl;dr: It seems that the bot is dependent on the exchange/taskcluster-queue/v1/task-group-resolved pulse queue resolving in a timely fashion. But in the event there are things that never run, the group won't resolve until its deadline. It appears that there were a bunch of tier 3 "Bg" jobs like "build-linux64-gcp/opt" that got scheduled but never ran. I've asked in the channel if there's an easy way for the bot to avoid caring about the tier3 jobs by choosing a different queue.
So as a basic timeline of an example of Saturday's very delayed coverage report:
- https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&searchStr=searchfox&revision=b579c12907dc3bb210177454a696ccc181c1ded0 is the Saturday afternoon push to mozilla-central.
2020-08-08T21:33:50.820Z- Decision task RrtxoDFqSUKiPyyKj--Wdw is created and promptly runs which is what gives us theprimary.RrtxoDFqSUKiPyyKj--Wdw.gecko-level-3._routing key message that will eventually trigger the bot.2020-08-08T21:38:41.869Z- An M1 mochitest coverage task W09jrnyITZeo828Ddccg4w is created and promptly runs2020-08-08T21:38:41.869Zcreated2020-08-08T22:01:42.493Zscheduled2020-08-08T22:39:47.382Zstarted2020-08-08T23:00:17.079Zresolved
2020-08-09T21:33:50.820ZThe decision task/task group's deadline.2020-08-09T21:41:01.832Z- The coverage bot hook task Fp7LhNPcTRqebSMoEJOXfA is created.2020-08-09T21:41:01.832Zcreated2020-08-09T21:41:01.866Zscheduled2020-08-09T21:41:02.650Zstarted2020-08-09T23:43:06.839Zresolved
Chat log links:
- #taskcluster
- Initial questions and me thinking aloud on Saturday: https://matrix.to/#/!whDRjjSmICCgrhFHsQ:mozilla.org/$gDd5sweCehsSj-dMOSq2Q3hcc6_M6ptlk-h2ctoD7kw?via=mozilla.org&via=matrix.org&via=t2bot.io
- Discussion of the tasks queue issue on Monday: https://matrix.to/#/!whDRjjSmICCgrhFHsQ:mozilla.org/$VOcAKJi_6f3lqKsWuTffs7QvUYaGMFFk6bpXy6Nmx6Q?via=mozilla.org&via=matrix.org&via=t2bot.io
It does seem
| Assignee | ||
Comment 17•10 months ago
|
||
| Assignee | ||
Comment 18•10 months ago
|
||
| Assignee | ||
Comment 19•10 months ago
•
|
||
Initial attempt at this is up at https://asuth.searchfox.org/ and using https://asuth.searchfox.org/mozilla-central/source/dom/indexedDB/ActorsParent.cpp for a stress-test. The pull requests include synthetic coverage data for the "tests" and "searchfox" repo for the simple.cpp test file.
As proposed in the pull requests, I was thinking we could more canonically discuss styling here but any CSS selector mechanics in the mozsearch pull request. I'm also going to put a pointer to the server in the #searchfox chat room on chat.mozilla.org for more informal discussion, but this will be the conversation of record.
:Jamie, this current revision doesn't yet do anything to surface the code coverage information to screen readers. Right now the patch maintains the existing DOM and just uses hacky CSS selectors, but we can obviously modify the DOM, such as adding an additional cell next to the blame bar or including the data in the blame bar. I suppose the information could also go in the blame popup too?
Right now the data is reflected into the DOM via CSS classes on the "row", ex:
<div role="row" id="line-150" class="source-line-with-number cov-hit cov-known cov-log10-3">
<div role="cell"><div class="blame-strip"></div></div>
<div role="cell" class="line-number" data-line-number="150"></div>
<code role="cell" class="source-line"> <span data-symbols="V_e7dc91c08f154bab_63a795">xy</span>.<span data-symbols="_ZN2NS1XIiE1fEv" data-i="86">f</span>();
</code>
</div>
Comment 20•10 months ago
|
||
Thanks for reaching out.
I'm not super familiar with code coverage reports and how that info is presented.
Can you give me an idea of the different indicators that might be exposed for a line?
Can you give some scenarios where it's useful to be able to "glance" at this info for any given line of a file without any additional action? My initial feeling was that it'd be easiest to just expose this in the blame popup, but there is obviously a good reason we want to expose this visually without the user needing to open the popup. I'm trying to understand that reason so I can figure out an equivalent for screen reader users if possible.
| Assignee | ||
Comment 21•10 months ago
•
|
||
Context
(In reply to James Teh [:Jamie] from comment #20)
Can you give me an idea of the different indicators that might be exposed for a line?
Sure! The Mozilla prior art for this is the UI surfaced by https://coverage.moz.tools/ is consistent with my other coverage experiences, so I'll use that as my basis for description. Each line has:
- A visual background color encoding where each source line has one of 3 states:
- Uncolored AKA normal background color: Coverage data was a -1 indicating source lines that are not instrumented. This happens for whitespace lines, comment lines, as well as lines that are part of multi-line statements/expressions where things like function call arguments don't get individually tracked by coverage.
- Red = Coverage Miss: Coverage data was a 0 indicating that the source line was instrumented and capable of recording execution hits, but didn't get executed.
- Green = Coverage Hit: Coverage data was greater than 0, indicating that the source line was instrumented and recorded at least one hit.
- For covered lines, there's also a "number bubble" on the right of the source listing in its own table column. The number bubble is a span with a dark background color and white text for maximum contrast. The number bubble has a tooltip hover of "This line has been hit NUMBER times."
- For coverage hit counts of less than a thousand, the background color of the number bubble is black and the exact number of coverage hits is displayed. So like "2" or "509".
- For coverage hits at kilo and mega breakpoints, the bubble is rendered with a blue background instead and is expressed with "k" or "M" suffixes, rounding down. For example "19 k" will be displayed for a hover that says "This line has been hit 19954 times." and "1 M" for "This line has been hit 1734587 times."
Can you give some scenarios where it's useful to be able to "glance" at this info for any given line of a file without any additional action? My initial feeling was that it'd be easiest to just expose this in the blame popup, but there is obviously a good reason we want to expose this visually without the user needing to open the popup. I'm trying to understand that reason so I can figure out an equivalent for screen reader users if possible.
Important context for visual processing: There's a concept known as preattentive processing which Perception in Visualization documents well and I keep incorrectly calling "visual prefiltering". My bad digest of it for the purposes here is that this mechanism can make it relatively low-effort to identify regions where there is no coverage and to conceptually think of them as a single continuous entity if they are sufficiently rare and continuous. One problem I think our visual approach to code coverage needs to deal with is that multi-line statements are very common and the uncovered line presentation creates a lot of visual noise that distracts and makes it more difficult to leverage preattentive processing. I may look at visually interpolating/filling-in these gaps so that visual transitions only occur at transitions between covered and uncovered, and this is something that would likely also be important for screen reader purposes.
The benefits of presenting the ambient coverage data given a good visual encoding and preattentive processing in my experience are:
- From the green/red: Ability to identify potentially dead code, especially methods that aren't used when the entire method is red. This can also be a manifestation of a complete lack of test coverage for something, either due to the test being disabled or the tests not existing.
- From the green/red: Ability to identify whether edge case handling and general error handling is something that actually gets used and/or tested. Especially in older platform code it's very common to have logic where there are no MOZ_ASSERT calls and so one frequently has to guess about whether a given code path is used even in tests. The coverage makes it explicit.
- From the order of magnitude from the black/blue encoding and no-suffix, K-for-kilo, M-for-mega: Ability to determine how common a given control flow path is taken relative to other control flow paths.
In terms of future potential that may be worth keeping in mind.
- The coverage data searchfox is currently ingesting with this patch is aggregated over all test corpuses, but one could imagine exposing the data more granularly and with the ability to compare different scenarios. Examples:
- Comparing the coverage result of high level smoke tests that use something like Selenium/WebDriver to drive actual web browsing versus the test corpus.
- Comparing coverage data from the same high level smoke tests with and without a screen reader active.
- Comparing the coverage difference of a specific feature's functional tests versus a baseline start-and-stop Firefox profile to better understand the feature's transitive system dependencies. For example, there recently was somewhat of a surprising regression for the search box where users with a partially corrupt profile had the search box UI entirely break because of a dependency of the search box on the Remote Settings subsystem which depends on IndexedDB which depends on Quota Manager. This scenario is most densely presented visually as a treemap which is a 2-dimensional nested hierarchy of rectangles that encodes data with size and color. Normally the naive hierarchy mapping would be over the source code with sizing based on lines-of-code in a file and the colors being the (relative) coverage. But the scenario could also be expressed as a tree which could be more accessible.
Possibilities
"Glasses Mode" Hierarchical Coverage Digests
Context: Glasses Mode is an emacs mode that actively re-writes CamelCase text into underscore delimited camel_case. It's notable because it's the injection/re-writing of source code, which people usually don't want. For the point below, I'm referring to "glasses mode" as potentially manipulating the source line cells themselves instead of placing the information in an explicitly separate cell like we do for the blame data.
The implementation of the position: sticky magic that makes the enclosing namespaces, classes, and methods stick to the top of the scrolling viewport has resulted in the construction of an explicit DOM hierarchy that reflects the underlying semantic hierarchy. It's not perfect, but it makes it straightforward to compute the number of coverage hits and misses and ratios/etc. within a given class. This would allow the optional injection of some type of content or markup that conveys when a method is fully covered, fully uncovered, partially covered with a ratio of 90%, etc. Unfortunately, the position: sticky hierarchy currently does not exist at any granularity inside methods, but heuristics that look at code indentation and/or syntax aware tokenization like is used for code folding could certainly attempt to provide summaries for code blocks and conditionals.
Another possibility would also be a glasses mode that injects content or markup at points of transition between covered and uncovered lines, potentially with the content/markup conveying the length of the run of hits/misses.
Aside: :sg and I have talked about a glasses mode that would display inferred C++ "auto" types as inline which I think would be an example of a glasses mode that's useful for both visual and accessibility-tree interaction modes.
Audio "Visualization" / Alternate Text Encodings to convey coverage levels
First: I'm not sure how much other people value the different level of magnitudes for code coverage. I have the CSS attributes encoding the log10 logarithm of the coverage level for a line and have themes that encode this as the width of a visual bar, but I don't think any feedback has endorsed that approach. (Aside: we could still visually encode the level, however.)
According to the NVDA docs on Document Formatting line indentation reporting there is a a tones mode that: "If Tones is selected, when the amount of indentation changes, tones indicate the amount of change in indent. The tone will increase in pitch every space, and for a tab, it will increase in pitch the equivalent of 4 spaces. "
I suspect it's inadvisable for searchfox itself to be trying to be doing its own audio thing when using NVDA or other screen readers, but perhaps this existing NVDA mechanism for indentation could be used in a specific mode that would cause Searchfox to re-write the indents of source lines based on the logarithm of the coverage level of a line. Or maybe trying to use tones is silly or unreliable because humans without perfect pitch are bad at determining relative frequencies and it's just faster to optionally prefix the source line with a terse textual summary of the coverage level like "miss", "small hit", "hit", "big hit".
Layers
Map and GIS (Geographic Information System) user interfaces frequently have different layers that can be toggled on and off because you don't necessarily need or want to see everything all at once. After having talked about the visual noise of the current prototype, I'm thinking that although we don't want the theme picker the prototype has/its feature matrix, it might be useful for all searchfox interface modes to have a concept of layers.
For example, there's a proposed blame bar mode that would visually convey the age of changes via a color encoding as well., and this would fight the visual coverage encoding. It would probably be problematic to turn them both on by default unless the code coverage was presented quite differently, like wrapping uncovered runs of code in a big high contrast box that's visually separate from the coverage strip.
Comment 22•10 months ago
|
||
Wow. Thanks for these super detailed and valuable thoughts.
As things are now for screen reader users, I think the nearest equivalent we can get is to have summaries for methods/classes. That said, that sounds like more work than you probably want to do in this first version, and I'm not sure it's worth going that far just yet.
For now, it'd be good to at least expose the information for each line in the DOM. Where to expose it is an interesting question:
- The easiest would be to expose it in the blame bar, but that does seem like odd conflation.
- We could expose it in the same cell as the blame bar.
- We could expose it as an extra cell.
I'm guessing all of these potentially have visual impact, which you may not want. I'm normally not a fan of off-screen text - it's an ugly hack - but in this case, it might be the best option. If we were going for off-screen text, I'd probably put it in the same cell as the blame cell, thus avoiding any visual impact. (That said, having some way for sighted users to see the text might make the coverage colours more "discoverable".)
The text should be as terse as possible while still being clear. For not instrumented, there should be no text. For misses, we could say "coverage miss". For 10000 hits, we could say "coverage hit 10k". The "coverage" prefix is somewhat verbose and I kinda want to drop it for efficiency, but I'm concerned that "hit" and "miss" might not be understood by users, though to be fair, that's probably also true of the colours. If we're not worried about discoverability, maybe "miss" and "hit 10k" are enough.
Comment 23•10 months ago
|
||
To address some of your other points, though not so relevant right now:
(In reply to Andrew Sutherland [:asuth] (he/him) from comment #21)
I suspect it's inadvisable for searchfox itself to be trying to be doing its own audio thing when using NVDA or other screen readers,
That's true, though it's a fascinating idea.
but perhaps this existing NVDA mechanism for indentation could be used in a specific mode that would cause Searchfox to re-write the indents of source lines based on the logarithm of the coverage level of a line.
This is a problem we keep hitting that no one really has a good solution for yet: how to do efficient, domain specific a11y without horrible hacks, specific support in assistive tech or implementing things like audio in the app itself (thus restricting versatility). Trying to abuse NVDA's indentation feature will lead to pain because it also impacts things for, say, braille users. And yet, how do we make NVDA use tones in a useful way here? No good answers, I'm afraid.
Or maybe trying to use tones is silly
On the contrary, while there are problems with this (steep learning curve, discoverability), using tones in parallel with speech makes for huge efficiency gains. I absolutely love the indentation tones functionality in NVDA for this reason.
or unreliable because humans without perfect pitch are bad at determining relative frequencies
I don't have any solid data, but as I understand it, there is a much higher incidence of absolute pitch among blind people as compared with those who are not blind. Also, you don't need absolute pitch to make good use of, say, indentation tones; you only need good relative pitch once you establish the base pitch. Again, the incidence of good relative pitch among blind people is probably higher.
| Assignee | ||
Comment 24•10 months ago
|
||
As a brief update on the status of this patch, my plans for the next rev (which may be up to 2 weeks out given recent changes happening in Mozilla that have impacted priorities and my general concerns about changing a UI used daily by people dealing with these stressful changes):
- Visual presentation:
- I want to focus on reducing the visual distraction factor of these changes.
- The current candidate for iteration based on feedback from :kats and :emilio is the fixed-size strip but positioned to the left of the blame bar. The tentative plan is to lighten the colors used (based on feedback from :emilio) and to establish a hover box that contextualizes what the box means.
- I'll probably try a few variations of theming for this, including a few hover variations.
- I also want to look at interpolating coverage data to eliminate the uninstrumented visual indication to reduce the number of visual transitions which I believe are not useful.
- Accessibility presentation:
- The fixed-size visual strip maps well to an extra cell on each row. Although we could imitate the blame pop-up mechanism for this, it seems like the coverage text could be terse enough that we could provide for the content to be read as the cells are navigated. The strings could be chosen so that any entropy is front loaded. Example strings: "Not covered", "Covered with 100 hits", "Uninstrumented". The popup mechanism could still be available to explain that in greater depth. It's not clear to me if this would want the coverage interpolation to eliminate "Uninstrumented" or not.
- Stretch Goal:
- position: sticky hierarchy
- Aggregating a coverage digested for the position: sticky hierarchy
- Add keyboard navigation between the position: sticky hierarchy nodes via
j/kand ensure this is built on top of exposure of these navigation points to the accessibility tree/screen readers.
- position: sticky hierarchy
| Assignee | ||
Comment 25•8 months ago
|
||
:Jamie, I've resumed working on this patch and for exposing the coverage text to screen-readers, I was thinking of adding an aria-label to the coverage cell with the terse digest of what's going on. The enter key would then trigger the more verbose popup that is also displayed on mouse hover using the same mechanism as blame. I think this might also make sense to similarly expose a label for blame.
Questions:
- Does setting an
aria-labelon the role=cell div sound appropriate given that this terse information isn't meant to be perceived outside of the accessibility tree? - Does it make sense to expose a label for blame too? If so, do you have any suggestions on what labels might be good digests for blame? My notes:
- Currently the visual encoding is "zebra striping" where we alternate gray colors whenever the blame source revision of a line isn't the same as its preceding line. So a run of lines with the same source revision will all share the same color gray, and there will be a visible transition indicating when a line is from a different revision.
- Right now there's a limited amount of data synchronously available when generating the file: the revision the line came from, the filename it came from if it's not the current filename, the line number it had in that file, and the display name (not email) of the author associated with the line. There's an enhancement suggestion somewhere to support coloring blame based on the age of the revision which would likely result in us also including the timestamp of the revision. All the additional data we display in the popup is due to dynamic git lookups that aren't something we can currently consider running as part of
output-file. - Because of the lack of extra metadata, I think this limits us to doing something similar to zebra striping like saying "prev hash differs", "same hash", and possibly "next hash differs". Alternately, if we assume a top-down order, the text could always be given to be stateful relative to the previous line, so we'd just say "new hash" and "same hash". This could be suffixed by including a prefix of the hash or mapping the hashes observed in the file onto a more human friendly space. So a boring file that only has 10 distinct git revisions that feed into it might have lines that, moving down the column would read: "new hash 1", "same hash 1", "new hash 2", "same hash 2", "new hash 3", "new hash 4", "new hash 1", "same hash 1", "new hash 4".
Note that assuming we go ahead with adding aria-label, I'm going to implement bug 1631466 wherein we store the output-file-generated HTML files pre-compressed (gzip?) on disk to offset the expected growth of the files and help offset any performance hits by avoiding the need to compress on the fly.
Comment 26•8 months ago
|
||
(In reply to Andrew Sutherland [:asuth] (he/him) from comment #25)
- Does setting an
aria-labelon the role=cell div sound appropriate given that this terse information isn't meant to be perceived outside of the accessibility tree?
Unfortunately, that's not going to be read by screen readers. aria-label would need to be used on an interactive element (e.g. a role="button") or we'd need off-screen text (urgh).
Given that there's a pop-up, I assume we'll have a role="button" like we do for blame? Can we put the text in that aria-label?
- Does it make sense to expose a label for blame too? If so, do you have any suggestions on what labels might be good digests for blame?
- Because of the lack of extra metadata, I think this limits us to doing something similar to zebra striping like saying "prev hash differs", "same hash", and possibly "next hash differs". Alternately, if we assume a top-down order, the text could always be given to be stateful relative to the previous line, so we'd just say "new hash" and "same hash". This could be suffixed by including a prefix of the hash or mapping the hashes observed in the file onto a more human friendly space. So a boring file that only has 10 distinct git revisions that feed into it might have lines that, moving down the column would read: "new hash 1", "same hash 1", "new hash 2", "same hash 2", "new hash 3", "new hash 4", "new hash 1", "same hash 1", "new hash 4".
I like the last idea: "new hash 1", "same hash 1", etc. I was wondering whether we should just do "hash 1" without the new/same prefix to eliminate the reading order dependence. However, that could get messier in a file with a lot of revs. Having to listen to "hash 1532" and "hash 1597" is a lot slower than hearing "new" or "same" and ignoring the rest if it's not of interest.
Thanks!
| Assignee | ||
Comment 27•8 months ago
•
|
||
Patch updated
The most recent test run is up at https://asuth.searchfox.org/. My summary of key changes from #searchfox on chat.mozilla.org was:
- The coverage is now to the left of the blame bar, fixed width.
- When not hovered, coverage hits/misses are interpolated in order to reduce visual noise from transitions between uncovered lines and hits/misses.
- When hovered, it's assumed you would appreciate more detail and so we drop the interpolation; uncovered lines don't get colored. I also opted to go with a log scale for hits. Lines that get hit a lot will have increasingly darker blues in the scale.
- The accessibility changes in terms of aria-label have been implemented for both coverage and blame but I ran into bug 1668701 which is independent of my changes here when testing with NVDA. I do still need to make it possible to use the enter key to toggle the popups off. (That's also a pre-existing issue that's somewhat self-mooting since you can only ever have one popup up at a time, but it should be simple to fix.)
Disk Usage
Disk usage slightly increased. Unfortunately though we have 110G spare on "release" config.json before the patch and 102G after the patch, pre-patch we only have 19G spare on "mozilla-releases.json" which is not a sufficiently comfortable buffer so I'm going to try and get to bug 1631466 soon because it seems likely to be an easy win. For example, the ServiceWorkerPrivate.cpp file I've been using as an example is 1.9M HTML post-patch uncompressed and gzip with default settings dropped that to 140K and brotli defaults hit 80K. Using brotli_static the server can decompress for clients that don't report Accept-Encoding: br. The question is whether it's worth also storing gzip-encoded versions as well so that an accept-encoding header that includes gzip but not br gets a similarly precomputed experience. We don't currently log that header in our logs, so we don't know, but my guess would be it's not an optimal use of resources (although we could enable it to cache dynamically re-encoded gzips if needed).
Indexed dirs, with pre-patch and post-patch values:
- comm-central: 40G increased to 43G
- ecma262: 140M increased to 145M
- glean: 572M increased to 579M
- kaios: 49G increased to 52G
- mozilla-build: 1.8G stayed 1.8G
- mozilla-central: 68G increased to 70G
- mozilla-mobile: 3.8G increased to 4.0G
- nss: 4.1G increased to 4.3G
- version-control-tools: 196M increased to 210M
- whatwg-html: 3.5G increased to 3.6G
Comment 28•8 months ago
|
||
This is a great rendering, bravo!
| Assignee | ||
Comment 29•7 months ago
|
||
This landed.
Description
•