Closed Bug 646216 Opened 10 years ago Closed 2 years ago

Collapse a11y tree when possible

Categories

(Core :: Disability Access APIs, enhancement)

x86
Linux
enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla67
Tracking Status
firefox67 --- fixed

People

(Reporter: fherrera, Assigned: MarcoZ)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files)

Currently we are exposing mostly a 1<->1 map from layout tree as the accessible tree. Things like:

<div><div><div><p>foobar</p></div></div></div> 

are mapped into:

section->section->section->paragraph

even in that is presented on the screen as a single paragraph

We should try to collapse a11y tree hierarchy to make everything more performant (and easier on the AT apps side).
Agree, I carried this idea for a long time, but never did get close to technical side. Any engineering thoughts, Fernando?
Blocks: treea11y
We could try cut off div elements without any semantics since they are wide used rather for DOM elements grouping than for layout stuffs. If there's no good reason why all of them were made accessible then we could try to get a build and test it well. Our hierarchy is really big because of these divs.
I think this is a good idea but doesn't need an open bug.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
(In reply to David Bolter [:davidb] from comment #4)
> I think this is a good idea but doesn't need an open bug.

at least wontfix is a wrong status, because this is something we'd need to fix. also it's not probably bad to keep ideas/enhancements in the bugzilla, for tracking proposes.
let's reopen the bug, the topic gets active.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Attached patch kill them allSplinter Review
just kill them all, curious if it makes any difference in user experience in terms of better performance rather than in terms of broken web sites.

here's a try server build: https://tools.taskcluster.net/index/artifacts/gecko.v2.try.revision.ed1212a6b6fa0e762fc5f1bf06ef8aae259984da
Flags: needinfo?(jteh)
Sorry, but that task cannot be found. I only noticed afterwards that this isn't a treeherder link as it usually is. So I cannot currently test this.
Flags: needinfo?(surkov.alexander)
(In reply to Marco Zehe (:MarcoZ) from comment #8)
> Sorry, but that task cannot be found. I only noticed afterwards that this
> isn't a treeherder link as it usually is. So I cannot currently test this.

their message is actually misleading, but it seems they have all the builds https://tools.taskcluster.net/index/artifacts/gecko.v2.try.revision.ed1212a6b6fa0e762fc5f1bf06ef8aae259984da.firefox: a list on the left of all available platforms, and then a list of builds will be shown on the right when you click a specific platform.
Flags: needinfo?(surkov.alexander)
I was finally able to give this a spin with something like Wikipedia, Gmail, our IRCCloud etc. The results are that I don't see much of a difference compared to a regular nightly from around the same time. Yes, a conversation in Gmail might open a bit faster, but returning from there to the inbox is just as slow. I am seeing far better results with the caching we're now doing in bug 1416986 and bug 1417327 on top of that.
(In reply to Marco Zehe (:MarcoZ) from comment #10)
> I was finally able to give this a spin with something like Wikipedia, Gmail,
> our IRCCloud etc. The results are that I don't see much of a difference
> compared to a regular nightly from around the same time. Yes, a conversation
> in Gmail might open a bit faster, but returning from there to the inbox is
> just as slow. I am seeing far better results with the caching we're now
> doing in bug 1416986 and bug 1417327 on top of that.

thanks, Marco. So fetching info by AT is a bigger bottleneck than a tree size. Btw, did you see broken things with the patch? How they were noticeable/bad?

The cut-them-all approach is the easiest and the most performant one. As soon as we do more sophisticated rules, then we have to support more sophisticated tree updates, so that for every DOM insertion/removal, we may need to inspect its children or parents, which goes at cost for sure. So we have to be careful: slower tree updates vs less info to expose.

Yura, did you have a chance to mine more data?
Flags: needinfo?(jteh) → needinfo?(yzenevich)
The spot checks I did, Twitter, Gmail, IRCCloud, Slack, still looked OK in the buffer, but these were nothing more than that. I'd say since this didn't really improve performance that much overall, adding tree rules would slow things down further, mittigating the slight advantage the reduction of the tree size might give us. But this approach to fully cut the divs out might introduce regressions we cannot foresee. Given that the caching stuff Jamie is working on is proving to yield very good results, we might even consider WONTFIXing this bug again. But let's see what Yura says.
I used the following query:

[...document.querySelectorAll(
  "div:not([role]):not([title]):not([aria-label]):not([aria-labelledby]):not([aria-describedby])")]
    .filter(div => !div.id || 
            document.querySelectorAll(
              `:-moz-any([aria-activedescendant~="${div.id}"], [aria-controls~="${div.id}"], [aria-describedby~="${div.id}"], [aria-details~="${div.id}"], [aria-errormessage~="${div.id}"], [aria-flowto~="${div.id}"], [aria-labelledby~="${div.id}"], [aria-owns~="${div.id}"])`).length === 0).length

/

document.getElementsByTagName("div").length

to calculate a rough ratio of divs on the page that probably do not have any semantics attached to it for several high traffic properties (gdocs, gdrive, gmail, twitter facebook) and the ratio of these divs vs all divs is as follows:

Filtered, Unfiltered, Ratio
GDocs: 1207, 1630, 0.7404907975460123
GDrive: 861, 1062, 0.8107344632768362
Google: 192, 196, 0.9795918367346939
Baidu: 21, 21, 1
Twitter after a couple scrolls: 3278, 4062, 0.8069916297390448
Gmail: 2833, 3556, 0.797
Facebook: 1486, 1582, 0.939
Youtube: 948, 962, 0.985
Amazon: 684, 731, 0.9357045143638851
Reddit: 362, 498, 0.7269076305220884
Yahoo: 1199, 1245, 0.9630522088353414
Wikipedia main page: 100, 116, 0.8620689655172413
Flags: needinfo?(yzenevich)
So Yura findings suggests that most of DIVs have no semantics (I realize that some with implicit semantics are missed in the report, but they don't make big number I believe).

I'm curious what other browsers do. Ideally we should end up building same trees between all browsers. So if other browser don't expose semantci-less DIVs, then neither we should do.
Some analysis with Chrome.

Test cases:

C1. <div>foo</div>
Exposes the div.

C2. <div><div>foo</div></div>
Exposes one of the divs.

C3. <div><div id="b">foo</div></div>
Exposes only div#b.

C4. <div id="a"><div id="b">foo</div></div>
Exposes both divs.

C5. <div class="a"><div class="b">foo</div></div>
Exposes only div.b.

C6. <div><div>foo</div><div>bar</div></div>
Exposes the two inner divs.

C7. <div>foo<div>bar</div>baz</div>
Exposes the inner div only. (foo and baz are text on the document.)

C8. <div>foo<div><div>bar</div>baz</div></div>
Exposes only the inner most div containing "b". ("a" and "c" are text on the document.)

Observations:

O1. As per C4, divs with an id are considered to have semantic value and are always exposed. (I believe this is so that <a name="#id"> works correctly.) 
O2. But as per C5, divs with a class are not treated specially.
O3. Where there are nested divs, only the inner most is exposed. See C2, C3, C5, C6, C7, C8.
Thanks, Jamie. So we could try then to cut out all divs but those who has text. 

We have more complicated rule about IDed elements (for example spans): we don't expose them if no one refers to it. We might want to keep it simpler, then would reduce our complexity.
here's one more sample (exclude divs containing non whitespace text):

(function(coll) {
  let cnt = 0;
  for (let d of coll) {
    if (d.firstChild && /\S/.test(d.firstChild.wholeText)) { cnt++; }
  }
  return cnt;
})
(coll)

returns 1907 on gmail on Yura's collection, divs total: 3789, total elements: 7797. The ratio of semanticless  DIV's is high and about 25% in gmail case.
With improvements in NVDA 2018.4, and in combination with a patch to only fire sync text changes for live regions that Jamie is working on, I am revisiting this and have made some great progress in thinning out the tree, and I believe this can be performant. This work will most likely continue well into the new year.
Assignee: nobody → mzehe
Status: REOPENED → ASSIGNED
Current work in this try build, test failures still expected: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ff9e462dd234bfa732eb6ce05f59e6e2e5d8039e

Relevant divs are:

  • Those that have an ID attribute. This is important so anchors still work.
  • Those whose first or last child is a text-only node.
  • Those whose first or last child has an inline frame.
  • Those whose descendants contain an img tag.
Attachment #9038490 - Attachment description: Bug 646216 - Thin out the tree by only creating accessibles for relevant divs, r=Jamie → Bug 646216 - Thin out the tree by only creating accessibles for relevant divs, r=Jamie,timdream
Pushed by jteh@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5440a333cf4b
Thin out the tree by only creating accessibles for relevant divs, r=Jamie,timdream
Pushed by mzehe@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/dbb019cc077c
Thin out the tree by only creating accessibles for relevant divs, r=Jamie,timdream

Landed with fixed test. The fix was the same as for the similarly named mochitest.

Flags: needinfo?(mzehe)
Blocks: 1523513
Status: ASSIGNED → RESOLVED
Closed: 4 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
Depends on: 1523931

I think there's a problem with the taken approach, which is you handle tree updates properly, i.e. you don't rebuilt a tree if a reduced tree is altered. Say, if you have denied to create an accessible for a div in <h1><div></div></h1>, and then JS inserts a <span>hey</span> as a first child of <h1>, according to logic in MARKUPMAP(div), div has to be accessible now, but it doens't happen.

Any suggestions where to handle this, Alex?

(In reply to Marco Zehe (:MarcoZ) from comment #28)

Any suggestions where to handle this, Alex?

I agreed that cutting the number of elements in accessible tree looks very tempting, but it goes at cost of the tree update, of course if we do this. So, if update the tree part can be skipped, i.e. no one is regressed in the wilds (I guess you can be certain on this when the patch is shipped on release build), then we are in good shape I'd say. Otherwise, we would need to implement update the tree part, however it will go at performance cost. Thus it'd be good to improve our talos coverage.

Anyway, I'd say this is something that is good to work on with other browsers vendors to make sure the accessible tree we expose is consistent though the browsers. Maybe HTML-AAM can be a good platform to keep discussion on.

Cc'ing Aaron

You need to log in before you can comment on or make changes to this bug.