Closed Bug 538594 (a11yperfcoalesce) Opened 15 years ago Closed 14 years ago

Improve event filtering (coalescence) performance

Categories

(Core :: Disability Access APIs, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: davidb, Assigned: davidb)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 2 obsolete files)

Attachment #420737 - Attachment is patch: true
Attachment #420737 - Attachment mime type: application/octet-stream → text/plain
This is basically backing out the core part of the fix for bug 513213, no?
(In reply to comment #1)
> This is basically backing out the core part of the fix for bug 513213, no?

Yes. And it sounds not very correct.
David, is this bug different from bug 522847?
This bug is specifically about experimenting with backing out the core part of the fix for bug 513213 yes. I didn't want to clutter bug 522847 (yet).

I ran this patch against our tests locally (OSX). It always runs green on test_coalescence when that test is run alone.

In a run of our a11y tests suite I got 3 errors in test_coalescence.
On a subsequent suite run I got 2 errors in test_coalescence.

(Note fragility of these tests: bug 537936)
(In reply to comment #4)
> This bug is specifically about experimenting with backing out the core part of
> the fix for bug 513213 yes. I didn't want to clutter bug 522847 (yet).
> 
> I ran this patch against our tests locally (OSX). It always runs green on
> test_coalescence when that test is run alone.

That's interesting and that should mean our testsuite is not perfect :)

There is description of the testcase in bug 513213 comment #0. The current way allows to decrease reorder events count.
One additional note: I think it's not enough to traverse an array once to coalesce all events.
Blocks: 522847
No longer blocks: 538530
Comment on attachment 420737 [details] [diff] [review]
coalesce less frequently

Given the danger of accessibility hurting more than helping (see bug 538530) I think we should take this patch for now. It reverts the perf regression BZ points out in bug 513213. Given bug 522847 comment 30, I think we can follow up with a better approach once the event code cleanup has happened. What do you think?
Attachment #420737 - Flags: review?(surkov.alexander)
Attachment #420737 - Flags: review?(ginn.chen)
(In reply to comment #7)
> (From update of attachment 420737 [details] [diff] [review])
> Given the danger of accessibility hurting more than helping (see bug 538530) I
> think we should take this patch for now. It reverts the perf regression BZ
> points out in bug 513213. Given bug 522847 comment 30, I think we can follow up
> with a better approach once the event code cleanup has happened. What do you
> think?

1) I believe this is not correct coalescing. This is not a backout of the patch 513213.

2) Does it fix the bug 522847? How much is it quicker?
(In reply to comment #8)
> (In reply to comment #7)
> > (From update of attachment 420737 [details] [diff] [review] [details])
> > Given the danger of accessibility hurting more than helping (see bug 538530) I
> > think we should take this patch for now. It reverts the perf regression BZ
> > points out in bug 513213. Given bug 522847 comment 30, I think we can follow up
> > with a better approach once the event code cleanup has happened. What do you
> > think?
> 
> 1) I believe this is not correct coalescing. This is not a backout of the patch
> 513213.

Can you explain how it is not correct? It would be good to capture your thoughts here. Not a full backout - right.

> 2) Does it fix the bug 522847? How much is it quicker?

Yes. The test case takes time as follows:

no accessibility: ~5 seconds
with accessibility: at least 32 seconds... i kept getting unresponsive script dialogs.
accessibility with patch: ~6.4 seconds
(debug build)
Attachment #420737 - Flags: review?(marco.zehe)
(In reply to comment #9)

> Can you explain how it is not correct? It would be good to capture your
> thoughts here. Not a full backout - right.

It was O(N^2) before and after the patch 513213. Now it is O(N). It can't be equivalent.

> > 2) Does it fix the bug 522847? How much is it quicker?
> 
> Yes. The test case takes time as follows:

What test case? What events are fired there?
(In reply to comment #10)
> (In reply to comment #9)
> 
> > Can you explain how it is not correct? It would be good to capture your
> > thoughts here. Not a full backout - right.
> 
> It was O(N^2) before and after the patch 513213. Now it is O(N). It can't be
> equivalent.

Only the accessibility code is O(N^2) and with a large constant... the big draw is outside a11y with this patch applied.

> 
> > > 2) Does it fix the bug 522847? How much is it quicker?
> > 
> > Yes. The test case takes time as follows:
> 
> What test case? What events are fired there?

https://bugzilla.mozilla.org/attachment.cgi?id=406846
> It was O(N^2) before and after the patch 513213.

That's not quite right.

If I read the code correctly, after the patch for bug 513213 we do coalescing after every append.  The coalescing algorithm seems to me to be O(N^2) in list length at first glance.  So on every append we do O(N^2) work, for an overall O(N^3) behavior.
Ah, nevermind.  ApplyEventRules is just O(N).
At least as long as ApplyToSiblings doesn't get called...
My understanding is this patch would break Bug 513213.

For the case of Bug 522847, per Bug 522847 comment #30, it is not correct coalescing no matter Bug 513213 is fixed or not.
Backing out Bug 513213 makes the case of Bug 522847 run faster.

Per Bug 522847 comment #19, I had a patch to get rid of ApplyToSiblings (I need to search my harddisk to find the draft), but it would not help much if ApplyEventRules is O(N^2) and with the case of Bug 522847.
(In reply to comment #16)
> My understanding is this patch would break Bug 513213.
> 
> For the case of Bug 522847, per Bug 522847 comment #30, it is not correct
> coalescing no matter Bug 513213 is fixed or not.
> Backing out Bug 513213 makes the case of Bug 522847 run faster.

Ginn these statements sounds like mutual exclusive. Could you give clarification please?

> Per Bug 522847 comment #19, I had a patch to get rid of ApplyToSiblings (I need
> to search my harddisk to find the draft), but it would not help much if
> ApplyEventRules is O(N^2) and with the case of Bug 522847.

It would be great if you file this patch or its idea or finish the patch.
(In reply to comment #17)
> (In reply to comment #16)
> > My understanding is this patch would break Bug 513213.
> > 
> > For the case of Bug 522847, per Bug 522847 comment #30, it is not correct
> > coalescing no matter Bug 513213 is fixed or not.
> > Backing out Bug 513213 makes the case of Bug 522847 run faster.
> 
> Ginn these statements sounds like mutual exclusive. Could you give
> clarification please?

We have several cases of wrong coalescing.
Case 1) The case in Bug 513213.
Case 2) The case in Bug 522847.
Perhaps the result of coalescing is correct, but the calculating process is wrong.
Because the parents of removed node and the parent of the sibling node being removed are different.
We can not judge if they're siblings.
We can not judge if the node being removed is the parent of a removed node, either.
Case 3) Other cases.

The patch of Bug 513213 fixed the wrong coalescing of Case 1), but the side effect it caused a regression of performance of Case 2).
Ok. I see. Thanks. Bug 513213 makes things correct, it just covered some hidden code problems. I think I can fix it if you're fine with this.
The patch here did changed the behavior of the case of Bug 522847.

Before this patch, 5001 out of 15001 events were sent for "Run Test" button.
After this patch, 14999 out of 15001 events were sent.
(In reply to comment #20)
> The patch here did changed the behavior of the case of Bug 522847.
> 
> Before this patch, 5001 out of 15001 events were sent for "Run Test" button.
> After this patch, 14999 out of 15001 events were sent.

David, I wonder what screen reader you tried when you got numbers from comment #9?
(In reply to comment #21)
> (In reply to comment #20)
> > The patch here did changed the behavior of the case of Bug 522847.
> > 
> > Before this patch, 5001 out of 15001 events were sent for "Run Test" button.
> > After this patch, 14999 out of 15001 events were sent.
> 
> David, I wonder what screen reader you tried when you got numbers from comment
> #9?

Oh, I triggered accessibility via UIElementInspector; no screen reader involved. Hopefully Marco can give the try build a spin.

Ginn, thanks for your investigations.
(In reply to comment #22)

> Oh, I triggered accessibility via UIElementInspector; no screen reader
> involved. Hopefully Marco can give the try build a spin.

Is it necessary?
Comment on attachment 420737 [details] [diff] [review]
coalesce less frequently

Removing review requests. If we decide the perf hit outweighs the good parts of bug 513213 solution we can reconsider.
Attachment #420737 - Flags: review?(surkov.alexander)
Attachment #420737 - Flags: review?(marco.zehe)
Attachment #420737 - Flags: review?(ginn.chen)
(In reply to comment #24)
> (From update of attachment 420737 [details] [diff] [review])
> Removing review requests. If we decide the perf hit outweighs the good parts of
> bug 513213 solution we can reconsider.

I still think bug 513213 approach is right, however it covered code problems I didn't expected (and possibly no one of us). Also I think we can get back perf results without backing out bug 513213, see Ginn's comment #18.
(In reply to comment #23)
> (In reply to comment #22)
> 
> > Oh, I triggered accessibility via UIElementInspector; no screen reader
> > involved. Hopefully Marco can give the try build a spin.
> 
> Is it necessary?

Not high priority, but I'm curious about the impact of the extra events on the screen readers. Also how noticeable regressing 513213 would be especially given our test coverage for that bug sometimes fails.
(In reply to comment #26)

> Not high priority, but I'm curious about the impact of the extra events on the
> screen readers. 

It depends on what they do. Screen readers update their cache on mutation events so I think it 
the difference should be if events count is 5000 or in 3 times more. All I want to say the performance should be measured when screen reader is running.
 
> Also how noticeable regressing 513213 would be especially given
> our test coverage for that bug sometimes fails.

I didn't catch your question.
OK I think this bug is partially consumed by bug 541474.

(In reply to comment #27)
> (In reply to comment #26)
> 
> > Not high priority, but I'm curious about the impact of the extra events on the
> > screen readers. 
> 
> It depends on what they do. Screen readers update their cache on mutation
> events so I think it 
> the difference should be if events count is 5000 or in 3 times more. All I want
> to say the performance should be measured when screen reader is running.

Yes would be interesting.
Attached patch refreshed against latest trunk (obsolete) — Splinter Review
We need to fix this.

This patch will fire more events (thanks Ginn), but do a lot less processing on our end to fire them.

Marco can you test this with NVDA? I suspect there might be a performance hit if they are doing a lot of processing for the extra events that are fired.
Assignee: nobody → bolterbugz
Attachment #420737 - Attachment is obsolete: true
Attachment #442756 - Flags: feedback?(surkov.alexander)
Attachment #442756 - Flags: feedback?(marco.zehe)
Comment on attachment 442756 [details] [diff] [review]
refreshed against latest trunk

Adding bz for algorithmic performance feedback and any other advice here.

It isn't anything special, just an updated version of the last patch.

I'll add a link to a trybuild when it finishes.
Attachment #442756 - Flags: feedback?(bzbarsky)
Comment 7 is probably a good summary.
David, please describe your patch. Please point why we fire the same number of events.
(In reply to comment #29)
> Created an attachment (id=442756) [details]
> refreshed against latest trunk
> 
> We need to fix this.
> 
> This patch will fire more events (thanks Ginn), but do a lot less processing on
> our end to fire them.

I don't think firing more events is an acceptable solution is an acceptable solution.
We should find a way to do coalescence correctly and efficiently.
(In reply to comment #34)

> I don't think firing more events is an acceptable solution is an acceptable
> solution.
> We should find a way to do coalescence correctly and efficiently.

Thanks, Ginn. I share your point.
I tried the try-server build, and other than a little less tendency to lock up on the testcase in bug 522847, I didn't notice much difference.
(In reply to comment #35)
> (In reply to comment #34)
> 
> > I don't think firing more events is an acceptable solution is an acceptable
> > solution.
> > We should find a way to do coalescence correctly and efficiently.
> 
> Thanks, Ginn. I share your point.

I agree philosophically. Right now there are web pages that are unusable by anyone when accessibility is enabled because we do too much computation (every time we queue an event). It is just algorithmically too expensive.

This patch simply moves the coalescence computation to the time when we are ready to fire queued events.

(In reply to comment #36)
> I tried the try-server build, and other than a little less tendency to lock up
> on the testcase in bug 522847, I didn't notice much difference.

Thanks Marco, that's encouraging. Did you notice any speedup or sluggishness on other pages?
> We should find a way to do coalescence correctly and efficiently.

OK.  Can you describe your coalescing requirements?  I would expect that we can coalesce efficiently if we use a data structure actually designed for the coalescing usecase instead of just building on top of a completely unstructured array.
(In reply to comment #33)
> David, please describe your patch. Please point why we fire the same number of
> events.

We almost certainly fire more events with this patch - and note I agree with comment 6.
(In reply to comment #37)
> (In reply to comment #35)
> > (In reply to comment #34)
> > 
> > > I don't think firing more events is an acceptable solution is an acceptable
> > > solution.
> > > We should find a way to do coalescence correctly and efficiently.
> > 
> > Thanks, Ginn. I share your point.
> 
> I agree philosophically. Right now there are web pages that are unusable by
> anyone when accessibility is enabled because we do too much computation (every
> time we queue an event). It is just algorithmically too expensive.

It's not philosophical question. More events means more screen reader processing. We should realize what event dupes are happen and how they affect on screen readers before considering the patch taking.
 
> This patch simply moves the coalescence computation to the time when we are
> ready to fire queued events.

No. Not only. You fire more events. Note, in general moving coalecsence after delay doesn't mean making the browser quicker.
(In reply to comment #38)
> > We should find a way to do coalescence correctly and efficiently.
> 
> OK.  Can you describe your coalescing requirements?  I would expect that we can
> coalesce efficiently if we use a data structure actually designed for the
> coalescing usecase instead of just building on top of a completely unstructured
> array.

I'm aware of two main cases:

When a tree is removed from the DOM, we'd prefer to fire one event (for the root) instead of events for all nodes.

We don't want to fire duplicate events (for the same document).
Comment on attachment 442756 [details] [diff] [review]
refreshed against latest trunk

Killing this patch.

(In reply to comment #40)
> (In reply to comment #37)
> > (In reply to comment #35)
> > > (In reply to comment #34)
> > > 
> > > > I don't think firing more events is an acceptable solution is an acceptable
> > > > solution.
> > > > We should find a way to do coalescence correctly and efficiently.
> > > 
> > > Thanks, Ginn. I share your point.
> > 
> > I agree philosophically. Right now there are web pages that are unusable by
> > anyone when accessibility is enabled because we do too much computation (every
> > time we queue an event). It is just algorithmically too expensive.
> 
> It's not philosophical question. More events means more screen reader
> processing. We should realize what event dupes are happen and how they affect
> on screen readers before considering the patch taking.

Yep. See comment 39 :)  I don't want to regress anyone.

> 
> > This patch simply moves the coalescence computation to the time when we are
> > ready to fire queued events.
> 
> No. Not only. You fire more events. Note, in general moving coalecsence after
> delay doesn't mean making the browser quicker.

Right. I think HasRelatedContent/FindNeighbourPointingToNode is very expensive.
Attachment #442756 - Attachment is obsolete: true
Attachment #442756 - Flags: feedback?(surkov.alexander)
Attachment #442756 - Flags: feedback?(marco.zehe)
Attachment #442756 - Flags: feedback?(bzbarsky)
(In reply to comment #37)
> Thanks Marco, that's encouraging. Did you notice any speedup or sluggishness on
> other pages?

Not noticeably. But NVDA is usually pretty good at filtering out events it doesn't need. Or I visited the wrong pages. :)
(In reply to comment #42)
> Right. I think HasRelatedContent/FindNeighbourPointingToNode is very expensive.

Filed bug 563331.
(In reply to comment #42)

> Right. I think HasRelatedContent/FindNeighbourPointingToNode is very expensive.

So, it makes sense to move to make this fixed. And do not move following approach you suggest in your patch.

My primary idea to get bz's perf testcase fixed is to make quick comparison of the nodes in tree. I don't think we'll be allowed to hack DOM tree to achieve this, therefore I want to switch a11y logic to accessible tree, where we have more freedom. That's not quick process but I think it's the best what we can do.
> I don't think we'll be allowed to hack DOM tree to achieve this

Why not?  What do you need the DOM to provide that it doesn't provide right now?  In general, _please_ talk to me or sicking or just or smaug or peterv or ideally all of us about any issues you have with making your code fast due to DOM limitations.  The worst that will happen is that we'll actually tell you that you have to solve the problem yourself.

> I think HasRelatedContent/FindNeighbourPointingToNode is very expensive.

They may be, but in the testcases I've profiled (e.g. bug 522847) they are NOT what takes the time.

> When a tree is removed from the DOM, we'd prefer to fire one event (for the
> root) instead of events for all nodes.

Why do you end up with events for all nodes here in the first place?  I'd think a subtree removal would just post one event and be done with it, no?

> We don't want to fire duplicate events (for the same document).

What does that mean?  Duplicate in what sense?
(In reply to comment #46)
> > I don't think we'll be allowed to hack DOM tree to achieve this
> 
> Why not?  What do you need the DOM to provide that it doesn't provide right
> now?  In general, _please_ talk to me or sicking or just or smaug or peterv or
> ideally all of us about any issues you have with making your code fast due to
> DOM limitations.  The worst that will happen is that we'll actually tell you
> that you have to solve the problem yourself.

The idea is to map the tree into ordered set. So that I can say when I have two nodes, who contains who or nodes are in different subtrees. The DOM tree has methods to find this but they arnent' optimal since they traverse the DOM to find the info. I need to spend additional memory to store the order of DOM node in the node tree set. I need to play with different algorithms of this idea but any way I need to reserve more than two numbers for each node.
(In reply to comment #46)

> > We don't want to fire duplicate events (for the same document).
> 
> What does that mean?  Duplicate in what sense?

Dupes are either for same DOM node, or for the same subtree. For example, if child and parent hidden then we want to fire hide event for parent only.
> they arnent' optimal since they traverse the DOM to find the info

While true, the exact traversals a11y does right now are at least an order of magnitude slower than they should be due to its use of nsIDOM* (bug 541618 seems to maybe cover this).

And if you don't need to find the info umpteen bajillion times, the nsINode methods would be plenty fast.  My point is that we should change this code to not require umpteen bajillion DOM traversals.

If it turns out we need faster "is this node a descendant of this other node" than walking the GetNodeParent() chain provides, then we can certainly look into adding one.  But all sorts of layout code gets away with not having it, and coalesces various things just fine.

> For example, if child and parent hidden then we want to fire hide event for
> parent only.

Maybe I wasn't clear.  I'd like an exhaustive list of cases we want to coalesce, so that we can usefully think about the right data structure for doing the coalescing...

So far I know about:

1)  A hide on nodes A and B should be coalesced if A is an ancestor of B or B
    itself.
2)  Whatever the "tree removed from the DOM" thing is; not clear to me so far.

What else do we need to coalesce?
(In reply to comment #49)
> > they arnent' optimal since they traverse the DOM to find the info
> 
> While true, the exact traversals a11y does right now are at least an order of
> magnitude slower than they should be due to its use of nsIDOM* (bug 541618
> seems to maybe cover this).

Right. It should be fixed there. I hope to come with the working fix in few weeks.

> And if you don't need to find the info umpteen bajillion times, the nsINode
> methods would be plenty fast.  My point is that we should change this code to
> not require umpteen bajillion DOM traversals.
> 
> If it turns out we need faster "is this node a descendant of this other node"
> than walking the GetNodeParent() chain provides, then we can certainly look
> into adding one.  But all sorts of layout code gets away with not having it,
> and coalesces various things just fine.

Ok. I see your point.

> > For example, if child and parent hidden then we want to fire hide event for
> > parent only.
> 
> Maybe I wasn't clear.  I'd like an exhaustive list of cases we want to
> coalesce, so that we can usefully think about the right data structure for
> doing the coalescing...
> 
> So far I know about:
> 
> 1)  A hide on nodes A and B should be coalesced if A is an ancestor of B or B
>     itself.
> 2)  Whatever the "tree removed from the DOM" thing is; not clear to me so far.
> 
> What else do we need to coalesce?

Technically we checked show/hide events for A and B nodes if one of them contains another once to coalecse. Bug 541474 was an attempt to make coalescence smarter. The patch contains comments of cases we expect (http://mxr.mozilla.org/mozilla-central/source/accessible/src/base/nsEventShell.cpp#259). Also we have to coalecse reorder events (it's fired on container when its children are shown or hidden. We don't do robust coalescence in this case yet.
(In reply to comment #49)
> 2)  Whatever the "tree removed from the DOM" thing is; not clear to me so far.

Sorry bogus case - it was meant to be illustrative.
Hmm I'm running some tests (with the call to coalesce commented out) and I'm seeing a lot less events than I recall seeing a year ago. Perhaps changes in /content?
Assignee: bolterbugz → nobody
OK.  So if I read that right, events are ordered in time and:

1)  A "show" event on a node should suppress all earlier "show" events on
    descendants. (But not sure what the two XXX comments there mean.)
2)  A "hide" event on a node should suppress all earlier "hide" events on
    descendants.
3)  A reorder event on a node should suppress all earlier events on
    descendants.

Is that correct?  If not, what's the correct list of things we want to suppress?

If the above is correct, I assume visibility changes do not trigger show/hide events?

Where does ApplyToSiblings come in?

Other questions: are access nodes created eagerly for all nodes, or lazily when someone asks for them?   If the latter, are show/hide events supposed to force access node creation?
Assignee: nobody → bolterbugz
David, it depends on when and how you fire these events.  Depending on that, you may simply be seeing the effects of lazy frame construction.
(In reply to comment #54)
> OK.  So if I read that right, events are ordered in time and:
> 
> 1)  A "show" event on a node should suppress all earlier "show" events on
>     descendants. (But not sure what the two XXX comments there mean.)
> 2)  A "hide" event on a node should suppress all earlier "hide" events on
>     descendants.
> 3)  A reorder event on a node should suppress all earlier events on
>     descendants.
> 
> Is that correct?

That looks correct.
 
> If the above is correct, I assume visibility changes do not trigger show/hide
> events?

Changes to the display, visibility styles, and adding/removing nodes can all cause us to fire show hide events.

> Where does ApplyToSiblings come in?

Alexander can answer this best, but I think this is for updating the current state of similar events for all sibling nodes. So if a decision has been made to not emit the event, set that state on all similar events for sibling nodes.

> Other questions: are access nodes created eagerly for all nodes, or lazily when
> someone asks for them?

They are created (and cached) when asked for.

> If the latter, are show/hide events supposed to force
> access node creation?

In nsDocAccessible::ProcessPendingEvent we call the GetAccessible method of the event so we've seen a need to do that ourselves unfortunately. This will cause the creation of an access node (if not already in our cache).
blocking2.0: --- → ?
> Changes to the display, visibility styles, and adding/removing nodes can all
> cause us to fire show hide events.

So if a node changes to "visibility: hidden" we fire a hide event on it?  How does that affect still-visible subtrees?

> In nsDocAccessible::ProcessPendingEvent we call the GetAccessible method of the
> event so we've seen a need to do that ourselves unfortunately.

OK.  So we force creation of an access node for anything we plan to fire an event on.  Gotcha.
I found some history of why the coalescence rules were introduced over in bug 418845. (A lot of this pre-dates my core involvement in gecko a11y)
(In reply to comment #57)
> > Changes to the display, visibility styles, and adding/removing nodes can all
> > cause us to fire show hide events.
> 
> So if a node changes to "visibility: hidden" we fire a hide event on it?  How
> does that affect still-visible subtrees?

Not sure of the question. I recall a lot of churn if for example the node that went invisible is say a row in a large table. I need to do some profiling and debugging to refresh myself on particulars.
Note that my question in comment 57 is not a performance question.  It's a "what events are dispatched, and what's needed for correctness?" question.
(In reply to comment #57)
> > Changes to the display, visibility styles, and adding/removing nodes can all
> > cause us to fire show hide events.
> 
> So if a node changes to "visibility: hidden" we fire a hide event on it?  How
> does that affect still-visible subtrees?

Sorry I mis-read "subtrees" last time. I believe we currently don't worry about this case, but probably we should. I'm not sure what the correct events would be in this case TBH.
Here's some Windows AT perspective on this. My apologies if this information is completely irrelevant or already known.

There are two ways that accessible events can be handled on Windows: in-context (where we inject code into the remote process) and out-of-context (where the events must be sent via IPC). NVDA does both. I believe a lot of Windows screen readers are now going almost completely in-process, so they probably don't handle events out-of-context. There are advantages and disadvantages to in-process injection, but that's beyond the scope of this discussion.

In-context events (which we use to render our own representation of the document and to handle live regions) are dispatched synchronously. That is, because we're in the same thread, we get the event as it is fired, we do our work (blocking the process) and then return. In order to speed things up, we don't actually do much processing in the events. Instead, we record which tree was updated and then do the update in a timer (currently 250ms). We figure this slight delay in updates isn't going to be noticed most of the time.

Out-of-context events are dispatched asynchronously via IPC. NVDA uses these for the main part of the screen reader; e.g. handling changes to the focused object. NVDA only listens to events it actually needs, but unfortunately, "show" is one of these, which we need for tooltips, etc. We have code to limit the events we actually process; eliminating duplicates, events for objects we aren't interested in, etc. However, this doesn't change the fact that the event has already been fired, thus incurring the system-wide performance hit of IPC and context switching. In addition, the AT process is "spammed" with events if there are a lot of them, most of which will probably be dropped. This is why coalescing is a good thing in theory.

Note that, as I understand it, in-context handling of events is not possible on platforms other than Windows, which probably makes coalescing even more important there.

Having said that, if the coalescing algorithm is actually slower than just dispatching all of the events anyway (which would seem to be the case sometimes), it would seem better to not bother with coalescing.
(In reply to comment #52)
> Hmm I'm running some tests (with the call to coalesce commented out) and I'm
> seeing a lot less events than I recall seeing a year ago. Perhaps changes in
> /content?

Probably, or probably layout or bug 541474 which makes coalescence smarter.

(In reply to comment #56)
> (In reply to comment #54)
> > OK.  So if I read that right, events are ordered in time and:
> > 
> > 1)  A "show" event on a node should suppress all earlier "show" events on
> >     descendants. (But not sure what the two XXX comments there mean.)
> > 2)  A "hide" event on a node should suppress all earlier "hide" events on
> >     descendants.
> > 3)  A reorder event on a node should suppress all earlier events on
> >     descendants.
> > 
> > Is that correct?
> That looks correct.

That's optimized algorithm. In general show/hide/reorder event targets aren't allowed to contain each other. For example, if A node was appended to B node and then B was appended to the DOM document then we need show event for B node only. That's we do because we don't get notifications when node is not within the document. However it's not true for show events from layout changes, earlier show event target can be contained by older show event target. The XXX comments are about this issue. Refer to InvalidateSubtreeFor (http://mxr.mozilla.org/mozilla-central/ident?i=InvalidateSubtreeFor) to see where layout notifies us about changes.

So if we are sure some cases aren't possible then we should ignore them and get more qucik algorithm, but theoretically we should address all cases. So while your statement seems to be true for show/hide events, it's not valid for reorder events. Reorder events can contain each other which doesn't depend on their order. For example let's A contains B that contains C: "earlied reorder contains older reorder when C is appended to B (reorder on B), B is removed from A (reorder on A); oldest reorder contains earlier reorder when B is shown (reorder on A) and C is appended (reorder on B).

> > If the above is correct, I assume visibility changes do not trigger show/hide
> > events?
> Changes to the display, visibility styles, and adding/removing nodes can all
> cause us to fire show hide events.

Visibility changes are result of show/hide events, however it might be not very correct since visibilty hidden means element isn't shown but a place on the screen is reserved for it. Rather I think we should fire visible state change events and they should be coalesced as well.

> > Where does ApplyToSiblings come in?
> Alexander can answer this best, but I think this is for updating the current
> state of similar events for all sibling nodes. So if a decision has been made
> to not emit the event, set that state on all similar events for sibling nodes.

Yep, that was an attempt to fix performance issue when hundreds of sibling nodes are removed from the parent.

(In reply to comment #57)
> > Changes to the display, visibility styles, and adding/removing nodes can all
> > cause us to fire show hide events.
> So if a node changes to "visibility: hidden" we fire a hide event on it?  How
> does that affect still-visible subtrees?

Yes, we do. I'm sure we process it the same way as we do for display and DOM operations. And that sounds wrong. So we need to deal with accessibility state change events rather than with mutation events as I said above.

> > In nsDocAccessible::ProcessPendingEvent we call the GetAccessible method of the
> > event so we've seen a need to do that ourselves unfortunately.
> OK.  So we force creation of an access node for anything we plan to fire an
> event on.  Gotcha.

Not exactly. We force accessible creation only for show events. We don't do this for hide events.
> For example, if A node was appended to B node and then B was appended to the
> DOM document then we need show event for B node only.

OK.  Does the order of the show events matter?  If we don't include visibility changes in show/hide, so that a hide guarantees the whole subtree is hidden while a show guarantees that the whole subtree is visible, then as long as a hide and a show happen on different nodes the order seems to not matter, right?  If they happen on the same node, the later one wins, of course.

Should events of different types get coalesced with each other (e.g. can a reorder cause shows or hides to be irrelevant)?  Or does the coalescing only need to happen within each type?  If the latter, does relative order of two events of different types (e.g. a show and a reorder) need to be preserved in general?
(In reply to comment #64)
> > For example, if A node was appended to B node and then B was appended to the
> > DOM document then we need show event for B node only.
> OK.  Does the order of the show events matter?  If we don't include visibility
> changes in show/hide, so that a hide guarantees the whole subtree is hidden
> while a show guarantees that the whole subtree is visible, then as long as a
> hide and a show happen on different nodes the order seems to not matter, right?

Yes, mostly the order doesn't matter at least while the events are of the same type. However we would like to keep original ordering. For example, AT would expect menupopup show event before focus events on menuitems if this example makes sense here.

>  If they happen on the same node, the later one wins, of course.
> Should events of different types get coalesced with each other (e.g. can a
> reorder cause shows or hides to be irrelevant)?  Or does the coalescing only
> need to happen within each type?  If the latter, does relative order of two
> events of different types (e.g. a show and a reorder) need to be preserved in
> general?

Yes, events of different types might be coalesced, for example I'm sure we don't want any events (like state change events) for the hidden subtree if they occured before.
OK.  Can we get our ideal set of coalescing rules written down, then?  And identify which orderings we want to preserve (show before focus?) and which we don't care about (different shows or hides)?
(In reply to comment #66)
> OK.  Can we get our ideal set of coalescing rules written down, then?  And
> identify which orderings we want to preserve (show before focus?) and which we
> don't care about (different shows or hides)?

We've begun.
https://wiki.mozilla.org/Accessibility/EventCoalescing
(In reply to comment #67)
> (In reply to comment #66)
> > OK.  Can we get our ideal set of coalescing rules written down, then?  And
> > identify which orderings we want to preserve (show before focus?) and which we
> > don't care about (different shows or hides)?
> We've begun.
> https://wiki.mozilla.org/Accessibility/EventCoalescing

I'm not sure we could write down ideal coalescing rules since there is no specification to follow, currently we don't fire all events we should, some events are missed in some cases and sometimes we fire events in wrong order. That should be very very big work trying to orginize all events.

Now we could follow simple rule and I believe it should be good achievement.

1. Do not fire any events from hidden subtree.
2. Do not fire show/reorder events inside a shown/reordered subtree.

That's what should we have. Algorithm must be a bit more complex. For example, it doesn't make sense to coalesce two hide events if newer event target is child of older event target because this case shouldn't happen in real life.
(In reply to comment #68)
> (In reply to comment #67)
> > (In reply to comment #66)
> > > OK.  Can we get our ideal set of coalescing rules written down, then?  And
> > > identify which orderings we want to preserve (show before focus?) and which we
> > > don't care about (different shows or hides)?
> > We've begun.
> > https://wiki.mozilla.org/Accessibility/EventCoalescing
> 
> I'm not sure we could write down ideal coalescing rules since there is no
> specification to follow, currently we don't fire all events we should, some
> events are missed in some cases and sometimes we fire events in wrong order.
> That should be very very big work trying to orginize all events.

I think we should brain dump anything we can think of - perhaps at the bottom of the wiki. Capturing even edge case ideas is valuable I think.

> Now we could follow simple rule and I believe it should be good achievement.
> 
> 1. Do not fire any events from hidden subtree.
> 2. Do not fire show/reorder events inside a shown/reordered subtree.

OK this is captured in the Ancestral Priority section of the wiki. I removed the section specific to reorder know that I am clearer we are thinking alike. Thanks.
> I'm not sure we could write down ideal coalescing rules 

Can we write down the ones we are trying to implement or may likely try to implement in the near future?

> For example, it doesn't make sense to coalesce two hide events if newer event
> target is child of older event target because this case shouldn't happen in
> real life.

It can happen right now, if you use "hide" for visibility:hidden stuff.

The EventCoalescing wiki page sounds good.  Let me know once you guys agree amongst yourselves on it?
Ping?  Is the wiki page at a point where we can try to design based on it?
Blocks: 565452
(In reply to comment #71)
> Ping?  Is the wiki page at a point where we can try to design based on it?

Not exactly I think. As I said in comment #68 I think it's ok to keep in mind two general rules:

1. Do not fire any events from hidden subtree.
2. Do not fire show/reorder events inside a shown/reordered subtree.

Of course algorithm for these rules is more complicated. Now we drop some cases that won't ever happen. For example, we know we can't have newer hide event target that is contained by older hide event target (at least I didn't get this rule broken in real life) what allows us to avoid target comparisons. However the similar rule for show event isn't true "older show event target can't be contained by newer show event target" (I think the rule should make sense but I saw its broken on web pages).

Therefore I think we need some help to figure out possible cases: when show/hide events can contain show/hide events and when they can't in order to make further algorithm improvement.

Also the current algorithm searches for siblings targets to apply rules to them since it allows to handle children adding or removal much faster. That still makes sense until we have quick nodes comparison.

Also I can think of one more smarter coalescence. Now we coalesce reorder and related show/hide events separately, we could try to do this the same time. For example, if older hide event target is contained by newer hide event target then older reorder event target is contained by newer reorder event target and we don't need to compare reorder targets in this case. I think this improvement should really help, since event filtering might be faster in two times since every show/hide event has related reorder event.

In general I think we have two options
1. Improve algorithm to make it more faster (add new cases to drop unneeded comparisons) and as result make it more complicated.
2) Implement fast tree nodes comparison.

Probably we need to do both.
No longer blocks: 565452
> Not exactly I think.

OK, can we fix that please?

> Therefore I think we need some help to figure out possible cases: when
> show/hide events can contain show/hide events and when they can't in order to
> make further algorithm improvement.

Fine.  Are you expecting that help from me?  In that case you need to describe (or point me to existing documentation about) what the set of events involved is (show, hide, reorder, anything else?) and what changes in the DOM or CSS box model trigger those events.  You also need to explain whether the order of the events (currently represented by the order in the array) is important; presumably it is.

> That still makes sense until we have quick nodes comparison.

Not necessarily, depending on how the whole thing is set up.  Again, I'd like to understand what we're trying to accomplish before working on mechanism.

> Now we coalesce reorder and related show/hide events separately, we could try
> to do this the same time.

Yes, of course.  You want all the coalescing to happen in a single pass.

> since every show/hide event has related reorder event.

This comes back to the "explain to me exactly what events are generated when".

> 1. Improve algorithm to make it more faster (add new cases to drop unneeded
> comparisons) and as result make it more complicated.

That last doesn't necesarily follow.

> Probably we need to do both.

Quite likely.

So can we please get the "what events happen when, and what things do we want to coalesce" data in that wiki, or point me to existing documentation?  I sort of feel like I'm trying to pull teeth here...
(In reply to comment #73)
> > Not exactly I think.
> 
> OK, can we fix that please?
> 
> > Therefore I think we need some help to figure out possible cases: when
> > show/hide events can contain show/hide events and when they can't in order to
> > make further algorithm improvement.
> 
> Fine.  Are you expecting that help from me?

That would be great. Either I or David will put information we have into wiki these days so you can get more clear overview what we need to have. Thank you.
That would be very helpful.  Looking forward to it!
I'm removing myself as the bug owner since my availability is compromised for the next week or two. Probably bz could own this?

We'll add what we know to the wiki; but I don't think we can be exhaustive since we are always learning more details about what AT expect at the other end. Choosing the data structure or solution is going to be a judgment call on what we know today and I think in this case talking about the possible solutions may trigger ideas about we might coalesce.
Assignee: bolterbugz → nobody
OK.  I'd settle for a non-exhaustive list of things you _know_ you need to coalesce and whatever things you think you might want to in the future.
not blocking on this without a concise argument for blocking the release.
blocking2.0: ? → -
Here's my argument:

On machines with tablet drivers or other MSAA-using things (including some AV software), this has a disastrous effect on performance of dynamic page modifications.  The second test in the recent hacks.m.o post about lazy frame construction is a good example: it takes ~400 ms with a11y forcibly disabled, and I gave up after 90 seconds of locked browser the last time I tried with it on.  People have a11y triggered by all sorts of random things on Windows machines these days, unfortunately, and for them it's like driving with the emergency brake partially on.  (Forcing it disabled was a revelation for me, and this bug has been the recurring theme.)

Sorry that is not more concise!

(http://hacks.mozilla.org/2010/05/better-performance-with-lazy-frame-construction/)
blocking2.0: - → ?
blocking2.0: ? → final+
This patch passes tests locally (pushed to try) and reduces the first mozilla hacks testcase from 61 seconds to 17 seconds (on my Win7 machine).

Note: I did this kind of throttling in GOK (Linux accessibility client) because processing all events was too much at times and GOK became noticeably less responsive. Normally I think this throttling should happen in the AT, but it is interesting to explore doing in the a11y server (gecko); especially since it keeps 'n' low enough that our coalescence complexity doesn't kill us.

Feedback welcome.
Assignee: nobody → bolterbugz
BTW 600 is just a magic number (maybe it could be a pref)
Chatted with Shaver, if we go this route we should essentially use a circular buffer.
> reduces the first mozilla hacks testcase from 61 seconds to 17 seconds

What's the number with a11y off?
(In reply to comment #83)
> > reduces the first mozilla hacks testcase from 61 seconds to 17 seconds
> 
> What's the number with a11y off?

7 seconds (trunk debug build)
Rule #1 of performance measurement: do not ever do it in debug builds if you plan to use the numbers for something.  There's various code in the tree that has different asymptotic algorithmic complexity in debug builds and opt builds.  What do all three numbers look like in an opt build?
Moz hacks, 1st example:

Win32-FF
no a11y:                   882ms (Try build)
a11y with lossy throttle: 2166ms (Try build)
a11y no throttle: I gave up after about 90 secs (Nightly build)

Note I have concern about the lossy throttle and screen reader virtual buffers getting out of sync but we could reach out on that.
OK.  So even with lossy throttle, it sounds like we need to get a lot faster no matter what (though the bottleneck may be elsewhere now, of course; profiling is needed to tell).
Depends on: 570275
Alias: a11yperfcoalesce
Blocks: a11yperf
Good news. I am marking this not blocking 2.0 because we now have a far more reasonable situation (much thanks to the sleepless Alexander, and others).

On our table mutation perf test[1], using the current nightly on Windows I tend to get numbers like:

          Time     Post processing
A11y off   570     558
A11y on    623     600
A11y on[2] 730     709

We will continue to work on performance as our top priority and expect to land more improvements in the coming weeks.

[1] https://bugzilla.mozilla.org/attachment.cgi?id=449665
[2] accprobe event monitoring to simulate common event monitoring cases: focus, show, hide, statechange, valuechange
blocking2.0: final+ → ---
> On our table mutation perf test[1]

Another data point: On the testcase in bug 522847 I see about 700ms without accessibility on and between 860ms and 1100ms with accessibility on.

20-50% slowdown is not so great, but definitely better than 100x slowdown...

How did we get here without changing the coalescing behavior?
(In reply to comment #90)

> How did we get here without changing the coalescing behavior?

Coalescence behaviour was optimized in bug 570532. Another reason I think is bug 575052 which makes text events creation fast. However I didn't measure perf for this test case so these are likely assumptions.
Ah, ok.  Sounds good.  Fwiw, we may be able to call this fixed.  It no longer seems to be a hotspot in the few profiles I just did.
Fixed by number of a11y perf bugs made for Firefox 4, bug 570500.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: