Closed Bug 418845 Opened 14 years ago Closed 14 years ago

Enabling a11y massively degrades performance of dynamic <option> addition on Linux

Categories

(Core :: Disability Access APIs, defect, P2)

x86
Linux
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: mcepl, Assigned: ginnchen+exoracle)

References

()

Details

(Keywords: perf)

Attachments

(4 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; cs-CZ; rv:1.8.1.12) Gecko/20080208 Fedora/2.0.0.12-1.fc8 Firefox/2.0.0.12
Build Identifier: Fedora Rawhide different builds of Firefox 3 beta 2 and beta3

This is the original description of the Fedora bug https://bugzilla.redhat.com/show_bug.cgi?id=431162 with further long discussion hinting towards FF[23] having problems with allocation of 6000+ items long array via Javascript. On FF3 times for downloading a page are over 16minutes.

-------

I find that firefox, galeon, and epiphany all hang when visiting the query.cgi
expert mode page of redhat bugzilla.  The 'components' select box does not
populate itself, it stays grey.  The cpu spins and the application does not
respond or redraw itself.   It can be closed (gnome recognizes not responding,
force quit dialog). It appears to be javascript related since I haven't seen
this hang on a page without ajax type content yet.

The xulrunner-1.9-0.beta2.15.nightly20080130.fc9 is effected as well as
firefox-3.0-0.beta2.12.nightly20080121.fc9.i386.

Currently seeing the issue from both x86_64 running as vmware guest and as i686
on physical hardware.

Reproducible: Always

Steps to Reproduce:
1.open https://bugzilla.redhat.com/show_bug.cgi?id=431162
2.click on Update components links right of the (one-element by default) component listbox
3.
Actual Results:  
It takes a long time -- most of the reporters' thought that acutally firefox got dead frozen, only most persistent souls got timings of 16 minutes with Firefox 3 (and 4 minutes or so with Firefox 2; both the last versions of both Fedora Rawhide builds AND Mozilla binaries).

Expected Results:  
It should take some reasonable amount of time — let’s say 1 minute max.

Happens every time for some people (probably with slightly limited hardware; the one computer where this was repeatedly reproduced has 512MB RAM).
Product: Firefox → Core
QA Contact: general → general
On my system, Firefox 2 used to take around 20 seconds to process the JS on https://bugzilla.redhat.com/query.cgi?format=simple - Firefox 3 took, well, I stopped waiting after 15 minutes ;)
ugh, that page takes forever to load.
Assignee: nobody → general
Status: UNCONFIRMED → NEW
Component: General → JavaScript Engine
Ever confirmed: true
Flags: blocking1.9?
QA Contact: general → general
Version: unspecified → Trunk
https://bugzilla.redhat.com/query.cgi?format=simple loads for me in a few seconds on my current nightly; what am I missing?
(In reply to comment #3)
> https://bugzilla.redhat.com/query.cgi?format=simple loads for me in a few
> seconds on my current nightly; what am I missing?

Nobody knows, I was not able to reproduce it either myself. My hypothesis is that you have too good computer.
(In reply to comment #4)
> Nobody knows, I was not able to reproduce it either myself. My hypothesis is
> that you have too good computer.

Whatever you and Brian have, Matej, I don't think it's so much better than a dual-core 2GHz machine that it takes 14 minutes less just because of that...
Zack:  Yes, I agree.  My MBP isn't going to cure this problem.  So I'm wondering if my nightly-build does.  Can we be specific about what versions/builds are troublesome?  What about a reduced testcase?
Keywords: perf
To clarify, my systems where this is both 100% repeatable on FF3-linux and not repeatable on FF2-linux or other browsers/platforms are:
P4 2.0G 1024Mb 400mhz rdram (physical install, i686 Fedora rawhide)
Core2 Duo 2.16G 2048Mb 667mhz ddr (vmware guest using 1024Mb ram, x86_64 Fedora rawhide)

The difference between FF3-linux taking over 5 minutes to display the unresponsive script dialog, compared to the 10 seconds to fully populate those lists with FF2-linux or FF3-osx is definitely not a hardware performance issue.  My laptop can populate this list in a Windows XP vmware guest with excellent performance on FF3 and FF2.  This situation must be caused by the FF3-linux javascript interpreter doing something differently than others because it is behaving very differently on other platforms using FF3 nightly builds on the same hardware.
Summary: all gecko-related browsers are intolerably slow on the Red Hat bugzilla page loading components array → Linux-only slowness on the Red Hat bugzilla page loading components array
If it's the JavaScript interpreter (which I doubt, but I'm an open-minded sort), then can you reproduce it if you just have the script parts, without the HTML manipulation?  Can you run oprofile or something similar to see where it's spending the time?  5 minutes for a pageload and something should show up like a supernova on a profile.

I'm wondering if we're doing something painful in the native-widget code on Linux, as we append elements and rebuild the select control.  Is there a way to disable Linux native widget code with a pref or something, to rule that out?

Do you get the same effect with a nightly tarball downloaded from ftp.mozilla.org?
I can look at oprofile, I haven't tried that yet.  You may not have seen but I watched the network traffic and all packets incoming from the server are done very quickly after the javascript begins, then for 5 minutes there is no traffic and after that the dialog appears AND the list populates without any additional traffic.  So, it is completely on the client side that this is misbehaving.

Whether its in rewriting the html or in the javascript I don't know but since the script is unresponsive (according to the dialog) I would imagine the script has failed to finish... indicating its not just a page rendering issue for instance.
No doubt it's the client side, but the unresponsive-script dialog is based on time-since-script-started, so if manipulating the select box takes a long time then it'll pop up, even if the JS interpreter isn't spending a lot of time interpreting the script proper.  So I don't think its appearance rules out it being a problem with rendering or otherwise manipulating the page's widgets.

Can you try extracting the JS and running it alone, excising the parts that manipulate the DOM?  That would tell us if this is a bug in the JS engine -- unlikely, IMO -- or somewhere else that's more likely to differ between Linux and other platforms (like widget code, such as the widget code involved in populating the components list).

mventnor: do you have any tips here on how to rule out/in the widget code?
https://bugzilla.redhat.com/attachment.cgi?id=295524 (callgrind output) from the downstream bug makes this very much not look like a JS engine issue, since the time in Spidermonkey is utterly dwarfed by the time in Gecko code.  Moving to Layout: Form Controls based on the callgrind, though I'm still not convinced that our calls into native-widget machinery are innocent. :)

Would be interesting to hear if it happens with builds from ftp.mozilla.org, though, such as Firefox 3 beta 3, to make sure that we're debugging a problem in the upstream source rather than downstream config.  (We should help downstream resolve a config problem if that's what it turns out to be, but it'll save a lot of time and headache if we can rule that in/out now.)
Assignee: general → nobody
Component: JavaScript Engine → Layout: Form Controls
QA Contact: general → layout.form-controls
I have already confirmed this occurs in upstream FF3 nightlies on both my rawhide boxes (x86_64 multilib and i686), see https://bugzilla.redhat.com/show_bug.cgi?id=431162#c40.  I'll duplicate the callgrind output on my systems and also look at oprofile.  If there are any other hints at how to locate the offending code this please share.
Ways to help narrow it down:

- Isolate widget/layout: make a page that has a static <select> list with the N-thousand entries, see if you see the same slowness
- Isolate JS: edit the page to do something other than add an option for each entry (it could just stick an entry in an array), and see if the slowness remains
Ok I have page [1] setup to test this.  Passing in the GET var c changes the number of options shown or generated [2].  The Array() test just creates c new text entries in an array, showing an alert when done.  The Options() test inserts new Options into the empty select.

[1] http://lordmorgul.net/pub/fedora/testing/ff3-bug431162.php
[2] http://lordmorgul.net/pub/fedora/testing/ff3-bug431162.php?c=6000


My results, using c=6000 (near the components in fedora's list):

FF3 i686 fedora rawhide build (3.0-0.beta3.27.nightly20080223)
   ~4sec to show page (create static select)
   ~10mins to generate the new select by JS
   <1 sec to do the Array test.

FF3 upstream build nightly (3.0b4pre 2008022404) on i686 fedora rawhide
   ~3sec to show page (create static select)
   ~10mins to generate the new select by JS
   <1 sec to do the Array test.

OSX FF2.0.0.12
   ~6sec to show page
   ~8sec to generate select by JS
   <1 sec to do array test.

OSX FF3b4pre (roughly the same as FF2), WinXP FF3b4pre does them all in seconds as well

In any case... I'd have to say this unresponsive script dialog is about 10 minutes late no matter what is causing this problem.  I don't want to wait 10 minutes to stop a script that a deliberately broken site runs no matter what it is trying to do.
Dialog can't be posted while we're off in reflow, and I suspect that the 10 mins is all off in the reflow, long after you've manipulated the Options.  If you want to discuss that further, file another bug to do so, it's not what we're looking at here.

On a hunch, I had our trusty Reed flip his config.use_system_prefs.accessibility setting from false to true, and he can now reproduce the bug: 60x increase in time spent.  On that basis, I am sending this to a11y.
Assignee: nobody → aaronleventhal
Severity: normal → critical
Component: Layout: Form Controls → Disability Access APIs
QA Contact: layout.form-controls → accessibility-apis
Summary: Linux-only slowness on the Red Hat bugzilla page loading components array → Enabling a11y massively degrades performance of dynamic <option> addition
Summary: Enabling a11y massively degrades performance of dynamic <option> addition → Enabling a11y massively degrades performance of dynamic <option> addition on Linux
Matej, could you double-check that the folks reporting this have the GNOME accessibility pref toggled?

To ease testing, I _think_ the following can be used to force accessibility enabled/disabled on Linux for a particular browser run:

env GNOME_ACCESSIBILITY=1
env GNOME_ACCESSIBILITY=0

I can probably do a jprof profile of this tomorrow night, but I'm guessing that there is an O(N) or slower algorithm here that runs for every option added, making the whole thing O(N^2) or slower in the number of options added.
There is a Firefox extension called about:accessibilityenabled which will help here:
https://addons.mozilla.org/en-US/firefox/addon/2407

You can see with 100% certainty whether a11y is enabled when you run. So you can see if the environment variable that Bz mentioned worked (GNOME_ACCESSIBILITY)
I believe this is similar to what the Orca team experienced in this Gnome bug: http://bugzilla.gnome.org/show_bug.cgi?id=505742#c10. The moment comboboxes repopulate due to the selection changing in one of them, a lot of events get fired for each child that's being added or removed. If you have comboboxes or listboxes that are several hundred items long, this will take a while, and would only show up if a11y is enabled.
My config.use_system_prefs.assessibility was set to false already, and the extension does not work (trying to load the url about:assessibility does nothing).  The bug is still occurring.
I forgot to mention, setting/unsetting GNOME_ACCESSIBILITY in the env before launching firefox does not effect it one way or the other.

What else can I look for here?
Reproduced on my box.

Much cpu spent in this stack
nsDocAccessible::FlushEventsCallback
nsDocAccessible::FlushPendingEvents
nsDocAccessible::FireShowHideEvents
nsAccessibleWrap::FireAccessibleEvent
nsAccessibleWrap::FireAtkShowHideEvent

(In reply to comment #16)
> Matej, could you double-check that the folks reporting this have the GNOME
> accessibility pref toggled?

I have checked on one computer where the problem was present (both with the upstream binary as well as with our Fedora package) and with GNOME_ACCESSIBILITY=0 Fedora package works like a charm.

Martin Stránský was testing this with his computer and with GNOME_ACCESSIBILITY=1 he made his Firefox unusably slow.

So, yes we can confirm this. I will let you know, if somebody else will report anything else.
Aaron: can you figure out if this is in our code?
Flags: blocking1.9? → blocking1.9+
Priority: -- → P2
So on the testcase from comment 14 I can reproduce the problem only when accessibility is enabled (via GNOME_ACCESSIBILITY=1 in my case, but I'd bet that changing the GNOME preferences would also do the trick).

With that setting, I profiled for a little bit after clicking that link.  The bottom-up profile is not that enlightening: every single function in the top 12 or so (accounting for about 50% of the total time) is either QI/addref/release related or is nsAccUtils::IsAncestorOf or is nsDocAccessible::FireDelayedAccessibleEvent.

Much more interesting is the top-down profile.  Almost all the time (95% or so) is spent under nsDocAccessible::FireDelayedAccessibleEvent.  Most of _that_ is spent under nsAccUtils::IsAncestorOf.

If I recall this code correctly, it has a list of pending changes, and checks to see whether the new change can be coalesced with the existing pending changes.  The whole thing is O(N^2) in number of changes if a bunch of changes happen at once (adding each change is O(N) in number of pending changes unless it can be coalesced with something).  And the constant is big because the code is COM-heavy.

I was sure this exact problem had come up before, but I can't seem to find the bug...
There're thousands of options in HTML option, and then
sel.options.length = 0;

It will remove these options one by one, every removal will cause a
nsAccessibilityService::InvalidateSubtreeFor()

That's very expensive.

On my test script (comment #14) the first time the page is loaded the second SELECT is empty, yet it takes a long time to populate.  The redhat bugzilla query also starts empty for the query.cgi page [1] (creating a new bug) or starts with only 1 entry (when clicking 'update components' [2] while viewing a bug already created).  Those situations are not clearing large sets of options before updating the SELECT.

[1] https://bugzilla.redhat.com/enter_bug.cgi?product=Fedora
[2] https://bugzilla.redhat.com/show_bug.cgi?id=431162 (click 'update components')
By pressing "Update components", the options are added one by one.
It takes as much time as removing them one by one.

In nsDocAccessible::FireDelayedAccessibleEvent()
The aAllowDupes rule is eCoalesceFromSameSubtree.
If thousands of events are in queue,  nsAccUtils::IsAncestorOf(newEventDOMNode, domNode) will take quite a lot time.

Also, by receiving children_changed event, atk-bridge will call atk_object_ref_accessible_child, since we just invalidate subtree, thousands of accessible objects will be recreated.

(In reply to comment #27)
> Also, by receiving children_changed event, atk-bridge will call
> atk_object_ref_accessible_child, since we just invalidate subtree, thousands of
> accessible objects will be recreated.

atk-bridge doesn't need to do so.
Let's fix it at http://bugzilla.gnome.org/show_bug.cgi?id=350552

Anyway, the fix won't make O(N^2) become O(N).
Assignee: aaronleventhal → ginn.chen
Unblocking based on comment 28 - thanks!
Flags: tracking1.9+
Even with a fix in a newer atk-bridge, current users will experience massive fail here, no?  I think we need to have this fixed for the default Linux config (a11y on, apparently) before we ship, or it's going to be pretty grim.
Flags: blocking1.9?
Is GNOME_ACCESSIBILITY=1 the default option? I'm just looking at a case where we have absolutely no notion of a fix and I'm not sure that this is a primary case. I know it's not ideal, but I also don't know if I'd hold ship for it based on the comments in this bug.

Leaving as ? for now ...
Flags: blocking1.9? → blocking1.9+
It's always bothered me that we always fire these events on Linux even if there are no listeners. A lot of users turn on a11y w/o knowing there is a perf impact, but they don't actually have an AT using it.

I've started working on a fix that makes our accessible docs only process events when something is actually using a11y info. I plan to post a fix for that as well early next week, in a different bug. But we still probably need to fix the algorithm for handling option changes, but it should help Linux perf in general for this case.
I don't know what changed but I am now seeing the performance problem go away when starting firefox with 'GNOME_ACCESSIBILITY=0 firefox' on my Fedora systems.  In comment #20 I said it was not working for my systems, but it does now, while GNOME_ACCESSIBILITY=1 is still slowed down.
(In reply to comment #30)
> Even with a fix in a newer atk-bridge, current users will experience massive
> fail here, no?  
Yes, we still have the problem.

(In reply to comment #31)
> Is GNOME_ACCESSIBILITY=1 the default option?
GNOME_ACCESSIBILITY=0 is forcing AT feature off.
GNOME_ACCESSIBILITY=1 is forcing AT feature on.
By default, GNOME_ACCESSIBILITY is not set. User sets AT feature on/off in gnome preferences (it is off by default), and Firefox reads the setting from gconf.
(In reply to comment #34)
> By default, GNOME_ACCESSIBILITY is not set. User sets AT feature on/off in
> gnome preferences (it is off by default), and Firefox reads the setting from
> gconf.

I am not sure whether you are right with this. The default is of course highly distro dependent and it seems to me that at least in some configurations of Fedora/RHEL a11y might be on by default.
(In reply to comment #35)
> (In reply to comment #34)
> > By default, GNOME_ACCESSIBILITY is not set. User sets AT feature on/off in
> > gnome preferences (it is off by default), and Firefox reads the setting from
> > gconf.
> 
> I am not sure whether you are right with this. The default is of course highly
> distro dependent and it seems to me that at least in some configurations of
> Fedora/RHEL a11y might be on by default.

Default in Gnome is apparently a11y on.

I spun off bug 420722 to try and make Mozilla smart about how much work it does for a11y in the case where it's turned on for the desktop but isn't actually being used.
(In reply to comment #36)
> Default in Gnome is apparently a11y on.
> 

I think on most of distros, a11y is default off for Gnome stable releases, e.g. Gnome 2.20,
a11y is default on for Gnome develop releases, e.g. Gnome 2.21.
I think we can fix the problem in FireDelayedAccessibleEvent().
I'm working on it.
Awesome, Ginn.  That would be great.
For the 'update components' case, there're about 6000 options being added.

It takes about 240 seconds of user CPU time on my box.
81 seconds is in IsAncesterOf [1]
64 seconds in refChildCB [2]
62 seconds in getIndexInParentCB [3]
13 seconds in getNameCB [4]

These 4 functions add up to 220 seconds.

[1]
because of the eCoaleceFromSameSubtree rule, my idea is in FireDelayedAccessibleEvent(), we add every event in queue, and apply the rule in FlushPendingEvent().
I think we can use a faster algorithm there.
[2]
a bug of atk-bridge
I have a patch at http://bugzilla.gnome.org/show_bug.cgi?id=350552
We can fix it in GNOME 2.22.1
[3]
By emitting children-changed event, we need to know the index, I don't know how much we can improve here.
[4]
another bug of atk-bridge, filed at http://bugzilla.gnome.org/show_bug.cgi?id=520490
We can fix it in GNOME 2.22.1
(In reply to comment #41)
> For the 'update components' case, there're about 6000 options being added.
>
> 62 seconds in getIndexInParentCB [3]
> [3]
> By emitting children-changed event, we need to know the index, I don't know how
> much we can improve here.

Are you really saying that it has to take a minute to find the indices of 6000 children, on your hardware?  In the case in which most or all of the children have the same parent, or a common parent, it would seem like you should be able to iterate over the child list once and store the child-index for each node processed in a secondary data structure like a tree or hash table.  (I'm presuming here that the huge amount of time spent in getIndexInParentCB is because we're naively walking the child list from the start for each of the 6000 children, which in the case of a large batch of children to process is certain to fail painfully.)

But also: when you're appending a set of children, don't you know that the index is one higher than the last one you just calculated?  Do you need to walk the preceding 5999 children at all in this case?  Even just adding a simple check to store lastCheckedChild and lastCheckedChildIndex such that you can be fast when newChild is lastCheckedChild.nextSibling might win you enough.

Or:
- group the new children from the queue by parent
- walk the children of that parent once in order, storing (child, childIndex) in a hash table
- consult the hash table to find the childIndex when you're firing the event

(Callers of nsAccessible::GetIndexInParent may also want to be examined for this pattern.)

Finally, since I can't find ATK documentation that describes the semantics here: is there really no way of batching the event firings, or of just saying "things changed, you should go re-inspect the tree" and letting the listeners walk through (once!) to rebuild their state as appropriate?  If you were writing a native GTK app that added a couple of thousand options, would you really be expected to fire a children-changed event for each?  (Could we just send "children-changed" without the "::add" detail suffix?  My primitive understanding of GObject signal semantics would have me believe that ATK listeners would need to cope with that anyway.)  If there's a way to give a more general "lots of changes!" event, then we should probably do that if we have more than N events to send for a single parent.  (That's not a substitute for optimizing the index calculation for the < N case, since that time is proportional to the total number of children that need to be walked, and not just the number of times that we're walking.)

Honestly, iterating over the entire *document* and storing the child-index of every node present shouldn't take ten times as long as starting the browser up, so I'm pretty sure that this part can be made much faster.
Mike,

Thank you!
Of course we can improve this part.
Actually in the 60 seconds, we are not only calculating the index, but also creating AtkObject for each one.

I agree with your comment.

This is my idea of reducing calls of IsAncesterOf.
Apply dupe rules in FlushPendingEvents, and apply the same result to siblings.

In "update components" case, ApplyEventRules() only takes me 0.04s.
The save is 80-90 seconds.

If atk-bridge problem [2], [4] are fixed.

The "update components" case will take a little longer than 1 minutes.
And
https://bugzilla.redhat.com/query.cgi?format=simple
will take less than 2 minutes.
It's Intel Core 2 Duo 2.66GHz desktop running Solaris.

It's still not good. I'll continue work on it.

BTW: I think we should consider to add a limitation of no more than N events can be fired for nodes of same parent. Because even we can fire 6000 or 13000 events in a short time, a11y tool will take a extreme long time to deal with them.
It is very simple to add this limitation with this patch, just add one line in ApplyToSiblings().
For getIndexInParentCB [3],
I found getIndexInParentCB itself doesn't use much time,
but nsHTMLSelectListAccessible::CacheChildren() takes quite a lot.

I don't know why we don't have
if (mAccChildCount == eChildCountUninitialized) {} in this function.
(In reply to comment #45)
> I don't know why we don't have
> if (mAccChildCount == eChildCountUninitialized) {} in this function.

Becuase if the child count >= 0 we have already cached the children and don't need to do it again.

It seemed we missed it since first version.

I checked other CacheChildren(), this is the only one.

With these 2 patches and atk-bridge fixes, the page loads in several seconds.
Attachment #307891 - Flags: review?(aaronleventhal)
Comment on attachment 307682 [details] [diff] [review]
draft patch 1 to reduce calls of IsAncesterOf

perhaps we can move ApplyEventRules() to nsAccessibleEventData.cpp
Attachment #307682 - Flags: review?(aaronleventhal)
Attachment #307891 - Flags: review?(aaronleventhal) → review+
Attachment #307891 - Flags: approval1.9?
Comment on attachment 307682 [details] [diff] [review]
draft patch 1 to reduce calls of IsAncesterOf

Surkov, Evan -- please look.
Attachment #307682 - Flags: review?(surkov.alexander)
Attachment #307682 - Flags: review?(aaronleventhal)
Attachment #307682 - Flags: review?(Evan.Yan)
Sounds like there is more to be done here before approving?
Comment on attachment 307891 [details] [diff] [review]
patch 2 for nsHTMLSelectListAccessible::CacheChildren()

Ah, I see.

a1.9+=damons on this patch.
Attachment #307891 - Flags: approval1.9? → approval1.9+
This is a blocker, so it didn't need approval in the first place.
Ginn, is the patch safe enough in timeframe of firefox3?

You put private information into public nsIAccessibleEvent. It doesn't seem to be correct, could you use nsAccEvent for this like we do similar things for nsHyperTextAccessible if you don't want to have private interface for this?
Yes, I think it's safe.

I don't think the event rule is private.

I can use nsAccEvent, but we'd better make nsIAccessibleStateChangeEvent *not* inherit nsIAccessibleEvent, otherwise we will need more code to deal with the 2 inherit paths.
Comment on attachment 307682 [details] [diff] [review]
draft patch 1 to reduce calls of IsAncesterOf


>+  static nsIDOMNode* EventNode(nsIAccessibleEvent *aAccEvent) {
>+    nsCOMPtr<nsIDOMNode> domNode;
>+    aAccEvent->GetDOMNode(getter_AddRefs(domNode));
>+    return domNode;

it's not safe if we won't keep alife aAccEvent, right?

>+    // Helper function for FlushPendingEvents()
>+    void ApplyEventRules();
>+    void ApplyToSiblings(PRUint32 aStart, PRUint32 aEnd, PRUint32 aEventType,
>+                         nsIDOMNode* aDOMNode, PRUint32 aEventRule);

could you document these methods?
(In reply to comment #55)
> (From update of attachment 307682 [details] [diff] [review])
> 
> it's not safe if we won't keep alife aAccEvent, right?

If aAccEvent is not kept, this method should not be used.

> 
> >+    // Helper function for FlushPendingEvents()
> >+    void ApplyEventRules();
> >+    void ApplyToSiblings(PRUint32 aStart, PRUint32 aEnd, PRUint32 aEventType,
> >+                         nsIDOMNode* aDOMNode, PRUint32 aEventRule);
> 
> could you document these methods?

Yes, sure.
(In reply to comment #54)
> I don't think the event rule is private.

I mean event rules are used only in internal accessibility event processing.
They can't be useful outside of accessibility moudle and especially in script.
We should keep our interface simple as possible.

> I can use nsAccEvent, but we'd better make nsIAccessibleStateChangeEvent *not*
> inherit nsIAccessibleEvent, otherwise we will need more code to deal with the 2
> inherit paths.
> 

it's usual way in accessibility module at least.  Why is it bad? Why you can't
use
nsRefPtr<nsAccEvent> event;
Ievent->QueryInterface(NS_GET_IID(nsAccEvent), getter_AddRefs(event));
I think we can use [noscript] if we just want to hide it from script.

Yes, we can use nsRefPtr<>, but we need to write code for QI, CID.
Comment on attachment 307682 [details] [diff] [review]
draft patch 1 to reduce calls of IsAncesterOf

(In reply to comment #58)
> I think we can use [noscript] if we just want to hide it from script.

I don't like [noscript] actually, it brings a mesh into interface.

> Yes, we can use nsRefPtr<>, but we need to write code for QI, CID.
> 

Yes, I like rather to use nsRefPtr approach (or add new private interface) than to spoil public interface. Though if everyone is ok with the current approach I won't press you.
Attachment #307682 - Flags: review?(surkov.alexander) → review+
If it's not for people to use, don't put it on the interface, just use a method?  (Use a nsPIWhatever if you need cross-component access, but cross-component access to private interfaces? fear!).  Don't make things [noscript] just because they're internal; script isn't the issue here.
I think it's no harm for people to use it, the only thing here, there's no use case outside a11y module right now.

I tend to agree the nsRefPtr<> approach now.
I'll work out a patch later.
Attached patch revised patch 1 (obsolete) — Splinter Review
Move the methods into nsAccEvent.
It is easier than I thought.

Thanks!
Attachment #307682 - Attachment is obsolete: true
Attachment #308848 - Flags: review?(surkov.alexander)
Attachment #307682 - Flags: review?(Evan.Yan)
Attachment #308848 - Flags: review?(Evan.Yan)
nit:
void nsAccEvent::ApplyEventRules

void
nsAccEvent::ApplyEventRules
Ginn, can you use exisitng macros not to define QueryInterface or aren't they
applicable here?
nit:

enum EEventRule { eAllowDupes, eCoalesceFromSameSubtree

possibly it's better to document each enum item

/**
  *
  */
enum EEventRule {
  /**
    *
    */
  eAllowDupes,

}
nsAccEvent* accEvent;

please initialize raw pointers
NS_ENSURE_TRUE(aDOMNode1, PR_FALSE);
NS_ENSURE_TRUE(aDOMNode2, PR_FALSE);

you can combine them into one
Comment on attachment 308848 [details] [diff] [review]
revised patch 1

looks ok, r=me
would be nice if Evan will look this too, additional pair of eyes will be great.
Attachment #308848 - Flags: review?(surkov.alexander) → review+
Yes please have Evan look too. If you put up a new patch I will look as well.
Attached patch patch 1 v2Splinter Review
Addressing surkov's comments.

Thanks.
Attachment #308848 - Attachment is obsolete: true
Attachment #309021 - Flags: review?(Evan.Yan)
Attachment #308848 - Flags: review?(Evan.Yan)
Attachment #309021 - Flags: review?(aaronleventhal)
Comment on attachment 309021 [details] [diff] [review]
patch 1 v2

This is great stuff. We should get it in ASAP to ensure adequate testing. If Neil or someone has a moment to in the next day to sr= that would be great. If not we should just go ahead and request a=.

Two nits:
Rename "IsSiblings" to "IsSiblingOf"

In the follow block, please add the comment // Not in another event node's subtree, and no other event is in this event node's subtree.
        if (tailEvent->mEventRule != nsAccEvent::eDoNotEmit) {

	

          // This event should be emitted

	

          // Apply this result to sibling nodes of tailDOMNode

	

          ApplyToSiblings(aEventsToFire, 0, tail, tailEvent->mEventType,

	

                          tailEvent->mDOMNode, nsAccEvent::eAllowDupes);

	

        }

	

      } break; // case eCoalesceFromSameSubtree
Attachment #309021 - Flags: superreview?(neil)
Attachment #309021 - Flags: review?(aaronleventhal)
Attachment #309021 - Flags: review?(Evan.Yan)
Attachment #309021 - Flags: review+
Comment on attachment 309021 [details] [diff] [review]
patch 1 v2

>+nsAccUtils::IsSiblings(nsIDOMNode *aDOMNode1,
>+                       nsIDOMNode *aDOMNode2)
Maybe AreSiblings would be more grammatically correct.

>-NS_IMPL_ISUPPORTS1(nsAccEvent, nsIAccessibleEvent)
>+NS_IMPL_QUERY_HEAD(nsAccEvent)
>+NS_IMPL_QUERY_BODY(nsAccEvent)
>+NS_IMPL_QUERY_BODY(nsISupports)
>+NS_IMPL_QUERY_BODY(nsIAccessibleEvent)
>+NS_IMPL_QUERY_TAIL_GUTS
>+
>+NS_IMPL_ADDREF(nsAccEvent)
>+NS_IMPL_RELEASE(nsAccEvent)
Why this change?
(In reply to comment #72)
> >-NS_IMPL_ISUPPORTS1(nsAccEvent, nsIAccessibleEvent)
> >+NS_IMPL_QUERY_HEAD(nsAccEvent)
> >+NS_IMPL_QUERY_BODY(nsAccEvent)
> >+NS_IMPL_QUERY_BODY(nsISupports)
> >+NS_IMPL_QUERY_BODY(nsIAccessibleEvent)
> >+NS_IMPL_QUERY_TAIL_GUTS
> >+
> >+NS_IMPL_ADDREF(nsAccEvent)
> >+NS_IMPL_RELEASE(nsAccEvent)
> Why this change?
> 
Make nsIAccessibleEvent* convertible to nsAccEvent*
nsAccStateChangeEvent has 2 inherent paths to nsIAccessibleEvent*, so static_cast<> doesn't work.
CC'ing Neil since he probably didn't see you response.
Blocks: 422744
(In reply to comment #73)
> (In reply to comment #72)
> > >-NS_IMPL_ISUPPORTS1(nsAccEvent, nsIAccessibleEvent)
> > >+NS_IMPL_QUERY_HEAD(nsAccEvent)
> > >+NS_IMPL_QUERY_BODY(nsAccEvent)
> > >+NS_IMPL_QUERY_BODY(nsISupports)
> > >+NS_IMPL_QUERY_BODY(nsIAccessibleEvent)
> > >+NS_IMPL_QUERY_TAIL_GUTS
> > >+
> > >+NS_IMPL_ADDREF(nsAccEvent)
> > >+NS_IMPL_RELEASE(nsAccEvent)
> > Why this change?
> > 
> Make nsIAccessibleEvent* convertible to nsAccEvent*
> nsAccStateChangeEvent has 2 inherent paths to nsIAccessibleEvent*, so
> static_cast<> doesn't work.
> 
Sorry, I don't see how nsAccStateChangeEvent is relevant to the code I quoted.
I wanted to cast nsIAccessibleEvent* to nsAccEvent*, so I implemented QueryInterface() to do it.

I have to use QueryInterface(), because static_cast<> could not work if the nsIAccessibleEvent* is from nsAccStateChangeEvent.
Ah, so does it work if you use
NS_ISUPPORTS2(nsAccEvent, nsAccEvent, nsIAccessibleEvent)
nope,

"nsAccessibleEventData.cpp", line 60: Error: Cannot have a parameter of the abstract class nsIAccessibleEvent.
"nsAccessibleEventData.cpp", line 60: Error:     nsISupports::QueryInterface(const nsID&, void**) has not been overridden.
(In reply to comment #77)
> Ah, so does it work if you use
> NS_ISUPPORTS2(nsAccEvent, nsAccEvent, nsIAccessibleEvent)
Sorry, typo, NS_IMPL_ISUPPORTS2 of course.
Great, it works.
Comment on attachment 309021 [details] [diff] [review]
patch 1 v2

OK, so sr=me with IsSiblings renamed and using NS_IMPL_ISUPPORTS2.
Attachment #309021 - Flags: superreview?(neil) → superreview+
Attachment #309021 - Flags: approval1.9?
Comment on attachment 309021 [details] [diff] [review]
patch 1 v2

a1.9+=damons
Attachment #309021 - Flags: approval1.9? → approval1.9+
Attached patch patch checked inSplinter Review
Checking in src/base/nsAccessibilityUtils.cpp;
/cvsroot/mozilla/accessible/src/base/nsAccessibilityUtils.cpp,v  <--  nsAccessibilityUtils.cpp
new revision: 1.30; previous revision: 1.29
done
Checking in src/base/nsAccessibilityUtils.h;
/cvsroot/mozilla/accessible/src/base/nsAccessibilityUtils.h,v  <--  nsAccessibilityUtils.h
new revision: 1.24; previous revision: 1.23
done
Checking in src/base/nsAccessibleEventData.cpp;
/cvsroot/mozilla/accessible/src/base/nsAccessibleEventData.cpp,v  <--  nsAccessibleEventData.cpp
new revision: 1.29; previous revision: 1.28
done
Checking in src/base/nsAccessibleEventData.h;
/cvsroot/mozilla/accessible/src/base/nsAccessibleEventData.h,v  <--  nsAccessibleEventData.h
new revision: 1.32; previous revision: 1.31
done
Checking in src/base/nsCaretAccessible.cpp;
/cvsroot/mozilla/accessible/src/base/nsCaretAccessible.cpp,v  <--  nsCaretAccessible.cpp
new revision: 1.65; previous revision: 1.64
done
Checking in src/base/nsDocAccessible.cpp;
/cvsroot/mozilla/accessible/src/base/nsDocAccessible.cpp,v  <--  nsDocAccessible.cpp
new revision: 1.236; previous revision: 1.235
done
Checking in src/base/nsDocAccessible.h;
/cvsroot/mozilla/accessible/src/base/nsDocAccessible.h,v  <--  nsDocAccessible.h
new revision: 1.78; previous revision: 1.77
done
Checking in src/base/nsRootAccessible.cpp;
/cvsroot/mozilla/accessible/src/base/nsRootAccessible.cpp,v  <--  nsRootAccessible.cpp
new revision: 1.263; previous revision: 1.262
done
Checking in src/html/nsHTMLSelectAccessible.cpp;
/cvsroot/mozilla/accessible/src/html/nsHTMLSelectAccessible.cpp,v  <--  nsHTMLSelectAccessible.cpp
new revision: 1.95; previous revision: 1.94
done
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Thanks for the work on this issue, its all much appreciated!
Indeed, thank you!
Me three. You all rock! I've been really impressed in general with the responsiveness to every bug I've cared about, personally.
Great work, indeed.  Are there any tests that we can add to the tree to ensure that this work isn't inadvertently undone later, or a test suite that can be run to make sure it didn't cause inadvertent effects on other platforms or scenarios?

Seems like a pretty big patch, affecting pretty core areas of a11y, to go in without test support, so I'm hoping the answer to at least part of the above question is "yes".
We haven't now tests for accessibility events at all, it should be much of tests and we should do this eventually I think. In the mean time we could add some performance tests if they are applicable. Also it's worth to try to check delayed of different types events with their queue. For example mutation events might be good to check eCoalesceFromSameSubtree type. Ginn, what do you think?
surkov, I agree, we should have a11y event tests and performance tests.

You need to log in before you can comment on or make changes to this bug.