Closed Bug 376924 (a11yframecrash) Opened 17 years ago Closed 17 years ago

Traversing the accessible tree after changes to CSS display property can crash Firefox

Categories

(Core :: Disability Access APIs, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: listaddr1, Assigned: aaronlev)

References

(Blocks 1 open bug)

Details

(5 keywords, Whiteboard: [sg:critical?] destroyed frame)

Attachments

(8 files, 1 obsolete file)

User-Agent:       Emacs-w3m/1.4.4 w3m/0.5.1+cvs-1.968
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a4pre) Gecko/20070408 Minefield/3.0a4pre

The attached sample page shows and hides a region by setting its CSS
"display" property to "block" or "none". If the accessible tree is
traversed after a change from "block" to "none", Firefox under Linux
crashes. The attached javascript shell log shows the required steps.

The talkback report for this problem is 31019831.

On Windows with the exact same Minefield build (2007-04-08),
the problem does not occur in this simple case. I will attach a
comment for how to provoke a similar crash on Windows.



Reproducible: Always

Steps to Reproduce:
See the attached javascript shell log
Actual Results:  
Firefox crashes

Expected Results:  
The accessible tree should reflect the changes to the "display"
property and Firefox should not crash when the accessible tree is
traversed. 

Some information about the traversal of the accessibel tree:

The code used can be found in:

http://firemacs.googlecode.com/svn/trunk/tests/firefox/moz.js

The relevant functions are "dumpAccessibleTree" and
"dumpAccessibleNode".

The first function will simply traverse the accessible tree
recursively, calling the second fuction for each node.
dumpAccessibleNode will access finalRole, name, finalValue, state, extState and
description for each node. These accesses sometimes throw an
exception which does not make much sense to me. For provoking the
crash, it is apparently important to actually access these node
properties, i.e. if the line calling dumpAccessibleNode is commented
out in dumpAccessibleTre, no crash occurs. In other words, simply
walking the tree structure is not enough for a crash.
this test case should work as follows:

Hitting the link should show or hide a line of text with yellow
backgroundd. This is achieved via the CSS display property. If the
checkbox is checked, the script will also make a spurious update to
the structure of the DOM in order to force an EVENT_REORDER event for
the node just above the element that has its "display" property
changed.
While the Windows version appears to be more resilient, it can also be
crashed or at least confused with a more complicated example: The
comments page on blogger offers the option to show or hide the
original post. This effect is achieved with the CSS display property.
Example:

https://www2.blogger.com/comment.g?blogID=32049549&postID=503788244167924508

The link "Show original post" is what we are interested here.

With jaws, showing the post works fine, but after hitting the
link a second time to hide the post again, jaws is completely confused. When
using a remote javascript shell to click the link programmatically, FF
wil crash after the second click, which should hide the original post
again; this is without any AT running on Windows.

See the attachment for the required steps.

Again, it is important to actually access the properties of the
accessibel nodes while traversing the tree in order to provoke the
crash. The next step should probably be to find out the exact property of
the accessible node whose access leads to the crash.
Assignee: nobody → aaronleventhal
Component: Disability Access → Disability Access APIs
Keywords: access, crash
Product: Firefox → Core
QA Contact: disability.access → accessibility-apis
Blocks: aria, atk
We're using a destroyed frame.  Branches are affected too I think.
Group: security
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Linux → All
Hardware: PC → All
The problem is the cached frame pointers in nsHTMLTextAccessible,
nsHTMLLinkAccessible (and on trunk, nsHTMLListBulletAccessible).
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/accessible/src/html/nsHTMLTextAccessible.h&rev=1.47&root=/cvsroot&mark=68-71#48

AFAICT we notify frame changes from two places:
nsCSSFrameConstructor::RecreateFramesForContent
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/layout/base/nsCSSFrameConstructor.cpp&rev=1.1336&root=/cvsroot&mark=11115-11134#11059

and nsFrameManager::ReParentStyleContext
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/layout/base/nsFrameManager.cpp&rev=1.245&root=/cvsroot&mark=1374-1389#1307

but only when "shell->IsAccessibilityActive()", which requires an
NS_ACCESSIBLE_EVENT to be activated:
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/layout/base/nsPresShell.cpp&rev=3.984&root=/cvsroot&mark=5835,5815#5810

The crash occurs because the we never got such an event and hence
the accessible nodes we're not notified of frame changes.
Attached patch Patch rev. 1 (obsolete) — Splinter Review
This fixes the crash and makes the code slightly more robust:
  1. make the shell to be IsAccessibilityActive if not done already
     in nsAccessibilityService::GetAccessible()
  2. check IsAccessibilityActive() in GetFrame() and invalidate
     the cached frame if it's false (since we may have missed events)
  3. always invalidate the cached frame in FireToolkitEvent, not just
     for EVENT_HIDE. I'm think we can have a destroyed frame also for
     EVENT_REORDER (when frame != newFrame in RecreateFramesForContent -
     see nsCSSFrameConstructor.cpp link above).
Attachment #261328 - Flags: review?(aaronleventhal)
I suspect there are other cases where frames are replaced or destroyed that
does not go through RecreateFramesForContent/ReParentStyleContext so I
think we should try to remove this caching of raw frame pointers if we can.
Do we cache these frames for performance only? 
If so, is it still needed? (on trunk)
Can we solve it differently? (in the frame manager perhaps?)
Actually, for primary frames we should always go through one or the other...

That said, why are we doing _anything_ for the DOM mutation if accessibility is not active??
(In reply to comment #9)
> Actually, for primary frames we should always go through one or the other...

I was suspecting the code that creates first-letter frames, but I haven't
checked it closely, maybe you're right that it's covered...

> That said, why are we doing _anything_ for the DOM mutation if
> accessibility is not active??

Good point, but I think the problem here is that accessibility
should have been activated.  I think the intention is to activate
it for the shell "on demand" to avoid frame notifications when a11y
isn't used, but I could be wrong...
> I was suspecting the code that creates first-letter frames

Hmm....  those can indeed go away and reappear as needed, you're right.
Bug 376884 might be a duplicate of this.

> Do we cache these frames for performance only? 
Yes, we cache them  because GetPrimaryFrameFor() must add extra items to primary frame hashtable whenever request comes in for a frame. Since we already have the frame when accessible is created it would seem to save that work. However, it may
have been a premature optimization and not really be a big improvement. It seems that getting rid of that optimization (caching the frames) is wise, to avoid the  crashes. That's something that can be done for branch as well.
Accessibility should be active for the whole app or not at all. If  accessibility is not active, the accessibility service should not be getting called. I think presShell should just use a static global gIsAccessibilityActive, rather than a per-preshell mIsAccessibilityActive.
Well, for list bullet frame it's not a performance enhancement. Caching the list bullet frame is necessary to cache otherwise we don't know how to get it.
This should also fix this bug.

- We'll need to file a follow-up bug to deal with list bullet bounds and other issues. We already have many issues with list bullets because the list bullet text is not stored in anonymous content. Since there is no content node many of our normal methods don't work.
Attachment #261333 - Flags: review?(mats.palmgren)
Comment on attachment 261328 [details] [diff] [review]
Patch rev. 1

+    nsAccessibleEvent event(PR_TRUE, NS_GETACCESSIBLE, nsnull);
+    nsEventStatus status;
+    mIgnoreShellGetAccessible = 1; // protect against recursive calls
+    aPresShell->HandleEventWithTarget(&event, nsnull, nsnull, &status);
+    mIgnoreShellGetAccessible = 0;
+  }

So I guess the presShell doesn't know that accessibility is active because it's being activated by Javascript instead of the normal OS way (via NS_GETACCESSIBLE).

Rather than doing this, I think we should have nsIPresShell::SetAccessibilityActive(PRBool);
and have that be a global static instead of a member variable.
Attachment #261328 - Flags: review?(aaronleventhal) → review-
Lukas, is JS required to get crashes with these mutations? Or can you get it to happen just when using Firefox with MSAA or ATK?
(In reply to comment #16)
> Rather than doing this, I think we should have
> nsIPresShell::SetAccessibilityActive(PRBool);

Can't do that change on branches though.

> and have that be a global static instead of a member variable.

I don't see a problem with having a bit per shell if we want to.
Not sure how much of a performance win that is though.
Comment on attachment 261333 [details] [diff] [review]
Remove caching of frames in accessibility

Looks good.  r=mats

We still need to flip that mIsAccessibilityActive bit if we want these
notifications.  Not sure if that part is important enough for branch though.
Attachment #261333 - Flags: review?(mats.palmgren) → review+
> Lukas, is JS required to get crashes with these mutations? Or can you get it to
> happen just when using Firefox with MSAA or ATK?

A crash can also be provoked just by using orca or lsr to interact with
Firefox. I am still refering to the April 8 version. Should I try with
the latest Minefield again? I do not know if the crash can also be
provoked on Windows without graversing the accessible tree in
javascript.

Also, I do not think that the crash does depend on the fact that no
other screen reader is running. If it helps, I can produce talkback reports
with jaws or orca running. On Linux, when I start Firefox, it will say
on the console that accessibility support is enabled regardless
whether orca is running.

Moreover, I think what makes these crashes show more predictively with my
AT (Firemacs) than with other ATs on Linux is that I use a virtual
buffer and thus have to traverse the whole accessable tree in one go, reading
at least "name" and "finalRole" for each node. The other Linux ATs
like lsr or orca seem to access a node onle when the user moves to it
on the page.
Elaborating on my previous comment, here are the steps to provoke a
crash of Minefield April 8 with orca. I tried the version of April 12,
but it crashes even on startup:

1. Load page e.g.:
http://homepage.hispeed.ch/loehrer/firemacs/tests/display.html
(This is the same as the attachment above)

2. Visit  all the lines on the page with the screen reader. This step
   seems important, probably to trigger the creation of all the nodes
   in the accessible tree.

3. Hit the link "Hide region"

4. move the cursor down one line -> Feedback Agent pops up

Not sure if the last step is needed. I cannot really tell when the
Agent pops up exactly because I cannot see the screen at all. I can
get sighted assistance if more precise information is needed.

Flags: in-testsuite?
Keywords: testcase
Whiteboard: [sg:critical?] destroyed frame
Spun off bug 377288 for issue on getting accessibility notifications even if the a11y service is launched from JS. That bug isn't security sensitive or a crasher.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
We should let this bake in and then consider it for the 1.8 branch. How should we mark it up so we remember to do that in a week or so. Is there a way to mark up the patch as potentially needed for 1.8, without sending it for a= right away?
Mats's testcase in comment 22 crashes FF2.0.0.4pre. Nominating for the branch.
Flags: wanted1.8.1.x+
Flags: wanted1.8.0.x+
Flags: blocking1.8.1.4?
Flags: blocking1.8.0.12?
I think we want more baking that we can get for 1.8.1.4 -- not blocking this release but marked "wanted" for the branch so we remember to come back to it.
Flags: blocking1.8.1.4?
Flags: blocking1.8.1.4-
Flags: blocking1.8.0.12?
Flags: blocking1.8.0.12-
Blocks: fox2access
The machine I had with an older version of Visual C++ went down, and the older branches don't build with VC8. Can I get any help from Mozilla to backport this fix?
Alias: a11yframecrash
Attachment #261328 - Attachment is obsolete: true
Attachment #265073 - Flags: approval1.8.1.5?
Attachment #265073 - Flags: approval1.8.0.13?
Attached file stack, branch 1.8
FWIW, the "Standalone testcase" still crash on branch with the attached
branch patch, but I think it's an unrelated (null-ptr) crash.
Mats, how should we deal with the fact that it still crashes?
As a separate unrelated bug IMO.  I looked briefly at the code and it
seemed like a null-ptr crash only.  compare 1.8 branch:
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/accessible/src/atk/nsAccessibleHyperText.cpp&rev=MOZILLA_1_8_BRANCH&root=/cvsroot&mark=184,190#182

with trunk:
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/accessible/src/html/nsHyperTextAccessible.cpp&rev=1.58&root=/cvsroot&mark=1378#1375

Since the code appears to be so different I'm not sure if it's worth the
effort+risk of rewriting the code to much for a null-ptr crash...
we could try to wallpaper it with an early return I guess...
Ah. That's a Linux-only bug then, whereas this bug is for all operating systems. Our accessibility support for Linux on branch is really different from what we have now. It's not worth doing much to fix it. Wallpaper is fine.
Spawned off bug 380975 for the additional branch null-ptr crash.
Flags: blocking1.8.1.5+
Comment on attachment 265073 [details] [diff] [review]
Patch for branches

approved for 1.8.1.5 and 1.8.0.13, a=dveditz for release-drivers
Attachment #265073 - Flags: approval1.8.1.5?
Attachment #265073 - Flags: approval1.8.1.5+
Attachment #265073 - Flags: approval1.8.0.13?
Attachment #265073 - Flags: approval1.8.0.13+
Comment on attachment 265073 [details] [diff] [review]
Patch for branches

I landed the branch patch on MOZILLA_1_8_BRANCH and MOZILLA_1_8_0_BRANCH
at 2007-07-01 15:46 PDT.
verified fixed 1.8.1.5 using Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.5pre) Gecko/20070705 BonEcho/2.0.0.5pre ID:2007070504 and Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.5pre) Gecko/2007070503 BonEcho/2.0.0.5pre on Linux Fedora F7. No crash on the testcase from mats (comment #22). Adding verified keyword
verified fixed 1.8.0.13 using the branch testcase in this bug from mats (comment #22) and Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.0.13pre) Gecko/20070822 Firefox/1.5.0.13pre 

No crash on testcase - adding verified keyword
Group: security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: