Hang under accessibility code

RESOLVED FIXED

Status

()

--
critical
RESOLVED FIXED
11 years ago
11 years ago

People

(Reporter: bzbarsky, Assigned: aaronlev)

Tracking

Trunk
x86
Linux
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

See http://crash-stats.mozilla.com/report/index/cb031823-55e4-11dc-9e64-001a4bd43e5c?date=2007-08-29-04

I got that stack by sending a kill -SEGV once my browse hung.  All I'd done was type a space to page down.

Two questions:

1)  Why is it hanging?
2)  Why is this code even running?  It shouldn't be.
I have to wonder what fraction of the performance problems I've seen recently is due to accessibility code being enabled when it shouldn't be...
Severity: normal → critical
For what it's worth, I just stepped through the accessibility part of nsWindow::NativeCreate.  sAccessibilityEnabled ends up false after that code runs (gconf doesn't say to turn it on, in other words, and neither does the env var).

So how did I ever end up with that stack?  Is there anything that can enable accessibility code partway though program execution?
(Assignee)

Comment 3

11 years ago
Ginn owns the a11y enablement code well on Linux.
Assignee: aaronleventhal → ginn.chen
Note that this bug is about the hang, though.

Comment 5

11 years ago
(In reply to comment #2)
> So how did I ever end up with that stack?  Is there anything that can enable
> accessibility code partway though program execution?
> 

In DOM Inspector, and check View->Show Accessible Node

I don't know if there's a third way to do this.
I guess this bug is not only reproducible on Linux from looking at the stack.
(Assignee)

Comment 6

11 years ago
Yes, I realize it's about the hang, but I'd like Ginn to look at both issues.

Boris, can you set a breakpoint on the constructor of nsAccessibilityService and see how it's getting initialized?

Comment 7

11 years ago
Boris, what's the page you're browsing?

I tried to reproduce it with a lot of pages, no matter a11y on/off, I didn't get hang.
I'll do more tests.
> In DOM Inspector, and check View->Show Accessible Node

What does that enable?  That doesn't seem to set the presshell global boolean to true.  Does it turn no anything that would lead to accessibility being active in documents other than the one you're inspecting?

> Boris, can you set a breakpoint on the constructor of nsAccessibilityService

It's not.  That's the point.  I start the browser, I browse around a bit.  At no point is it turned on.  Then after a few days of browsing I hang with the stack in question.  What happened?

> Boris, what's the page you're browsing?

Had I known, I would have said.  :(  I was definitely doing some combination of find-as-you-type and page-down scrolling via the space bar on a tinderbox log and an LXR page (nsCSSFrameConstructor.cpp, I think).  I don't know which exact keystroke in which window caused the loop to start.
(Assignee)

Comment 9

11 years ago
Something really strange is happening, if the accessibility service is never constructed, and you have this stack. The a11y service is needed to create the root/doc accessible objects that can own an nsCaretAccessible which sets up the sselection listener.

Boris, can you watch what happens to nsIPresShell::gIsAccessibilityActive and log any changes that occur to it while you're using the app?

In the normal case where a11y code runs, it should start as false until the a11y service is set. Then it would be set to true, just after the a11y service would have been created. Finally at shutdown it would be set to false. 

If a11y doesn't run, it should stay false the whole time. The a11y service would also not be created.

Any other scenario is bizarre.

Ginn, can you figure out why sAccessibilityEnabled is false and the a11y code is running? Why is there a different variable to keep track if whether it's enabled than what's in pres shell. IT's a bit dangerous because they could get out of sync bo?
> if the accessibility service is never constructed

It's never constructed when I try to catch it being constructed in a debug build.  It's pretty clear from that stack that it _does_ get constructed at some point.

> Boris, can you watch what happens to nsIPresShell::gIsAccessibilityActive 

In a nightly build (which is where I see the problem, when I see it)?  How?

Again, what I end up seeing is that after a few days of browsing accessibility has somehow gotten enabled.
I just got this again: http://crash-stats.mozilla.com/report/index/31f58438-567e-11dc-9790-001a4bd46e84?date=2007-08-29-22

This time I know some things for certain:

1)  I never opened DOM inspector
2)  The only pages I opened were LXR and Google groups search
3)  /desktop/gnome/interface/accessibility is NOT checked in gconf-editor
(Assignee)

Comment 12

11 years ago
Do you have any extensions enabled?
> 1)  I never opened DOM inspector

Actually, nevermind.  I did.  I just do it so automatically when I need to check on computed styles for a page that I didn't notice myself doing it.  I've done some testing and filed bug 394253, which covers the "accessibility enabled when it shouldn't be" angle.  The hang issue in this bug remains.
(Assignee)

Updated

11 years ago
Blocks: 396346
(Assignee)

Comment 14

11 years ago
The hanging crash stack from bug 392800 (fixed on 8/20, 8 days before this was filed) also involved nsHyperTextAccessible::QueryInterface() and the call to Role().

It could that Role() call we use in QI does something which calls another QI.

We could put some reentry proofing via a static variable in there but it wouldn't solve the core issue.

For some reason the stack here doesn't show me much, unfortunately.
(Assignee)

Comment 15

11 years ago
Bz, are you still getting this hang? It doesn't look like it hung from you pressing space, but actually the click in the page content right before you pressed space. Any idea what web page you were on?

Here's what I can see from the stack:

On Linux, with a11y active but no screen reader or other AT running:
1. User clicks on some content
2. We are notified of a selection change
3. We try to get the offset for a caret
4. In order to do that, we get the children of the accessible text
5. While it's getting the children, it tries to init an accessible and in that method QI's to nsPIAccessNode
6. The accessible is an HTML text leaf 
7. The QI for HTML the text leaf tries to get the role to see if it's generated text, in which case it's static text

Aaron, I haven't seen this since opening inspector stopped automatically triggering accessibility code.

Before that, it was random; I never figured out a reliable way to reproduce it.  Usually it seemed to hit on lxr or tinderbox logs, but that might be measurement bias: I load those a lot.
(Assignee)

Comment 17

11 years ago
Created attachment 287121 [details] [diff] [review]
Simplify QI so it avoids infinite recursion

I think this will fix it but I can't prove it.
Assignee: ginn.chen → aaronleventhal
Status: NEW → ASSIGNED
Attachment #287121 - Flags: review?(ginn.chen)

Comment 18

11 years ago
Comment on attachment 287121 [details] [diff] [review]
Simplify QI so it avoids infinite recursion

missed nsHTMLImageAccessible?

With this patch, image will have texteditable interface.
Attachment #287121 - Flags: review?(ginn.chen) → review-
(Assignee)

Comment 19

11 years ago
Created attachment 287406 [details] [diff] [review]
Address Ginn's comments

BTW this fix may help improve performance.
Attachment #287121 - Attachment is obsolete: true
Attachment #287406 - Flags: review?(ginn.chen)

Updated

11 years ago
Attachment #287406 - Flags: review?(ginn.chen) → review+
(Assignee)

Comment 20

11 years ago
Comment on attachment 287406 [details] [diff] [review]
Address Ginn's comments

Marco, can you run with this patch for a while?
Attachment #287406 - Flags: approval1.9?

Comment 21

11 years ago
(In reply to comment #20)
> Marco, can you run with this patch for a while?

I do not notice any difference. Should I?
(Assignee)

Comment 22

11 years ago
No, that's good that there's no difference. Thanks for testing.

Updated

11 years ago
Attachment #287406 - Flags: approval1.9? → approval1.9+
(Assignee)

Comment 23

11 years ago
We hope this fixes it, and in any case it's cleaner and less brittle in terms of guture changes. We'll keep looking for any new reports of similar lock ups.
Status: ASSIGNED → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.