Closed Bug 1388409 (jaws-jank) Opened 3 years ago Closed 2 years ago

Long jank issues with JAWS and Nightly on text heavy pages

Categories

(Core :: Disability Access APIs, defect, P2)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox57 --- affected

People

(Reporter: jimm, Unassigned)

References

(Blocks 2 open bugs)

Details

(Whiteboard: aes+)

Attachments

(3 files)

STR:

1) open jaws and then nightly
2) in the one open tab visit the westworld wikipedia page

https://en.wikipedia.org/wiki/Westworld_(TV_series)

3) wait for the page to become responsive

In vmware / win10 on my workstation this period of time is around ten minutes.

Occasionally the browser would simply exit without an error. No reports visible in about:crashes when this happens.
By comparison, NVDA takes under 1 second for this article with E10S on and on my regular Windows 10 (without a VM). So I suspect even in a VM NVDA will be a lot faster.
let's go with P2 as it's JAWS specific (I'm sure we can do it better on our side, but we have to sort out generic perf problems at first)
Priority: -- → P2
Duplicate of this bug: 1369215
Blocks: 1354238
Here's a profile with a 642 ms text change events:
https://perf-html.io/public/ac8fb10c13c8c1912d61c457a9423ac7f264e0b8/calltree/?hiddenThreads=&range=70.5309_71.6750&thread=0&threadOrder=0-2-3-4-1-5

We spend time in a few places all of which are tied to the jaws in-proc dll doing com stuff. The most notable seems to be 72 msec spend in qi'ing a remote interface.
Flags: needinfo?(surkov.alexander)
Here' a profiler of the westworld wikipedia page with Jaws, Nightly 57.0a1 on Win 10 64bit
https://perfht.ml/2uxpOb0
(In reply to Jim Mathies [:jimm] from comment #5)
> Here's a profile with a 642 ms text change events:
> https://perf-html.io/public/ac8fb10c13c8c1912d61c457a9423ac7f264e0b8/
> calltree/?hiddenThreads=&range=70.5309_71.6750&thread=0&threadOrder=0-2-3-4-
> 1-5

Is it accurate URL? I don't see 642ms number. I see that 98ms (16%) is spent on PDocAccessible::Msg_ShowEvent and 34% (212ms) is spent by fsDOMSrv.dll. Could you give me please some instructions how to read the profile properly?
hmmm, you'll have to follow the correct link from within wikipedia suggestion.
profile while in heavy hang at westworld wikipedia page. https://perfht.ml/2vXeRCX
(In reply to Tracy Walker [:tracy] from comment #10)
> profile while in heavy hang at westworld wikipedia page.
> https://perfht.ml/2vXeRCX

Looks like the browser's stuck in a long PDocAccessible::Msg_SyncTextChangeEvent from content.
Alias: jaws-jank
Group: mozilla-employee-confidential
Attached file mainthreadstack.txt
(In reply to Jim Mathies [:jimm] from comment #12)
> Here's a dump during a bad hang while loading the westworld page. (taken
> from a vm)
> 
> http://mathies.com/mozilla/firefox.zip

main thread stack for this hang.
(In reply to Jim Mathies [:jimm] from comment #11)
> (In reply to Tracy Walker [:tracy] from comment #10)
> > profile while in heavy hang at westworld wikipedia page.
> > https://perfht.ml/2vXeRCX
> 
> Looks like the browser's stuck in a long
> PDocAccessible::Msg_SyncTextChangeEvent from content.

What do you mean by long? Is it different than large number of events?

It's interesting that on a bottom of the stack is NSWaitForMultipleObjects taking about 100% of time, and there's no any evidence that text events generation itself takes anything, all I can see is the time is spent on events delivering from content to a parent process. Is this observation correct?
Flags: needinfo?(surkov.alexander)
Adding NeedInfo as reminder to add perf profiles here.
Flags: needinfo?(gwimberly)
This profile surface a very interesting idea. Some of this requires some history, so please bear with me:

My first cut of a11y+e10s used a synchronous message, sent immediately after SendPDocAccessibleConstructor, to move the top-level doc's COM proxy over to the parent.

We hit all kind of crashing with that scheme, so instead I figured that maybe we should move the COM proxy stuff into SendPDocAccessibleConstructor. This creates another problem, however, because SendPDocAccessibleConstructor is async. And we don't do any preprocessing on IPDL to allow us to make it sync on Windows only.

Based on some suggestions that were brought up in the a11y meeting, I modified our code to do the following:
* Content calls SendPDocAccessibleConstructor. When the parent is done processing that, it calls SendCOMProxy which is an async message sent to content that says, "Hey, I'm done with that constructor and here's the COM proxy of the top-level doc's parent"

* Between the time that Content calls SendPDocAccessibleConstructor and the time that RecvCOMProxy fires, that PDocAccessible enqueues any events that are supposed to fire on it.

* Once RecvCOMProxy fires, the PDocAccessible then fires all of those events that it had accumulated.


I'm thinking that this scheme is hurting the ability of a11y code to properly coalesce events, since this none of this enqueuing in the IPDL actors is visible to it.

Now that so many crashes have been fixed by other bugs, I think it is worth revisiting the sync SendCOMProxy message. That might fix all of our event problems for free.

I can throw together a test patch + try build and we can see how that goes.
(In reply to Aaron Klotz [:aklotz] (a11y work receiving priority right now, please send interceptor reviews to dmajor or handyman) from comment #16)

> I'm thinking that this scheme is hurting the ability of a11y code to
> properly coalesce events, since this none of this enqueuing in the IPDL
> actors is visible to it.

Can you elaborate on this? I believe the event coalescence happens before we send any events over ipc, no?
Flags: needinfo?(aklotz)
(In reply to alexander :surkov from comment #14)
> (In reply to Jim Mathies [:jimm] from comment #11)
> > (In reply to Tracy Walker [:tracy] from comment #10)
> > > profile while in heavy hang at westworld wikipedia page.
> > > https://perfht.ml/2vXeRCX
> > 
> > Looks like the browser's stuck in a long
> > PDocAccessible::Msg_SyncTextChangeEvent from content.
> 
> What do you mean by long? Is it different than large number of events?
> 
> It's interesting that on a bottom of the stack is NSWaitForMultipleObjects
> taking about 100% of time, and there's no any evidence that text events
> generation itself takes anything, all I can see is the time is spent on
> events delivering from content to a parent process. Is this observation
> correct?

Content 1 of 2 is sitting in an ipc wait after calling SendSyncTextChangeEvent. That's pretty much all this thread does for the entire profile. The parent is missing data but there is some on the end.
(In reply to Jim Mathies [:jimm] from comment #18)
> (In reply to alexander :surkov from comment #14)
> > (In reply to Jim Mathies [:jimm] from comment #11)
> > > (In reply to Tracy Walker [:tracy] from comment #10)
> > > > profile while in heavy hang at westworld wikipedia page.
> > > > https://perfht.ml/2vXeRCX
> > > 
> > > Looks like the browser's stuck in a long
> > > PDocAccessible::Msg_SyncTextChangeEvent from content.
> > 
> > What do you mean by long? Is it different than large number of events?
> > 
> > It's interesting that on a bottom of the stack is NSWaitForMultipleObjects
> > taking about 100% of time, and there's no any evidence that text events
> > generation itself takes anything, all I can see is the time is spent on
> > events delivering from content to a parent process. Is this observation
> > correct?
> 
> Content 1 of 2 is sitting in an ipc wait after calling
> SendSyncTextChangeEvent. That's pretty much all this thread does for the
> entire profile. The parent is missing data but there is some on the end.

I see a lot of calls by the JAWS module to AccessibleHandler::get_relations. Aaron, is that something we can optimize?
(In reply to Jim Mathies [:jimm] from comment #19)
> I see a lot of calls by the JAWS module to AccessibleHandler::get_relations.
> Aaron, is that something we can optimize?

The main work that we did there was bug 1384328.

My understanding is that there are newer APIs that JAWS could switch to on their end that might help.

Alex, what is the replacement for get_relations?
Flags: needinfo?(surkov.alexander)
relationTargetsOfType [1], which may be a perf relief if JAWS doesn't need to grab all relations at once, or if IAccessibleRelation interface is costy by itself to expose through COM jungle.

If you see that they call a lot of get_relations, then it could be that they want to grab all possible relations for a whole document, in which case, the method might be not that helpful as I anticipated. If that's a case, then we probably should reach them asking if they can do something else. 

[1] http://git.linuxfoundation.org/?p=a11y/ia2.git;a=blob;f=api/Accessible2_2.idl;h=e90c2a348d48828488ce7e657d82ee446c598f7f;hb=HEAD#l114
Flags: needinfo?(surkov.alexander)
(In reply to Jim Mathies [:jimm] from comment #18)

> Content 1 of 2 is sitting in an ipc wait after calling
> SendSyncTextChangeEvent. That's pretty much all this thread does for the
> entire profile. The parent is missing data but there is some on the end.

So, it is waiting for a parent process to answer? Btw, I see same picture for 'content 2 of 2' and GeckoMain in a pref profile from comment #10. What is a bottleneck then there? Is the inter-process communication a thin place?

(In reply to Jim Mathies [:jimm] from comment #19)

> I see a lot of calls by the JAWS module to AccessibleHandler::get_relations.
> Aaron, is that something we can optimize?

Which profile do you look at?

Do you see a lot of get_relations calls while we deliver text change events? Are there any show/hide events in a profile? I'm asking because it feels weird if JAWS requests relations on text change events.
(In reply to Jim Mathies [:jimm] from comment #5)
> Here's a profile with a 642 ms text change events:
> https://perf-html.io/public/ac8fb10c13c8c1912d61c457a9423ac7f264e0b8/
> calltree/?hiddenThreads=&range=70.5309_71.6750&thread=0&threadOrder=0-2-3-4-
> 1-5
> 
> We spend time in a few places all of which are tied to the jaws in-proc dll
> doing com stuff. The most notable seems to be 72 msec spend in qi'ing a
> remote interface.

Hi Jim are you looking here (see attachment)... when I hover over the SendSyncTextChangeEvent I see 610ms...
Flags: needinfo?(jmathies)
Ah we see a 624ms (642 was probably a typo) activity marker in content process 1. Cancelling NI.
Flags: needinfo?(jmathies)
(In reply to alexander :surkov from comment #22)

> (In reply to Jim Mathies [:jimm] from comment #19)
> 
> > I see a lot of calls by the JAWS module to AccessibleHandler::get_relations.
> > Aaron, is that something we can optimize?
> 
> Which profile do you look at?
> 
> Do you see a lot of get_relations calls while we deliver text change events?
> Are there any show/hide events in a profile? I'm asking because it feels
> weird if JAWS requests relations on text change events.

Thanks to Markus, demoing some profiling tricks. There's indeed a spike on get_relations [1] but those are called by JAWS. My guess will be we are bombing them with events, and that's how they answer it. It'd be interesting to reach them for their insight on the issue.

[1] https://perfht.ml/2w7IsdZ
> (In reply to Aaron Klotz [:aklotz] (a11y work receiving priority right now,
> please send interceptor reviews to dmajor or handyman) from comment #16)
> 
> > I'm thinking that this scheme is hurting the ability of a11y code to
> > properly coalesce events, since this none of this enqueuing in the IPDL
> > actors is visible to it.
> 
> Can you elaborate on this? I believe the event coalescence happens before we
> send any events over ipc, no?

From what I am seeing, with 100% async messaging, no events are coalesced. When we force content to wait for acknowledgement from the parent that the COM proxies have been set up, then the events do become coalesced (see below).

(In reply to Aaron Klotz [:aklotz] (a11y work receiving priority right now, please send interceptor reviews to dmajor or handyman) from comment #16)
> I can throw together a test patch + try build and we can see how that goes.

I have seen some positive results with NVDA on Marco's WWI wikipedia page. The pause is gone there.

Trying JAWS now but I am having some deadlock issues.(In reply to alexander :surkov from comment #17)
Flags: needinfo?(aklotz)
(In reply to Aaron Klotz [:aklotz] (a11y work receiving priority right now, please send interceptor reviews to dmajor or handyman) from comment #26)
> I have seen some positive results with NVDA on Marco's WWI wikipedia page.
> The pause is gone there.
> 
> Trying JAWS now but I am having some deadlock issues.

Upon further testing, I take that back. :-(
Got a perf from a hang during e10s/a11y testing with the live region:
http://bit.ly/2xvaCIu
Flags: needinfo?(gwimberly)
Correction to Comment 28:

http://bit.ly/2xvACIu
I just realized that this isn't in our aes list, so putting it there. This is essentially what I am seeing, too.
Whiteboard: aes+
(In reply to Aaron Klotz [:aklotz])
> > Can you elaborate on this? I believe the event coalescence happens before we
> > send any events over ipc, no?
> 
> From what I am seeing, with 100% async messaging, no events are coalesced.
> When we force content to wait for acknowledgement from the parent that the
> COM proxies have been set up, then the events do become coalesced (see
> below).

how exactly you do that ('force content to wait for acknowledgement from the parent')?
removed the crash dump file linked here. opening this back up.
Group: mozilla-employee-confidential
Getting long hangs while doing JAWS testing for Pre-Beta: This was a 6-7 minute hang when doing ARIA testing-

https://perfht.ml/2f1054o
STR: 
1) run JAWS
2) load test case http://oaa-accessibility.org/example/2/
3) as soon as the page loads, hit ctrl-L
4) wait for the address bar to take focus
5) collect profile

https://perfht.ml/2jwsJPT

Address bar focus delay was about 30-45 seconds here on a surface pro 4 tablet.
STR:
1) run JAWS
2) load test case http://oaa-accessibility.org/example/4/
3) as soon as the page loads, hit ctrl-L
4) wait for the address bar to take focus
5) collect profile

https://perfht.ml/2jx4kJZ

28 second page load hang
STR:

1) open JAWS then nightly
2) search bing.com for 'westworld'
3) when search results display press ctrl-L
4) wait for the address bar to take focus
5) collect profile

https://perfht.ml/2jvLPG3

This collection got a little weird at the end: the browser was still bogged down somewhat after keyboard input returned. In the task manager the tab associated with bing was consuming about 20% cpu with the parent doing the same. I shutdown JAWS and waited but the problem persisted. Finally I terminated the bing content process in the task manager, and then was finally able to ctrl-shift-2 and post this profile.
The profile in comment 36 is interesting because content process 1's message manager is sending a sync message. Then in content process 3, we also end up sending a sync message via PDocAccessible::SendSyncTextChangeEvent.

But it looks to me like JAWS is running on the browser's main thread via the app shell event. So content process 1's sync message is held up while JAWS is doing its thing on the main thread.
Duplicate of this bug: 1353818
Blocks: 1400434
Does this require a Jaws fix?
Flags: needinfo?(jmathies)
(In reply to David Bolter [:davidb] (NeedInfo me for attention) from comment #40)
> Does this require a Jaws fix?

I think it likely requires improvements on both sides.
Flags: needinfo?(jmathies)
Just tested 2018 on one of our reference laptops. I still see hangs on first page load on some pages. I'm not running the latest jaws though, updating now and will collect a profile to see how things look currently in nightly and beta.
(In reply to Jim Mathies [:jimm] from comment #42)
> Just tested 2018 on one of our reference laptops. I still see hangs on first
> page load on some pages. I'm not running the latest jaws though, updating
> now and will collect a profile to see how things look currently in nightly
> and beta.

Latest releases of JAWS 2018 and Firefox nightly:

http://perfht.ml/2gPzTux

On our low end reference hardware this combination is pretty much unusable.
Here's a related profile for latest NVDA 2017.3 on the same low end hardware - 

https://perfht.ml/2iittEE

Generally usable browser, minor page load jank on WWI but overall no major problems.
some recent jaws updates brought perf improvements: https://www.freedomscientific.com/downloads/JAWS/JAWSWhatsNew#Enhancements
jim, can you still reproduce with the steps in comment #0?
Flags: needinfo?(jmathies)
I think it's safe to close this out now. We've done a ton of perf work for JAWS, and continue to. Freedom Scientific continue to improve perf on their end as well.
Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(jmathies)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.