Closed Bug 1434822 Opened 2 years ago Closed 2 years ago

Fix usage of MSHLFLAGS_NOPING

Categories

(Core :: IPC: MSCOM, enhancement)

All
Windows
enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox60 --- fixed

People

(Reporter: Jamie, Assigned: Jamie)

References

Details

Attachments

(4 files)

Bug 1384328 disabled COM garbage collection (set MSHLFLAGS_NOPING) for interceptors marshaled to the Firefox parent process. This was subsequently abstracted and used for IGeckoBackChannel as well in bug 1393589. Unfortunately, as Aaron noticed this morning, there's a bug in the FastMarshaler abstraction which causes NOPING to be used for external processes and *not* for our parent process, which is the opposite of what was intended. This may well be having a performance impact.

In addition to this, I recently discovered that using NOPING apparently causes objects to never be released. From a Microsoft support article:

> Also, it must be noted carefully that garbage collection is an all-or- nothing proposition. If you turn off garbage collection for an object, then COM-based lifetime management of the object is turned off also. This means that clients can no longer call Release() on their interface pointers and expect the object to receive the call. Release() called on any proxy is a no-op for such objects. This may seem counterintuitive at first, but it is the correct behavior because COM can no longer guarantee that the object will be shutdown when clients disappear.
( https://support.microsoft.com/en-us/help/171414/how-to-turn-off-the-com-garbage-collection-mechanism )

This may be the cause of suspicious high memory usage when accessibility is being used.  Looking at about:memory right now, I see the following for two content processes:
1,537.53 MB (86.47%) ── heap-unclassified
987.04 MB (84.93%) ── heap-unclassified
Marco also noted that a friend of his was seeing very high memory usage with Firefox. It's difficult for us to confirm whether this is a11y specific, since we can't use Firefox without a11y.

So, there are a few possible options to consider:
1. Stop using NOPING altogether. However, Aaron was seeing it cause problems, and with some brief testing, turning it on the way it was intended seemed to make buffer renders a bit faster. For example, I saw World War I render up to 400 ms faster with NVDA, but I didn't test it exhaustively.
2. Fix FastMarshaler so that we use NOPING when intended. However, that may cause memory usage to grow a lot. Since buffers touch every object, we'll be holding every object around forever. In contrast, we currently only hold objects forever which have been touched out-of-process.
3. Along with 2), find a way to release the memory. Ideas:
a) Somehow, when an accessible goes defunct, disconnect all proxies (CoDisconnectObject) and then call Release ourselves.
b) Implement a disconnect method in IGeckoBackChannel which is called by the handler.
Here are two try builds:

1. Use NOPING for our parent process (but ping for external processes) as was originally intended:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b6021899162c9e726b95f60729d0140a84d0aa0
It'd be good to know:
a) Whether this demonstrably improves performance relative to Nightly; and
b) Whether memory usage shoots up significantly relative to Nightly.

2. Never use NOPING:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0f4ece1b01d52a56c3ab67ff387770e2631ee53
It'd be good to know:
a) Whether this demonstrably hurts performance relative to Nightly; and
b) Whether this demonstrably decreases memory usage relative to Nightly.

Marco, if your friend is willing to run try builds, it'd be interesting to see if 2) fixes the memory problems.
(In reply to James Teh [:Jamie] from comment #1)
> 1. Use NOPING for our parent process (but ping for external processes) as
> was originally intended:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=3b6021899162c9e726b95f60729d0140a84d0aa0
> It'd be good to know:
> a) Whether this demonstrably improves performance relative to Nightly; and

Yes, it does! This thing hugely improves everything across the board, like buffer renders even of large documents, focus changes, seemingly everything related to accessibility. BUT:

> b) Whether memory usage shoots up significantly relative to Nightly.

A definite yes to that. Nightly, and the other try build, release memory over time again. For example, if I buffer-refresh World War I 5 times, a segment of between 27 and 30 MB of memory gets added for each refresh. Over time, Nightly and the below second try build release those instances in chunks, like after a few minutes, memory usage of such content processes drops again. But this does not happen for this particular try build. The memory gets added, but never released.

> 2. Never use NOPING:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=b0f4ece1b01d52a56c3ab67ff387770e2631ee53
> It'd be good to know:
> a) Whether this demonstrably hurts performance relative to Nightly; and

Yes, it does. Everything across the board, from buffer renders to tab switching to focus changes gets slow as hell with this build.

> b) Whether this demonstrably decreases memory usage relative to Nightly.

Nope, not radically I would say, but obviously, everything that Nightly *does* use NOPING for, which would not get released before, now does get released as well. So memory usage is better, but performance is bad, bad, bad.
I'm assigning this to me for now, as I'm doing some investigation. Whether or not I'm the final patch writer is yet to be determined. :)

I first want to verify whether CoDisconnectObject releases references associated with remote clients even when NOPING is set. The documentation doesn't specify whether CoDisconnectObject causes Release to be called. I've written a little test case and can confirm that CoDisconnectObject does release remote references without NOPING. Next step is to make it use NOPING and then test with that.
Assignee: nobody → jteh
Using this test case, I've verified that MSHLFLAGS_NOPING causes references for remote proxies to never be released automatically, even after waiting several minutes. However, I've also verified that CoDisconnectObject does release all of these references, even if MSHLFLAGS_NOPING is set.

Unfortunately, two conflicting problems occur to me:
1. We can't just have some mechanism which does an extra Release when we're done with the accessible because the stub manager might actually be holding more than one reference; we don't necessarily know how many references it acquires. Also, it's possible COM might release these references when shutting down (e.g. CoUninitialize), which might cause a crash if we've already released them. I haven't investigated that yet.
2. This was why I was investigating CoDisconnectObject. My thinking was that we could have an IGeckoBackChannel method on the server which gets called when the handler ref count goes to 0. However, it just occurred to me that using CoDisconnectObject will also disconnect any out-of-process clients, not just the Firefox parent process.

So, I think the only way we can do this is to have a Gecko accessible notify its interceptor somehow when it goes defunct. But I suspect that's going to be ugly. :(

Some notes on testing with this test case:

- Compile with: cl msaaTest.cpp /EHsc user32.lib ole32.lib oleaut32.lib oleacc.lib

- The USE_NOPING constant near the top of the file determines whether MSHLFLAGS_NOPING gets set.

- If you're testing with NVDA, to ensure you get pure out-proc COM without UIA interfering (since it acquires in-proc references), disable UIA by doing this in the NVDA Python console:
import config; config.conf["UIA"]["enabled"] = False
Then save config and restart NVDA.

- Sending WM_USER calls CoDisconnectObject and releases the app's own reference to the object. With NVDA Python console, NVDA+control+z in the app's GUI and:
import winUser; winUser.sendMessage(fg.windowHandle, winUser.WM_USER, 0, 0)

- Without NOPING, if you alt+tab into the app's GUI window and then back to its console, you'll notice the ref cout drops to 1. That means all remote references got released.

- Without NOPING, if you restart NVDA while focused in the app's GUI, the ref count does not drop to 1, presumably because the client didn't gracefully release before it exited. (Hmm!) However, after a few minutes, it drops to 1.

- With NOPING, the ref count never drops to 1. However, sending WM_USER does drop it, thus proving CoDisconnectObject cleans up remote refs.
Two more points:
1. VFO did some testing with a relations + NOPING build and didn't see any perf improvement from NOPING with JAWS.
2. I do wonder whether calling some IGeckoBackChannel method when a handler cleans up would mitigate much of the perf benefit we're seeing. In other words, is the major part of the perf gain here just because COM isn't remoting calls to Release? That should be easy enough to confirm: I'll just add a dummy IGeckoBackChannel cleanup method and throw it in a try build.
(In reply to James Teh [:Jamie] from comment #5)
> Two more points:
> 1. VFO did some testing with a relations + NOPING build and didn't see any
> perf improvement from NOPING with JAWS.

Interesting, I felt that there was a difference. And as I already found with NVDA, the NOPING build is much faster overall, including document loads.

> 2. I do wonder whether calling some IGeckoBackChannel method when a handler
> cleans up would mitigate much of the perf benefit we're seeing. In other
> words, is the major part of the perf gain here just because COM isn't
> remoting calls to Release? That should be easy enough to confirm: I'll just
> add a dummy IGeckoBackChannel cleanup method and throw it in a try build.

Hm, I see the perf gain when loading pages, which would not release stuff, but obtain objects, right?
(In reply to Marco Zehe (:MarcoZ) from comment #6)
> Hm, I see the perf gain when loading pages, which would not release stuff,
> but obtain objects, right?

NVDA, at least, releases objects as soon as it has queried all the info from them. In other words, a buffer render always releases all objects it queries. So, without NOPING, there would always be remote Release calls.
New try builds against latest mozilla-central (which now includes the relations caching patches from bug 1431264):

1. Use NOPING for our parent process (but ping for external processes) as was originally intended:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=05f7ec8632c76c7cf6bd1a56370c53de6e2dab08

2. Never use NOPING:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b42ca3c43a90dea7a2cbd721318f5727955be020
(In reply to James Teh [:Jamie] from comment #8)
> New try builds against latest mozilla-central (which now includes the
> relations caching patches from bug 1431264):
> 
> 1. Use NOPING for our parent process (but ping for external processes) as
> was originally intended:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=05f7ec8632c76c7cf6bd1a56370c53de6e2dab08

NVDA loves this build, it is very smooth and loads buffers much faster than in Nightly or the below builds. This is consistent with what I described above.

JAWS, on the other hand, doesn't like this build. It temporarily freezes, focus changes take much longer than with the below build, buffers temporarily disappear, writing in Gmail or other text fields causes delays, and the braille display doesn't follow along smoothly at all.

> 2. Never use NOPING:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=b42ca3c43a90dea7a2cbd721318f5727955be020

NVDA does not like this build. Everything is much slower than with Nightly or the above NOPING As Intended build. Buffer renders, focus changes, etc.

JAWS, on the other hand, loves this build. All strange artifacts of buffers disappearing, temporary freezes, and such are completely gone. Bringing up the Compose dialog in Gmail, which is quite delayed in Nightly and the above build, is instant in this build. Typing in the Subject field and message body causes the braille display to follow along very smoothly. I worked with that build for almost 20 minutes with JAWS without ever encountering a single problem.

For Nightly, I can only say that NVDA likes it more than JAWS does. JAWS there shows more problems, it is almost comparable to the NOPING as Intended build above, maybe works a little more smoothly, but just barely.
To recap here what Jamie and I have been working on over the past couple of days:

The goal is to use NOPING as originally intended, since it does indeed make NVDA noticeably faster with Firefox, and also give JAWS users a good experience. And fix the memory leaks. After an initial try build last week, which made NVDA fast, but leaked memory left and right, and made JAWS and Firefox freeze immediately after startup, on Monday, Jamie gave me another try build:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6fdd5a2620c7246295cd4836770c637da88d109a

The results here are:

1. NVDA is still fast.
2. JAWS is fast now, too.
3. But the memory leaks are still plenty, although not as plenty as with the original try build from a few weeks ago, and the one I tested last week. So some memory is already freed up, but not enough yet. By comparison, the amount of Heap - Unclassified and Total for a content process in about:memory are decreasing slightly, but not nearly enough yet, whereas they didn't at all decrease last week. And Nightly still decreases those values more than this current try build.

Before Jamie went to bed on Monday, he hinted in the #accessibility IRC channel that he found out that all accessibles fetched via HyperText aren't freed yet, even though they should. This is being investigated next.
Comment on attachment 8953039 [details]
Bug 1434822 part 3: On Windows, when a content Accessible shuts down, disconnect all associated remote clients.

https://reviewboard.mozilla.org/r/222312/#review228212
Attachment #8953039 - Flags: review?(mzehe) → review+
Comment on attachment 8953037 [details]
Bug 1434822 part 1: Disable COM ping functionality for our parent process instead of for external processes.

https://reviewboard.mozilla.org/r/222308/#review229684
Attachment #8953037 - Flags: review?(aklotz) → review+
Comment on attachment 8953038 [details]
Bug 1434822 part 2: mscom: Add a function to disconnect all remote clients associated with a given target.

https://reviewboard.mozilla.org/r/222310/#review229682

::: ipc/mscom/IHandlerProvider.h:26
(Diff revision 1)
>  {
>    virtual STDMETHODIMP GetHandler(NotNull<CLSID*> aHandlerClsid) = 0;
>    virtual STDMETHODIMP GetHandlerPayloadSize(NotNull<IInterceptor*> aInterceptor, NotNull<DWORD*> aOutPayloadSize) = 0;
>    virtual STDMETHODIMP WriteHandlerPayload(NotNull<IInterceptor*> aInterceptor, NotNull<IStream*> aStream) = 0;
>    virtual STDMETHODIMP_(REFIID) MarshalAs(REFIID aIid) = 0;
> +  virtual STDMETHODIMP DisconnectRemotes() = 0;

Can you please rename this to DisconnectHandlerRemotes? It makes things a bit more clear at the call site in Interceptor.

::: ipc/mscom/Interceptor.h:89
(Diff revision 1)
> +   * the Interceptor, its target and any objects associated with the
> +   * HandlerProvider.
> +   * @return S_OK if there was an Interceptor for the given target,
> +   *         S_FALSE if there was not.
> +   */
> +  static HRESULT DisconnectRemotesForTarget(IUnknown* aTarget);

We haven't fixed the "canonical IUnknown in live set" issue yet, so can you please clarify in the comments *which* IUnknown to use?

::: ipc/mscom/Interceptor.cpp:847
(Diff revision 1)
> +  // S_FALSE instead of an error in that case.
> +  RefPtr<IWeakReference> existingWeak(Move(GetLiveSet().Get(aTarget)));
> +  if (!existingWeak) {
> +    return S_FALSE;
> +  }
> +  RefPtr<IWeakReferenceSource> existingStrong;

Nit: please add a blank line between the closing } of the previous block and the next line.

::: ipc/mscom/MainThreadHandoff.cpp:598
(Diff revision 1)
> +MainThreadHandoff::DisconnectRemotes()
> +{
> +  if (!mHandlerProvider) {
> +    return E_NOTIMPL;
> +  }
> +  return mHandlerProvider->DisconnectRemotes();

Nit: please add a blank line between the closing } of the previous block and the next line.
Attachment #8953038 - Flags: review?(aklotz) → review+
Pushed by jteh@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c055d7b57071
part 1: Disable COM ping functionality for our parent process instead of for external processes. r=aklotz
https://hg.mozilla.org/integration/autoland/rev/54526e7c7b6c
part 2: mscom: Add a function to disconnect all remote clients associated with a given target. r=aklotz
https://hg.mozilla.org/integration/autoland/rev/918b41fa66d9
part 3: On Windows, when a content Accessible shuts down, disconnect all associated remote clients. r=MarcoZ
Depends on: 1442523
Backout by btara@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/ffb961bc9a04
Backed out changeset 918b41fa66d9 is causing 1442523. a=backout
https://hg.mozilla.org/mozilla-central/rev/49d0f2d94285
Backed out changeset 54526e7c7b6c for causing 1442523. a=backout
https://hg.mozilla.org/mozilla-central/rev/77e7592cbccf
Backed out changeset c055d7b57071 for causing 1442523. a=backout
Backed out on request for causing hangs (bug 1442523).
Flags: needinfo?(jteh)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla60 → ---
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/autoland/rev/1556761288cc
part 1: Disable COM ping functionality for our parent process instead of for external processes. r=aklotz
https://hg.mozilla.org/integration/autoland/rev/86288678c3d6
part 2: mscom: Add a function to disconnect all remote clients associated with a given target. r=aklotz
https://hg.mozilla.org/integration/autoland/rev/4811c426205d
part 3: On Windows, when a content Accessible shuts down, disconnect all associated remote clients. r=MarcoZ
Bug 1442523 has been fixed and landed, so we requested that this be re-landed (comment 24).
Flags: needinfo?(jteh)
https://hg.mozilla.org/mozilla-central/rev/1556761288cc
https://hg.mozilla.org/mozilla-central/rev/86288678c3d6
https://hg.mozilla.org/mozilla-central/rev/4811c426205d
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
You need to log in before you can comment on or make changes to this bug.