Closed Bug 1116202 Opened 5 years ago Closed 5 years ago

Intermittent run-by-dir leakcheck | default process: 2331 bytes leaked (AsyncLatencyLogger, AudioNodeEngine, AudioNodeStream, CondVar, ControlMessage, ...) from test_AudioNodeDevtoolsAPI.html leaking AudioNodeStreams

Categories

(Core :: Web Audio, defect)

defect
Not set

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: vaibhav1994, Assigned: padenot)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file, 4 obsolete files)

in bug 1110982, we are looking to enable tests where we run a fresh browser instance per directory.  Usually what happens is that a few tests fail because they accidentally depend on the state of the browser from an earlier test.

In the try run:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c1a0dea1e39c

This test causes a leak in mac debug jobs. The failure goes like this:

 TEST-UNEXPECTED-FAIL | leakcheck | default process: 2331 bytes leaked (AsyncLatencyLogger, AudioNodeEngine, AudioNodeStream, CondVar, ControlMessage, ...)
Blocks: 1110982
Ehsan, I see you are the original author of this test file (from bug 1015783), I am not sure why it leaks and only on osx, maybe it is when we instantiate the AudioContext:
http://dxr.mozilla.org/mozilla-central/source/dom/media/webaudio/test/test_AudioNodeDevtoolsAPI.html#16

I don't have a osx box to verify this is easy to reproduce locally.
No longer blocks: 1110982
Flags: needinfo?(ehsan)
It would have been helpful if this bug mentioned that the failure is intermittent.

Anyway, this assertion explains the leak:

14:13:33     INFO -  [1653] ###!!! ASSERTION: Appshell already destroyed?: 'Error', file /builds/slave/try-osx64-d-000000000000000000/build/src/dom/media/MediaStreamGraph.cpp, line 1743
14:13:33     INFO -  #01: mozilla::MediaStreamGraphImpl::AppendMessage(mozilla::ControlMessage*) [dom/media/MediaStreamGraph.cpp:1798]
14:13:33     INFO -  #02: mozilla::MediaStream::Destroy() [dom/media/MediaStreamGraph.cpp:1983]
14:13:33     INFO -  #03: mozilla::dom::AudioNode::DestroyMediaStream() [xpcom/base/nsRefPtr.h:43]
14:13:33     INFO -  #04: mozilla::dom::AudioDestinationNode::DestroyMediaStream() [dom/media/webaudio/AudioDestinationNode.cpp:418]
14:13:33     INFO -  #05: mozilla::dom::AudioNode::DisconnectFromGraph() [dom/media/webaudio/AudioNode.cpp:188]
14:13:33     INFO -  #06: mozilla::dom::AudioNode::cycleCollection::Unlink(void*) [xpcom/base/nsRefPtr.h:208]
14:13:33     INFO -  #07: mozilla::dom::AudioDestinationNode::cycleCollection::Unlink(void*) [xpcom/glue/nsCOMPtr.h:371]
14:13:33     INFO -  #08: nsCycleCollector::CollectWhite() [xpcom/base/nsCycleCollector.cpp:3316]
14:13:33     INFO -  #09: nsCycleCollector::Collect(ccType, js::SliceBudget&, nsICycleCollectorListener*, bool) [xpcom/base/nsCycleCollector.cpp:3646]
14:13:33     INFO -  #10: nsCycleCollector::Shutdown() [xpcom/base/nsCycleCollector.cpp:3575]
14:13:33     INFO -  #11: nsCycleCollector_shutdown() [xpcom/base/nsRefPtr.h:43]
14:13:33     INFO -  #12: mozilla::ShutdownXPCOM(nsIServiceManager*) [gfx/layers/ipc/AsyncTransactionTracker.h:130]
14:13:33     INFO -  #13: ScopedXPCOMStartup::~ScopedXPCOMStartup() [toolkit/xre/nsAppRunner.cpp:1335]
14:13:33     INFO -  #14: XREMain::XRE_main(int, char**, nsXREAppData const*) [memory/mozalloc/mozalloc.h:234]
14:13:33     INFO -  #15: XRE_main [toolkit/xre/nsAppRunner.cpp:4446]
14:13:33     INFO -  #16: main [browser/app/nsBrowserApp.cpp:292]

The issue is that we destroy the AudioDestinationNode during the last CC in ShutdownXPCOM, at which point the appshell service is already gone away, so we cannot finish destroying the AudioNodeStreams, at which point those objects and anything else that it keeps alive will leak.

I never managed to reproduce this issue, but I wrote an experimental patch which might fix this.  Please test it.
Flags: needinfo?(ehsan)
Summary: test_AudioNodeDevtoolsAPI.html fails when run as a standalone directory → test_AudioNodeDevtoolsAPI.html fails intermittently when run as a standalone directory
Vaibhav, can you please test this patch and let me know if it fixes the issue?  Thanks!
Attachment #8542265 - Flags: feedback?(vaibhavmagarwal)
Component: DOM → Web Audio
7 / 10 10.8 debug m-oth jobs have the leak with this patch:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=aed4d244040e
Comment on attachment 8542265 [details] [diff] [review]
Destroy the destination node's media stream when the AudioContext is being shut down because the window is getting destroyed

Review of attachment 8542265 [details] [diff] [review]:
-----------------------------------------------------------------

while this is probably a good patch to have in general, it doesn't seem to be fixing the leak.
Attachment #8542265 - Flags: feedback?(vaibhavmagarwal) → feedback-
So it turns out that nsGlobalWindow::FreeInnerObjects is not guaranteed to be called for chrome window objects, which I think is the reason why the previous patch did not work.  Can you please try this one instead?
Attachment #8542590 - Flags: feedback?(jmaher)
Attachment #8542265 - Attachment is obsolete: true
Comment on attachment 8542590 [details] [diff] [review]
Destroy the destination node's media stream when XPCOM is being shut down

Review of attachment 8542590 [details] [diff] [review]:
-----------------------------------------------------------------

we still have a lot of leaks:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5e08d8164d81
Attachment #8542590 - Flags: feedback?(jmaher) → feedback-
(In reply to Joel Maher (:jmaher) from comment #11)
> we still have a lot of leaks:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=5e08d8164d81

Yes, I never intended to fix all possible leaks.  ;-)  But the AudioNodeStream leaks are gone now.  Please file the rest separately, I think there is another issue with the MediaStreamGraph shutdown that is not related to AudioNodeStreams.  Thanks!
Assignee: nobody → ehsan
Summary: test_AudioNodeDevtoolsAPI.html fails intermittently when run as a standalone directory → test_AudioNodeDevtoolsAPI.html leaks AudioNodeStreams intermittently when run as a standalone directory
Attachment #8542590 - Flags: review?(padenot)
Comment on attachment 8542590 [details] [diff] [review]
Destroy the destination node's media stream when XPCOM is being shut down

Review of attachment 8542590 [details] [diff] [review]:
-----------------------------------------------------------------

jmaher, I've filed bug 1117727 (with a log example and some notes), blocking 1110982 to track the rest of the leak, that I believe is unrelated, as ehsan said.
Attachment #8542590 - Flags: review?(padenot) → review+
This patch bounced because of test failures <https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&author=eakhgari%40mozilla.com>.  Turns out that AudioDestinationNode::DestroyMediaStream() might early return when called during shutdown!
This will prevent us from needing to rely on the last CC to destroy
the stream for us.  The last CC is too late for destroying streams
properly.
Attachment #8544082 - Flags: review?(padenot)
This caused failures on other platforms as well (like OSX 10.8 debug - https://treeherder.mozilla.org/ui/logviewer.html#?job_id=5085703&repo=mozilla-inbound)

Please give this a full Try run before pushing again.
Ah, all of these issues are caused because AudioContext is now being kept alive by the observer service for the duration of the browser's lifetime.  I am testing a fix on the try server right now.
Attachment #8544082 - Attachment is obsolete: true
Attachment #8544082 - Flags: review?(padenot)
Attachment #8542590 - Attachment is obsolete: true
This will prevent us from needing to rely on the last CC to destroy
the stream for us.  The last CC is too late for destroying streams
properly.

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c42aa732530c
Attachment #8544583 - Flags: review?(padenot)
Comment on attachment 8544583 [details] [diff] [review]
Destroy the destination node's media stream when XPCOM is being shut down

Wrong patch...
Attachment #8544583 - Attachment is obsolete: true
Attachment #8544583 - Flags: review?(padenot)
This will prevent us from needing to rely on the last CC to destroy
the stream for us.  The last CC is too late for destroying streams
properly.
Attachment #8544772 - Flags: review?(padenot)
Comment on attachment 8544772 [details] [diff] [review]
Destroy the destination node's media stream when XPCOM is being shut down

Review of attachment 8544772 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, but what's with the android m-10 perma orange on the try push?

::: dom/media/webaudio/AudioDestinationNode.cpp
@@ +403,3 @@
>  
> +    if (target) {
> +      // GetOwner() may return null if we are destroying the media streami

nit: s/streami/stream/
Attachment #8544772 - Flags: review?(padenot) → review+
Hmm, so I tried reverting the AudioDestinationNode hunk in <https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=b49d78696b90> but that didn't fix the m-10 perma orange.

Any chance you can take a look at why that happens, please?
Flags: needinfo?(padenot)
We can't really destroy the destination here, because the audio thread is still running, and the AudioDestinationNode is used all over the place to get the .currentTime. What happens here is that we destroy the AudioDestinationNode's MediaStream, and then it crashes some time later on a virtual call on a nullptr.

I think it would be better to force-shutdown the MSG on XPCOM shutdown or something. The issue is that that's also async (and can take some time, depending on the platform), because the audio callback needs to fire, notice it needs to exit immediately, and return. Is there an even that fires earlier than XPCOM shutdown, but also indicates shutdown we could hook up to?
Flags: needinfo?(padenot) → needinfo?(ehsan)
(In reply to Paul Adenot (:padenot) from comment #23)
> We can't really destroy the destination here, because the audio thread is
> still running, and the AudioDestinationNode is used all over the place to
> get the .currentTime. What happens here is that we destroy the
> AudioDestinationNode's MediaStream, and then it crashes some time later on a
> virtual call on a nullptr.

Ah, right.  Can't we just make that AudioNodeStream* an nsRefPtr<AudioNodeStream>?

> I think it would be better to force-shutdown the MSG on XPCOM shutdown or
> something. The issue is that that's also async (and can take some time,
> depending on the platform), because the audio callback needs to fire, notice
> it needs to exit immediately, and return. Is there an even that fires
> earlier than XPCOM shutdown, but also indicates shutdown we could hook up to?

There are several events that fire before shutdown but an async tear down of the MSG thread will not work with any of them since they're all fired sequentially one after another.

There is this new nsIAsyncShutdownBlocker interface which is supposed to let you spin the event loop before shutdown proceeds, but I can't find anywhere in the code where it's used (at least from C++).  Perhaps Yoric can help us on how to use it?
Flags: needinfo?(ehsan) → needinfo?(dteller)
Due to lack of reviewer, I haven't been able to land C++ code using nsIAsyncShutdownBlocker yet. However, I'm quite willing to help.

Essentially, the use case is a service FooService and clients that need to complete some work before FooService shuts down. In many cases, said work even needs to be started during shutdown. nsIAsyncShutdown* lets you describe this kind of dependencies, in most cases without spinning the event loop. Does this match your need?
Flags: needinfo?(dteller)
Hmm, not sure if we need any kind of dependency tracking here.  We just need to wait for the MSG thread's shutdown sequence to be finished before letting the XPCOM services die.

roc, should we just spin the event loop here?
Flags: needinfo?(roc)
Spinning the event loop during shutdown sounds very dangerous.

Is it not possible to cancel the callback, synchronize with the audio thread and have it shut down while we wait on a monitor?
Flags: needinfo?(roc) → needinfo?(padenot)
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #27)
> Spinning the event loop during shutdown sounds very dangerous.

That is my intuition as well, but FWIW it seems like common practice in other code such as IDB.  It seems to have worked in practice...  (and that surprises me!)
Duplicate of this bug: 1121539
Spinning event loop during xpcom-will-shutdown is no-no and documentation is clear about that, but
other than that it should be ok (-ish at least).
(In reply to :Ehsan Akhgari (not reading bugmail, needinfo? me!) from comment #28)
> That is my intuition as well, but FWIW it seems like common practice in
> other code such as IDB.

Also workers!
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #31)
> Also workers!

And PBackground shutdown
How about we spin in profile-before-change too, like all the k00l k1dz do these days? ;-)
(In reply to Robert O'Callahan (:roc) (offline Jan 17-22) (Mozilla Corporation) from comment #27)
> Spinning the event loop during shutdown sounds very dangerous.
> 
> Is it not possible to cancel the callback, synchronize with the audio thread
> and have it shut down while we wait on a monitor?

Reassigning to Paul.  If the answer to this question is yes, he's a better person to finish this anyway.
Assignee: ehsan → padenot
Summary: test_AudioNodeDevtoolsAPI.html leaks AudioNodeStreams intermittently when run as a standalone directory → Intermittent run-by-dir leakcheck | default process: 2331 bytes leaked (AsyncLatencyLogger, AudioNodeEngine, AudioNodeStream, CondVar, ControlMessage, ...) from test_AudioNodeDevtoolsAPI.html leaking AudioNodeStreams
:padenot, can you comment on this, otherwise I am inclined to disable this test for osx debug.
This seems to have mysteriously gone away.
yeah, not sure what's up, although since we now ::Close the context during FreeInnerObject, that certainly have changed the timing.
Flags: needinfo?(padenot)
Closing, feel free to reopen if it starts again.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Resolution: FIXED → WORKSFORME
Duplicate of this bug: 1121263
You need to log in before you can comment on or make changes to this bug.