Web Audio CPU usage climbs during long waits for garbage collection

RESOLVED FIXED in Firefox 41

Status

()

defect
P1
normal
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: karlt, Assigned: jesup)

Tracking

(Blocks 2 bugs, {perf})

26 Branch
mozilla41
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox41 fixed)

Details

()

Attachments

(2 attachments, 3 obsolete attachments)

Reporter

Description

5 years ago
1. Load https://dl.dropboxusercontent.com/u/40949268/Bugs/beep.html
2. Select sampling rate of 44100 or 48000, trying to guess native rate but it
   is not critical to reproducing.
   (Leave sample frame size at 2048.)
3. Wait for 10 or 20 seconds (for GCs to finish).
4. Click "start audio"
5. Observe CPU use of "MediaStreamGrph" thread over a minute
   (If on a platform supporting "top", type 'H' for threads.)

Desired results:
CPU use remains consistent (and low).

Actual results:
CPU use climbs for about a minute and then falls back to a stable level.
Here it may climb to over 60% before falling back to 2.6 to 3.6%.

Forcing CC from a preloaded about:memory in another window collects no objects
and has no effect on cpu.
Forcing GC causes CC to collect objects and the cpu usage falls.
Switching tabs also seems to cause objects to be collected and bring cpu down.
Reporter

Comment 1

5 years ago
Each Web Audio node will usually consume cpu for as long as it lives.
Many pages create many nodes (many node types are designed as one-shot nodes.)
The test page is pretty much as simple as it gets.  More complex audio graphs
will run out of available cpu and stutter.

I assume GC is tuned for memory pressure, not cpu resources.
I think we need to stop depending on GC for cleaning up unnecessary streams.
For nodes that can be reconnected (not one-shot source nodes) this may involve
nodes recreating streams after they are destroyed.  Another option might be to
remove the stream from the graph and add later if required.

This is a variation on bug 897796 comment 30:

1. Passive streams without input are destroyed.
   This includes streams with a node.

2. Newly connected source nodes will propagate (re)creation of streams for
   their output nodes.

3. Streams without node or output are destroyed.
   This includes active streams.

4. Occasional cycle collection on the graph collects cycles without nodes or
   outputs to nodes (sinks), and cycles without inputs from active streams.

1 and 2 can be implemented separately in this bug and 3 and 4 in bug 897796.
Reporter

Comment 2

5 years ago
This is a proof of concept, which is effective only for one-shot source nodes.
More complex graphs will have other nodes that will contribute to cpu usage
but are not cleaned up here.

In the simple example here, it reduces MSG CPU usage to a stable 1.2%.  Even
if GC is running frequently, this is still a reduction to less than half.
Reporter

Updated

5 years ago
See Also: → 897796
Reporter

Comment 3

5 years ago
This was a failed attempt to trigger GC more frequently.

sLikelyShortLivingObjectsNeedingGC reached 13100 on 
http://www.cappel-nord.de/webaudio/acid-defender/
and 24200 on http://webaudiodemos.appspot.com/MIDIDrums/index.html
before GC, which was well after MSG CPU reached 100%.

At least, with this change, forcing a CC when
sLikelyShortLivingObjectsNeedingGC > 2500 leads to a GC, but there is nothing
to trigger a CC before we get the GC we need, at that seems to be on a 1 minute timer.
Keywords: perf
Reporter

Updated

4 years ago
Blocks: 923319
Hey Karl -- We've gotten to a point where we really need this bug fixed for Web Audio perf.  It's affecting some important demos, and it's particularly noticeable now because we've solved so many other perf issues.  Will you able able to start back on this bug this week?
Flags: needinfo?(karlt)
Flags: needinfo?(ajones)
I'm running the first patch, rebased, on try because it implements what's we need for now.

It was deadlocking because we were running messages on the main thread during shut down, and then it was crashing because we would try to update a stream that was already destroyed.

Third try here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b17257d47548
Assignee: karlt → rjesup
Status: NEW → ASSIGNED
Comment on attachment 8620801 [details] [diff] [review]
fix deadlocks in destroying streams in MSG due to patch 1

Review of attachment 8620801 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MediaStreamGraph.cpp
@@ +128,5 @@
>    SetStreamOrderDirty();
>  }
>  
>  void
> +MediaStreamGraphImpl::RemoveStreamUnlocked(MediaStream* aStream)

Let's not clone the entire routine and just omit one line; that's a pain for maintenance.

Instead, pass a bool aLock to control the locking.  Autolocks are a pain when conditional so we have to duplicate the for() loop.

@@ +1298,5 @@
>      mStreamUpdates.SetCapacity(mStreamUpdates.Length() + mStreams.Length());
>      for (uint32_t i = 0; i < mStreams.Length(); ++i) {
>        MediaStream* stream = mStreams[i];
> +      if (!stream->MainThreadNeedsUpdates() ||
> +          stream->IsDestroyed()) {

Mainthread only (see follow-on patch; testing mGraph is a valid replacement to see if it's been destroyed on the MSG thread).

@@ +2125,5 @@
> +    {
> +      mStream->RemoveAllListenersImpl();
> +      auto graph = mStream->GraphImpl();
> +      mStream->DestroyImpl();
> +      graph->RemoveStreamUnlocked(mStream);

let's add a comment about why this is needed
Attachment #8620801 - Flags: review+
Comment on attachment 8620799 [details] [diff] [review]
destroy WebAudio MediaStream when a source finishes

Apply verbal r+ from padenot
Attachment #8620799 - Flags: review+
Attachment #8377800 - Attachment is obsolete: true
Keywords: leave-open
Attachment #8620802 - Flags: review?(roc)
Attachment #8620802 - Flags: review?(padenot)
Attachment #8620802 - Flags: review?(roc)
Attachment #8620802 - Flags: review?(padenot)
Attachment #8620802 - Flags: review+
To test on spark, you can grab builds here once their done: https://treeherder.mozilla.org/#/jobs?repo=try&revision=67a26d625497

This fix should be in tomorrow's Nightly.
Depends on: 1061220
Hi Harald - we got tripped up by a race condition in windows & mac and got backed out (working on fixing it), but you should be able to use the try build in comment 12 to verify that this will resolve the WebAudio game issues for you.  Can you test using that build and let us know how it works for you?

Thanks!
Flags: needinfo?(hkirschner)
Reporter

Comment 16

4 years ago
Comment on attachment 8620801 [details] [diff] [review]
fix deadlocks in destroying streams in MSG due to patch 1

1dcfd15bb387 should be enough to fix the original deadlock from calling back
into the graph during NotifyMainThreadStreamFinished(), which was already on
the main thread.

I don't think this patch is necessary.
Attachment #8620801 - Attachment is obsolete: true
Flags: needinfo?(karlt)
Reporter

Comment 18

4 years ago
I'll open a new bug about destroying and re-creating streams for processing (not just source) nodes.
Keywords: leave-open
Reporter

Updated

4 years ago
Attachment #8620802 - Attachment is obsolete: true
With Karl's landing last night (thanks to Karl, Padenot and Jesup -- a great team effort!), the improvements needed for the games demos are on inbound, and it looks they have stuck (i.e. they won't get backed out).  I expect a merge over to central very soon.

Harald -- Since it's Friday, can you verify today that the remaining issues with the demos are resolved?  Padenot tested the fixes, and it all looks good to him, but I know you were able to reliably repro the problems -- and some of the problems we had trouble repro'iing.  Please feel free to reach out to me or Padenot on irc (we're in #games) if you have any questions or concerns.  Thanks!
https://hg.mozilla.org/mozilla-central/rev/7d4c3379ecb3
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Reporter

Updated

4 years ago
Flags: needinfo?(ajones)

Updated

4 years ago
Depends on: 1175510
Can't test currently, Angry Bots is getting OOM killed during loading in the latest dogfooding version on Aries.
Flags: needinfo?(hkirschner)
Reporter

Updated

4 years ago
Depends on: 1179662
Still running crash, filed bug 1182600
Reporter

Updated

4 years ago
Depends on: 1185176
Reporter

Updated

4 years ago
Blocks: 1189562
Reporter

Updated

4 years ago
Depends on: 1191649
Reporter

Updated

4 years ago
Blocks: 1167681
You need to log in before you can comment on or make changes to this bug.