release shared buffers from downstream so that upstream can re-use

RESOLVED FIXED in Firefox 43

Status

()

defect
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: karlt, Assigned: karlt)

Tracking

Trunk
mozilla43
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(firefox43 fixed)

Details

Attachments

(6 attachments, 2 obsolete attachments)

Assignee

Description

4 years ago
One situation that affects us sometimes including with gain nodes
is that downstream nodes keep a reference to a buffer from
upstream.  Next iteration, the upstream node reallocates because
it sees the buffer is still in use, but the downstream node has
usually finished with the contents of the buffer.
Assignee

Comment 1

4 years ago
This is a simple solution that provides a 15% win on the benchmark at
https://github.com/karlt/webaudio-benchmark/commit/64d958e459dafdf40818c4eed11c843a9f7e92fd
Attachment #8650800 - Flags: review?(padenot)
Assignee

Comment 2

4 years ago
An alternative solution would be to make ReleaseSharedBuffers() virtual and
have a custom implementation for DelayNode and perhaps some other nodes, but
this approach leaves ReleaseSharedBuffers simple and predictable.
Attachment #8650805 - Flags: review?(padenot)
Attachment #8650800 - Flags: review?(padenot) → review+
Comment on attachment 8650805 [details] [diff] [review]
re-use DelayNode output buffer

Review of attachment 8650805 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/webaudio/DelayNode.cpp
@@ +161,5 @@
>      } else {
> +      // AudioNodeStream::ReleaseSharedBuffers() is called on delay nodes in
> +      // cycles first and so may release the buffer reference in aOutput
> +      // because downstream nodes may still be sharing when called.  Therefore
> +      // Keep a separate reference to the output buffer for re-use next

nit: something weird with this sentence. ("Therefore Keep").
Attachment #8650805 - Flags: review?(padenot) → review+
Assignee

Updated

4 years ago
Depends on: 1199113
Assignee

Updated

4 years ago
Flags: in-testsuite-
https://hg.mozilla.org/mozilla-central/rev/0b4230b29db5
https://hg.mozilla.org/mozilla-central/rev/c8d987aeaece
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Assignee

Comment 6

4 years ago
The results of the two benchmarks that I suspect were affected significantly by these patches have changed as follows:

      Simple Mixing with Gains  Granular Synthesis
win       700 ->  580               8800 -> 12950
linux    1050 ->  875              10000 -> 11250
mac      1280 -> 1100              12000 -> 14000

I'm guessing that Granular Synthesis was adversely affected because of the
number of nodes involved.  The extra pass over the mostly inactive nodes seems
to be expensive.  Bug 1189562 should help with that, but I don't think it is
going to undo all the adverse affects here.

I think it is best to back this out and come up with something different.
I have some ideas for keeping track of the number of downstream references.
That would mean that accounting is only required when adding a downstream reference and wanting to reuse a buffer.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee

Comment 8

4 years ago
bug 1197028 move AllocateAudioBlock to AudioBlock.h r?padenot

In a subsequent patch, AllocateAudioBlock will become part of an AudioBlock
class derived from AudioChunk and used for AudioNodeStream members.
Attachment #8656513 - Flags: review?(padenot)
Assignee

Comment 9

4 years ago
bug 1197028 introduce AudioBlockBuffer r?padenot

At this point AudioBlockBuffer is just like SharedBuffer but always with float
channels of length 128.
Attachment #8656514 - Flags: review?(padenot)
Assignee

Comment 10

4 years ago
bug 1197028 use AudioChunk::ChannelCount() r?padenot
Attachment #8656515 - Flags: review?(padenot)
Assignee

Comment 11

4 years ago
bug 1197028 use AudioChunk::GetDuration() r?padenot
Attachment #8656516 - Flags: review?(padenot)
Assignee

Comment 12

4 years ago
bug 1197028 introduce AudioBlock to keep track of downstream references to AudioBlockBuffer r?padenot
Attachment #8656517 - Flags: review?(padenot)
Assignee

Comment 13

4 years ago
bug 1197028 use AudioBlock for web audio processing to reuse buffers shared downstream r?padenot
Attachment #8656518 - Flags: review?(padenot)
Assignee

Updated

4 years ago
Attachment #8650800 - Attachment is obsolete: true
Assignee

Updated

4 years ago
Attachment #8650805 - Attachment is obsolete: true
Comment on attachment 8656513 [details]
MozReview Request: bug 1197028 move AllocateAudioBlock to AudioBlock.h r?padenot

https://reviewboard.mozilla.org/r/18191/#review16481
Attachment #8656513 - Flags: review?(padenot) → review+
Comment on attachment 8656514 [details]
MozReview Request: bug 1197028 introduce AudioBlockBuffer r?padenot

https://reviewboard.mozilla.org/r/18193/#review16483
Attachment #8656514 - Flags: review?(padenot) → review+
Comment on attachment 8656515 [details]
MozReview Request: bug 1197028 use AudioChunk::ChannelCount() r?padenot

https://reviewboard.mozilla.org/r/18195/#review16485
Attachment #8656515 - Flags: review?(padenot) → review+
Comment on attachment 8656516 [details]
MozReview Request: bug 1197028 use AudioChunk::GetDuration() r?padenot

https://reviewboard.mozilla.org/r/18197/#review16487
Attachment #8656516 - Flags: review?(padenot) → review+
Comment on attachment 8656517 [details]
MozReview Request: bug 1197028 introduce AudioBlock to keep track of downstream references to AudioBlockBuffer r?padenot

https://reviewboard.mozilla.org/r/18199/#review16489

::: dom/media/webaudio/AudioBlock.h:90
(Diff revision 1)
> +  bool mBufferIsDownstreamRef = false;

The name is somewhat explicit, but this could use a comment.

::: dom/media/webaudio/AudioBlock.cpp:18
(Diff revision 1)
> + *

There is something weird in this sentence I think.

::: dom/media/webaudio/AudioBlock.cpp:64
(Diff revision 1)
> +    nsrefcnt count = mRefCnt;

https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong !
Attachment #8656517 - Flags: review?(padenot) → review+
Attachment #8656518 - Flags: review?(padenot) → review+
Comment on attachment 8656518 [details]
MozReview Request: bug 1197028 use AudioBlock for web audio processing to reuse buffers shared downstream r?padenot

https://reviewboard.mozilla.org/r/18201/#review16513
Assignee

Comment 21

4 years ago
https://reviewboard.mozilla.org/r/18199/#review16525

::: dom/media/webaudio/AudioBlock.h:14
(Diff revision 1)
> + * An AudioChunk whose buffer contents need to be valid only for one
> + * processing block iteration, after which contents can be overwritten.

Added "if the buffer has not been passed to longer term storage or to another thread, which may happen through AsAudioChunk() or AsMutableChunk()."

::: dom/media/webaudio/AudioBlock.h:90
(Diff revision 1)
> +  bool mBufferIsDownstreamRef = false;

Yes, added:

  // mBufferIsDownstreamRef is set only when mBuffer references an
  // AudioBlockBuffer created in a different AudioBlock.  That can happen when
  // this AudioBlock is on a node downstream from the node which created the
  // buffer.  When this is set, the AudioBlockBuffer is notified that this
  // reference does prevent the upstream node from re-using the buffer next
  // iteration and modifying its contents.  The AudioBlockBuffer is also
  // notified when mBuffer releases this reference.

::: dom/media/webaudio/AudioBlock.cpp:18
(Diff revision 1)
> + *

Yes, made this:

 \* Downstream references are accounted specially so that the creator of the
 \* buffer can reuse and modify its contents next iteration if other references
 \* are all downstream temporary references held by AudioBlock.

::: dom/media/webaudio/AudioBlock.cpp:64
(Diff revision 1)
> +    nsrefcnt count = mRefCnt;

I've changed the comment somewhat to explain:

    // mRefCnt is atomic and so reading its value is defined even when
    // modifications may happen on other threads.  mDownstreamRefCount is
    // not modified on any other thread.
    //
    // If all other references are downstream references (managed on this, the
    // graph thread), then other threads are not using this buffer and cannot
    // add further references.  This method can safely return false.  The
    // buffer contents can be modified.
    //
    // If there are other references that are not downstream references, then
    // this method will return true.  The buffer will be assumed to be still
    // in use and so will not be reused.
Assignee

Updated

4 years ago
Depends on: 1203380
Assignee

Comment 25

4 years ago
(In reply to Karl Tomlinson (ni?:karlt) from comment #6)
> The results of the two benchmarks that I suspect were affected significantly
> by these patches have changed as follows:
> 
>       Simple Mixing with Gains  Granular Synthesis
> win       700 ->  580               8800 -> 12950
> linux    1050 ->  875              10000 -> 11250
> mac      1280 -> 1100              12000 -> 14000
> 
> I'm guessing that Granular Synthesis was adversely affected because of the
> number of nodes involved.  The extra pass over the mostly inactive nodes
> seems
> to be expensive.  Bug 1189562 should help with that, but I don't think it is
> going to undo all the adverse affects here.
> 
> I think it is best to back this out and come up with something different.
> I have some ideas for keeping track of the number of downstream references.
> That would mean that accounting is only required when adding a downstream
> reference and wanting to reuse a buffer.

(In reply to Pulsebot from comment #7)
> Backout:
> https://hg.mozilla.org/integration/mozilla-inbound/rev/4d7fa39e2d3f

That backout reverted the improvements on Simple Mixing with Gains but didn't change the Granular Synthesis results, so that regression must be unrelated to the changes for the first approach here.

The second approach performs a little better on the Simple Mixing with Gains
benchmark than the first, and avoiding the extra pass is theoretically better
for graphs without so many gain nodes.

I couldn't reproduce the Granular Synthesis regression on i7 linux or pentium
win7 systems.  The variability in results was high (~50%).

Bug 1199559 and bug 1199561 were also in that regression range.  I can only
speculate on how they may be adverse affect specifically in Granular Synthesis
results, but they provided such large improvements elsewhere that I don't
think we want to revert them.

One difference is that ThreadSharedFloatArrayBufferList::Create() uses
js_pod_malloc() whereas JS_NewFloat32Array in AudioBuffer::Create() used
pod_calloc.  Those are used for the source node buffer, but the difference
didn't cause adverse results in other tests, and I wouldn't have affected any
difference in allocation location to make a difference on the large buffer for
the source nodes.

Perhaps memory is laid out differently due to previous tests not touching so
many pages.

Improvements from fixing bug 1189506, bug 1205540, and bug 1207003 have more
than won back the regression anyway.
You need to log in before you can comment on or make changes to this bug.