Closed Bug 856361 Opened 7 years ago Closed 6 years ago

Implement MediaStreamAudioSourceNode

Categories

(Core :: Web Audio, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla26
blocking-b2g koi+
Tracking Status
firefox25 --- verified
firefox26 --- verified

People

(Reporter: ehsan, Assigned: roc)

References

(Depends on 2 open bugs)

Details

(Keywords: dev-doc-complete, Whiteboard: [FT: Media Recording, Sprint])

Attachments

(10 files, 17 obsolete files)

1.83 KB, text/plain
Details
1.00 KB, patch
jesup
: review+
Details | Diff | Splinter Review
14.08 KB, patch
ehsan
: review+
Details | Diff | Splinter Review
21.77 KB, patch
ehsan
: review+
Details | Diff | Splinter Review
24.32 KB, patch
ehsan
: review+
Details | Diff | Splinter Review
10.54 KB, patch
ehsan
: review+
Details | Diff | Splinter Review
7.54 KB, patch
padenot
: review+
Details | Diff | Splinter Review
4.21 KB, patch
smaug
: review+
Details | Diff | Splinter Review
8.53 KB, patch
padenot
: review+
Details | Diff | Splinter Review
15.75 KB, patch
cpearce
: review+
Details | Diff | Splinter Review
No description provided.
Blocks: 855568
Comment on attachment 734619 [details] [diff] [review]
Initial work to implement MediaStreamAudioSourceNode from WebAudio spec

Review of attachment 734619 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webaudio/AudioContext.cpp
@@ +110,5 @@
> +already_AddRefed<MediaStreamAudioSourceNode>
> +AudioContext::CreateMediaStreamSource(const DOMMediaStream& aMediaStream)
> +{
> +  nsRefPtr<MediaStreamAudioSourceNode> mediaStreamAudioSourceNode = new MediaStreamAudioSourceNode(this);
> +  mediaStreamAudioSourceNode->SetInputMediaStream(&aMediaStream);

See my comments on MediaStreamAudioSourceNode.h.  With those, you want to drop this line and pass the media stream argument directly to MediaStreamAudioSourceNode.

::: content/media/webaudio/AudioContext.h
@@ +45,5 @@
>  class AudioListener;
>  class BiquadFilterNode;
>  class DelayNode;
>  class DynamicsCompressorNode;
> +class MediaStreamAudioSourceNode;

Nit: please maintain the alphabetical order here.

::: content/media/webaudio/MediaStreamAudioSourceNode.cpp
@@ +35,5 @@
> +                                 AudioChunk* aOutput,
> +                                 bool* aFinished)
> +  {
> +    MOZ_ASSERT(mSource == aStream, "Invalid source stream");
> +    *aOutput = aInput;

aInput should be "null" (i.e., aInput.IsNull() should be true here), so this assignment is sort of pointless.

@@ +36,5 @@
> +                                 bool* aFinished)
> +  {
> +    MOZ_ASSERT(mSource == aStream, "Invalid source stream");
> +    *aOutput = aInput;
> +    WriteZeroesToAudioBlock(aOutput, 0, WEBAUDIO_BLOCK_SIZE);

And because aOutput is null, the loop in WriteZeroesToAudioBlock will never run (since channel data is empty).  Speaking of this, we should probably MOZ_ASSERT this inside WriteZeroesToAudioBlock -- filed bug 859335 for that.)  Anyways, here you should call AllocateAudioBlock first (in the final patch of course)

@@ +40,5 @@
> +    WriteZeroesToAudioBlock(aOutput, 0, WEBAUDIO_BLOCK_SIZE);
> +  }
> +
> +  AudioNodeStream* mSource;
> +  AudioNodeStream* mDestination;

I don't think that you'll need to track the source and destination audio node streams here.  They're only used for nodes which have AudioParam attributes (see WebAudioUtils::ConvertAudioParamToTicks).

::: content/media/webaudio/MediaStreamAudioSourceNode.h
@@ +14,5 @@
> +
> +class MediaStreamAudioSourceNode : public AudioNode
> +{
> +public:
> +  explicit MediaStreamAudioSourceNode(AudioContext* aContext);

Please make the ctor take both the context and the media stream in one go, and remove SetInputMediaStream. (You also want to remove the explicit keyword once you add the second argument.)
Attachment #734619 - Attachment is obsolete: true
Attachment #734969 - Flags: feedback+
@ehsan: I think I fixed everything you pointed out.

@roc: ehsan and I spent quite a bit of time today trying to figure out the best way to get data from the stream attached to DOMMediaStream (which is usually a TrackUnionStream) to a subsequent stream (e.g. an AudioNodeStream). Connecting them with an InputPort using AllocateInputPort starts the process off well, but we run into trouble in AudioNodeStream::ObtainInputBlock, since it asserts the use of AudioNodeStream. The patch I submitted here shows what we were attempting to do, but I'd like to ask for advice before digging further into a potential hilarious waste of time.
Attachment #734969 - Flags: feedback+ → feedback?(roc)
Comment on attachment 734969 [details] [diff] [review]
Initial work + attempt at fixing ObtainInputBlock

Review of attachment 734969 [details] [diff] [review]:
-----------------------------------------------------------------

I think MediaStreamAudioSourceNode should use a new AudioNodeStream subclass, let's call it AudioNodeExternalInputStream, that overrides ProduceOutput and doesn't use ObtainInputBlock at all, and only supports a single MediaInputPort, extracting audio directly from the input MediaStream's audio track(s) to fill each buffer of WEBAUDIO_BLOCK_SIZE samples. We need to support sample-rate conversion here. Also, the results of all tracks should be added together, but that's easy. The code for TrackUnionStream shows how to iterate through input tracks, maintain per-input-track state and extract data for each track.

Hope this helps.
Thanks for the help, Roc. Seems like the correct direction so far. At least I'm able to output sine waves.

For the record, I've uploaded my work here: https://github.com/secretrobotron/mozilla-central/compare/media-stream-audio-source-node . It needs some cleanup, so I didn't bother making an hg patch for it again.

The questionable part is here: https://github.com/secretrobotron/mozilla-central/compare/media-stream-audio-source-node#L0R44 . It's obviously incorrect, and I wouldn't recommend listening to it (violent clicks!). My goal was to just get some data from the input stream's tracks and pass it on to the output; seems as though I've succeeded at that at least.
Let's take a look at it today.
Attached patch working 48k patch (obsolete) — Splinter Review
No resampling is implemented yet, so this patch only works for 48k input streams. It does, however, produce sound through the MediaStreamAudioSource node.

The way data is stripped from the input and passed to the output seems a bit heavy-handed, but, was sufficient for getting the correct chunks based on aTo and aFrom.
Attachment #734969 - Attachment is obsolete: true
Attachment #734969 - Flags: feedback?(roc)
Attachment #735985 - Flags: feedback+
Comment on attachment 735985 [details] [diff] [review]
working 48k patch

Review of attachment 735985 [details] [diff] [review]:
-----------------------------------------------------------------

You'll want something like CopyTrackData in TrackUnionStream where it does
      MediaInputPort::InputInterval interval = map->mInputPort->GetNextInputInterval(t);
and test interval.mInputIsBlocked; if the interval is blocked, then you need to treat that time as silence.
Basically this means a paused stream (or media element) will be treated as silence by Web Audio.

::: content/media/AudioNodeExternalInputStream.cpp
@@ +50,5 @@
> +
> +      for (AudioSegment::ChunkIterator chunks(outputSegment);
> +           !chunks.IsEnded(); chunks.Next()) {
> +        AudioChunk& externalChunk = *chunks;
> +        if (!externalChunk.IsNull()) {

You don't want to skip Null chunks. They're silence and should be handled as such.
Thanks for the feedback. Ehsan and I spoke about it yesterday as well, and agreed that I'd need to do what CopyTrackData does, and used speex for rate conversion.
Bobby, are you still working on this, or do you want someone to take it over?
Flags: needinfo?(bobby)
A couple of weeks ago, Ehsan and I spoke about a new implementation strategy since our prior planning failed. A significant amount of work exists in a patch, but it's still incomplete.

Unfortunately, I don't have a lot of free time to finish this work at the moment, but should have some soon -- next week.

If it's urgent, I'm happy to post the patch and explain things to someone else. Otherwise, I'll report back with some more progress soon.
Flags: needinfo?(bobby)
It can wait until next week, I guess.  Thanks for not giving up.  :-)
Mass moving Web Audio bugs to the Web Audio component.  Filter on duckityduck.
Component: Video/Audio → Web Audio
Completely new, queue-based approach which is a lot more functional. Still obviously missing a couple of important things (which cause some underrun problems).
Attachment #735985 - Attachment is obsolete: true
Attachment #760000 - Flags: review?(ehsan)
Attached patch Some cleanup over last patch (obsolete) — Splinter Review
Attachment #760000 - Attachment is obsolete: true
Attachment #760000 - Flags: review?(ehsan)
Attachment #760002 - Flags: review?(ehsan)
Comment on attachment 760002 [details] [diff] [review]
Some cleanup over last patch

Review of attachment 760002 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/AudioNodeExternalInputStream.cpp
@@ +15,5 @@
> +
> +namespace mozilla {
> +
> +AudioNodeExternalInputStream::AudioNodeExternalInputStream(AudioNodeEngine* aEngine)
> +    : AudioNodeStream(aEngine, MediaStreamGraph::INTERNAL_STREAM, IdealAudioRate()),

Please add a sampling rate arg to the ctor and pass it over to AudioNodeStream, since you cannot assume that all AudioNodeStreams run at 48KHz any more.

@@ +22,5 @@
> +}
> +
> +AudioNodeExternalInputStream::~AudioNodeExternalInputStream()
> +{
> +  MOZ_COUNT_DTOR(AudioNodeExternalInputStream);

You forgot the corresponsing MOZ_COUNT_CTOR.

@@ +23,5 @@
> +
> +AudioNodeExternalInputStream::~AudioNodeExternalInputStream()
> +{
> +  MOZ_COUNT_DTOR(AudioNodeExternalInputStream);
> +  

Nit: please clean up the trailing spaces here and elsewhere in the patch.

@@ +34,5 @@
> +AudioNodeExternalInputStream::TrackMapEntry*
> +AudioNodeExternalInputStream::GetTrackMap(const StreamBuffer::Track& aTrack)
> +{
> +  SpeexResamplerState* resampler;
> +  AudioNodeExternalInputStream::TrackMapEntry* map;

You can drop the AudioNodeExternalInputStream:: prefix here and elsewhere in the body of the member functions.

@@ +56,5 @@
> +  resampler = speex_resampler_init((*ci).mChannelData.Length(),
> +                                   aTrack.GetRate(),
> +                                   IdealAudioRate(),
> +                                   SPEEX_RESAMPLER_QUALITY_DEFAULT,
> +                                   nullptr);

You should only do this if the sampling rates do not match.  Also, please call speex_resampler_skip_zeros here too.

@@ +74,5 @@
> +                                                  const AudioChunk& aOutputChunk,
> +                                                  uint32_t aInputOffset,
> +                                                  uint32_t aOutputOffset,
> +                                                  uint32_t& aActualInputSamples,
> +                                                  uint32_t& aActualOutputSamples)

You need to copy the mVolume member of the input chunk to the output chunk to preserve the volume on the incoming data.

@@ +78,5 @@
> +                                                  uint32_t& aActualOutputSamples)
> +{
> +  // Make sure these values are copied as doubles for accurate use later.
> +  double inputPlaybackRate = aInputRate;
> +  double finalPlaybackRate = inputPlaybackRate / IdealAudioRate();

Use mSampleRate here.  You can no longer make assumptions about the sampling rate.  (Tip: if you call IdealAudioRate(), then you're probably doing it wrong!)

@@ +85,5 @@
> +  const uint32_t numberOfChannels = aInputChunk.mChannelData.Length();
> +
> +  // The number of input samples available is the size of the input chunk minus the
> +  // amount of data we've already read from it.
> +  uint32_t inputSamples = aInputChunk.GetDuration() - aInputOffset;

What guarantees this to not underflow?

@@ +88,5 @@
> +  // amount of data we've already read from it.
> +  uint32_t inputSamples = aInputChunk.GetDuration() - aInputOffset;
> +  // The number of output samples available is the size of the output chunk (WEBAUDIO_BLOCK_SIZE)
> +  // minus the amount of data we've already written to it.
> +  uint32_t outputSamples = WEBAUDIO_BLOCK_SIZE - aOutputOffset;

Ditto.

@@ +92,5 @@
> +  uint32_t outputSamples = WEBAUDIO_BLOCK_SIZE - aOutputOffset;
> +
> +  // In case no data is written, initialize these to 0.
> +  aActualOutputSamples = 0;
> +  aActualInputSamples = 0;

Please initialize these values in the caller instead.

@@ +104,5 @@
> +  for (uint32_t i = 0; i < numberOfChannels; ++i) {
> +    // rawBuffer can hold either int16 or float audio data, depending on the format of
> +    // the input chunks.  We're using a raw byte buffer that can fit in a channel of
> +    // floats, which would be more than enough for int16.
> +    char rawBuffer[sizeof(float) * WEBAUDIO_BLOCK_SIZE];

Please remove this and create local buffers in the below condition bodies.

@@ +129,5 @@
> +      speex_resampler_process_int(aResampler, i,
> +                                  inputData, &in,
> +                                  outputData, &out);
> +
> +      // *Add* the result to `finalData`, since multiple tracks could be writing to this chunk in the same

"Mix" is the common terminology for this.

@@ +133,5 @@
> +      // *Add* the result to `finalData`, since multiple tracks could be writing to this chunk in the same
> +      // time window. Use AudioSampleToFloat to ensure that 16bit int data becomes a float for `finalData`.
> +      for (uint32_t k = 0; k < out; ++k) {
> +        finalData[k] += AudioSampleToFloat(outputData[k]);
> +      }

A fast path for the case where the sampling rates are the same will enable you to avoid a lot of memory copying here.

@@ +194,5 @@
> +
> +void
> +AudioNodeExternalInputStream::ConsumeInputData(const StreamBuffer::Track& aInputTrack,
> +                                               AudioNodeExternalInputStream::ChunkMetaData& aChunkMetaData)
> +{

Great news!  This is what the spec says: "The first AudioMediaStreamTrack from the MediaStream will be used as a source of audio."  So, all of the complexity needed to take care of multiple tracks can go away!!!  :-)

Here's a suggestion for a better way of doing this.  For this track, just copy all of the AudioChunk objects that it contains eagerly on input.  On the output, if the sampling rates match and you don't fall onto the boundary of two chunks, then just borrow the chunk like we do in AudioBufferSourceNodeEngine.  Otherwise, construct a new chunk and do the resampling as needed in order to get 128 frames of output.

This also means that you wouldn't need to do the MAX_QUEUE_LENGTH dance, which will make the code simpler.

@@ +218,5 @@
> +  while (!ci.IsEnded()) {
> +    const AudioChunk& currentInputChunk = *ci;
> +
> +    // TODO: these are still ok
> +    if (currentInputChunk.IsNull()) {

You can only do this if mDuration is 0.  If it's not, then a null chunk is just as valid as a non-null chunk.  Null chunks can be used to represent silence efficiently.

@@ +273,5 @@
> +AudioNodeExternalInputStream::ProduceOutput(GraphTime aFrom, GraphTime aTo)
> +{
> +  MOZ_ASSERT(mInputs.Length() == 1);
> +
> +  uint16_t outputCount = std::max(uint16_t(1), mEngine->OutputCount());

OutputCount() is always going to return 1 here.  It's basically AudioNode::NumberOfOutputs, which returns the number of output "ports", which is determined by the type of the node.

@@ +285,5 @@
> +    // is ready to accept data if it's empty.
> +    AudioNodeExternalInputStream::ChunkMetaData lastChunkMetaData;
> +    if (mOutputChunkQueue.empty()) {
> +      AudioChunk tmpChunk;
> +      mOutputChunkQueue.push_back(tmpChunk);

Hmm, AudioChunk doesn't have a useful constructor.  I guess it's guaranteed for this chunk to be filled in with something useful, but I would feel a lot more comfortable if you initialize this chunk here.

@@ +299,5 @@
> +      lastChunkMetaData.mOffset = mLastChunkOffset;      
> +    }
> +
> +    // Mulitple tracks additively write data to the same chunks, so we need to figure out which track wrote
> +    // the most data, and remember the offset of the last chunk the gets created.

You should check with roc or jesup on this, but I believe that this is not needed.  I think you should just look at the longest available track and when mixing them just assume that the rest are padded with silence.

@@ +305,5 @@
> +    maxChunkMetaData.mIndex = 0;
> +    maxChunkMetaData.mOffset = 0;
> +
> +    // Iterate over all the input tracks.
> +    for (StreamBuffer::TrackIter tracks(mInputs[0]->GetSource()->mBuffer, MediaSegment::AUDIO);

Please MOZ_ASSERT the length of mInputs.

@@ +338,5 @@
> +  if (!mOutputChunkQueue.empty() && !(mOutputChunkQueue.size() == 1 && mLastChunkOffset < WEBAUDIO_BLOCK_SIZE)) {
> +    mLastChunks[0] = mOutputChunkQueue.front();
> +    mOutputChunkQueue.pop_front();
> +  }
> +  else {

Nit: } else {

::: content/media/AudioNodeExternalInputStream.h
@@ +29,5 @@
> +
> +  AudioNodeExternalInputStream(AudioNodeEngine* aEngine);
> +  ~AudioNodeExternalInputStream();
> +
> +  virtual void ProduceOutput(GraphTime aFrom, GraphTime aTo);

Nit: MOZ_OVERRIDE.

::: content/media/AudioNodeStream.cpp
@@ +285,5 @@
>      if (a->IsFinishedOnGraphThread() ||
>          a->IsAudioParamStream()) {
>        continue;
>      }
> +    chunk = &a->mLastChunks[mInputs[i]->OutputNumber()];

Hmm, why did you move up the declaration of chunk?  Actually I think this whole hunk can probably be reverted.

::: content/media/MediaStreamGraph.h
@@ +945,5 @@
>                                           AudioNodeStreamKind aKind,
>                                           TrackRate aSampleRate = 0);
> +
> +  AudioNodeExternalInputStream*
> +  CreateAudioNodeExternalInputStream(AudioNodeEngine* aEngine);

This constructor needs to accept aSampleRate as well.

::: content/media/webaudio/MediaStreamAudioSourceNode.cpp
@@ +7,5 @@
> +#include "MediaStreamAudioSourceNode.h"
> +#include "mozilla/dom/MediaStreamAudioSourceNodeBinding.h"
> +#include "AudioNodeEngine.h"
> +#include "AudioNodeExternalInputStream.h"
> +#include "AudioDestinationNode.h"

Nit: this header is not needed.

@@ +8,5 @@
> +#include "mozilla/dom/MediaStreamAudioSourceNodeBinding.h"
> +#include "AudioNodeEngine.h"
> +#include "AudioNodeExternalInputStream.h"
> +#include "AudioDestinationNode.h"
> +#include "WebAudioUtils.h"

Nit: this one is not needed either!

@@ +14,5 @@
> +
> +namespace mozilla {
> +namespace dom {
> +
> +NS_IMPL_ISUPPORTS_INHERITED0(MediaStreamAudioSourceNode, AudioNode)

You need to tell the cycle collector about mInputPort.  In fact I'm not sure how this links...

@@ +18,5 @@
> +NS_IMPL_ISUPPORTS_INHERITED0(MediaStreamAudioSourceNode, AudioNode)
> +
> +MediaStreamAudioSourceNode::MediaStreamAudioSourceNode(AudioContext* aContext,
> +                                                       const DOMMediaStream* aMediaStream)
> +  : AudioNode(aContext, 2, ChannelCountMode::Max, ChannelInterpretation::Speakers)

Nit: for invoking the AudioNode ctor, please use the whitespace style that other nodes in this code use.

@@ +29,5 @@
> +
> +MediaStreamAudioSourceNode::~MediaStreamAudioSourceNode()
> +{
> +  mInputPort->Destroy();
> +  DestroyMediaStream();

A better way to destroy the input port is to override DestroyMediaStream() and do mInputPort->Destroy() before calling the base class version.  This way you can drop this dtor, since DestroyMediaStream() is guaranteed to be called before this.

::: content/media/webaudio/MediaStreamAudioSourceNode.h
@@ +20,5 @@
> +
> +  NS_DECL_ISUPPORTS_INHERITED
> +  NS_DECL_CYCLE_COLLECTION_CLASS_INHERITED(MediaStreamAudioSourceNode, AudioNode)
> +
> +  virtual JSObject* WrapObject(JSContext* aCx, JS::Handle<JSObject*> aScope);

Nit: MOZ_OVERRIDE.

Also, you need to override NumberOfInputs() and make it return 0, otherwise it would be possible to connect other nodes to this node.

::: content/media/webaudio/test/test_mediaStreamAudioSourceNode.html
@@ +12,5 @@
> +  <!-- <audio src="sin440-96k.wav" controls></audio> -->
> +  <!-- <audio src="audio-expected.wav" controls></audio> -->
> +  <!-- <audio src="audio.ogv" controls></audio> -->
> +  <!-- <audio src="ting-dualchannel44.1.ogg" controls loop="true"></audio> -->
> +  <!-- <audio src="ehsan.wav" controls></audio> -->

lol :-)

@@ +14,5 @@
> +  <!-- <audio src="audio.ogv" controls></audio> -->
> +  <!-- <audio src="ting-dualchannel44.1.ogg" controls loop="true"></audio> -->
> +  <!-- <audio src="ehsan.wav" controls></audio> -->
> +  <script type="text/javascript">
> +    var context = new AudioContext();

You need the pref-setting dance we have in other tests in this directory.

@@ +24,5 @@
> +      var stream = audio.mozCaptureStream();
> +      var source = context.createMediaStreamSource(stream);
> +      source.connect(context.destination);
> +      audio.play();
> +    }, false);

While this test is nice (and it would be a good idea to keep it, if you modified it to remove the comments and such, and also remove the unused wav files from the patch), we can do better.  Please create a test similar to test_audioParamTimelineDestinationOffset.html which sets up an async graph and calls the callback when the canplay event is received, feed a 48khz audio file into the MediaStreamAudioSourceNode, and return the expected buffer from createExpectedBuffers to verify that the correct data is fed to the output of this node.  Please ping me if you need help with the test, but <http://mxr.mozilla.org/mozilla-central/source/content/media/webaudio/test/webaudio.js?force=1#67> should be a nice quick read on how this mini-framework works.

::: dom/bindings/Bindings.conf
@@ +615,5 @@
>      'skipGen': True
>  }],
>  
> +'MediaStreamAudioSourceNode': {
> +},

You don't need to add an entry here if it's going to be empty, please get rid of it.
Attachment #760002 - Flags: review?(ehsan)
Comment on attachment 760002 [details] [diff] [review]
Some cleanup over last patch

Review of attachment 760002 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webaudio/MediaStreamAudioSourceNode.h
@@ +14,5 @@
> +
> +class MediaStreamAudioSourceNode : public AudioNode
> +{
> +public:
> +  MediaStreamAudioSourceNode(AudioContext* aContext, const DOMMediaStream* aMediaTream);

Nit: aMediaStream.
You should also use AssociateWithMediaStreamDestinationNode that Josh is adding in bug 865257.
Attached patch Some crash fixes (obsolete) — Splinter Review
Fixed these as part of my work on bug 855568.
If you apply my patch in bug 855568, and then look at this test case <http://people.mozilla.org/~eakhgari/webaudio/mediaelement-webaudio.html>, then you can see a few bugs:

* Volume changes do not take effect, as I've already pointed out.
* If you pause the audio element, the last bits of the played back buffer will continue to play back for a while.

Note that you can easily modify that test to use mozCaptureStream instead of a MediaElementAudioSourceNode, since that's what I'm doing in my patch anyways.
roc, is there a use case in practice which would cause a DOMMediaStream to have multiple audio tracks?  If yes, I think we should raise a spec issue about the spec asking for only the first audio track to be honored.
Flags: needinfo?(roc)
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #24)
> roc, is there a use case in practice which would cause a DOMMediaStream to
> have multiple audio tracks?

Yes, this can easily happen with WebRTC for example, and authors can combine MediaStreams to construct such cases too.

> If yes, I think we should raise a spec issue
> about the spec asking for only the first audio track to be honored.

I think it would make more sense to mix the currently-active audio tracks evenly.
Flags: needinfo?(roc)
(In reply to comment #25)
> (In reply to :Ehsan Akhgari (needinfo? me!) from comment #24)
> > roc, is there a use case in practice which would cause a DOMMediaStream to
> > have multiple audio tracks?
> 
> Yes, this can easily happen with WebRTC for example, and authors can combine
> MediaStreams to construct such cases too.
> 
> > If yes, I think we should raise a spec issue
> > about the spec asking for only the first audio track to be honored.
> 
> I think it would make more sense to mix the currently-active audio tracks
> evenly.

I see.  Do you mind raising the spec issue please?  Thanks!
Attachment #760002 - Attachment is obsolete: true
Attachment #761199 - Attachment is obsolete: true
Test is incomplete and quite broken, but more review/bug fixes are here.
Attachment #763141 - Attachment is obsolete: true
Attachment #765101 - Flags: review?(ehsan)
Comment on attachment 765101 [details] [diff] [review]
Multi-input buffering -- couple more review fixes over last patch + start of test

Review of attachment 765101 [details] [diff] [review]:
-----------------------------------------------------------------

Please remove the binary files which you don't need from the patch.

::: content/media/AudioNodeExternalInputStream.cpp
@@ +46,5 @@
> +  // Grab the first valid AudioChunk from the track.
> +  AudioSegment::ChunkIterator ci(*aTrack.Get<AudioSegment>());
> +  while ((*ci).IsNull() && !ci.IsEnded()) {
> +    ci.Next();
> +  }

You should handle the case where ci.IsEnded() is true after this loop.  See attachment 761199 [details] [diff] [review].

@@ +111,5 @@
> +    float* finalData = static_cast<float*>(const_cast<void*>(aOutputChunk.mChannelData[i])) + aOutputOffset;
> +
> +    // Depending on the audio format, we need to cast input and output arrays appropriately.
> +    if (aInputChunk.mBufferFormat == AUDIO_FORMAT_S16) {
> +      char rawBuffer[sizeof(int16_t) * (WEBAUDIO_BLOCK_SIZE - aOutputOffset)];

Do not rely on the gcc variable length array extension here, just allocate a possibly larger than necessary buffer here.  Also, please create an int16_t buffer right off the bat instead of doing the cast below.

@@ +126,5 @@
> +                                  inputData, &in,
> +                                  outputData, &out);
> +
> +      // Mixdown. *Add* the result to `finalData`, since multiple tracks could be writing to this chunk in the same
> +      // time window. Use AudioSampleToFloat to ensure that 16bit int data becomes a float for `finalData`.

The comment is incorrect.

@@ +132,5 @@
> +        finalData[k] = AudioSampleToFloat(outputData[k]);
> +      }
> +    }
> +    else {
> +      char rawBuffer[sizeof(float) * (WEBAUDIO_BLOCK_SIZE - aOutputOffset)];

Ditto.

@@ +150,5 @@
> +      // Mixdown. *Add* the result to `finalData`, since multiple tracks could be writing to this chunk in the same
> +      // time window. No float conversion needs to be done here, since we should already be in float land.
> +      for (uint32_t k = 0; k < out; ++k) {
> +        finalData[k] = outputData[k];
> +      }

You can write directly to finalData.

@@ +164,5 @@
> +}
> +
> +void
> +AudioNodeExternalInputStream::MixTracks(const nsTArray<AudioChunk>& aOutputChunks,
> +                                        const AudioChunk& aDestinationChunk)

Nit: Please make this second arg non-const as it seems like it's an input to the function while it's really an output.

@@ +173,5 @@
> +      float* outputChunkData = static_cast<float*>(const_cast<void*>(aOutputChunks[i].mChannelData[j]));
> +      float* aggregateChunkData = static_cast<float*>(const_cast<void*>(aDestinationChunk.mChannelData[j]));
> +      for (uint32_t k = 0; k < WEBAUDIO_BLOCK_SIZE; ++k) {
> +        aggregateChunkData[k] += outputChunkData[k];
> +      }

Instead of this loop, please call AudioBlockAddChannelWithScale with aScale set to aOutputChunks[i].mVolume.

@@ +181,5 @@
> +
> +void
> +AudioNodeExternalInputStream::ConsumeDataFromTrack(AudioChunk& aOutputChunk,
> +                                                   const StreamBuffer::Track& aInputTrack,
> +                                                  TrackMapEntry& aTrackMap)

Nit: indentation.

@@ +183,5 @@
> +AudioNodeExternalInputStream::ConsumeDataFromTrack(AudioChunk& aOutputChunk,
> +                                                   const StreamBuffer::Track& aInputTrack,
> +                                                  TrackMapEntry& aTrackMap)
> +{
> +  // Can we take a fast path for equal sampling rates?

Nit: Why the comment?  Seems like we can!

@@ +210,5 @@
> +        } else if (outputSamples < inputSamples / playbackRateRatio) {
> +          inputSamples = ceil(outputSamples * playbackRateRatio);
> +        }
> +
> +        WriteZeroesToAudioBlock(&aOutputChunk, outputOffset, outputSamples);

Please file a follow-up bug to avoid first writing zeros to the buffer and then possibly overwriting them.

@@ +276,5 @@
> +    // for the track.
> +    TrackMapEntry& trackMap = *(GetTrackMap(inputTrack));
> +
> +    // If something bizarre happened and we're beyond the end of the input track, bail.
> +    if (trackMap.mLastInputTick > inputTrack.GetEnd()) { return; }

Again, see attachment 761199 [details] [diff] [review].

@@ +284,5 @@
> +
> +    if (!trackMap.mChunkQueue.empty()) {
> +      AudioChunk* outputChunk = outputChunks.AppendElement();
> +      AllocateAudioBlock(trackMap.mChunkQueue.front().mChannelData.Length(), outputChunk);
> +      WriteZeroesToAudioBlock(outputChunk, 0, WEBAUDIO_BLOCK_SIZE);

Please file a follow-up to optimize this away.

@@ +292,5 @@
> +  }
> +
> +  if (!outputChunks.IsEmpty()) {
> +    AllocateAudioBlock(numChannels, &mLastChunks[0]);
> +    WriteZeroesToAudioBlock(&mLastChunks[0], 0, WEBAUDIO_BLOCK_SIZE);

Please get rid of this!

::: content/media/AudioNodeExternalInputStream.h
@@ +18,5 @@
> +#else
> +#define LOG(type, msg)
> +#endif
> +
> +// Forward declaration of for Speex for mResamplerMap

of for?  ;-)

::: content/media/webaudio/MediaStreamAudioSourceNode.h
@@ +17,5 @@
> +public:
> +  MediaStreamAudioSourceNode(AudioContext* aContext, const DOMMediaStream* aMediaStream);
> +
> +  NS_DECL_ISUPPORTS_INHERITED
> +  NS_DECL_CYCLE_COLLECTION_CLASS_INHERITED(MediaStreamAudioSourceNode, AudioNode)

You don't need the CC macro since this class doesn't implement any CC logic.

::: content/media/webaudio/test/test_mediaStreamAudioSourceNode.html
@@ +1,4 @@
> +<!DOCTYPE HTML>
> +<html>
> +<head>
> +  <title>Test AudioParam timeline events scheduled after the destination stream has started playback</title>

I did not review the test yet.
Attachment #765101 - Flags: review?(ehsan)
1) Review fixes.
2) Removed unnecessary wav test files.
3) Updated webaudio.js testing framework.
4) Added throw condition to offlineContext.createMediaStreamSource().
Attachment #765101 - Attachment is obsolete: true
Attachment #765960 - Flags: review?(ehsan)
Comment on attachment 765960 [details] [diff] [review]
MediaStreamAudioSourceNode && AudioNodeExternalInputStream: best version yet!

Review of attachment 765960 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/AudioNodeExternalInputStream.cpp
@@ +95,5 @@
> +      for (k = 0; k < outputSamples && k < inputSamples; ++k) {
> +        finalData[k] = AudioSampleToFloat(inputData[k]);
> +      }
> +    }
> +    else {

Nit!

@@ +97,5 @@
> +      }
> +    }
> +    else {
> +      float* inputData = static_cast<float*>(const_cast<void*>(aInputChunk.mChannelData[i])) + aInputOffset;
> +      uint32_t k;

You're masking the k defined above.  Honestly I would just declare and define it inside the for loop in both cases.

@@ +167,5 @@
> +      for (uint32_t k = 0; k < out; ++k) {
> +        finalData[k] = AudioSampleToFloat(outputData[k]);
> +      }
> +    }
> +    else {

} else {

@@ +243,5 @@
> +                      aTrackMap.mFrontChunkOffset, outputOffset,
> +                      out);
> +        in = out;
> +      } else {
> +

Nit: drop the empty line please.

::: content/media/webaudio/MediaStreamAudioSourceNode.cpp
@@ +22,5 @@
> +{
> +  AudioNodeEngine* engine = new AudioNodeEngine(this);
> +  mStream = aContext->Graph()->CreateAudioNodeExternalInputStream(engine);
> +  ProcessedMediaStream* outputStream = static_cast<ProcessedMediaStream*>(mStream.get());
> +  mInputPort = outputStream->AllocateInputPort(aMediaStream->GetStream(), MediaInputPort::FLAG_BLOCK_INPUT);

I think you should just pass 0 here.

::: content/media/webaudio/test/test_mediaStreamAudioSourceNode.html
@@ +13,5 @@
> +SimpleTest.waitForExplicitFinish();
> +var gTest;
> +
> +// This makes loading audio *much* easier.
> +var url = "data:audio/x-wav;base64,UklGRiQIAABXQVZFZm10IBAAAAABAAEAgLsAAAB3AQACABAAZGF0YQAIAAAgAL0F9gtwEYQX/hy+IiYoiC3AMr83qDw9QbxF4knjTY1RAFUkWPxakF3HX7hhUmOSZItlG2ZpZkpm6WUgZQ9koGLgYNJeaVzAWblWf1PkTx9MA0i5QzE/bTqGNV8wISuwJScgghrDFPYOGgk3A1L9b/eR8cTrBOZa4M/aX9UU0PjK/MVBway8XLhDtGqw26yFqYymyqNqoUefh50QnPOaKZq3mZqZ2JlnmlGbj5wfngegPKLDpJentKoersqxt7XpuU++8sLLx8fMBdJN19bcYuIc6N3tsPOY+XD/YgU3CxcR3RaQHDEiqycOLUYyUTcxPN1ATUWLSYNNPVGzVNxXwVpTXZdfkWErY39kc2UVZmFmVWbwZT1lJGTMYgZhCF+kXAJaA1fLUz1QdUxsSBpEnz/fOvY13jCYKzImrCAAG1AVeQ+kCcMD2P359xrySOyL5trgU9vW1ZbQZMt0xqnBFr3AuJ20xbAmrdSpy6YKpJyhdp+tnSycDJs3mr2ZmpnMmVqaN5tunPmd1J8JooGkVqdmqs6tbrFetYC5672BwlfHV8yB0drWStzp4Y7nW+0o8wr57f7TBLUKjBBUFhMcpiE3J40szjHfNsA7ckDrRClJK03oUGVUl1d9Wh9dZV9jYQ5jX2RjZQpmXGZcZvtlUWVDZOxiNWE4X+FcP1pOVxpUjlDYTMJIiUQGQFE7ajZVMRUssiYqIYsbzxUFEC0KSARl/oD4ovLR7A3nYeHO21nWCtHey+TGGMJ8vSO5+7QbsXetIaoKp02kyaGsn9CdTZwim0Wax5mYmceZRZoim02c0Z2on8+hR6QPpxyqfK0Ysf20ILl/vRfC48bgywjRW9bM22PhDOfQ7KXye/hp/kcELQoEENIVhxstIbEmFCxXMWk2UTsIQINEykjRTJJQGVROVz5a5VwwXz5h5mJIZEtlAGZZZl1mDGZeZWZkCGNmYWZfHF2AWpZXZFTpUCxNJknvRG1AxDveNswxkiwwJ60hDBxaFogQuArRBO3+Cvko81ztjOfr4Ufc3tZ/0VnMUceIwuW9hrlZtXGxy61rqk+niKQCotqf951unDebWZrMmZuZvpk1mg2bLJysnXmfmqEKpMym0Kktrb6worS+uBO9rcFwxmrLkNDa1U/b3uCJ5kjsHPL19979vQOmCXsPTRUEG6ggNSaUK+Mw8DXmOpg/IERoSHVMP1DJUwZX/1mmXARfDGHIYiZkOmX0ZVBmZ2YQZndle2QuY45hmV9UXcBa3VexVD5Rg02LSU9F3EAwPFM3QjISLaknMyKPHNwWFhE6C14Fdv+R+bbz2e0d6GPi09xS1wDSy8zHx/bCSr7vubK1zrEbrrWqlqfEpDyiBqAhnoycU5tmmtqZlpm8mSWa9ZoPnIedSZ9nocyjiqaHqdusaLBGtFi4srw7wQDG9soV0F7V0tpT4A3mveuU8W/3UP04AxsJ9Q7EFIAaKCCyJR4rYzCANXM6Lz+3QwdIGUzrT3pTvFa+WWpc0l7fYKNiCmQnZeFlUmZiZiBmiGWTZFNjtmHJX49d+1ooWPpUlFHbTelJt0VDQaE8xDe8MoktLCixIhMdZReYEcgL4gUBABz6OvRm7pzo6+JR3dPXeNJEzTvIYMO7vkm6HLYesnWu+qrjp/ykeqIxoE2eq5xum3aa5JmbmbCZHZrZmvabX50enzKhkqNFpkKphqwYsOSz+rdKvM/AkcV9yp3P5NRM2tvffuU76w3x4/bL/KwCkwhtDj0U/RmkHzMloCrrLws1ATrDPk1Dq0e4S5hPJlN3VndZNVyXXrhgemLuYw5l2WVHZmdmKWaVZbNkcWPdYf9fvl1BW2lYSVXkUTdORUodRqtBET02ODQzAy6qKDIjmR3nFyMSSwxxBocApvrC9O3uJOls49XdTtj30rnNrsjRwyK/s7p0tn+yv65Tqx6oQ6Wsomegcp7SnIGbkJromaCZrJkMmsia15s6nfOe/aBaowCm+6g2rMKvirOat9+7ZsAexQrKJc9m1M7ZV9/65LfqgfBg9j38JgIJCOQNuRN3GSEfsyQjKnIvlTSROVI+70I8R2dLOk/YUi1WNVn2W2dehWBXYtBj92TJZUJmZmYzZqhlyGSTYwdiKmD6XXtbr1iXVTNSkU6kSn5GGUJ4Pa84pzN+LisprSMhHmkYrBLVDPYGEwEu+0r1d++l6ffjT97S2G/TMc4lyTzEkL8Vu9W22LIVr56rZ6iBpeailaCinu+copudmvaZo5mmmQKaspq7mxadyZ7IoCGjv6WyqOqraa80szO3f7v1v7LElcmqzu3TS9nZ3nPkMur679X1t/uaAYEHXA0zE/AYoR4xJKUp+S4gNB056T2AQuJGAUvpToRS5FXxWLlbMV5YYC9is2PeZLtlOGZpZjhmvGXeZLNjLWJbYC5evFvwWONVhlLnTgNL4UZ/Quw9GTkkNPUupykzJJ4e8hgxE10NgAecAbb71fX77y3qeuTS3lHZ6tOqzpfJrsT5v3y7Nrcxs2uv6auyqL+lIKPJoMmeF524m7aa/ZmsmZ2Z+pmcmqGb8pyenpag6KJ+pWuom6sXr9Wy17YVu5C/PsQhyTPOb9PS2FHe8+Oo6XfvSfUv+xAB+QbUDKwSaRgfHq8jKyl9Lqgzrjh4PRtCfEakSpJOMlKZVaxYfFv7XSpgB2KSY8hkqWUzZmVmQ2bIZfhkz2NYYoRgaV70WzVZLVbYUj5PYEtDRw==";

Please just use a wave file.

@@ +34,5 @@
> +        audio.load();
> +        var stream = audio.mozCaptureStream();
> +        var source = context.createMediaStreamSource(stream);
> +          callback(source);
> +          audio.play();

Nit: indentation.

::: content/media/webaudio/test/webaudio.js
@@ +22,5 @@
>    }
>    ok(threw, "The exception was thrown");
>  }
>  
> +var printed = 0;

You left this out!

@@ +98,5 @@
>      SimpleTest.finish();
>    }
>  
>    SimpleTest.waitForExplicitFinish();
> +  var runTestFunction = function() {

Nit: function runTestFuncgion() {

@@ +193,5 @@
> +        testOnOfflineContext(function() {
> +          testOnOfflineContext(done, 44100);
> +        }, 48000);
> +      }
> +      else {

Nit: } else {

@@ +202,5 @@
> +
> +  if (document.readyState !== 'complete') {
> +    addLoadEvent(runTestFunction);
> +  }
> +  else {

Shall I say more?

@@ +203,5 @@
> +  if (document.readyState !== 'complete') {
> +    addLoadEvent(runTestFunction);
> +  }
> +  else {
> +    setTimeout(runTestFunction, 10);

Please just call the function directly.
Attachment #765960 - Flags: review?(ehsan) → review+
Attached patch review fixes from last patch (obsolete) — Splinter Review
Same as 765960, but with review nits fixed.
Attachment #765960 - Attachment is obsolete: true
Attachment #766022 - Flags: superreview?(roc)
Comment on attachment 766022 [details] [diff] [review]
review fixes from last patch

I'll assume that Bobby was looking for a review, not a superreview.  :-)
Attachment #766022 - Flags: superreview?(roc)
Attachment #766022 - Flags: superreview?
Attachment #766022 - Flags: review?(roc)
Attachment #766022 - Flags: superreview?
Comment on attachment 766022 [details] [diff] [review]
review fixes from last patch

Review of attachment 766022 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/AudioNodeExternalInputStream.cpp
@@ +23,5 @@
> +  MOZ_COUNT_DTOR(AudioNodeExternalInputStream);
> +
> +  // Destroy all speex resamplers -- one per input track.
> +  for (uint32_t i = 0; i < mTrackMap.Length(); ++i) {
> +    speex_resampler_destroy(mTrackMap[i].mResampler);

Better to move this call into a destructor for TrackMapEntry

@@ +28,5 @@
> +  }
> +}
> +
> +AudioNodeExternalInputStream::TrackMapEntry*
> +AudioNodeExternalInputStream::GetTrackMap(const StreamBuffer::Track& aTrack)

Should be called GetTrackMapEntry.

@@ +118,5 @@
> +  double inputPlaybackRate = aInputRate;
> +  double playbackRateRatio = inputPlaybackRate / mSampleRate;
> +
> +  // Cache the number of channels in the input chunk.
> +  const uint32_t numberOfChannels = aInputChunk.mChannelData.Length();

It's possible for AudioChunks in a track to have varying numbers of channels. You need to handle that here. See how AudioNodeStream::ObtainInputBlock calls UpMix and DownMix.

I guess the Speex resampler can't handle channel-count changes properly so once we've seen the first packet of non-null audio we're stuck upmixing/downmixing to that number of channels forever? Maybe we could lift that restriction, bypassing the resampler when the track rate matches the AudioContext rate?

@@ +291,5 @@
> +  for (StreamBuffer::TrackIter tracks(mInputs[0]->GetSource()->mBuffer, MediaSegment::AUDIO);
> +       !tracks.IsEnded(); tracks.Next()) {
> +    const StreamBuffer::Track& inputTrack = *tracks;
> +
> +    if (inputTrack.GetRate() == 0) { continue; }

The rate can't be zero, so don't bother with this.

@@ +299,5 @@
> +    // for the track.
> +    TrackMapEntry* trackMap = GetTrackMap(inputTrack);
> +
> +    // If something bizarre happened and we're beyond the end of the input track, bail.
> +    if (!trackMap || trackMap->mLastInputTick > inputTrack.GetEnd()) { continue; }

if (...) {
  continue;
}

@@ +308,5 @@
> +    if (!trackMap->mChunkQueue.empty()) {
> +      AudioChunk* outputChunk = outputChunks.AppendElement();
> +      AllocateAudioBlock(trackMap->mChunkQueue.front().mChannelData.Length(), outputChunk);
> +      WriteZeroesToAudioBlock(outputChunk, 0, WEBAUDIO_BLOCK_SIZE);
> +      ConsumeDataFromTrack(*outputChunk, inputTrack, *trackMap);

This code doesn't try to sync the tracks properly. aFrom and aTo should be used to extract the right set of samples from the resampled track. You'll need to look at the track's mStart as well as take into account any Null data you skipped.

::: content/media/AudioNodeExternalInputStream.h
@@ +23,5 @@
> +typedef struct SpeexResamplerState_ SpeexResamplerState;
> +
> +namespace mozilla {
> +
> +class AudioNodeExternalInputStream : public AudioNodeStream {

Please document this class.

@@ +45,5 @@
> +
> +  // For easily pairing data about output chunks.
> +  struct ChunkMetaData {
> +    uint32_t mIndex;
> +    uint32_t mOffset;

This needs more documentation.

@@ +48,5 @@
> +    uint32_t mIndex;
> +    uint32_t mOffset;
> +  };
> +
> +  std::deque<AudioChunk> mOutputChunkQueue;

So does this.

::: content/media/MediaStreamGraph.cpp
@@ +2254,5 @@
> +MediaStreamGraph::CreateAudioNodeExternalInputStream(AudioNodeEngine* aEngine, TrackRate aSampleRate)
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +  if (!aSampleRate) {
> +    aSampleRate = aEngine->NodeMainThread()->Context()->SampleRate();

Why does this function even take aSampleRate as a parameter?

::: content/media/MediaStreamGraph.h
@@ +957,5 @@
>                                           TrackRate aSampleRate = 0);
> +
> +  AudioNodeExternalInputStream*
> +  CreateAudioNodeExternalInputStream(AudioNodeEngine* aEngine,
> +                                     TrackRate aSampleRate = 0);

Please document this, especially aSampleRate if we keep it.

::: content/media/webaudio/AudioContext.cpp
@@ +249,5 @@
> +  if (!mIsOffline) {
> +    nsRefPtr<MediaStreamAudioSourceNode> mediaStreamAudioSourceNode = new MediaStreamAudioSourceNode(this, &aMediaStream);
> +    return mediaStreamAudioSourceNode.forget();
> +  }
> +  aRv.Throw(NS_ERROR_DOM_NOT_SUPPORTED_ERR);

Put the throw under "if (mIsOffline)" so the body of the function is the normal case.

::: content/media/webaudio/test/test_mediaStreamAudioSourceNode.html
@@ +34,5 @@
> +        audio.load();
> +        var stream = audio.mozCaptureStream();
> +        var source = context.createMediaStreamSource(stream);
> +        callback(source);
> +        audio.play();

There is no guarantee that the media decoder will produce data immediately ... There could be some delay during which the MediaStreamAudioSourceNode produces silence. That will cause your test to fail, I think.

Now that MediaStreamDestinationNode has landed I think you might be better off generating audio in an AudioNode, feeding it to MediaStreamDestinationNode, then plugging that stream into MediaStreamAudioSourceNode and testing the result.
Jean-Marc, we can have audio tracks where the number of channels varies from block to block. (The sample rate is constant, however.) We're using the Speex resampler to resample these tracks. Do we have to pick a fixed number of channels for the resampler and upmix/downmix the input blocks, or is there a way to have the resampler vary the number of channels over time? Basically, what should we do here? :-)
Flags: needinfo?(jmvalin)
Is the actual audio continuous between blocks? Are the mono blocks simply for time when all channels are the same anyway or are there discontinuities when the number of channels change. The Speex resampler does not currently allow changing the number of channels (though it can possibly be implemented), so if you don't want glitches, the you would have to upmix/downmix the input. However, if the input is already discontinuous, you can probably just reinitialize the resampler with a different number of channels.
Flags: needinfo?(jmvalin)
I guess we can get rid of aSampleRate and just read it from the AudioNode unconditionally.
Also, about the test, as far as I know we don't have any way to construct a MediaStream (right?) so we can't just create one and connect it to the destination node of another graph.  Is it true that the wave decoder may be unable to start giving samples immediately though?
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #38)
> Also, about the test, as far as I know we don't have any way to construct a
> MediaStream (right?) so we can't just create one and connect it to the
> destination node of another graph.  Is it true that the wave decoder may be
> unable to start giving samples immediately though?

Wait, I'm stupid.  There exists a way to create a MediaStream, it's called AudioContext.createMediaStreamDestination().  :-)  Please disregard this comment.
This interdiff addresses some of the review comments.  Bobby, I factored out the upmixing/downmixing code into AudioNodeStream::UpMixDownMixChunk for you to use.  Note that currently it only works with float channel formats, and I think in the case of an incoming chunk with an int format, perhaps it makes sense to convert it to float format before passing it in to UpMixDownMixChunk.
(In reply to Jean-Marc Valin (:jmspeex) from comment #36)
> Is the actual audio continuous between blocks? Are the mono blocks simply
> for time when all channels are the same anyway or are there discontinuities
> when the number of channels change.

One way the channel count can change is if we are mixing inputs with different numbers of channels. The result of that mixing has the maximum number of channels of any input, so as inputs are added or removed, the channel count can change. (This is per Web Audio spec, not our decision.) In that case, I guess the audio is supposed to be continuous.

> The Speex resampler does not currently
> allow changing the number of channels (though it can possibly be
> implemented), so if you don't want glitches, the you would have to
> upmix/downmix the input.

OK I think we'll stick with the current approach (upmix/downmix) for now. Thanks.
@roc I just need some clarification about track syncing. Is it something that matters only when multiple input tracks are involved? In what situation will this implementation break?

I'm assuming it has something to do with connecting nodes after they've already been generating audio. e.g. capturing a stream from an <audio> after it's already been playing for a while.

Correct me if I'm wrong, but I can probably copy the syncing logic from TrackUnionStream, where it does conversions from GraphTime to StreamTime.
Flags: needinfo?(roc)
(In reply to Bobby Richter [:bobby] from comment #42)
> @roc I just need some clarification about track syncing. Is it something
> that matters only when multiple input tracks are involved? In what situation
> will this implementation break?

Very soon we'll have the ability to make MediaStreams with multiple audio tracks. One way is to do new MediaStream(stream1, stream2) which will take the union of the tracks in stream1 and stream2.

> Correct me if I'm wrong, but I can probably copy the syncing logic from
> TrackUnionStream, where it does conversions from GraphTime to StreamTime.

Right.

Unfortunately we can't actually test multiple audio tracks yet because the features aren't there. But the code can still be written in a way that will work with multiple tracks and also choose the right data for each time window.
Flags: needinfo?(roc)
This is a nice demo to test this patch against: <http://webaudiodemos.appspot.com/input/index.html>.  Currently I don't get audio output with your patches on this demo.
Courtesy of Chris Wilson.
Bobby, would you like me to take this over?
Flags: needinfo?(secretrobotron)
Sorry for the lack of progress. In Brazil @ FISL working on mobile strategy.

@roc, if it needs to get done asap, I don't mind passing the torch. I'll be back in a few days when I can continue work though.
Flags: needinfo?(secretrobotron)
I'd like us to get this done as soon as possible, sorry for stealing this from you, Bobby!
Assignee: bobby → roc
Attached file somewhat rewritten patch (obsolete) —
This mostly rewrites AudioNodeExternalInputStream. It builds, but is completely untested. I need to test it, fix bugs, and simplify the code.

It builds on Bobby's patch, but handles many things the previous iteration did not, such as synchronizing the input stream tracks (including handling periods where the input stream blocks), upmixing/downmixing channels when the AudioChunks of an input track change channel count, removing TrackMapEntries for dead tracks, and correct handling of volume for input chunks.
Attachment #766022 - Attachment is obsolete: true
Attachment #766354 - Attachment is obsolete: true
Attachment #766022 - Flags: review?(roc)
Blocks: 894848
Attached patch Updated patch (obsolete) — Splinter Review
The test passes now!

However, the test doesn't test the resampling paths. I need to figure out a way to test that. Resampling introduces delay, and we don't really want to make a tight constraint on the delay, so it's going to be a bit tricky.
Attachment #777065 - Attachment is obsolete: true
I'll review this patch.  There are a lot of cases where we don't test resampling code paths (mostly because that coming up with a sensible expected buffer is difficult) so I think we can potentially leave that test to another bug even!
Comment on attachment 778846 [details] [diff] [review]
Updated patch

Review of attachment 778846 [details] [diff] [review]:
-----------------------------------------------------------------

Great!
Attachment #778846 - Flags: review+
Blocks: 894856
No longer blocks: 894848
I've still got some bugs to fix. I fixed the security issue about exposing non-same-origin audio data through Web Audio. I wrote a testcase for resampling a MediaStream captured from a media element, which just verifies that there's roughly the right number of nonzero audio samples produced, and the testcase fails, so I'm trying to figure that out now.
Attached patch updated patch (obsolete) — Splinter Review
The resampling test passes now.

I want to add another test, to check that getting data from cross-origin media resources doesn't work. Then I'll clean up the patch a bit and submit it.
Attachment #778846 - Attachment is obsolete: true
The flags are defined and used correctly, the comments describing them are just wrong.
Attachment #780293 - Flags: review?(rjesup)
Attachment #780293 - Flags: review?(rjesup) → review+
Hmm, one thing I haven't addressed in these patches is lifetime. I believe with these patches, nothing prevents a MediaStreamAudioSourceNode from being garbage collected while it's playing a MediaStream, cutting playback off.

Also, the MediaStreamAudioSourceNode doesn't keep the DOMMediaStream alive, and it really should.

So, I think MediaStreamAudioSourceNode should strongly reference DOMMediaStream, and it should add a self-reference while the DOMMediaStream is not finished. But that doesn't give authors a way to remove it. Just disconnecting the MediaStreamAudioSourceNode's output connections wouldn't allow it to be GCed. The same is true for nodes like AudioBufferSourceNode though. Is that wrong? Should self-referencing AudioNodes lose their self-reference when they have no outputs?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #60)
> Hmm, one thing I haven't addressed in these patches is lifetime. I believe
> with these patches, nothing prevents a MediaStreamAudioSourceNode from being
> garbage collected while it's playing a MediaStream, cutting playback off.

You can hold the node alive from the DOMMediaStream.  This is what we did for MediaStreamAudioDestinationNode.

> Also, the MediaStreamAudioSourceNode doesn't keep the DOMMediaStream alive,
> and it really should.

Yes.

> So, I think MediaStreamAudioSourceNode should strongly reference
> DOMMediaStream, and it should add a self-reference while the DOMMediaStream
> is not finished. But that doesn't give authors a way to remove it. Just
> disconnecting the MediaStreamAudioSourceNode's output connections wouldn't
> allow it to be GCed. The same is true for nodes like AudioBufferSourceNode
> though. Is that wrong? Should self-referencing AudioNodes lose their
> self-reference when they have no outputs?

I don't think it's wrong.  If you have an AudioBufferSourceNode which is playing, it shouldn't stop just because you've disconnected it from a node, in case you reconnect it.  That being said, maybe it would be ok, since if you lose a JS reference to the node, then it won't be possible to reconnect it to some other node in the future.  So, perhaps we still need all of that JSBindingFinalized logic?  :(
I'll try to review your patches soon, but I'm in a two day TRIBE program right now, and it's not clear if I'm going to have enough time to sit down and focus on these patches before Friday.  Will do my best though.
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #61)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #60)
> > Hmm, one thing I haven't addressed in these patches is lifetime. I believe
> > with these patches, nothing prevents a MediaStreamAudioSourceNode from being
> > garbage collected while it's playing a MediaStream, cutting playback off.
> 
> You can hold the node alive from the DOMMediaStream.  This is what we did
> for MediaStreamAudioDestinationNode.

OK, that sounds good.

> > So, I think MediaStreamAudioSourceNode should strongly reference
> > DOMMediaStream, and it should add a self-reference while the DOMMediaStream
> > is not finished. But that doesn't give authors a way to remove it. Just
> > disconnecting the MediaStreamAudioSourceNode's output connections wouldn't
> > allow it to be GCed. The same is true for nodes like AudioBufferSourceNode
> > though. Is that wrong? Should self-referencing AudioNodes lose their
> > self-reference when they have no outputs?
> 
> I don't think it's wrong.  If you have an AudioBufferSourceNode which is
> playing, it shouldn't stop just because you've disconnected it from a node,
> in case you reconnect it.

It should if you also GC its JS wrapper.

It seems to me that right now, if you have a looping AudioBufferSourceNode, disconnect it from the audio graph and drop the reference to the JS wrapper, it leaks as long as the lifetime of the page.

The same will be true for a MediaStreamAudioSourceNode being fed from an everlasting DOMMediaStream. In that case the author may be able to find a way to shut off the DOMMediaStream, and we can make finished DOMMediaStreams drop their references to MediaStreamAudioSourceNodes. It's still a problem if the author doesn't want to shut down the DOMMediaStream.

I'll fix the bidirectional DOMMediaStream reference here but we need to rethink node lifetimes (again!) in another bug.
Attachment #780303 - Flags: review?(ehsan) → review+
Attachment #780304 - Flags: review?(ehsan) → review+
Comment on attachment 780305 [details] [diff] [review]
Part 5: Implement MediaStreamAudioSourceNode

Review of attachment 780305 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/moz.build
@@ +97,5 @@
>  CPP_SOURCES += [
>      'AudioAvailableEventManager.cpp',
>      'AudioChannelFormat.cpp',
>      'AudioNodeEngine.cpp',
> +    'AudioNodeExternalInputStream.cpp',

I guess you should technically move these two hunks to the previous patch, but it doesn't matter that much.
Attachment #780305 - Flags: review?(ehsan) → review+
Comment on attachment 780748 [details] [diff] [review]
Make MediaStreamAudioSourceNode keep its DOMMediaStream alive, and make the DOMMediaStream keep the MediaStreamAudioSourceNode alive

Review of attachment 780748 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webaudio/MediaStreamAudioSourceNode.cpp
@@ +18,5 @@
> +NS_IMPL_CYCLE_COLLECTION_UNLINK_END_INHERITED(AudioNode)
> +
> +NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN_INHERITED(MediaStreamAudioSourceNode, AudioNode)
> +  NS_IMPL_CYCLE_COLLECTION_TRAVERSE(mInputStream)
> +NS_IMPL_CYCLE_COLLECTION_TRAVERSE_END

Please use NS_IMPL_CYCLE_COLLECTION_INHERITED_1 instead of these macros.
Attachment #780748 - Flags: review?(ehsan) → review+
Attachment #780295 - Flags: review?(cpearce) → review+
The major cause of test failures is because we don't do a good enough job of copying track data from the input stream. In particular we were hitting a case which is basically the same as https://bugzilla.mozilla.org/show_bug.cgi?id=896353#c35.

I'm afraid this code is quite tricky. I've worked pretty hard to simplify the code and explain it in the comments. I'm fixing TrackUnionStream first because the AudioNodeExternalInputStream code is based on it.
Attachment #783698 - Flags: review?(paul)
Comment on attachment 783742 [details] [diff] [review]
Part 9: Make SpeechStreamListener handle null AudioChunks

Hmm, the code is a bit odd, but let's clean it up once we activate speech API.
Attachment #783742 - Flags: review?(bugs) → review+
Comment on attachment 783698 [details] [diff] [review]
Part 7: Fix copying of track data from input streams to output streams in TrackUnionStream

Review of attachment 783698 [details] [diff] [review]:
-----------------------------------------------------------------

Some comments now, but I want to look at this again tomorrow.

::: content/media/TrackUnionStream.h
@@ +122,5 @@
>    struct TrackMapEntry {
> +    // mEndOfConsumedInputTicks is the end of the input ticks that we've consumed.
> +    // 0 if we haven't consumed any yet.
> +    TrackTicks mEndOfConsumedInputTicks;
> +    // mEndOfLastInputIntervalInInputTime is the timestamp for the end of the

s/mEndOfLastInputIntervalInInputTime/mEndOfLastInputIntervalInInputStream/

@@ +126,5 @@
> +    // mEndOfLastInputIntervalInInputTime is the timestamp for the end of the
> +    // previous interval which was unblocked for both the input and output
> +    // stream, in the input stream's timeline, or -1 if there wasn't one.
> +    StreamTime mEndOfLastInputIntervalInInputStream;
> +    // mEndOfLastInputIntervalInInputTime is the timestamp for the end of the

s/mEndOfLastInputIntervalInInputTime/mEndOfLastInputIntervalInOutputStream/

@@ +274,5 @@
> +        // intervals where both streams are not blocked, the above if condition
> +        // is false and mEndOfConsumedInputTicks advances exactly to match
> +        // the ticks that were consumed.
> +        // Property #2:
> +        // Let originalOutputStart be the value of outputStart when mLastInputSyncPoint

I can't find any reference to |mLastInputSyncPoint| in the code base/other patches. Maybe it got renamed halfway during the patch writing. I guess I could make sense of the proof nonetheless.
Er, yes, I changed the implementation and need to revise the proof. I'll do that.
Attached patch Part 7 v2Splinter Review
Attachment #783698 - Attachment is obsolete: true
Attachment #783698 - Flags: review?(paul)
Attachment #784183 - Flags: review?(paul)
Tweaked the contentDuration*.sjs tests to avoid httpd.js adding Content-Length headers automatically.
https://tbpl.mozilla.org/?tree=Try&rev=3a2fce80629b
Comment on attachment 784183 [details] [diff] [review]
Part 7 v2

Review of attachment 784183 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/TrackUnionStream.h
@@ +297,5 @@
> +        // as required.
> +
> +        if (inputStartTicks < 0) {
> +          // Data before the start of the track is just null.
> +          segment->AppendNullData(-inputStartTicks);

So, here, you are appending 1 ticks of silence the first time, before the start of the track, instead of appending the amount of ticks required to either reach the beginning of the track (if the start of the track is in the [aFrom,aTo] interval), or ((aTo - aFrom) * rate) ticks if the start of the track is well ahead.

Maybe we should change the semantic of GraphTimeToStreamTime to return a negative quantity if the argument is before the stream time (i.e. not clamp the lower bound to zero).

I believe this got unnoticed because the testcase provided in the next patch puts 1 as the delay parameter. If I change it to 30 or something, it fails.
We have to add a small amount of delay to ensure that there is always a sample available if we see an interval that contains a tick boundary on the output stream's timeline but does not contain a tick boundary on the input stream's timeline. 1 tick delay is necessary and sufficient. So I put 1 as the delay parameter because that's what this code adds. Does that make sense?
Part 2 is causing intermittent failures on Mac OSX 10.6 and 10.7 (but not 10.8).
blocking-b2g: --- → koi+
Whiteboard: [FT: Media Recording, Sprint]
I put put this to koi+ since it 896353 (for FFOS 1.2) needs this to record media stream
Attached patch part 2 v2Splinter Review
The problem in part 2 is that as soon as we call mDecoder->Load in FinishDecoderSetup, we start decoding and calling MediaDecoderStateMachine::SendStreamData, which could check mDecoder->IsSameOriginMedia before FinishDecoderSetup has been able to call NotifyDecoderPrincipalChanged. In that case the IsSameOriginMedia call will return false, breaking things.

This patch avoids the problem by making sure we call NotifyDecoderPrincipalChanged before Load. That requires that we set up mDecoder and set mDecoder's mResource before calling NotifyDecoderPrincipalChanged.
Attachment #780295 - Attachment is obsolete: true
Attachment #785628 - Flags: review?(cpearce)
Comment on attachment 784183 [details] [diff] [review]
Part 7 v2

Review of attachment 784183 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/TrackUnionStream.h
@@ +296,5 @@
> +        //   = aInputTrack->GetSegment()->GetDuration()
> +        // as required.
> +
> +        if (inputStartTicks < 0) {
> +          // Data before the start of the track is just null.

I'd make this a bit more explicit, maybe using the comment 82.
Attachment #784183 - Flags: review?(paul) → review+
Attachment #783700 - Flags: review?(paul) → review+
Comment on attachment 785628 [details] [diff] [review]
part 2 v2

Review of attachment 785628 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/MediaDecoderStateMachine.cpp
@@ +619,5 @@
>    if (mState == DECODER_STATE_DECODING_METADATA)
>      return;
>  
> +  if (!mDecoder->IsSameOriginMedia()) {
> +    printf("MediaDecoderStateMachine::SendStreamData Same-origin check failed (decoder %p)!!!\n", mDecoder.get());

Use the LOG() macros, or NS_WARNING().
Attachment #785628 - Flags: review?(cpearce) → review+
Comment on attachment 780293 [details] [diff] [review]
Fix comments around MediaInputPort flags

Patches landed in this bug are needed for Web Audio in Firefox 25.
Attachment #780293 - Flags: approval-mozilla-aurora?
Keywords: verifyme
QA Contact: jsmith
QA Contact: jsmith
checkin-needed for Aurora, a=webaudio.
Keywords: checkin-needed
Attachment #780293 - Flags: approval-mozilla-aurora?
We need to backport bug 899863 as well for this to go green, apparently [1].

[1]: https://tbpl.mozilla.org/?tree=Try&rev=48fdaee46d29
QA Contact: manuela.muntean
With Firefox 25 beta 1 (build ID: 20130917123208) on Mac OS X 10.8.4 in 32bit mode, I get the following results:

1) the testcase from comment 23 seems to be working the same as on Chrome
2) the testcase from comment 44 makes Firefox hang 
3) I wasn't able to use the testcase from comment 45 (I've put it into an .html file, but I couldn't see the functionality, only the code)

Does anyone have any thoughts/suggestions? Thanks! :)
Flags: needinfo?(roc)
File a bug on part 2 please
Flags: needinfo?(roc)
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #102)
> File a bug on part 2 please

I've filed bug 918217 and CC'ed you. Thanks!
Marking this verified as fixed, based on comment 101 and comment 103.
With Mac OS X 10.8.4 in 32-bit mode and latest Aurora (build ID: 20131016004005) I get the following:

1) the testcase from comment 23 works the same as on Chrome
2) the testcase from comment 44 has the following issue:

- after clicking on the "Share selected device" button when prompted to share your microphone, with the 3D Sonogram radio button selected, start talking in the microphone; 

- on Chrome, the animation keeps on showing on and on, as long as you're speaking; but with Aurora the animation stops after ~20-30 seconds, even if you continue talking.

Does anyone have any thoughts/suggestions? Thanks!
Flags: needinfo?(roc)
Depends on: 934512
With Mac OS X 10.9 in 32-bit mode and 26 beta 7 (build ID: 20131122094025)
1) the testcase from comment 23 works the same as with Chrome

I can still experience the below scenario:

> 2) the testcase from comment 44 has the following issue:
> 
> - after clicking on the "Share selected device" button when prompted to
> share your microphone, with the 3D Sonogram radio button selected, start
> talking in the microphone; 
> 
> - on Chrome, the animation keeps on showing on and on, as long as you're
> speaking; but with 26 beta 7 the animation stops after ~20-30 seconds, even if
> you continue talking.

The other radio buttons (Frequency, Sonogram, Waveform) also work poorly by comparison to Chrome.

Does anyone have any thoughts/suggestions? Thanks!
Flags: needinfo?(karlt)
(In reply to Manuela Muntean [:Manuela] [QA] from comment #105)
> 2) the testcase from comment 44 has the following issue:
> 
> - after clicking on the "Share selected device" button when prompted to
> share your microphone, with the 3D Sonogram radio button selected, start
> talking in the microphone; 
> 
> - on Chrome, the animation keeps on showing on and on, as long as you're
> speaking; but with Aurora the animation stops after ~20-30 seconds, even if
> you continue talking.
> 
> Does anyone have any thoughts/suggestions? Thanks!

This is most likely bug 934512.

The workaround would be to keep a reference to the MediaStream from getUserMedia, so I guess bug 934512 doesn't need to be blocking-b2g koi+.

(In reply to Manuela Muntean [:Manuela] [QA] from comment #106)
> The other radio buttons (Frequency, Sonogram, Waveform) also work poorly by
> comparison to Chrome.

Is this the same problem with the animation stopping?
Do they work OK until it stops?
(They seem to work OK here but my Chromium 30.0.1599.37 doesn't play this demo, so I haven't been able to compare.)
Flags: needinfo?(karlt)
> (In reply to Manuela Muntean [:Manuela] [QA] from comment #106)
> > The other radio buttons (Frequency, Sonogram, Waveform) also work poorly by
> > comparison to Chrome.

> Is this the same problem with the animation stopping?

Yes, it's the same problem.

> Do they work OK until it stops?

They work OK until it stops, but I can see intermittently the next issue: if I select a different radiobutton while the current selection is still "playing", the animation doesn't always change accordingly, like it does with Chrome.

Tested with the same environment: Mac OS X 10.9 in 32-bit mode & Firefox 26 beta 7.

> (They seem to work OK here but my Chromium 30.0.1599.37 doesn't play this
> demo, so I haven't been able to compare.)
Please retest in the next nightly and file new bugs as necessary. Thanks!
Flags: needinfo?(roc)
> They work OK until it stops, but I can see intermittently the next issue: if
> I select a different radiobutton while the current selection is still
> "playing", the animation doesn't always change accordingly, like it does
> with Chrome.

Robert, I've logged bug 945716 for this and CC'ed you. 

For the scenario detailed in comment 106 I didn't log a bug, since Karl said this is most likely bug 934512.

Since I've logged bug 945716, and this bug is verified in the first part of comment 106 ("1) the test case from comment 23 works the same as with Chrome"), I'm marking this as verified for Firefox 26.
Depends on: 983023
Depends on: 983052
Depends on: 983058
Blocks: 983062
Depends on: 983066
Depends on: 987976
Depends on: 989907
No longer blocks: 983062
Depends on: 983062
You need to log in before you can comment on or make changes to this bug.