Closed Bug 982490 Opened 11 years ago Closed 11 years ago

Add an audio mixer API to the MediaStreamGraph

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla31

People

(Reporter: padenot, Assigned: padenot)

References

Details

Attachments

(6 files, 7 obsolete files)

17.69 KB, text/plain
Details
3.14 KB, patch
Details | Diff | Splinter Review
22.83 KB, patch
Details | Diff | Splinter Review
27.12 KB, patch
roc
: review-
jesup
: review-
Details | Diff | Splinter Review
20.06 KB, patch
roc
: review+
jesup
: review+
Details | Diff | Splinter Review
38.94 KB, patch
Details | Diff | Splinter Review
This will be useful for the AEC.
We have a bug where the amount of frames we write to the AudioStream depends on the StreamTime for the MediaStream that write the audio. Because we want to mixe the data, we need to have the exact same number of frame each cycle. This patch decorellates the amount of frames written from the stream time to achieve that.
This adds a mixer, its test, and hook it up to the MSG, but does not actually connects it to the AEC in gUM. What needs to be done is just the glue between the mixer callback and the AudioOuputObserver, in bug 694814. Note that at the moment, as soon as we have a MediaEngineWebRTCAudioSource connected to the graph, we enable audio mixing. Maybe this is not desirable.
To get this to build, I had to add const char *typeName() const { return "mozilla::AudioMixer"; } to AudioMixer (though I doubt adding this raw is the right way to solve things). Note: debug build
Working up a patch of fixes (guards on mMixer for null, etc). Also hit this: #6 0x00007fc69a4204cc in mozilla::AudioMixer::Mix (this=0x7fc673de7b20, aSamples=0x7fc6657fa450, aChannels=2, aFrames=14) at ../../../content/media/AudioMixer.h:48 48 MOZ_ASSERT(aFrames == mFrames); with aFrames=14, mFrames=512
Attached patch some fixes (obsolete) — Splinter Review
Note: my testcase is http://mozilla.github.io/webrtc-landing/gum_test.html Fails reliably with 14/512 mismatch I have these patches and bug 818822 in my repo
Attached file gum_test logs
Selected Audio on gum_test
Randell, I forgot to mention that you need to have the patch from bug 818822 applied.
Depends on: 818822
New patch, it should work better. There are a bunch of debugging printf, this needs cleanup, but I wanted to put a patch up today for testing.
Attachment #8389641 - Attachment is obsolete: true
Attachment #8390046 - Attachment is obsolete: true
Updated patches with (hopefully) the rounding issue fixed.
Attachment #8389638 - Attachment is obsolete: true
Attachment #8390410 - Attachment is obsolete: true
Comment on attachment 8392250 [details] [diff] [review] Ensure that each MSG cycle, each MediaStream write the same number of frames to their AudioStream. r= >Bug 982490 - Ensure that each MSG cycle, each MediaStream write the same number of frames to their AudioStream. r= "Ensure *for* each MSG cycle that each MediaStream writes" would clarify the goal here, if that is what you mean. >+ TrackTicks ticksNeeded = TimeToTicksRoundDown(IdealAudioRate(), aTo - aFrom); >+ TrackTicks offset = track->TimeToTicksRoundDown(GraphTimeToStreamTime(aStream, aFrom)); I'm not sure that offset + ticksNeeded will equal the offset next time this function is called. Does something provide that? i.e. a single sample may be skipped or repeated. >- TrackTicks startTicks = >- track->TimeToTicksRoundDown(GraphTimeToStreamTime(aStream, t)); >- TrackTicks endTicks = >- track->TimeToTicksRoundDown(GraphTimeToStreamTime(aStream, end)); I don't know why this approach didn't provide the goal here, given each PlayAudio() call had the same From and To. Do you understand what was going wrong?
(In reply to Karl Tomlinson (:karlt) from comment #13) > >- track->TimeToTicksRoundDown(GraphTimeToStreamTime(aStream, t)); > >- TrackTicks endTicks = > >- track->TimeToTicksRoundDown(GraphTimeToStreamTime(aStream, end)); > > I don't know why this approach didn't provide the goal here, given each > PlayAudio() call had the same From and To. > Do you understand what was going wrong? Oh, I see the problem was from rounding audioOutput.mBlockedAudioTime instead of the end value from aStream->mBlocked.GetAt().
Yes, sorry about the lack of comments/explanations, I plan to explain the changes once this is ready for review. But yes, comment 14 is right.
So, we had this problem where the conversion between stream time/graph time and the to ticks would end up rounding the number of sample we wrote to the stream in a different directions, across AudioStream. This is a problem if we write a mixer, because it's easier to just avec buffers of the same size. This fixes by converting the difference of aFrom and aTo, so we work on the same numbers, for all the streams, regardless of the absolute value in stream time or graph time, and ensure consistent rounding.
Attachment #8397162 - Flags: review?(roc)
Attachment #8397162 - Flags: review?(rjesup)
Attachment #8392250 - Attachment is obsolete: true
Comment on attachment 8397162 [details] [diff] [review] Ensure for MSG cycle that each MediaStream write the same number of frames to their AudioStream. r= Review of attachment 8397162 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/MediaStreamGraph.cpp @@ +868,5 @@ > // blocked period has completed. But we do need to make sure we play from the > // right offsets in the stream buffer, even if we've already written silence for > // some amount of blocked time after the current time. > GraphTime t = aFrom; > + while (ticksNeeded) { MOZ_ASSERT(ticksNeeded >= 0); IIRC TrackTicks is signed @@ +1297,5 @@ > + if (!ticksPlayed) { > + ticksPlayed = ticksPlayedForThisStream; > + } else { > + MOZ_ASSERT(!ticksPlayedForThisStream || ticksPlayedForThisStream == ticksPlayed, > + "Each stream should have the same number of frame."); same number of frames
Attachment #8397162 - Flags: review?(rjesup) → review+
This implements the mixer. The AudioMixer class itself is pretty straigtforward (it's just a for loop, a buffer, two functions and a callback). This patch works in the following way: - The MSG has a weak ref to the AudioMixer - When a WebRTC engine needs to get a mixed stream, it asks for the a mixer, that is created and addref'd in the engine. The engine itself does not use this ref, it's merely to keep the mixer alive: we need an output mix as long as a source needs to be echo cancelled, not longer. - When the MSG goes to output something to AudioStreams, it passes a pointer to the mixer to the AudioSegment::WriteTo method, that nullchecks it, and mixes the track in if the pointer is not null - When an MSG cycle is done, and if we have a mixer, we tell the mixer it can send data somewhere else. The mixer calls its callback, that goes to jesup's patches (currently the callback is noop, but this patch is intended to be landed alongside jesup's). This patches changes the complicated logic in WriteTo so we write only once to the AudioStream (and therefore we have the guarantee that we write the same thing in the AudioStream and the mixer, so we can detect problems easily). Not that this will go away when we complete the rest of the MSG refactoring plan.
Attachment #8397194 - Flags: review?(roc)
Attachment #8397194 - Flags: review?(rjesup)
(In reply to Karl Tomlinson (:karlt) from comment #13) > >+ TrackTicks ticksNeeded = TimeToTicksRoundDown(IdealAudioRate(), aTo - aFrom); > > >+ TrackTicks offset = track->TimeToTicksRoundDown(GraphTimeToStreamTime(aStream, aFrom)); > > I'm not sure that offset + ticksNeeded will equal the offset next time this > function is called. > i.e. a single sample may be skipped or repeated. Is there a reason why this is not a problem? I wondered whether it would be better to have TrackTicks ticksNeeded = TimeToTicksRoundDown(IdealAudioRate(), aTo) - TimeToTicksRoundDown(IdealAudioRate(), aFrom); but we also need know that the stream has produced up to offset + ticksNeeded - 1. And blocking periods that may not be integer ticks are complicated too.
Well, yes, it's broken, but short of refactoring the MSG during the MSG refactoring to get rid of all the different units to unify everything using track ticks (because we now have only one samplerate), we can only hack around it. This is the idea of this little patch. I'd like to go towards a simpler API for AudioSegments where it consumes or produces a number of ticks, and track it's playback position internally. That would fix this, but it's more involved, and we would like to have AEC in gUM quickly.
Target Milestone: --- → mozilla31
Comment on attachment 8397162 [details] [diff] [review] Ensure for MSG cycle that each MediaStream write the same number of frames to their AudioStream. r= Review of attachment 8397162 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/MediaStreamGraph.cpp @@ +860,5 @@ > MediaStream::AudioOutputStream& audioOutput = aStream->mAudioOutputStreams[i]; > StreamBuffer::Track* track = aStream->mBuffer.FindTrack(audioOutput.mTrackID); > AudioSegment* audio = track->Get<AudioSegment>(); > + AudioSegment output; > + TrackTicks offset = track->TimeToTicksRoundDown(GraphTimeToStreamTime(aStream, aFrom)); Please assert that the track rate is IdealAudioRate. @@ +899,5 @@ > + } else { > + MOZ_ASSERT(track->IsEnded(), "Not enough data, and track not ended."); > + // If we are at the end of the track, maybe write the remaining > + // samples, and pad with/output silence. > + if (endTicksNeeded < endTicksAvailable) { This can't be true here! ::: content/media/MediaStreamGraphImpl.h @@ +326,5 @@ > void CreateOrDestroyAudioStreams(GraphTime aAudioOutputStartTime, > MediaStream* aStream); > /** > * Queue audio (mix of stream audio and silence for blocked intervals) > + * to the audio output stream. Returns the number of frame played. "frames"
Attachment #8397162 - Flags: review?(roc) → review-
Comment on attachment 8397194 [details] [diff] [review] Add an audio mixer API to the MediaStreamGraph. r= Review of attachment 8397194 [details] [diff] [review]: ----------------------------------------------------------------- Rather than exposing a Mixer object like this, would it be simpler to just have the MSG expose a callback that gets called with the mixed samples? ::: content/media/AudioMixer.h @@ +16,5 @@ > + AudioSampleFormat aFormat, > + uint32_t aChannels, > + uint32_t aFrames); > + > +class AudioMixer : public SupportsWeakPtr<AudioMixer> Please add a comment explaining what this class is for Single space before : @@ +23,5 @@ > + NS_INLINE_DECL_THREADSAFE_REFCOUNTING(AudioMixer) > + MOZ_DECLARE_REFCOUNTED_TYPENAME(AudioMixer) > + > + AudioMixer(MixerFunc aCallback) > + :mCallback(aCallback), space after : ::: content/media/AudioSegment.cpp @@ +193,2 @@ > > + if(!c.mTimeStamp.IsNull()) { Space before ( ::: content/media/MediaStreamGraph.cpp @@ +919,5 @@ > + > + WeakPtr<AudioMixer> mixer; > + if (mMixer) { > + mixer = mMixer->asWeakPtr(); > + } Why are we doing this WeakPtr thing?
Attachment #8397194 - Flags: review?(roc) → review-
Comment on attachment 8397194 [details] [diff] [review] Add an audio mixer API to the MediaStreamGraph. r= Review of attachment 8397194 [details] [diff] [review]: ----------------------------------------------------------------- I think if you just dump the WeakPtr stuff, and deal with TSAN/etc issues around Register, we'd good. ::: content/media/AudioMixer.h @@ +16,5 @@ > + AudioSampleFormat aFormat, > + uint32_t aChannels, > + uint32_t aFrames); > + > +class AudioMixer : public SupportsWeakPtr<AudioMixer> What are we actually gaining by having this be a WeakPtr? Since WeakPtr's aren't threadsafe, only MSG will be able to touch it anyways, so why not make it a RefPtr? (Or if you prefer a raw ptr and store in an AutoPtr.) @@ +19,5 @@ > + > +class AudioMixer : public SupportsWeakPtr<AudioMixer> > +{ > +public: > + NS_INLINE_DECL_THREADSAFE_REFCOUNTING(AudioMixer) This can't possibly be safe when combined with WeakPtr @@ +23,5 @@ > + NS_INLINE_DECL_THREADSAFE_REFCOUNTING(AudioMixer) > + MOZ_DECLARE_REFCOUNTED_TYPENAME(AudioMixer) > + > + AudioMixer(MixerFunc aCallback) > + :mCallback(aCallback), nit: space after : @@ +27,5 @@ > + :mCallback(aCallback), > + mFrames(0), > + mChannels(0) > + { } > + /* Get the data from the mixer. This is supposed to be called when all the nit: blank line before comment (repeat through file) @@ +37,5 @@ > + mFrames); > + PodZero(mMixedAudio.Elements(), mMixedAudio.Length()); > + mChannels = mFrames = 0; > + } > + /* aSample is interleaved */ nit: aSamples ::: content/media/AudioSegment.h @@ +215,5 @@ > #endif > return chunk; > } > void ApplyVolume(float aVolume); > + void WriteTo(uint64_t aID, AudioStream* aOutput, AudioMixer* aMixer = 0); nullptr ::: content/media/MediaStreamGraph.cpp @@ +26,5 @@ > #include "DOMMediaStream.h" > #include "GeckoProfiler.h" > #include "mozilla/unused.h" > #include "speex/speex_resampler.h" > +#include "stdio.h" remove @@ +1190,5 @@ > + > +AudioMixer* > +MediaStreamGraphImpl::RegisterForAudioMixing() > +{ > + if (!mMixer) { I'm concerned what thread this gets called on, and if setting mMixer here can cause a TSAN or related issue (or races if there was more than one caller). Either this needs locking or it needs to go through the command queue I think. Note: WeakPtr usages aren't thread-safe, though this doesn't violate that so long as the caller doesn't try to use WeakPtr functions from a thread other than MSG. ::: content/media/compiledtest/TestAudioMixer.cpp @@ +1,1 @@ > +#include "AudioMixer.h" Needs MPL2
Attachment #8397194 - Flags: review?(rjesup) → review-
This sits on top of the previous patches, and addresses review comment, especially the WeakPtr thing. It also fixes a problem where the rounding tracking hack would break in case of global underrun.
Attachment #8398555 - Flags: review?(roc)
Attachment #8398555 - Flags: review?(rjesup)
Attachment #8397788 - Attachment is obsolete: true
Comment on attachment 8398555 [details] [diff] [review] Make sure we don't skip or play a sample twice. r= Review of attachment 8398555 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/AudioMixer.h @@ +19,5 @@ > +/** > + * This class mixes multiple streams of audio together to output a single audio > + * stream. > + * > + * AudioMixer::Mix is to be called repeatidly with buffers that have the same repeatedly ::: content/media/MediaStreamGraph.cpp @@ +581,5 @@ > + AudioSampleFormat aFormat, > + uint32_t aChannels, > + uint32_t aFrames) > +{ > + // xxx File a bug for an API to register for mixer callbacks and put the number here
Attachment #8398555 - Flags: review?(rjesup) → review+
Attachment #8397162 - Attachment is obsolete: true
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Depends on: 995543
Depends on: 995690
Blocks: 1015519
Depends on: 1027963
No longer blocks: 1015519
Depends on: 1015519
Depends on: 1057274
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: