Closed Bug 830707 Opened 12 years ago Closed 12 years ago

Make MediaStream AudioSegments not have a fixed number of channels

Categories

(Core :: Audio/Video, defect)

18 Branch
x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla21

People

(Reporter: roc, Assigned: roc)

References

Details

Attachments

(4 files, 3 obsolete files)

We need to allow the number of channels in a MediaStream's audio track to change from moment to moment. In particular Web Audio needs this. I guess theoretically chained Oggs and similar kinds of media resources would need this too.
I have patches for this bug, but no good way to test them yet.
Comment on attachment 704415 [details] [diff] [review] Part 1: Add code for upmixing and downmixing following Web Audio's spec Review of attachment 704415 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/AudioChannelFormat.cpp @@ +2,5 @@ > +/* This Source Code Form is subject to the terms of the Mozilla Public > + * License, v. 2.0. If a copy of the MPL was not distributed with this file, > + * You can obtain one at http://mozilla.org/MPL/2.0/. */ > + > +#include "AudioChannelFormat.h" You need to #include <algorithm> here, for std::min and std::max.
Comment on attachment 704417 [details] [diff] [review] Part 2: Mix channels to output channel count when playing audio Review of attachment 704417 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/AudioSegment.cpp @@ +89,5 @@ > } > > +static const int AUDIO_PROCESSING_FRAMES = 640; /* > 10ms of 48KHz audio */ > +static const int GUESS_AUDIO_CHANNELS = 2; > +static const uint8_t gZeroChannel[MAX_AUDIO_SAMPLE_SIZE*AUDIO_PROCESSING_FRAMES]; clang says: /Users/ehsanakhgari/moz/mozilla-central/content/media/AudioSegment.cpp:93:22: error: default initialization of an object of const type 'const uint8_t [2560]' static const uint8_t gZeroChannel[MAX_AUDIO_SAMPLE_SIZE*AUDIO_PROCESSING_FRAMES]; ^ 1 error generated. You probably want: static const uint8_t gZeroChannel[MAX_AUDIO_SAMPLE_SIZE*AUDIO_PROCESSING_FRAMES] = {0};
Comment on attachment 704418 [details] [diff] [review] Part 3: Don't constrain AudioSegment to a fixed number of channels Review of attachment 704418 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/MediaStreamGraph.cpp @@ +1159,3 @@ > // something here ... preallocation, async allocation, multiplexing onto a single > // stream ... > AudioSegment* audio = tracks->Get<AudioSegment>(); With this change, this variable is now unused.
Comment on attachment 704418 [details] [diff] [review] Part 3: Don't constrain AudioSegment to a fixed number of channels Review of attachment 704418 [details] [diff] [review]: ----------------------------------------------------------------- You probably want to remove the Init call from media/webrtc/signaling/test/mediapipeline_unittest.cpp too.
(In reply to :Ehsan Akhgari from comment #12) > Comment on attachment 704418 [details] [diff] [review] > Part 3: Don't constrain AudioSegment to a fixed number of channels > > Review of attachment 704418 [details] [diff] [review]: > ----------------------------------------------------------------- > > You probably want to remove the Init call from > media/webrtc/signaling/test/mediapipeline_unittest.cpp too. (And from media/webrtc/signaling/test/FakeMediaStreamsImpl.h too.)
I had already made all these changes locally, sorry :-(
Comment on attachment 704415 [details] [diff] [review] Part 1: Add code for upmixing and downmixing following Web Audio's spec Review of attachment 704415 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/AudioChannelFormat.cpp @@ +111,5 @@ > + aChannelArray->ElementAt(0) = surroundChannels[SURROUND_L]; > + aChannelArray->ElementAt(1) = surroundChannels[SURROUND_R]; > + aChannelArray->ElementAt(2) = surroundChannels[SURROUND_C]; > + aChannelArray->ElementAt(3) = surroundChannels[SURROUND_SL]; > + aChannelArray->ElementAt(4) = surroundChannels[SURROUND_SR]; I assume it's intentional for no-center to with-center upmixes to not generate a center channel? Normally in multi-channel audio upmixes generate virtual centers (correlated R+L) and usually derived rears (anti-correlated R&L), typically with a delay. This may well not be the intention here, but it isn't clear. Same question for down-mixes. I'm sure there are tons of articles on all the different ways you can up and downmix. @@ +118,5 @@ > + } > + > + memcpy(aChannelArray->Elements(), surroundChannels, sizeof(surroundChannels)); > + inputChannelCount = CUSTOM_CHANNEL_LAYOUTS; > + } If inputChannelCount > CUSTOM_CHANNEL_LAYOUTS (and it's tested, so let's assume it's possible), then we'll return NULL I presume for the first 6 channels. not zerochannel. Either assert on it or move the initializer.
Attachment #704415 - Flags: review?(rjesup) → review+
Attachment #704416 - Flags: review?(rjesup) → review+
Comment on attachment 704417 [details] [diff] [review] Part 2: Mix channels to output channel count when playing audio Review of attachment 704417 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/AudioSegment.cpp @@ +98,5 @@ > + uint32_t outputChannels = aOutput->GetChannels(); > + nsAutoTArray<AudioDataValue,AUDIO_PROCESSING_FRAMES*GUESS_AUDIO_CHANNELS> buf; > + nsAutoTArray<const void*,GUESS_AUDIO_CHANNELS> channelData; > + nsAutoTArray<float,AUDIO_PROCESSING_FRAMES*GUESS_AUDIO_CHANNELS> downmixConversionBuffer; > + nsAutoTArray<float,AUDIO_PROCESSING_FRAMES*GUESS_AUDIO_CHANNELS> downmixOutputBuffer; is it possible/useful to remember the number of channels from the last iteration and use that to seed the TARRAY sizes to avoid re-allocations when channels != 2? Perhaps not as coded. From one write to another they rarely change.... Also, holding allocations from one iteration to another may be useful and allow avoiding reallocs - I always try to avoid a memory allocator where possible in realtime output paths.... @@ +134,5 @@ > + for (uint32_t i = 0; i < channelData.Length(); ++i) { > + float* conversionBuf = downmixConversionBuffer.Elements() + (i*duration); > + const int16_t* sourceBuf = static_cast<const int16_t*>(channelData[i]); > + for (uint32_t j = 0; j < duration; ++j) { > + conversionBuf[j] = AudioSampleToFloat(sourceBuf[j]); I'd want to code this with a mind to doing an optimized (SSE/neon/etc) implementation(s) (ConvertAudioSamplesToFloats(dest,source,duration) or some such). @@ +142,5 @@ > + } > + > + downmixOutputBuffer.SetLength(duration*outputChannels); > + nsAutoTArray<float*,GUESS_AUDIO_CHANNELS> outputChannelBuffers; > + nsAutoTArray<const void*,GUESS_AUDIO_CHANNELS> outputChannelData; Ditto on remembering arrays @@ +165,5 @@ > + } else { > + // Assumes that a bit pattern of zeroes == 0.0f > + memset(buf.Elements(), 0, buf.Length()*sizeof(AudioDataValue)); > + } > + aOutput->Write(buf.Elements(), int32_t(duration)); May need Start() similar to patch that landed last weekend
Attachment #704417 - Flags: review?(rjesup) → review+
Comment on attachment 704418 [details] [diff] [review] Part 3: Don't constrain AudioSegment to a fixed number of channels Review of attachment 704418 [details] [diff] [review]: ----------------------------------------------------------------- r+ assuming GetFirstChunk == GetLastChunk is correct (or with an appropriate fix if not) ::: content/media/MediaSegment.h @@ +281,5 @@ > + if (mChunks.IsEmpty()) { > + return nullptr; > + } > + return &mChunks[mChunks.Length() - 1]; > + } How is this different than GetLastChunk()? ::: content/media/webrtc/MediaEngineDefault.cpp @@ -335,5 @@ > mSource = aStream; > > // AddTrack will take ownership of segment > AudioSegment* segment = new AudioSegment(); > - segment->Init(CHANNELS); With this removed, there are no uses of CHANNELS in this file; it can probably go too
Attachment #704418 - Flags: review?(rjesup) → review+
(In reply to Randell Jesup [:jesup] from comment #15) > I assume it's intentional for no-center to with-center upmixes to not > generate a center channel? Normally in multi-channel audio upmixes generate > virtual centers (correlated R+L) and usually derived rears (anti-correlated > R&L), typically with a delay. I don't understand everything you said here, but I'm trying to just follow the Web Audio spec for 1, 2, 4 and 6 channels and do something plausible for the rest: https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#UpMix > If inputChannelCount > CUSTOM_CHANNEL_LAYOUTS (and it's tested, so let's > assume it's possible), then we'll return NULL I presume for the first 6 > channels. not zerochannel. Either assert on it or move the initializer. That's a horrible bug. In the inputChannelCount >= CUSTOM_CHANNEL_LAYOUTS case, I need to copy the channels to the output array. I'll fix it. > is it possible/useful to remember the number of channels from the last iteration > and use that to seed the TARRAY sizes to avoid re-allocations when channels != 2? Probably best to just up the nsAutoTArray buffer size if necessary. > How is this different than GetLastChunk()? Stupid bug. Good thing it's not used. I'll just remove it. > With this removed, there are no uses of CHANNELS in this file; it can probably > go too I'll remove it.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #18) > I don't understand everything you said here, but I'm trying to just follow > the Web Audio spec for 1, 2, 4 and 6 channels and do something plausible for > the rest: > https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#UpMix We've discussed this in #media and even suggested proposing the downmixing matrices we're currently using with Opus (<http://mxr.mozilla.org/mozilla-central/source/content/media/ogg/OggReader.cpp#525>). But roc is right, this is what the spec currently says to do.
> > If inputChannelCount > CUSTOM_CHANNEL_LAYOUTS (and it's tested, so let's > > assume it's possible), then we'll return NULL I presume for the first 6 > > channels. not zerochannel. Either assert on it or move the initializer. > > That's a horrible bug. In the inputChannelCount >= CUSTOM_CHANNEL_LAYOUTS > case, I need to copy the channels to the output array. I'll fix it. Actually this is wrong --- I was right the first time. Since we're updating aChannelArray in place, when inputChannelCount >= CUSTOM_CHANNEL_LAYOUTS there is no need to touch the first inputChannelCount channels.
roc doesn't seem to have noted the mozilla-inbound push yet, but I'll note the trivial bustage fix I just landed: https://hg.mozilla.org/integration/mozilla-inbound/rev/703cfc290199
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: