Closed Bug 790458 Opened 8 years ago Closed 8 years ago

Support Opus multichannel on mobile

Categories

(Core :: Audio/Video, defect)

ARM
Android
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla21

People

(Reporter: rillian, Assigned: achronop)

References

Details

(Whiteboard: [mentor=rillian] [lang=c++])

Attachments

(1 file, 7 obsolete files)

Our current support for multichannel (surround audio) files only works with float audio playback, not with the short sample type used on android.

This bug is about adding a fixed-point version of the downmix matrix code.
Blocks: 790459
(In reply to Ralph Giles (:rillian) from comment #0)
> Our current support for multichannel (surround audio) files only works with
> float audio playback, not with the short sample type used on android.
> 
> This bug is about adding a fixed-point version of the downmix matrix code.

To do this, we also need to switch to the fixed-point decoder, or increase the decoder thread's stack size. Currently libopus allocates a stack buffer proportional to 120 ms of audio multiplied by the channel count when the internal sample format and the API sample format don't match. That can be up to 180 kB by itself, but our decoder thread stacks are only 128 kB.
Do we have architectures where the sample format is a short, but we'd prefer to use the floating-point decoder, or it just building the fixed-point decoder on arm enough?
(In reply to Ralph Giles (:rillian) from comment #2)
> Do we have architectures where the sample format is a short, but we'd prefer
> to use the floating-point decoder, or it just building the fixed-point
> decoder on arm enough?

On, e.g., OMAP4 or later, the float decoder is somewhat faster, but on anything older it's much, much slower. Since we don't do custom builds for each chip, just building the fixed-point decoder on ARM is probably the best strategy (and even if we did want to do custom builds for a chip, we should just use a float sample format all the way up the stack, and only convert when talking to the audio driver).
Whiteboard: [WebRTC], [blocking-webrtc-]
Component: WebRTC: Audio/Video → Video/Audio
QA Contact: jsmith
Whiteboard: [WebRTC], [blocking-webrtc-]
Sorry about the misleading component. This doesn't affect WebRTC.
Summary: Support Opus mulitchannel on mobile → Support Opus multichannel on mobile
Whiteboard: [mentor=rillian] [lang=c++]
Attached patch Fix for bug 790458 (obsolete) — Splinter Review
This is an initial effort for supporting fixed-point samples. I used the downmix matrix from opusfile.c which mentions in the comment "The matrices with 5 or more channels are normalized to a total volume of 2.0". Please review it and I can change it according to suggestions if it not suitable. 

Please note that the patch has been tested with opus streams found in various web resources but I am not certain if they  actually do check the new code. Maybe we need some specific multichannel fixed-point opus samples to verify.
Attachment #680450 - Flags: review?(giles)
Comment on attachment 680450 [details] [diff] [review]
Fix for bug 790458

Review of attachment 680450 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!

To test this you need to run a fixed-point build. The easiest way to do that is to build fennec for Android and test it on a phone or tablet. https://wiki.mozilla.org/Mobile/Fennec/Android for how to get set up for that if you haven't before.

I'll also push to the try server and see what it says.

The basic idea looks good. r+ if you remove the testing defines, and pending some test verification on android.

::: content/media/ogg/nsOggCodecState.cpp
@@ +808,4 @@
>  
>    return NS_OK;
>  }
> +#define MOZ_OPUS

This should be defined by the build system, you shouldn't need to set it here.

::: content/media/ogg/nsOggReader.cpp
@@ +407,4 @@
>    }
>    return NS_OK;
>  }
> +#define MOZ_OPUS

Same here; this shouldn't be necessary.
Attachment #680450 - Flags: review?(giles) → review+
Comment on attachment 680450 [details] [diff] [review]
Fix for bug 790458

Review of attachment 680450 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/ogg/nsOggCodecState.cpp
@@ -922,5 @@
>        }
> -#ifndef MOZ_SAMPLE_TYPE_FLOAT32
> -      // Downmixing more than 2 channels it is not supported for integer
> -      // output samples. It is only supported for float output.
> -      if (mChannels>2)

If you remove this check, then when we call opus_multistream_decode() it will allocate a temporary stack buffer proportional to the channel count, which could be up to 180 kB. However, we only have 128 kB of stack on the decoder threads (see bug 664341).

We either need to increase the decoder stack size, actually use a fixed-point build of libopus (which eliminates the need for the temporary buffer), or apply the upstream patch <https://git.xiph.org/?p=opus.git;a=commit;h=a40689e6>).

Landing this patch as-is will render us vulnerable to a stack-smashing attack.
Attachment #680450 - Flags: review-
https://tbpl.mozilla.org/?tree=Try&rev=6dfe9cdec1aa is attachment 680450 [details] [diff] [review] and shows build failures on armv6.

https://tbpl.mozilla.org/?tree=Try&rev=9d5bd2d1533e is after removing the spurious MOZ_OPUS defines.
Try run for 6dfe9cdec1aa is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=6dfe9cdec1aa
Results (out of 9 total builds):
    success: 3
    failure: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/rgiles@mozilla.com-6dfe9cdec1aa
Try run for 9d5bd2d1533e is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=9d5bd2d1533e
Results (out of 13 total builds):
    success: 12
    failure: 1
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/rgiles@mozilla.com-9d5bd2d1533e
I tried the files from http://people.xiph.org/~tterribe/opus/multichannel/ in a build with the MOZ_OPUS defines removed on my Android phone. There's no audio output with >2 channels.
Attached patch Fix for bug 790458 (obsolete) — Splinter Review
I removed the defines and corrected a bug that should cause the lack of output when the code is executed. The problem was that the initial buffer used to store the modified data. Corrected and uploaded for review.

About derf's comment I need your advice whether you prefer to change the temporary buffer or the change will be in the libopus.
Attachment #680450 - Attachment is obsolete: true
Attachment #681190 - Flags: review?(giles)
(In reply to Alexandros Chronopoulos [:achronop] from comment #12)
> About derf's comment I need your advice whether you prefer to change the
> temporary buffer or the change will be in the libopus.

I think the best solution is to do a fixed-point build of libopus on platforms where we aren't using floating-point sample formats, but that's also the most work.
Comment on attachment 681190 [details] [diff] [review]
Fix for bug 790458

Review of attachment 681190 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks. With this version I get correct output for all 8 test files.

One thing I noticed with the mobile device is that the level drop from the normalization is a lot more obvious that it is on desktop. Maybe because of cheaper headphones in a noisier environment giving less functional dynamic range. If I turn up the volume until the surround files are audible, the mono and stereo files are painfully loud. I wonder if it's better to not normalize, or normalize to a larger value, for fixed-point devices.

::: content/media/ogg/nsOggCodecState.cpp
@@ -808,4 @@
>  
>    return NS_OK;
>  }
> -

Please remove this whitespace change.
Depends on: 811544
I changed the downmix matrix normalization to four for 5-8 channels and to two for the rest. I did not have the chance to go through the instructions in order to test it properly. Since the only change here is the matrix I expect it to work and the question is the sound level output of multichannel files comparing to stereo. I will try to find sometime during the weekend to setup the environment.
Attachment #682150 - Flags: review?(giles)
I tested the latest patch in Android mobile phone. The sound level difference on stereo and multichannel files remains noticeable. I'll try without normalization

Would be useful to upload the .apk somewhere?
(In reply to Alexandros Chronopoulos [:achronop] from comment #16)

> Would be useful to upload the .apk somewhere?

I can build my own, but it might be useful for other folks to test.
Attached patch Downmix matrix x5 (obsolete) — Splinter Review
After some tries I concluded to this normalization. It is x5 times more than the original one (so normalization to 5 for channels 3,4 and to 10 for 5-8).

The .apk can be found at:
https://rapidshare.com/files/3564194320/fennec-19.0a1.en-US.android-arm.apk
Attachment #684941 - Flags: review?(giles)
If no one is working on this I'd like to take it over, just getting my feet wet with mozilla development but I do have everything setup for native c/arm development for android and would love to get started with this.
Welcome, Brian. acronop has a patch almost ready, but you're welcome to follow along, help test and comment. Have you done much with surround sound before?

You might look at bug 790459 as well. We had a contributor who offered to work on in in October, but they haven't responded since, so that would be ok to take over.
Comment on attachment 684941 [details] [diff] [review]
Downmix matrix x5

Review of attachment 684941 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think this normalization works. It sounds ok on the test files, but I get clipping on the louder sections of https://people.xiph.org/~giles/2012/tearsofsteel-surround.opus
Attachment #684941 - Flags: review?(giles) → review-
After several combinations I end up with normalization to 2 for 3,4 and 4 for 5-8. This doe not cause any clipping on louder sections testing with the file above. That patch lies in attachment 682150 [details] [diff] [review] uploaded with Comment 15. As I mentioned there, the difference between the stereo and multichannel samples remains obvious.

The .apk with latest + patch in attachment 682150 [details] [diff] [review] can be found at:
http://rapidshare.com/files/69155304/fennec-19.0a1.en-US.android-arm.apk

Can you please review the patch and let me know if it is approaching the desired result? I guess what we need is the smallest possible attenuation due to downmix that causes no clipping when the input is strong.
Comment on attachment 682150 [details] [diff] [review]
Downmix matrix normalized to four

Review of attachment 682150 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for taking so long to get back to this.Thanks for your patience.

I did some listening today; I agree this seems like the direction to go with. Provisional r+

Please update the patch to apply against current m-c (most of the filenames and classes have lost their ns prefix). And address the comment issue, then request review again. I still need to check the matrix numbers.

::: content/media/ogg/nsOggReader.cpp
@@ +527,5 @@
>        dBuffer[i*out_channels]=sampL;
>        dBuffer[i*out_channels+1]=sampR;
>      }
> +#else
> +    // Downmix matrix for channels up to 8, normalized to 2.0

Please update the comment to match the actual normalization.
Attachment #682150 - Flags: review?(giles) → review+
Comment on attachment 681190 [details] [diff] [review]
Fix for bug 790458

Mark this patch obsolete in favour of the newer one.
Attachment #681190 - Attachment is obsolete: true
Attachment #681190 - Flags: review?(giles)
Attached patch Normalization to 2 updated (obsolete) — Splinter Review
Thank you for the comments. This is the updated patch. I tested with the new files and sounds good. I mark the other patches obsolete due to this one.
Attachment #682150 - Attachment is obsolete: true
Attachment #684941 - Attachment is obsolete: true
Attachment #695735 - Flags: review?(giles)
Comment on attachment 695735 [details] [diff] [review]
Normalization to 2 updated

Review of attachment 695735 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks. I like how the patch keeps getting smaller.

I finally got a chance to re-derive the matrix coefficients. We're close now, but marking r- because of the overflow with Q15 coefficients. Please redo the matrix with Q14 (or a 1.0 norm) so we can retest that.

::: content/media/ogg/OggReader.cpp
@@ +540,5 @@
>        dBuffer[i*out_channels]=sampL;
>        dBuffer[i*out_channels+1]=sampR;
>      }
> +#else
> +    // Downmix matrix. Channels 3,4 normalized to 2, channels 5-8 normalized to 4

Please put a period ('.') at the end of comments to conform to local style.

I think our comment about 'normalized to 2' on the the float matrix is misleading. At least, I found it confusing in both cases when re-deriving the dmatrix coefficients. Would be be more clear to speak in terms of the per-row normalization. In both this patch and the float matricies, for 3 and 4 channels the coefficients in each row sum to '1', while the 5-8 channel matricies the coefficients sum to '2'.

@@ +556,5 @@
> +      for (uint32_t j = 0; j < channels; j++) {
> +        sampL+=buffer[i*channels+j]*dmatrix[channels-3][j][0];
> +        sampR+=buffer[i*channels+j]*dmatrix[channels-3][j][1];
> +      }
> +      sampL = (sampL + 32768)>>16;

So you're using a 16 bit floating point radix here, but the coefficients seem to be calculated with a 15 bit radix. That is, 19196 ~= 0.5858f * (1<<15). I wonder if this accounts for my comment about the surround downmix sounding obviously quieter on the phone?

Even with a 15 bit radix and the norm of 2.0 per output channel, the 8-channel matrix multiplication will overflow the 32-bit intermediaries in sampL and sampR. For this reason, Tim used Q14 (a 14-bit radix) is his implementation in the opusfile library. We'll need to use a 14 point radix to avoid introducing distortion noise as well as clipping.
Attachment #695735 - Flags: review?(giles) → review-
I should also mention: Tim said he distributed the rounding error so the fixed-point coefficients in each row sum to 2^radix (or 2*2^radix for norm 2). I don't understand his reasoning, but that's worth doing as well. He has a lot more experience with floating point mathematics than I do.
I continue maintaining the patch keeps small :)

Thanks you for your comments. In response to that I have updated the downmix matrix with Q14 coefficient. In addition I used 14 point radix. I have to say that your previous comment made more clear to me the meaning of those values. 

I have also updated the comment in the top of downmix matrix for both floating and fix point. I am not sure though if you meant this or to put the corresponding normalization value next to each row. Please let me know if you want it the other way.

I tested with the given samples and it sound ok without distortion or clipping in the loud sounds.
Attachment #695735 - Attachment is obsolete: true
Attachment #698449 - Flags: review?(giles)
Comment on attachment 698449 [details] [diff] [review]
Fix for bug 790458 latest review response

Review of attachment 698449 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks. Sorry for taking all week to get to this. You need to change the offset as well as the shift for Q14. Otherwise looks good, although some of the coefficients aren't exact. I've suggested corrections where they were off by more than .5.

::: content/media/ogg/OggReader.cpp
@@ +544,5 @@
> +    // Downmix matrix. Per-row normalization 1 for rows 3,4 and 2 for rows 5-8.
> +    // Coefficients in Q14.
> +    static const int16_t dmatrix[6][8][2]= {
> +        /*3*/{{9598, 0},{6786,6786},{0,   9598}},
> +        /*4*/{{6924, 0},{0,   6924},{5996,3464},{3464,5996}},

6925, 5997 and 3462 would distribute the rounding error more evenly, but it doesn't really matter.

@@ +545,5 @@
> +    // Coefficients in Q14.
> +    static const int16_t dmatrix[6][8][2]= {
> +        /*3*/{{9598, 0},{6786,6786},{0,   9598}},
> +        /*4*/{{6924, 0},{0,   6924},{5996,3464},{3464,5996}},
> +        /*5*/{{10666,0},{7537,7537},{0,  10666},{9234,5331},{5331,9234}},

Likewise 10663 and 7540.

@@ +548,5 @@
> +        /*4*/{{6924, 0},{0,   6924},{5996,3464},{3464,5996}},
> +        /*5*/{{10666,0},{7537,7537},{0,  10666},{9234,5331},{5331,9234}},
> +        /*6*/{{8668, 0},{6129,6129},{0,   8668},{7507,4335},{4335,7507},{6129,6129}},
> +        /*7*/{{7459, 0},{5275,5275},{0,   7459},{6460,3731},{3731,6460},{4568,4568},{5275,5275}},
> +        /*8*/{{6368, 0},{4502,4502},{0,   6368},{5515,3183},{3183,5515},{5515,3183},{3183,5515},{4502,4502}}

5514 and 3184

@@ +557,5 @@
> +      for (uint32_t j = 0; j < channels; j++) {
> +        sampL+=buffer[i*channels+j]*dmatrix[channels-3][j][0];
> +        sampR+=buffer[i*channels+j]*dmatrix[channels-3][j][1];
> +      }
> +      sampL = (sampL + 32768)>>14;

You need to shift by the midpoint of Q14 here. (1<<14)/2 = 8192 as well as shifting down by 14 instead of 16 bits.
Attachment #698449 - Flags: review?(giles) → review-
Thanks for the detailed comments. Sorry for waiting the whole week for that. I updated the mid point offset and the coefficients of downmix matrix. Please let me know if I am missing anything.
Attachment #698449 - Attachment is obsolete: true
Attachment #704307 - Flags: review?(giles)
Comment on attachment 704307 [details] [diff] [review]
Fix for bug 790458 additional comments

Review of attachment 704307 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me, thanks!

We need a proper commit message before we can commit. Would you like to do that, or should I?
Attachment #704307 - Flags: review?(giles) → review+
Thanks a lot for the review! I will propose:
"Bug 790458 - Support multichannel for fixed point opus audio data (mobile)"
but please change at will and go on with check in. I always miss the important details so it will be nice to add your style as well. Thanks again for your support and your time!
Updated patch with a commit message. Carrying forward my earlier r+.

https://tbpl.mozilla.org/?tree=Try&rev=48a7b1046c72
Attachment #704307 - Attachment is obsolete: true
Attachment #705646 - Flags: review+
Pretty green try.

https://hg.mozilla.org/integration/mozilla-inbound/rev/a9697b07a494
Assignee: nobody → achronop
Flags: in-testsuite-
OS: Linux → Android
Hardware: x86_64 → ARM
https://hg.mozilla.org/mozilla-central/rev/a9697b07a494
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Blocks: 706327
You need to log in before you can comment on or make changes to this bug.