Looping <audio> files should be seamless

REOPENED
Assigned to

Status

()

Core
Audio/Video: Playback
P3
normal
REOPENED
6 years ago
3 hours ago

People

(Reporter: huskyr@gmail.com, Assigned: chunmin, Mentored)

Tracking

(Depends on: 1 bug, Blocks: 2 bugs)

unspecified
x86
Mac OS X
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(tracking-b2g:backlog)

Details

(Whiteboard: [games:p3], URL)

Attachments

(5 attachments, 2 obsolete attachments)

(Reporter)

Description

6 years ago
User-Agent:       Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Build Identifier: 4.0.1

When looping an <audio> file there's a small, audible, gap when re-starting the audio file from the beginning. 

This makes it impossible to make something like an audio sequencer or drum loops.

Currently the 'loop' HTML5 property is not implemented (see #449157), but Javascript like this should have the same effect:

<audio id="audio" src="loop.wav" loop autoplay controls />

<script>
document.getElementById('audio').addEventListener('ended', function() {
    this.currentTime = 0;
}, false);
</script>

Reproducible: Always
The current plan is to fix bug 449157 in such a way that the looping is seamless.  I'm not sure it's feasible to provide (or, at least, guarantee) seamless looping when script sets currentTime to the start of the media.
(Reporter)

Comment 2

6 years ago
Great, thanks for taking that into account. I'm just wondering: is there any technical difference in setting 'loop' to  true or giving a 'currentTime = 0' on an ended event? Seems the same to me...
There are, I forgot to mention them earlier.  One is that the loop attribute allows the decoder to know in advance that the media is looping, so it can prepare to handle that by not tearing down resources at the end of playback (such as the decode threads, audio device, etc.)

The second is that your example sets currentTime in an ended event handler.  The ended event is dispatched after playback ends, which is dispatched after we have drained and closed the audio device.  The event handler runs asynchronously on the main thread, so (in addition to the tearing down resources issue mentioned above) there is some delay between playback ending where the event is dispatched and the time the event handler runs.

To allow seamless looping, the ended event would need be to be dispatched early enough that the event handler would set currentTime at the correct moment, which given the indeterminate delays between the event dispatch and event handler running, isn't really possible.

Having said that, there are probably improvements that can be made to reduce the delay between dispatching the ended event and starting playback again.  It's just that it's not possible to guarantee anything close to seamless looping without the loop attribute.
(In reply to comment #3)
> It's just that it's not possible to guarantee anything close to seamless
> looping without the loop attribute.

Note also that when using lossy compression (such as Vorbis), even if the first sample of the file immediately follows the last, there may be glitches caused by differing quantization used on the two blocks. For truly seamless looping, you want to crosslap the end of the file with the beginning using the MDCT windows, which may require specially prepared files for best results (see http://xiph.org/vorbis/doc/vorbisfile/crosslap.html for details). This is something I expect would be done if the loop attribute is present, but may not be the behavior you would want on a normal seek.
Based on comment 1 I think it would be safe to add a dependency to bug 449157 or dupe it.
Whiteboard: DUPEME?
Status: UNCONFIRMED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 449157
Reopening this to track seamless looping.  Bug 449157 now tracks basic (non-seamless) looping support.
Status: RESOLVED → REOPENED
Depends on: 664918
Ever confirmed: true
Resolution: DUPLICATE → ---
Whiteboard: DUPEME?
Depends on: 449157
Created attachment 575792 [details]
testcase - 440Hz sine

Updated

6 years ago
Blocks: 710398

Updated

5 years ago
Whiteboard: [games:p2]
Depends on: 782507

Comment 9

4 years ago
This still seems to be an issue on Linux at least.
(In reply to Alexander Brüning from comment #9)
> This still seems to be an issue on Linux at least.

No patches have landed to resolve this issue yet.

Comment 11

3 years ago
Hey, any progress here? We are getting bit by this issue as well.

It sure looks like audio loop scheduling implementation in audio.loop attribute does not even attempt to perform sample-precise joining, but plays the next instance of the audio loop to play back way too late after the first.

When I have an audio clip that looks like this: https://dl.dropboxusercontent.com/u/40949268/Bugs/firefox_audio_loop/Screen%20Shot%202014-05-16%20at%204.45.56%20PM.png

we are getting audio playback which when recorded back looks like this:

https://dl.dropboxusercontent.com/u/40949268/Bugs/firefox_audio_loop/Screen%20Shot%202014-05-16%20at%204.44.00%20PM.png

There is about 80msec gap in the scheduled playback between loops. I don't think this is a question about lossy compression - definitely the loop="true" attribute should in itself be able to schedule sample-precise joins of the audio data, and then it's the question of the audio data source on how well it sounds like. In that image, it looks like there's about 80 msecs of delay, which at a source audio data rate of 48kHz means that the audio buffer scheduling was off by some 3840 samples.

Here is a minimal STR with the loop="true" attribute: https://dl.dropboxusercontent.com/u/40949268/Bugs/firefox_audio_loop/audio_loop.html

Some suggested using the audio 'onended' event to fire the next loop. Here is a handwritten STR #2 for that workaround: https://dl.dropboxusercontent.com/u/40949268/Bugs/firefox_audio_loop/audio_loop_onended.html
which produces the same results. I agree with the earlier comments that this is not a good way to code, since the browser cannot prepare for this in advance.

I know that Web Audio API implements this properly, so at first I thought to use that, but I hit two obstacles:

  - currently we use HTMLAudioElements to load up audio files, so that they're loaded via browser WAV and OGG codecs. Looking here, http://updates.html5rocks.com/2012/02/HTML5-audio-and-the-Web-Audio-API-are-BFFs , it seems that <audio> elements integrate with Web Audio, but still, the playback api is not as flexible as when playing back AudioBufferSourceNodes? I was not able to start the playback in the Web Audio API graph and control looping nor manually schedule there. Is it possible to work around the looping some way with Web Audio API?

  - my other thought was to use Web Audio API with AudioBufferSourceNodes, but because of the above limitation, that means I would have to implement WAV and OGG codecs in JavaScript, just to be able to drive the looping from the Web Audio API graph. Or is there some other way to get the raw audio buffer data from <audio> so that one could use the same API as AudioBufferSourceNodes to play back?

Are there known workarounds for this issue?
Created attachment 8423887 [details]
loop.tar.gz

Web Audio API is perfectly suitable for this, see the attached demo (unzip, untar and run loop.html, the audio sample is included). I threw in a play/pause button, as it seemed to be a needed feature.

Note that you can put any codec that works in <audio> in the decodeAudioData call, it'll get decoded for you by the Web Audio API.

Jukka, does that work for you?
Flags: needinfo?(jujjyl)

Comment 13

3 years ago
Thanks, I downloaded the file, but I wonder if it has the wrong file? It did not contain a file loop.html.
Flags: needinfo?(jujjyl)
Created attachment 8423890 [details]
loop.tar.gz

Oops, went too fast, this has the right files.
Attachment #8423887 - Attachment is obsolete: true

Comment 15

3 years ago
Perfect, that workaround is working great! Thanks Paul!

Comment 16

3 years ago
Testing more, I am hitting this bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1012801

How difficult would this bug be to fix? (I expect between this and 1012801, this bug would be the easier one to tackle?) I seem to be trading bugs now when trying to find a workaround.

Updated

3 years ago
Blocks: 1014243

Updated

3 years ago
blocking-b2g: --- → 1.4?
Moving this to backlog.

This is too late to take in 1.4
blocking-b2g: 1.4? → backlog
Note: For games we need either this bug fixed or Bug 1012801 to be able to do seamlessly looping audio. Likely target is 2.0.
blocking-b2g: backlog → ---
tracking-b2g: --- → backlog
Component: Audio/Video → Audio/Video: Playback
Whiteboard: [games:p2] → [games:p?]

Comment 19

a year ago
Looking back at this, currently in order to play back looping audio clips, one can
   1. decodeAudioData() the whole clip up front, and play it as a looping audio buffer with WebAudio (AudioBufferSourceNode.loop=true), which will be seamless
   2. Emscripten compile Ogg Vorbis and implement the asset decoding and streaming manually in a looping manner

The first one is good solution if the sound clip is really short so that decodeAudioData() will take very little time and very little memory. The second solution is often used in asm.js compiled games to keep the performance good and memory footprint low.

Can Web Audio API produce seamlessly looping audio while decoding on demand? I.e. not having to do decodeAudioData() of the whole clip? When Paul posted comment 14, I believe it was not possible then, but that was some time ago - what is the situation today?

In any case, marking as games:p3 for now, given that we do have options 1. and 2. that solve the most common cases.
Flags: needinfo?(padenot)

Updated

a year ago
Whiteboard: [games:p?] → [games:p3]
(In reply to Jukka Jylänki from comment #19)
>    2. Emscripten compile Ogg Vorbis and implement the asset decoding and
> streaming manually in a looping manner

Not sure exactly how this handles seemless looping, but perhaps MSE could be used to manually streaming the encoded data, if that can be looped.

> Can Web Audio API produce seamlessly looping audio while decoding on demand?

Only through MediaElementAudioSourceNode, if a media element can play the looping audio.

Or maybe the assets can be broken in parts for the client to pass to decodeAudioData() shortly before required.
Yes, what karlt said, the situation has not changed much on the Web Audio API front.
Flags: needinfo?(padenot)

Comment 22

a year ago
(In reply to Karl Tomlinson (ni?:karlt) from comment #20)
> (In reply to Jukka Jylänki from comment #19)
> > Can Web Audio API produce seamlessly looping audio while decoding on demand?
> 
> Only through MediaElementAudioSourceNode, if a media element can play the
> looping audio.

Can you elaborate on this a bit? See the testcase audio_loop.html in comment 11:

> <html><head></head>
> <body>
> <audio id='a' src="noise.ogg" loop="true" /> </audio>
> <p>Pressing play should generate seamlessly joined audio playback (modulo any potential snapping occurring from issues with the authored audio):
> <p><b>TECHNIQUE 1: The audio.loop="true" property.</b>
> <input type="button" onclick="document.getElementById('a').play();" value="PLAY">
> </body>
> </html>

That does not currently produce seamlessly looping audio. Is the playback technically any different if this <audio> element was wrapped to a MediaElementAudioSourceNode in WebAudio graph, or will the audio output result (codepath) be the same?
(In reply to Jukka Jylänki from comment #22)
> Is the playback
> technically any different if this <audio> element was wrapped to a
> MediaElementAudioSourceNode in WebAudio graph, or will the audio output
> result (codepath) be the same?

The output is the same.  MediaElementAudioSourceNode is only useful for looping if a media element can do that.  The only way it currently might be able to do that would be if the encoded data could be continually streamed into a MediaSource buffer.

Updated

3 months ago
Flags: needinfo?(ajones)
See Also: → bug 1371202
ChunMin,
Per discussion, please put this on your plate.
Flags: needinfo?(ajones) → needinfo?(cchang)
Priority: -- → P3
(Assignee)

Comment 25

2 months ago
Sure.
Assignee: nobody → cchang
Flags: needinfo?(cchang)
Mentor: jwwang@mozilla.com
(Assignee)

Comment 26

13 days ago
My original plan is to change to seek's behavior first. Suppose the playing time is P and the audio file is already demuxed/decoded to time Q, and P < Q. In current seek's mechanism, wherever playing time is seeked, the whole decodeer state will be reset and the prior demuxed/decoded data will be useless. On the other hand, if we don't reset the decoder when the playing time is seeked to R, where P <= R <= Q, then the silence-gap should be shorter slightly. 

This model can also be applied to this case if we think this bug is same as improving seeking's speed. Currently, the loop is implemented by seeking playing time to 0 when it finishs playing. If the demuxer/decoder can loop back to the beginning of the audio file to process when the file is demuxed/decoded to the end, then when the playing time is seeked to 0, the decoder state won't be reset because the demuxed/decoded time Q' is definitely greater than 0.

After brief discussion with JW, he offers a simpler solution for this bug. The playing is controlled by MediaDecoderStateMachine and the demuxer/decoder is controlled by MediaFormatReader. We can seek MediaFormatReader to the beginning in advance when the file is demuxed/decoded to the end, so when MediaDecoderStateMachine is seeking back to the beginning, the time for waiting decoded data should be saved.
(Assignee)

Comment 27

6 days ago
Created attachment 8895727 [details]
seamless-loop-sample.wav

I extract one part from one of the samples on http://wavy.audio/
There is a fade-in and fade-out in the wave of the sample.

By my testing, Chrome cannot loop audio seamlessly, but Edge can.
(Assignee)

Comment 28

2 days ago
Created attachment 8896860 [details]
loop-log.zip

Summary for loop.log
=================================
The normal time difference between data callback is 0.01 sec.

EOS		6.922
		  |  2.078
Drain		8.900
		  |  0.002
Seek		8.902
                  |
		  |  0.09
                  |
Audio start	8.993
		  |  0.002
First callback	9.000

- We may have 2 second to do things in advance(between EOS to Drain).
- The goal is to make the time difference between Seek and Audio-start closer to 0.01 sec.
(Assignee)

Comment 29

2 days ago
(In reply to Chun-Min Chang[:chunmin] from comment #28)

> Summary for loop.log
> =================================
> The normal time difference between data callback is 0.01 sec.
> 
> EOS		6.922
> 		  |  2.078
> Drain		8.900
> 		  |  0.002
> Seek		8.902
>                 |
> 		  |  0.09
>                 |
> Audio start	8.993
> 		  |  0.002
> First callback	9.000
> 
> - We may have 2 second to do things in advance(between EOS to Drain).
> - The goal is to make the time difference between Seek and Audio-start
>   closer to 0.01 sec.
From the log below, we could estimate how much time we need from start requesting audio data to start playing it(prerolling). It needs 0.082 sec. and 0.076 sec. respectively, so we should have enough time(2.078 sec.) to request the audio data in advance.

1:
> 2017-08-14 06:32:06.592000 UTC - [MediaPlayback #2]: D/MediaDecoder Decoder=15532650 state=DECODING_METADATA change state to: DECODING_FIRSTFRAME
> ...
> 2017-08-14 06:32:06.592000 UTC - [MediaPlayback #1]: V/MediaFormatReader MediaFormatReader(0D7FD000)::RequestAudioData: 
> ...
> 2017-08-14 06:32:06.674000 UTC - [MediaPlayback #2]: D/MediaDecoder Decoder=15532650 MaybeStartPlayback() starting playback
> ...
> 2017-08-14 06:32:06.674000 UTC - [MediaPlayback #2]: D/AudioStream 1555C6A0 mozilla::AudioStream::Init channels: 2, rate: 44100
> ...
=> 6.674 - 6.592 = 0.082

2:
> 2017-08-14 06:32:08.903000 UTC - [MediaPlayback #2]: D/MediaDecoder Decoder=15532650 state=SEEKING change state to: DECODING
> ...
> 2017-08-14 06:32:08.903000 UTC - [MediaPlayback #2]: V/MediaFormatReader MediaFormatReader(0D7FD000)::RequestAudioData: 
> ...
> 2017-08-14 06:32:08.978000 UTC - [MediaPlayback #3]: D/MediaDecoder Decoder=15532650 MaybeStartPlayback() starting playback
> ...
> 2017-08-14 06:32:08.979000 UTC - [MediaPlayback #3]: D/AudioStream 1555C880 mozilla::AudioStream::Init channels: 2, rate: 44100
> ...
=> 8.979 - 8.903 = 0.076
(Assignee)

Comment 30

2 days ago
Created attachment 8896903 [details]
test page and its logs

Keep the same logs but adding the test-page
Attachment #8896860 - Attachment is obsolete: true
You can't make assumption about the time it takes to open an audio stream. We have been measuring it in the field using telemetry probes, and it can take up to 8 seconds, with a very wide distribution.

It seems particularly wasteful to re-create an audio stream each time we loop here.
(Assignee)

Comment 32

a day ago
(In reply to Paul Adenot (:padenot) from comment #31)
Actually, I am interesting in how many time prerolling needs[0]. After it finishes, the playback start playing and initializing the audio stream. That's why I use it as checkpoint.

Yes, it would be better if we keep using the original audio stream while looping.

[0] http://searchfox.org/mozilla-central/rev/e5b13e6224dbe3182050cf442608c4cb6a8c5c55/dom/media/MediaDecoderStateMachine.cpp#833
[1] http://searchfox.org/mozilla-central/rev/e5b13e6224dbe3182050cf442608c4cb6a8c5c55/dom/media/MediaDecoderStateMachine.cpp#686
(Assignee)

Comment 33

3 hours ago
Created attachment 8897799 [details]
Dirty test

By comment 28, the bottleneck is demuxing/decoding audio data, so I try to preload the audio data to see how it could help.

For testing, I pre-request the audio data in completed-state and save them into another queue. After seeking to the beginning(looping), the pre-loaded audio data will be pushed into the original audio queue.

The time is shorten from 0.09 to 0.016 by this way. However, the delay is still detectable. The goal is to make it shorten to 0.01 at least.

Summary for loop.log
=================================
EOS		59.507
		  |  1.98
Drain		01.487
		  |  0.001
Seek		01.488
		  |  *0.016*
Audio start	01.504
		  |  0.002
First callback	01.506
You need to log in before you can comment on or make changes to this bug.