Closed Bug 1294753 Opened 8 years ago Closed 8 years ago

MediaRecorder sometimes records broken audio timestamps

Categories

(Core :: Audio/Video: Recording, defect, P1)

48 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
firefox48 --- wontfix
firefox49 --- fixed
firefox-esr45 --- unaffected
firefox50 --- fixed
firefox51 --- fixed

People

(Reporter: jeremy.noring, Assigned: jesup, NeedInfo)

References

Details

(Keywords: regression)

Attachments

(5 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36

Steps to reproduce:

Does not reproduce consistently; on systems that do reproduce the issue, it is fairly easy to hit.

1. Go to https://webrtc.github.io/samples/src/content/getusermedia/record/
2. Record a longer segment (10+ minutes--this may or may not be necessary, I've had users hit this bug with recordings as short as one minute)
3. Download the segment and analyze with ffmpeg. 



Actual results:

The audio track contains both huge jumps in audio timestamp, and non-monotonous timestamps.  This can be seen by running "ffprobe -show_frames -select_streams 'a' -of json 3.webm > AUDIO_FRAMES" and examining the output.  Also, playing back the file around 6 minutes in, you can clearly see AV sync is incorrect.

Examining the attached file, there's a huge jump around 26 seconds in:

        {
            "media_type": "audio",
            "stream_index": 1,
            "key_frame": 1,
            "pkt_pts": 26059,
            "pkt_pts_time": "26.059000",
            "pkt_dts": 26059,
            "pkt_dts_time": "26.059000",
            "best_effort_timestamp": 26059,
            "best_effort_timestamp_time": "26.059000",
            "pkt_duration": 20,
            "pkt_duration_time": "0.020000",
            "pkt_pos": "4737526",
            "pkt_size": "124",
            "sample_fmt": "fltp",
            "nb_samples": 960,
            "channels": 1,
            "channel_layout": "mono"
        },
        {
            "media_type": "audio",
            "stream_index": 1,
            "key_frame": 1,
            "pkt_pts": 91615,
            "pkt_pts_time": "91.615000",
            "pkt_dts": 91615,
            "pkt_dts_time": "91.615000",
            "best_effort_timestamp": 91615,
            "best_effort_timestamp_time": "91.615000",
            "pkt_duration": 20,
            "pkt_duration_time": "0.020000",
            "pkt_pos": "4755903",
            "pkt_size": "126",
            "sample_fmt": "fltp",
            "nb_samples": 960,
            "channels": 1,
            "channel_layout": "mono"
        },

...there's a jump of ~65.556 seconds forward between two successive timestamps.  An example of non-monotonous timestamps can be seen here:

        {
            "media_type": "audio",
            "stream_index": 1,
            "key_frame": 1,
            "pkt_pts": 632391,
            "pkt_pts_time": "632.391000",
            "pkt_dts": 632391,
            "pkt_dts_time": "632.391000",
            "best_effort_timestamp": 632391,
            "best_effort_timestamp_time": "632.391000",
            "pkt_duration": 20,
            "pkt_duration_time": "0.020000",
            "pkt_pos": "49640196",
            "pkt_size": "128",
            "sample_fmt": "fltp",
            "nb_samples": 960,
            "channels": 1,
            "channel_layout": "mono"
        },
        {
            "media_type": "audio",
            "stream_index": 1,
            "key_frame": 1,
            "pkt_pts": 566875,
            "pkt_pts_time": "566.875000",
            "pkt_dts": 566875,
            "pkt_dts_time": "566.875000",
            "best_effort_timestamp": 566875,
            "best_effort_timestamp_time": "566.875000",
            "pkt_duration": 20,
            "pkt_duration_time": "0.020000",
            "pkt_pos": "49640333",
            "pkt_size": "128",
            "sample_fmt": "fltp",
            "nb_samples": 960,
            "channels": 1,
            "channel_layout": "mono"
        },

...a jump of 65.516 seconds *backwards*.  This causes ffmpeg to spew all sorts of warnings.

For files broken like this, they can be fixed by dumping the raw audio to wav format (thus discarding timestamp information) and then remuxing that audio with the original video):

ffmpeg -i 3.webm -vn 3.wav
ffmpeg -i 3.webm -i 3.wav -map 0:v -map 1:a -c:v copy 3.corrected.webm

If you view the corrected file around 6 minutes, you can see the AV sync is corrected.



Expected results:

Something is sometimes completely hosing audio timestamps in Firefox's media recorder implementation; it does not always do it, but it happens enough that I've rolled back our code to replace flash with Media Recorder in Firefox.  We will not be able to abandon flash until this issue is resolved.

We've seen this issue on both Windows and OSX, using Firefox 48.
QA Whiteboard: [bugday-20160815]
Component: Untriaged → Audio/Video: Recording
Product: Firefox → Core
On Linux Nightly:
Jump from 20.819 to 86.375, then from 86.575 to 21.059, then from 21.299 to 86.855
So it jumped forward ~65.5s, jumped back the same amount, then jumped forward the same amount.  Likeluy that's a big clue
later there's a jump from 138.355 to 203.891 (~65.5 seconds!)
And from 254.731 to 320.287 for one sample then 254.77, then 255.191 to 32.747

All of these are ~65.5 seconds.  If I had to hazard a totally unsupported guess, I'd check for something offsetting it by 65.536 seconds....
Status: UNCONFIRMED → NEW
Rank: 17
Ever confirmed: true
Priority: -- → P1
As jesup points out, the offset is suspiciously close to 2^16 ms. This lead him to https://github.com/mozilla/gecko-dev/blob/d602abb016626f9b36fa5ef5fa6afb494e7051ab/dom/media/webm/EbmlComposer.cpp#L149 where the block offset is stored in a short. His tracing reports:

timestamp 106840000, timeCode +32749, mClusterTimecode 139633, mCodecDelay 6500000
timestamp 106860000, timeCode -32767, mClusterTimecode 139633, mCodecDelay 6500000

106860 - 139633 is already outside the range of a signed 16-bit integer, so this isn't the first time it's wrapped.

The timeCode value is an offset relative to the timestamp of the containing WebM cluster, and it written as a signed, 16-bit integer in the file, so we can't just expand the available precision here. We should use a CheckedInt<int16_t> here, and report failure on overflow, but to address the actual problem it looks like we need to emit new Clusters more frequently.
It looks like EBMLComposer only flushes clusters when it receives a new video keyframe. That's an appropriate time to do so, since each cluster MUST start with a video keyframe (https://www.webmproject.org/docs/container/#muxer-guidelines) but the 16-bit timestamp offset limits us to a keyframe spacing of about 30s, so if the encoder hands us keyframes less frequently than that, or if we're recording an audio-only webm, these overflows will occur.

So (a) make sure we request at most 30s between vp8/9 keyframes, and (b) flush clusters anyway when the simpleblock timecode overflows.
However, the VP8TrackEncoder sets an keyframe interval of ~1s:
  // Ensure that we can output one I-frame per second.
  config.kf_max_dist = mEncodedFrameRate;
And it will insert them if the encoder is running slow.
So it's odd if there are >30sec between keyframes
Ok - I'm pretty sure I know what's going on

We're calling OpusTrackEncoder::GetEncodedTrack(), and it's designed to encode one "Packet" of Opus data i.e. 20ms.  It will wait until it has enough data to do so.  We call this, and then we call the VP8TrackEncoder::GetEncodedTrack() similarly.  Since that will wait until we have a frame, we won't call the Opus encoder more than ~ once per 33ms at best.  Since it only encodes 20ms of data....

The solution is to have it wait until it has at least 20ms, and then encode as much as it can (as many integral 20ms Packets it has data for).

Working on a patch...
diff ignoring whitespace changes to follow for easier reviewing
Attachment #8782024 - Flags: review?(giles)
Assignee: nobody → rjesup
Status: NEW → ASSIGNED
Comment on attachment 8782027 [details] [diff] [review]
Flush WebM clsuters if the timecode offset will overflow

Review of attachment 8782027 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comments addressed. Per IRC discussion, we shouldn't uplift this one; as future-proofing it can ride the trains.

::: dom/media/webm/EbmlComposer.cpp
@@ +130,5 @@
>    if (isVP8IFrame) {
>      FinishCluster();
> +    flush = true;
> +  } else {
> +    if ((aFrame->GetTimeStamp() / PR_USEC_PER_MSEC - mClusterTimecode) +

`else if ((aFrame...))` would be easier to read here, saving an indent.

@@ +155,5 @@
>      mFlushState |= FLUSH_CLUSTER;
>    }
>  
>    bool isOpus = (frameType == EncodedFrame::FrameType::OPUS_AUDIO_FRAME);
> +  auto timeCode = aFrame->GetTimeStamp() / PR_USEC_PER_MSEC - mClusterTimecode;

Does auto work out to uint64_t or int64_t here? Remember timeCode can reasonably be negative in a properly muxed file since you want audio available covering the presentation time of the first video frame in the cluster for better playback results when seeking.

In theory the subtraction could underflow, and the signedness details matter for that so I'd prefer an explicit type declaration here. I'd also check for the underflow. It can't happen as currently constructed, but such things are common sources of bugs in muxer code.

@@ +160,4 @@
>    if (isOpus) {
>      timeCode += mCodecDelay / PR_NSEC_PER_MSEC;
>    }
> +  MOZ_ASSERT(timeCode <= SHRT_MAX);

Please assert (timeCode >= SHRT_MIN) as well.
Attachment #8782027 - Flags: review?(giles) → review+
Comment on attachment 8782025 [details] [diff] [review]
Patch with whitespace changes hidden (qdiff -b)

Review of attachment 8782025 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/encoder/OpusTrackEncoder.cpp
@@ +306,4 @@
>        mReentrantMonitor.Wait();
> +          wait = false;
> +        } else {
> +          goto done; // nested while's...

Huh. I hadn't thought about C++ `goto` needing to call dtors before jumping. That's really nice.

I don't follow your logic with the `else` clause here. Can you elaborate? It seems like we enter a nested while loop here, but only execute the body once. The second time through, `wait` will be false, so we'll jump to the end and return instead of compressing another packet.
> ::: dom/media/encoder/OpusTrackEncoder.cpp
> @@ +306,4 @@
> >        mReentrantMonitor.Wait();
> > +          wait = false;
> > +        } else {
> > +          goto done; // nested while's...
> 
> Huh. I hadn't thought about C++ `goto` needing to call dtors before jumping.
> That's really nice.

Any exit of the block... though I forget how longjmp is handled (avoid at all costs anyways!)

> I don't follow your logic with the `else` clause here. Can you elaborate? It
> seems like we enter a nested while loop here, but only execute the body
> once. The second time through, `wait` will be false, so we'll jump to the
> end and return instead of compressing another packet.

So if there isn't enough data to packetize, it will enter the inner while loop.  If it hasn't packetized anything yet (wait == true_, it will wait until it has.  If it has packetized in this call, then (wait == false) it will goto done and exit.  So min 1 time through, max N times.  But only will wait for data once.
Comment on attachment 8782024 [details] [diff] [review]
encode all available audio on each cycle instead of one 'packet'

Review of attachment 8782024 [details] [diff] [review]:
-----------------------------------------------------------------

Ok, thanks for explaining. Seems convoluted, but fixes the issue, so please land it!
Attachment #8782024 - Flags: review?(giles) → review+
Attachment #8782025 - Flags: feedback?(giles) → feedback+
Pushed by rjesup@wgate.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/09d9b633e335
encode all available audio on each cycle instead of one 'packet' r=rillian
https://hg.mozilla.org/integration/mozilla-inbound/rev/96cf1604d227
Flush WebM clusters if the timecode offset will over/underflow r=rillian
Comment on attachment 8782024 [details] [diff] [review]
encode all available audio on each cycle instead of one 'packet'

Approval Request Comment
[Feature/regressing bug #]: bug 1215115 (MediaRecorder support in Android)

[User impact if declined]: audio is unusable in recorded videos (depending on the player).  All the audio data should be there, but not in the correct WebM cluster.

[Describe test coverage new/current, TreeHerder]: requires parsing the output stream, which we don't have in the test infra, or playing back with an external tool.  Manual testing done.

[Risks and why]: regression that has gone to release in 48.  Low risk; change is not complex (just have it encode 1 or more packets per call, instead of just one).

[String/UUID change made/needed]: none
Attachment #8782024 - Flags: approval-mozilla-beta?
Attachment #8782024 - Flags: approval-mozilla-aurora?
Comment on attachment 8782024 [details] [diff] [review]
encode all available audio on each cycle instead of one 'packet'

Hi Jesup, please renominate the patch for uplift after this has landed in central and stabilized for a few days. Thanks!
Flags: needinfo?(rjesup)
Attachment #8782024 - Flags: approval-mozilla-beta?
Attachment #8782024 - Flags: approval-mozilla-aurora?
There were two issues: one with not checking for mEncodingComplete in the outer loop I added, and the other was a bug in OggWriter which assumed that END_OF_STREAM would only be sent with a single frame of data, even though it loops over the frames sent in
Attachment #8783839 - Flags: review?(giles)
Comment on attachment 8783839 [details] [diff] [review]
interdiff to fix Ogg recording failures

Review of attachment 8783839 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/encoder/OpusTrackEncoder.cpp
@@ +277,5 @@
>  
>    bool wait = true;
>    int result = 0;
>    // Only wait once, then loop until we run out of packets of input data
> +  while (result >= 0 && !mEncodingComplete) {

The logic around mEncodingComplete seems confused. You have a check inside this loop which returns an error if we get there with it set, but if that variable is only being written from this thread, that check will now never succeed. So either it was useless before and should be deleted or it was necessary somehow and now we're not doing it right.

The test for setting mEncodingComplete also uses mEndOfStream outside the section where you hold mReentrantMonitor, so you've got a race condition there, though that's not a new issue with this patch specifically.
Attachment #8783839 - Flags: review?(giles) → review-
Attachment #8783839 - Attachment is obsolete: true
Comment on attachment 8784096 [details] [diff] [review]
interdiff to fix Ogg recording failures

Review of attachment 8784096 [details] [diff] [review]:
-----------------------------------------------------------------

Worth a try. I really hope you can rewrite this in m-c. The code is really hard to reason about.
Attachment #8784096 - Flags: review?(giles) → review+
Pushed by rjesup@wgate.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4c9b550ac7d6
encode all available audio on each cycle instead of one 'packet' r=rillian
https://hg.mozilla.org/integration/mozilla-inbound/rev/38b376e74c6e
Flush WebM clusters if the timecode offset will over/underflow r=rillian
Comment on attachment 8782024 [details] [diff] [review]
encode all available audio on each cycle instead of one 'packet'

Approval Request Comment
(Note: for the patch as-landed; that includes the interdiff patch here)

[Feature/regressing bug #]: bug 1215115 (MediaRecorder support in Android)

[User impact if declined]: audio is unusable in recorded videos (depending on the player).  All the audio data should be there, but not in the correct WebM cluster.

[Describe test coverage new/current, TreeHerder]: requires parsing the output stream, which we don't have in the test infra, or playing back with an external tool.  Manual testing done.

[Risks and why]: regression that has gone to release in 48.  Low risk; change is not complex (just have it encode 1 or more packets per call, instead of just one).

[String/UUID change made/needed]: none
Flags: needinfo?(rjesup)
Attachment #8782024 - Flags: approval-mozilla-beta?
Attachment #8782024 - Flags: approval-mozilla-aurora?
I am experiencing what I think is this bug all the time with Firefox 48.0.1 and with 49.0b6. I do not experience it with Firefox 47. The nightly build 51.0a1 crashes when I push stop on my MediaRecorder demo.

In this video I explain what I am doing but the audio is truncated. I count down from 10 and then say stop and close my hand and then I push stop. But as you can see the audio mutes around 12 seconds in and the video ends a few fractions of a second before my hand fully closes.

https://video-spike2.aws.42lines.net/html5_recording/uploads/f43b9db36ca433145673cec6663e2484.webm

In the following video I explain myself and count upwards to 120. Starting with 30 I put up a number of fingers indicating which decade I'm on. The audio cuts out around 33 seconds in. The audio comes back around 1:39 (99 seconds) but it is audio from where it broke off (at number 24 around the 33 second mark) - totally out of sync with the video. 

https://video-spike2.aws.42lines.net/html5_recording/uploads/6b771564a4d403c8732bd82308754f08.webm

The recorder I am using is here: https://video-spike2.aws.42lines.net/html5_recording/

This bug also exists on this demo (set time interval to more than 5000 in order to record a longer video in one chunk, use the save video to download the recorded video): https://www.webrtc-experiment.com/msr/video-recorder.html
This bugfix isn't in Fx51 Nightly yet; you can probably pull a build from 
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=38b376e74c6e
(find your platform, click on a green 'B', in the lower-left box click on Build: link, and download the package for your platform.  I advise installing it in a new/temp directory, since it won't auto-update ever.  Then run from commandline with ./firefox, or if you have another firefox running, ./firefox -profilemanager -no-remote).  

If it merges to mozilla-central today (likely), it will be in tomorrow's nightly.

Verify if this fixes your problem, and also if the crash is gone or not.  Thanks!!!
Flags: needinfo?(zach)
Attached video 895251091.webm
Unfortunately it still crashes the tab when I push "stop" using my spike: https://video-spike2.aws.42lines.net/html5_recording/

I filed a crash report but I don't know how to get a reference to that report to give here. 

If I use this demo https://www.webrtc-experiment.com/msr/video-recorder.html I do get a better result (all the audio is there) but the audio is out of sync with the video so it's still not good. I am attaching a video created via this demo.
Flags: needinfo?(zach)
(In reply to zach from comment #25)
> Unfortunately it still crashes the tab when I push "stop" using my spike:
> https://video-spike2.aws.42lines.net/html5_recording/
> 
> I filed a crash report but I don't know how to get a reference to that
> report to give here. 

Here is the crash report reference: https://crash-stats.mozilla.com/report/index/9c9f1107-7326-4567-857f-b882a2160824
I'm afraid that can't be looked at easily since it's not a nightly (no symbols in crash data).

Local builds of mozilla-inbound don't crash for me on Stop.  ALso, the *first* time I Start (on a Linux debug build) the local video shows jerky jumps between two or so frames - video never continues.  In one case the recording had similar jumps, in another the recorded video didn't freeze/jerk.

Please retry with Nightly tomorrow and put the link from about:crashes if it still crashes.

I'll check the audio delay/sync next
https://hg.mozilla.org/mozilla-central/rev/4c9b550ac7d6
https://hg.mozilla.org/mozilla-central/rev/38b376e74c6e
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
I don't think this is confirmed fixed. I tried the nightly build "51.0a1 (2016-08-25)" and the audio sync is bad AND it mutes out after a few seconds (using https://www.webrtc-experiment.com/msr/video-recorder.html). Perhaps the fix didn't make it into Nightly yet? Nightly also still crashes when I push stop on my test site: https://video-spike2.aws.42lines.net/html5_recording/

Here is a crash report from Nightly of the crash when pushing stop on my test. 
https://crash-stats.mozilla.com/report/index/c7d68637-d35f-4c96-8a44-e8afc2160825

I know that the crash might be a separate bug, let me know if I should file it separately. I'm only talking about it here because it is a regression from FF48 and you have asked me to confirm whether it is fixed.

I will try Nightly again tomorrow.
Oops. I think I was using the wrong build. Here's a crash report from Nightly that seems to have better symbol information: https://crash-stats.mozilla.com/report/index/869532f1-eaac-44fc-97f8-9926e2160825
Ok, that crash is outside of MediaRecorder itself - crash is "MOZ_CRASH(Input stream is not serializable!)" Crash is inside ipc::SerializeInputStream(), called from  	mozilla::net::HttpChannelChild::ContinueAsyncOpen()

Please file a new bug in Core:: Networking: HTTP   and include that crash report URL

I'll check more deeply into the recorder results, perhaps using a different test.  I see no muting using webrtc-experiment on linux; I did see it on Windows.  This will also need a new bug unless it's a problem with the site (it's using slicing incorrectly, for example)
Just to be clear - this bug is fixed.  There might be some other bugs (one in http, one that's windows only (so far as I've tested it)) which are being discussed here, and those discussions should move to new bugs.  This bug patch needs to get uplifted to beta ASAP, since the feature is very broken without it.
Flags: needinfo?(lhenry)
Hello Jeremy, could you please verify this issue is fixed as expected on a latest Nightly build? Thanks!
Flags: needinfo?(jeremy.noring)
Comment on attachment 8782024 [details] [diff] [review]
encode all available audio on each cycle instead of one 'packet'

Severe regression in MediaRecording, the sooner we uplift to Aurora, the sooner we can stabilize it on a broader audience.
Attachment #8782024 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8782024 [details] [diff] [review]
encode all available audio on each cycle instead of one 'packet'

Bad regression in 48, let's get this into 49 beta 8 for verification there.
Flags: needinfo?(lhenry)
Attachment #8782024 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Andrei, hoping your team can replicate the issue and verify this on beta after we build on Monday. Thanks!
Flags: qe-verify+
Flags: needinfo?(andrei.vaida)
Hi Liz,
I've managed to reproduce the issue on Nightly 48.0a1 (id:20160423030220). I couldn't use ffmpeg as described in the STR in comment 0 but after downloading and opening the *.webm file that I recorded using the tool it was clear that A/V was out of sync. Retesting the same thing on the latest Beta 49.0b8 I didn't come across the same problems. 
Taking the testing further to latest Aurora and latest Nightly the issue reproduces like in Nightly 48.0a1.
Flags: qe-verify+
Flags: needinfo?(andrei.vaida)
(In reply to Ritu Kothari (:ritu) from comment #33)
> Hello Jeremy, could you please verify this issue is fixed as expected on a
> latest Nightly build? Thanks!

Using latest nightly, I don't see the specific issue listed above, but I still produced a file with utterly hideous timestamping, to the point where it is completely unusable.  I've attached that file here.

To try and induce the bug, I specified a framerate of 15 fps in the getUserMedia call based on comments by Randall J.
Recorded with 51.0a1 (2016-08-30)
I have filed a new bug about the a/v sync issue that remains in the Nightly build. 

https://bugzilla.mozilla.org/show_bug.cgi?id=1300169
I also filed a new bug about the serialization crash that occurs when trying to upload a recorded video. That's the tab crash I discussed above. 

https://bugzilla.mozilla.org/show_bug.cgi?id=1300175
Blocks: 1298750
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: