Open Bug 1557394 Opened 5 years ago Updated 2 years ago

"MediaRecorder does not support recording multiple tracks of the same type at this time." error is not specified and inconsistent with currently only recording one track kind

Categories

(Core :: Audio/Video: Recording, enhancement, P3)

69 Branch
enhancement

Tracking

()

People

(Reporter: guest271314, Unassigned)

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Linux i686; rv:69.0) Gecko/20100101 Firefox/69.0

Steps to reproduce:

  1. Create 2 <canvas> elements
  2. Draw images onto canvases
  3. captureStream() of each canvas
  4. Create new MediaStream() with the 2 canvas MediaStreamTrack's
  5. Pass created MediaStream to MediaRecorder();

Actual results:

  1. Error is thrown by MediaRecorder: "MediaRecorder does not support recording multiple tracks of the same type at this time. "
  2. dataavailable event of MediaRecorder is dispatched with event.data set to Blob { size: 0, type: "" }

Expected results:

  1. No error should be thrown for 2 or more MediaStreamTrack's of the same "kind" within a MediaStream being passed to MediaRecorder (the specification https://w3c.github.io/mediacapture-record does not state an error should be thrown for such a case)
  2. In lieu of support for recording multiple MediaStreamTrack's (of any kind) MediaRecorder should record only the first MediaStreamTrack of "kind" without throwing an error (since MediaRecorder does not support recording multiple tracks of the same "kind" throwing an error when multiple tracks of the same kind are set at MediaStream should not take place)

Have not yet found the source of the thrown error at https://searchfox.org/mozilla-central/source/dom/media/MediaRecorder.cpp or https://searchfox.org/mozilla-central/source/dom/media/MediaRecorder.h

The error was found while trying to persuade MediaRecorder to record multiple tracks by using enabled, addTrack and removeTrack, which the specification does state stops recording https://w3c.github.io/mediacapture-record/#dom-mediarecorder-start 5.3. The specification does not state that recording should be stopped when the initial MediaStream is comprosed of multiple tracks of the same kind.

Component: Untriaged → Audio/Video: MediaStreamGraph
Product: Firefox → Core

It's true we should allow recording multiple tracks to be fully spec compliant. However the need hasn't really been there. We don't support playback of a media file with more than one track per kind (we only play the first track IIRC). For audio one could also easily circumvent this by using WebAudio.

For video it's a bit more complicated since there's no convenient and performant workaround like for audio, and the spec editors haven't been able to agree on a way to support the track set changing during the recording. What's your use case for multiple video tracks in one recording?

Status: UNCONFIRMED → NEW
Type: defect → enhancement
Component: Audio/Video: MediaStreamGraph → Audio/Video: Recording
Ever confirmed: true
Priority: -- → P3

(In reply to Andreas Pehrson [:pehrsons] from comment #3)

It's true we should allow recording multiple tracks to be fully spec compliant. However the need hasn't really been there. We don't support playback of a media file with more than one track per kind (we only play the first track IIRC). For audio one could also easily circumvent this by using WebAudio.

For video it's a bit more complicated since there's no convenient and performant workaround like for audio, and the spec editors haven't been able to agree on a way to support the track set changing during the recording. What's your use case for multiple video tracks in one recording?

First, for clarity, the concept of "multiple tracks" should be clearly defined so that the interested parties are discussing the same technical subject matter.

Do not gather any difference or additional complication with changing the track during recording. Whether the pattern is

MediaStream([MediaStreamTrack(audio), MediaStreamTrack(audio), MediaStreamTrack(audio) MediaStreamTrack(video), MediaStreamTrack(video), MediaStreamTrack(video)]); MediaStream[audio[2]].enabled; MediaStream[video[3]].enabled; or, similar to Web Audio

MediaStream.connect(MediaStreamTrack(video)) especially not at Firefox, where different input width and height ARE recorded reflecting the current image input https://bugs.chromium.org/p/chromium/issues/detail?id=972470.

One part of the basic use case that am interested in is creating a video collage without having to use requestAnimationFrame or ReadableStream

https://github.com/w3c/mediacapture-record/issues/166

The concept itself (concatenating media fragments) was inspired by A Shared Culture https://creativecommons.org/about/videos/a-shared-culture and Jesse Dylan https://mirrors.creativecommons.org/movingimages/webm/ScienceCommonsJesseDylan_240p.webm. The use case: Create such a video/audio collage from disparate media using only API's shipped with modern, ostensibly FOSS, browsers.

it should be possible to simply change the src of <video> and the recording should not stop, given that "black frames and silence" is specified to be output when no media is output from a MediaStreamTrack.

What is the technical issue with changing the VIDEO track set during recording? What is the disagreement?

(In reply to guest271314 from comment #4)

(In reply to Andreas Pehrson [:pehrsons] from comment #3)

It's true we should allow recording multiple tracks to be fully spec compliant. However the need hasn't really been there. We don't support playback of a media file with more than one track per kind (we only play the first track IIRC). For audio one could also easily circumvent this by using WebAudio.

For video it's a bit more complicated since there's no convenient and performant workaround like for audio, and the spec editors haven't been able to agree on a way to support the track set changing during the recording. What's your use case for multiple video tracks in one recording?

First, for clarity, the concept of "multiple tracks" should be clearly defined so that the interested parties are discussing the same technical subject matter.

Yes, that's on the spec to do.

Do not gather any difference or additional complication with changing the track during recording. Whether the pattern is

MediaStream([MediaStreamTrack(audio), MediaStreamTrack(audio), MediaStreamTrack(audio) MediaStreamTrack(video), MediaStreamTrack(video), MediaStreamTrack(video)]); MediaStream[audio[2]].enabled; MediaStream[video[3]].enabled; or, similar to Web Audio

MediaStream.connect(MediaStreamTrack(video)) especially not at Firefox, where different input width and height ARE recorded reflecting the current image input https://bugs.chromium.org/p/chromium/issues/detail?id=972470.

One part of the basic use case that am interested in is creating a video collage without having to use requestAnimationFrame or ReadableStream

https://github.com/w3c/mediacapture-record/issues/166

The concept itself (concatenating media fragments) was inspired by A Shared Culture https://creativecommons.org/about/videos/a-shared-culture and Jesse Dylan https://mirrors.creativecommons.org/movingimages/webm/ScienceCommonsJesseDylan_240p.webm. The use case: Create such a video/audio collage from disparate media using only API's shipped with modern, ostensibly FOSS, browsers.

it should be possible to simply change the src of <video> and the recording should not stop, given that "black frames and silence" is specified to be output when no media is output from a MediaStreamTrack.

Can you do this with MSE, or is that where you find it too complicated, i.e., your ReadableStream argument?

What is the technical issue with changing the VIDEO track set during recording? What is the disagreement?

There's no technical issue really. The issue is in the spec. If the editors could be convinced (typically a compelling-enough-use-case vs too-complicated-api tradeoff) it wouldn't be too hard to implement something. I think so far no (real-world) use case has been presented with enough benefit in order to make the API more complicated (which is the result of adding things to it).

Can you do this with MSE, or is that where you find it too complicated, i.e., your ReadableStream argument?

The issue is not the complexity. MSE does not work at Chromium (follow https://github.com/w3c/media-source/issues/190). Trying to record MSE at Chromium STILL crashes the tab; has been an outstanding issue for a couple years now.

What do you mean by "real-world"? Is this user not in this "real-world"?

(In reply to guest271314 from comment #6)

Can you do this with MSE, or is that where you find it too complicated, i.e., your ReadableStream argument?

The issue is not the complexity. MSE does not work at Chromium (follow https://github.com/w3c/media-source/issues/190). Trying to record MSE at Chromium STILL crashes the tab; has been an outstanding issue for a couple years now.

Perhaps changing the spec to work around an implementation issue is not the easiest way forward. Fixing the implementation issues seems like an easier path.

What do you mean by "real-world"? Is this user not in this "real-world"?

Sure, but from my own experience spec proposals often come from a perspective of "if only the api allowed THIS!" when perhaps not much useful would come out of it. That would be the opposite of a real-world usecase. Making a compelling spec proposal is always helped by a real-world, concrete, use case; some analysis of why the proposed way is the best way; and probably even with the PR for the spec changes already written.

(In reply to Andreas Pehrson [:pehrsons] from comment #7)

(In reply to guest271314 from comment #6)

Can you do this with MSE, or is that where you find it too complicated, i.e., your ReadableStream argument?

The issue is not the complexity. MSE does not work at Chromium (follow https://github.com/w3c/media-source/issues/190). Trying to record MSE at Chromium STILL crashes the tab; has been an outstanding issue for a couple years now.

Perhaps changing the spec to work around an implementation issue is not the easiest way forward. Fixing the implementation issues seems like an easier path.

Do not have an indication that Chromium/Chrome actually is inspired to "fix" the MSE issue. Digging in to the MSE specification and implementation at Chrome/Chromium makes it clear that the SourceBuffer is NOT intended to be inspected or exposed; the live playback and complete playback once endOfStream() is called is NOT intended to be downloaded. Perhaps related to *ouTube and their use of MSE? Not sure. The bug has been ongoing for over a year now.

What do you mean by "real-world"? Is this user not in this "real-world"?

Sure, but from my own experience spec proposals often come from a perspective of "if only the api allowed THIS!" when perhaps not much useful would come out of it. That would be the opposite of a real-world usecase. Making a compelling spec proposal is always helped by a real-world, concrete, use case; some analysis of why the proposed way is the best way; and probably even with the PR for the spec changes already written.

If a "real-world" use case is needed that appeals to or can be understood by all parties, then the simplest use case would be a playlist for a <video> element, where the src is changed and MediaRecorder is expected to record the entire playlist.

Consider a simple code example to effectuate recording a playlist:

const ms = video.captureStream(); // called BEFORE src is set
const recorder = new MediaRecorder(ms);
ms.onaddtrack = e => {
// do stuff with current MediaStreamTrack, e.g., apply constraints
// if necessary, remove or disable existing audio and video tracks
}
video.onplay = e => {
if (recorder.state === "inactive") {
recorder.start()
} else {
if (recorder.state === "paused") {
recorder.resume();
}
}
}

video.onpause = e => {
recorder.pause();
}

From perspective here, not much, if any language needs to be added to the relevant specifications. Instead, language needs to be removed from the relevant specifications.

There is no reason for MediaRecorder to stop simply because a MediaStreamTrack enabled is false or mute is true.

Therefore, the simplest way to allow recording of multiple tracks is for MediaRecorder to not stop the recording simply due to one or more tracks enabled set to false or even readyState set to "ended".

Will collect the specific portions of Media Capture and Streams, MediaStream Recording, and Media Capture from DOM Elements that should be REMOVED in order for what is essentially described at Proposal: Specify ability to pause and resume between adding and removing MediaStreamTracks to an active MediaStream https://github.com/w3c/mediacapture-record/issues/147 to take effect.

While Add replaceTrack method to MediaStream https://github.com/w3c/mediacapture-record/issues/167

This seems like the right repo for it, since the justification for the function is strictly based on the MediaRecorder definition that "adding a track stops recording".
An alternative would be to let MediaRecorder do the same thing as the <video> playback, and let only one video track be recorded, but the video track can be replaced by add/remove.

(https://github.com/w3c/mediacapture-record/issues/167#issuecomment-493090139) requests an existing WebRTC method to be ADDED to the relevant specifications, while testing (always test code at Firefox and Chromium) found that each implementation subtly clips audio from the last 1 second, similar to a recent Firefox that you addressed, to an appreciable degree; the same or similar issue arises when using replaceTrack().

Thus, the universal "real-world" use case is recording a PLAYLIST, and language does not necessarily need to be added to the relevant specifications, language needs to be REMOVED from the specifications.

From Media Capture from DOM Elements https://w3c.github.io/mediacapture-fromelement/#methods-0

Both MediaStream and HTMLMediaElement expose the concept of a track. Since there is no common type used for HTMLMediaElement, this document uses the term track to refer to either VideoTrack or AudioTrack. MediaStreamTrack is used to identify the media in a MediaStream.

An interesting topic to look into. videoTracks property of HTMLVideoElement is largely not used save for SourceBuffer of MediaSource; even then it is not straightforward how to "select" one of several tracks. Not necessarily relevant, though acknowledges that a <video> element can have more than on video track.

A <video> element can therefore capture a video MediaStreamTrack and any number of audio MediaStreamTracks.

is not necessarily correct, as the above code demonstrates. When captureStream() is called on a <video> element, if and when the src is changed, the associated MediaStreamTracks are added to the single MediaStream returned by captureStream(), therefore the <video> element CAN have more than "a video MediaStreamTrack". Without diving into the considerably varying manner in which the mute, unmute and ended events are dispatched or not when comparing Firefox and Chromium, which is an entirely separate interoperability issue that is far more non-uniform to unravel and address than simply allowing MediaRecorder to NOT STOP when a MediaStreamTrack having origin in src of <video> is either mute, not enabled or even "ended".

If the source for the media element ends, a different source is selected.

Does this actually occur relevant to MediaRecorder? If not, it either SHOULD or MUST; preferably, the user can select which track to enable, mute or otherwise arbitrarily record from the set initial or added of MediaStreamTracks. That line does not necessarily need to be REMOVED, just implemented. If it is not implemented, then remove the line so that no user gets the impression that they can arbitrarily select one or more audio or video MediaStreamTracks to record.

and a removetrack events is generated for each track that ceases to be selected or enabled.

does not actually occur at either Firefox or Chromium; new MediaStreamTracks are simply shifted or puushed in to the MediaStream.

Absence of content is reflected in captured tracks through the muted attribute. A captured MediaStreamTrack MUST have a muted attribute set to true if its corresponding source track does not have available and accessible content. A mute event is raised on the MediaStreamTrack when content availability changes.

mute attribute actual implementation varies widely depending on browser used and under what context.

MediaStream Recording https://w3c.github.io/mediacapture-record/MediaRecorder.html

  1. Set state to recording, and run the following steps in parallel:
  1. If at any point, a track is added to or removed from the stream's track set, the UA MUST immediately stop gathering data, discard any data that it has gathered, and queue a task, using the DOM manipulation task source, that runs the following steps:
  1. Set state to inactive.
  2. Fire an error event named InvalidModificationError at target.
  3. Fire a blob event named dataavailable at target with blob.
  4. Fire an event named stop at target.

The above should be REMOVED from the specification. The proposal is to NOT stop the recording simply due to a MediaStreamTrack being enabled, mute, "ended", etc.

The user defines and decides when to stop recording, not the browser; especially since the concept of "black frames and silence" has been introduced into the context by specification authors/implementers/developers - that is, if there is "black frames and silence", the recording can continue until the user decides to stop recording "black frames and silence".

Particularly when MediaRecorder.pause() is called, why should MediaRecorder code in the "paused" state care if the src of the <video> element where captureStream() is called changes? Unless the src is set to an empty string changing the src should be an indication that additional MediaStreamTracks will be momentarily added to the MediaStream to be recorded. The user can call removeTrack() or addTrack() to add or remove any MediaStreamTrack they decide to - without MediaRecorder stopping for no reason.

If the UA at any point is unable to continue gathering data for reasons other than isolation properties or stream track set, it MUST stop gathering data, and queue a task, using the DOM manipulation task source, that runs the following steps:

Has the same 4 steps as 5.3, and as well, should be REMOVED from the specification. Even if the MediaRecorder is not paused. Especially if the MediaRecorder is paused. Why should MediaRecorder code care when in the paused state if a MediaStreamTrack is added or removed from the MediaStream? The user SHOULD BE able to add or remove whichever tracks they decide to during the active or paused state - particularly the paused state (to avoid specification authors nit over switching a live track, even though RTCRtpSender.replaceTrack() exists), to which one can point to to state that the functionality already exists - and should not only be available to RTCPeerConnection, but also applicable to MediaRecorder and/or MediaStream created by captureStream(); particularly HTMLMediaElement.

If any Track within the MediaStream is muted or not enabled at any time, the UA will only record black frames or silence since that is the content produced by the Track.

Does this really occur? If it does why does the MediaRecorder need to stop when

If the UA at any point is unable to continue gathering data for reasons

purportedly occurs? Simply output "black frames or silence" until the user explicitly calls stop() or requestData().

If any Track within the MediaStream is muted or not enabled at any time, the UA will only record black frames or silence since that is the content produced by the Track.

The same principal holds true for the above portion of the specification, which can either actually occur or be REMOVED. If they do occur then there is no reason to stop the MediaRecorder.

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: