Closed Bug 1226450 Opened 9 years ago Closed 9 years ago

Report audio/video codecs used in HTMLMediaElement and WebAudio via telemetry

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla45

Tracking Flags:

Tracking

Status

firefox45

---

fixed

People

(Reporter: cpearce, Assigned: cpearce)

Details

Attachments

(1 file, 3 obsolete files)

Patch: Add telemetry to report codec used 9 years ago Chris Pearce [:cpearce (Not reading bugmail)] 15.71 KB, patch		Details \| Diff \| Splinter Review
Patch v2: Add telemetry to report codec used 9 years ago Chris Pearce [:cpearce (Not reading bugmail)] 14.94 KB, patch	jya : review-	Details \| Diff \| Splinter Review
Patch v3: Add telemetry to report codec used 9 years ago Chris Pearce [:cpearce (Not reading bugmail)] 13.05 KB, patch	jya : review+	Details \| Diff \| Splinter Review
Patch v4:: Add telemetry to report codec used 9 years ago Chris Pearce [:cpearce (Not reading bugmail)] 13.07 KB, patch	vladan : review+	Details \| Diff \| Splinter Review

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Description

•

9 years ago

We should have telemetry to report how often various codecs are used in HTMLMediaElement and WebAudio in order to inform our decisions.

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 1

•

9 years ago

Attached patch Patch: Add telemetry to report codec used (obsolete) — Details — Splinter Review

* Rely on the Reader setting mMimeType on the MediaInfo it produces after possibly updating it after the first frame has been decoded.
* For legacy readers (GStreamer, AndroidMediaReader, OmxReader) which don't easily expose what exact codec they're playing, just report the content type of the MediaResource, with a "resource" prefix so we can tell it's not an exact codec type.
* Distinguish between codecs in Ogg and codecs in WebM.
* Add a prefix for WebAudio, so we can distinguish usage of codecs in WebAudio decodeData vs usage in HTMLMediaElement.
* Report the foramt of WAV files.
* I don't bother trying to handle MSE changing the codec type at runtime. Seems hard and not likely to happen much.

r?jya for media.
r?vladan for telemetry.

Attachment #8689862 - Flags: review?(vladan.bugzilla)

Attachment #8689862 - Flags: review?(jyavenard)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 2

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=aee1fb70d248

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 3

•

9 years ago

Attached patch Patch v2: Add telemetry to report codec used (obsolete) — Details — Splinter Review

I realised that once the WaveReader is moved over to being implemented in terms of MediaDataDemuxer/Decoder, it won't make sense to add the format to the mimeType. So same as before, but without the format on the Wave mime type.

We can report that by a separate telemetry probe if we really want that.

Attachment #8689862 - Attachment is obsolete: true

Attachment #8689862 - Flags: review?(vladan.bugzilla)

Attachment #8689862 - Flags: review?(jyavenard)

Attachment #8689877 - Flags: review?(vladan.bugzilla)

Attachment #8689877 - Flags: review?(jyavenard)

Jean-Yves Avenard [:jya]

Comment 4

•

9 years ago

Comment on attachment 8689877 [details] [diff] [review]
Patch v2: Add telemetry to report codec used

Review of attachment 8689877 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MediaDecoderStateMachine.cpp
@@ +2073,5 @@
>  }
>  
>  void
> +MediaDecoderStateMachine::ReportCodecTelemetry() const
> +{ 

trailing space

@@ +2097,5 @@
> +              ("Telemetry MEDIA_CODEC_USED= '%s'", codec.get()));
> +      Telemetry::Accumulate(Telemetry::ID::MEDIA_CODEC_USED, codec);
> +    }
> +  });
> +  AbstractThread::MainThread()->Dispatch(task.forget());

I think this would be better done in the MediaDecoder instead. It has access to all this information and all is done on the main thread.

This could be done in MediaDecoder::MetadataLoaded as we don't need to wait until the first frame is decoded. We know the codec type before that.

It would be much simpler and not require any modifications to any of the decoder or readers ; just the base MediaDecoder class

::: dom/media/webm/WebMDemuxer.cpp
@@ +375,2 @@
>        } else if (mAudioCodec == NESTEGG_CODEC_OPUS) {
> +        mInfo.mAudio.mMimeType = "audio/webm; codecs=opus";

I'm not too keen of having different mimetype for the same codec; but I can see why you would want that. Better granularity in reporting the type

Attachment #8689877 - Flags: review?(jyavenard) → review-

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 5

•

9 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #4)
> Comment on attachment 8689877 [details] [diff] [review]
> Patch v2: Add telemetry to report codec used
> 
> Review of attachment 8689877 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/media/MediaDecoderStateMachine.cpp
> @@ +2073,5 @@
> >  }
> >  
> >  void
> > +MediaDecoderStateMachine::ReportCodecTelemetry() const
> > +{ 
> 
> trailing space
> 
> @@ +2097,5 @@
> > +              ("Telemetry MEDIA_CODEC_USED= '%s'", codec.get()));
> > +      Telemetry::Accumulate(Telemetry::ID::MEDIA_CODEC_USED, codec);
> > +    }
> > +  });
> > +  AbstractThread::MainThread()->Dispatch(task.forget());
> 
> I think this would be better done in the MediaDecoder instead. It has access
> to all this information and all is done on the main thread.

Sure.

> This could be done in MediaDecoder::MetadataLoaded as we don't need to wait
> until the first frame is decoded. We know the codec type before that.
> 
> It would be much simpler and not require any modifications to any of the
> decoder or readers ; just the base MediaDecoder class

We'd still need to modify the Readers which aren't MediaFormatReaders which we still use, as they don't yet fill out the mimetype fields of the MediaInfo.

> 
> ::: dom/media/webm/WebMDemuxer.cpp
> @@ +375,2 @@
> >        } else if (mAudioCodec == NESTEGG_CODEC_OPUS) {
> > +        mInfo.mAudio.mMimeType = "audio/webm; codecs=opus";
> 
> I'm not too keen of having different mimetype for the same codec; but I can
> see why you would want that. Better granularity in reporting the type

Precisely. We want to distinguish ogg/vorbis from webm/vorbis.

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 6

•

9 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #4)
> Comment on attachment 8689877 [details] [diff] [review]
> Patch v2: Add telemetry to report codec used
> 
> Review of attachment 8689877 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/media/MediaDecoderStateMachine.cpp
> @@ +2073,5 @@
> >  }
> >  
> >  void
> > +MediaDecoderStateMachine::ReportCodecTelemetry() const
> > +{ 
> 
> trailing space
> 
> @@ +2097,5 @@
> > +              ("Telemetry MEDIA_CODEC_USED= '%s'", codec.get()));
> > +      Telemetry::Accumulate(Telemetry::ID::MEDIA_CODEC_USED, codec);
> > +    }
> > +  });
> > +  AbstractThread::MainThread()->Dispatch(task.forget());
> 
> I think this would be better done in the MediaDecoder instead. It has access
> to all this information and all is done on the main thread.
> 
> This could be done in MediaDecoder::MetadataLoaded as we don't need to wait
> until the first frame is decoded. We know the codec type before that.
> 
> It would be much simpler and not require any modifications to any of the
> decoder or readers ; just the base MediaDecoder class

I guess what you mean here is that the DirectShow, Wave, and Raw Readers are only created if the MediaResource's ContentType is known to be what is equivalent to the mime types I'm adding to the MediaInfo's mimeType. So we could refrain from adding them.

For Ogg, we specifically want to know what codecs are being used, so it makes sense to fill out these fields.

Jean-Yves Avenard [:jya]

Comment 7

•

9 years ago

(In reply to Chris Pearce (:cpearce) from comment #6)

> I guess what you mean here is that the DirectShow, Wave, and Raw Readers are
> only created if the MediaResource's ContentType is known to be what is
> equivalent to the mime types I'm adding to the MediaInfo's mimeType. So we
> could refrain from adding them.
> 
> For Ogg, we specifically want to know what codecs are being used, so it
> makes sense to fill out these fields.

filling up the mimetype is fine. I'm more fuss about where the report is performed, and to me MediaDecoder is a better place, and the information being filled when reading the metadata, it can be done there instead of doing it after decoding the first frame

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 8

•

9 years ago

Attached patch Patch v3: Add telemetry to report codec used (obsolete) — Details — Splinter Review

With review comments addressed.

Attachment #8690601 - Flags: review?(jyavenard)

Jean-Yves Avenard [:jya]

Updated

•

9 years ago

Attachment #8690601 - Flags: review?(jyavenard) → review+

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Updated

•

9 years ago

Attachment #8689877 - Flags: review?(vladan.bugzilla)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Updated

•

9 years ago

Attachment #8689877 - Attachment is obsolete: true

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 9

•

9 years ago

Comment on attachment 8690601 [details] [diff] [review]
Patch v3: Add telemetry to report codec used

Re-requesting review by vladan, now this patch has passed review by a media peer.

This patch reports what media codecs' mime types we play in HTML <audio> and <video> elements, and in WebAudio's decodeAudioData API.

Attachment #8690601 - Flags: review?(vladan.bugzilla)

Vladan Djeric (:vladan)

Comment 10

•

9 years ago

Comment on attachment 8690601 [details] [diff] [review]
Patch v3: Add telemetry to report codec used

Review of attachment 8690601 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/Histograms.json
@@ +8756,5 @@
>      "releaseChannelCollection": "opt-out"
>    },
> +  "MEDIA_CODEC_USED": {
> +    "alert_emails": ["cpearce@mozilla.com"],
> +    "expires_in_version": "never",

* make it expire in 8 versions (~1 year), you can easily renew it then and you'll be notified by an automated email.
* add a bug_numbers field

@@ +8757,5 @@
>    },
> +  "MEDIA_CODEC_USED": {
> +    "alert_emails": ["cpearce@mozilla.com"],
> +    "expires_in_version": "never",
> +    "keyed": true,

how many distinct keys are you expecting? and are they really going to be arbitrary strings? if it's less than a dozen, you should implement this probe as an enum histogram

Attachment #8690601 - Flags: review?(vladan.bugzilla)

Ralph Giles (:rillian)

Updated

•

9 years ago

Priority: -- → P2

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 11

•

9 years ago

On desktop the set of strings I expect we'd encounter are:

audio/mp4a-latm
video/avc
audio/mpeg
resource; audio/mpeg
resource; audio/mp3
video/ogg; codecs="theora"
audio/ogg; codecs="opus"
audio/ogg; codecs="vorbis"
video/webm; codecs="vp8"
video/webm; codecs="vp9"
audio/webm; codecs="opus"
audio/webm; codecs="vorbis"
resource; audio/x-wav
resource; audio/wav
resource; audio/wave
resource; audio/x-pn-wav
resource; video/x-raw
resource; video/x-raw-yuv
webaudio; audio/mp4a-latm
webaudio; audio/mpeg
webaudio; audio/mp3
webaudio;resource; audio/mpeg
webaudio;resource; audio/mp3
webaudio; audio/ogg; codecs="opus"
webaudio; audio/ogg; codecs="vorbis"
webaudio; audio/webm; codecs="opus"
webaudio; audio/webm; codecs="vorbis"
webaudio;resource; audio/x-wav
webaudio;resource; audio/wav
webaudio;resource; audio/wave
webaudio;resource; audio/x-pn-wav

On Linux if ffmpeg is not installed we use GStreamer, and then the set of types would be the above plus what's in the sContainers list in GStreamerFormatHelper.cpp.

On B2G and older versions of Android we support a wider range of mime types, so the set of strings increases; basically it's the above, plus whatever else is in gOmxTypes and gB2GOnlyTypes in DecoderTraits.cpp.

So unless there's a good reason to define an explicit enumeration, I'd prefer to report the strings. Is there a good reason?

Flags: needinfo?(vladan.bugzilla)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 12

•

9 years ago

Attached patch Patch v4:: Add telemetry to report codec used — Details — Splinter Review

bug_numbers and expires_in_version updated. I left the arbitrary strings being reported, as it would be onerous to enumerate and test all permutations on all platforms that we support...

Attachment #8691198 - Flags: review?(vladan.bugzilla)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Updated

•

9 years ago

Attachment #8690601 - Attachment is obsolete: true

Jean-Yves Avenard [:jya]

Comment 13

•

9 years ago

Worth noting that we are the one setting the strings; it doesn't come from random value and as such their number is limited.

Vladan Djeric (:vladan)

Updated

•

9 years ago

Attachment #8691198 - Flags: review?(vladan.bugzilla) → review+

Vladan Djeric (:vladan)

Comment 14

•

9 years ago

(In reply to Chris Pearce (:cpearce) from comment #11)
> So unless there's a good reason to define an explicit enumeration, I'd
> prefer to report the strings. Is there a good reason?

Enum histograms are simpler, so if possible, they're preferrable. It's easier to compare different "keys" in an enumerated histogram on a single telemetry.mozilla.org graph, they take less space to store in backend & Firefox, they're quicker for the backend to aggregate, quicker to retrieve for dashboarding, etc. In particular, we get backend performance problems if there are too many (e.g. 100+) distinct keys in a keyed histogram.

It seems like the # of possible keys in this histogram is finite, but it looks the # of keys could be large. Do you need to track each of these keys separately from each other or can you coalesce some of these keys?

Flags: needinfo?(vladan.bugzilla)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 15

•

9 years ago

(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #14)
> (In reply to Chris Pearce (:cpearce) from comment #11)
> > So unless there's a good reason to define an explicit enumeration, I'd
> > prefer to report the strings. Is there a good reason?
> 
> Enum histograms are simpler, so if possible, they're preferrable. It's
> easier to compare different "keys" in an enumerated histogram on a single
> telemetry.mozilla.org graph, they take less space to store in backend &
> Firefox, they're quicker for the backend to aggregate, quicker to retrieve
> for dashboarding, etc. In particular, we get backend performance problems if
> there are too many (e.g. 100+) distinct keys in a keyed histogram.
> 
> It seems like the # of possible keys in this histogram is finite, but it
> looks the # of keys could be large. Do you need to track each of these keys
> separately from each other or can you coalesce some of these keys?

We can coalesce some of the keys. There are 2 mappings for mp3, 3 for Wav, and 2 for Raw. That would reduce the number of keys by 8 or so.

We do occasionally add codecs, so using a keyed histogram means we don't need to remember to update the telemetry.

Pulsebot

Comment 16

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/990e65f36d20

Carsten Book [:Tomcat]

Comment 17

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/990e65f36d20

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox45: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla45

You need to log in before you can comment on or make changes to this bug.