Closed Bug 1207785 Opened 4 years ago Closed 4 years ago

Create telemetry probe for tracking number of youtube embed tags seen by users

Categories

(Core :: Audio/Video: Playback, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox43 --- fixed
firefox44 --- fixed

People

(Reporter: qdot, Assigned: qdot)

References

Details

Attachments

(1 file, 2 obsolete files)

We need a telemetry probe to see how many times users are running into youtube embed tags that require flash to play. This will allow us to plan better for implementing/landing bug 769117.
cpeterson for code review, vladan for data collection peer review
Attachment #8665686 - Flags: review?(vladan.bugzilla)
Attachment #8665686 - Flags: review?(cpeterson)
Removing some test stuff from histograms json
Attachment #8665686 - Attachment is obsolete: true
Attachment #8665686 - Flags: review?(vladan.bugzilla)
Attachment #8665686 - Flags: review?(cpeterson)
Attachment #8665687 - Flags: review?(vladan.bugzilla)
Attachment #8665687 - Flags: review?(cpeterson)
Comment on attachment 8665687 [details] [diff] [review]
Patch 1 (v2) - Telemetry Probe for Youtube Embed Tags Seen

Review of attachment 8665687 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM!

::: dom/base/nsObjectLoadingContent.cpp
@@ +1480,5 @@
> +    do_QueryInterface(static_cast<nsIImageLoadingContent*>(this));
> +  NS_ASSERTION(thisContent, "Must be an instance of content");
> +
> +  // We're only interested in switching out embed tags
> +  if (!thisContent->NodeInfo()->Equals(nsGkAtoms::embed)) {

I assume we don't need to check for YouTube embeds with <object> tags that are missing an inner <embed> tag? YouTube's documentation consistently talks about "<object> embeds", but their example code shows an <object> tag with an inner <embed> tag.

https://developers.google.com/youtube/player_parameters

@@ +1496,5 @@
> +  if (!ok) {
> +    NS_WARNING("Could not parse plugin domain!");
> +    return false;
> +  }
> +  nsAutoCString domain("youtube.com");

YouTube has some alternate domains like youtu.be and youtube-nocookie.com, but we probably don't need to bother checking for them. YouTube's default embed HTML specifies "www.youtube.com". Any legacy <object> embeds using youtu.be or youtube-nocookie.com would have had to manually edited the embed HTML.

@@ +2153,5 @@
>      return NS_OK;
>    }
>  
> +  // Check whether this is a youtube embed.
> +  if (IsYoutubeEmbed()) {

Is this check reached if the user has Flash disabled or click-to-play (when they don't click)? Those are the most important cases to measure because those users won't see embedded video but they could if we rewrite the <object> embed to HTML5 video embed.
Attachment #8665687 - Flags: review?(cpeterson) → review+
Comment on attachment 8665687 [details] [diff] [review]
Patch 1 (v2) - Telemetry Probe for Youtube Embed Tags Seen

Review of attachment 8665687 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/Histograms.json
@@ +9655,5 @@
>      "expires_in_version": "55",
>      "kind": "count",
>      "description": "Attempt to notify ServiceWorker of push notification resubscription."
> +  },
> +  "YOUTUBE_EMBED_SEEN": {

Because this is reporting on which web content was seen/visited, it makes me a little uncomfortable. 
On the other hand, this is an opt-in probe and reporting on YouTube embeds seems pretty innocuous.
At a minimum we need to set it to expire in a few versions.

Note also that you'll be reporting on the # of embeds seen during a session, with no controlling for session-duration (unless you do a custom analysis on the backend), # of distinct sites or pages visited, etc.

It would seem sufficient to make this a flag histogram and just report if *any* YouTube embed was seen during the session. That would also leak less data about the user's browsing history. What do you think?

I'd also like to get an opinion from another privacy peer. Ally?

@@ +9656,5 @@
>      "kind": "count",
>      "description": "Attempt to notify ServiceWorker of push notification resubscription."
> +  },
> +  "YOUTUBE_EMBED_SEEN": {
> +    "expires_in_version": "never",

same general comments as bug 722110: add alert_emails, set expires_in_version 5 versions in the future max, note that 0-counts won't be explicitly reported

@@ +9658,5 @@
> +  },
> +  "YOUTUBE_EMBED_SEEN": {
> +    "expires_in_version": "never",
> +    "kind": "count",
> +    "description": "Counts number of embed tags (which force flash usage) user sees from youtube.com domain.",

Nit: Trailing comma
Flags: needinfo?(ally)
Oh yeah, also note that probes are *opt-in* on Release channel, unless otherwise specified, so you'll only get data from a somewhat biased 1% of Release-channel users. All probes are opt-out on pre-release channels. I don't know if that's ok for your needs.
Attachment #8665687 - Flags: review?(vladan.bugzilla)
(In reply to Chris Peterson [:cpeterson] from comment #3)
> @@ +2153,5 @@
> >      return NS_OK;
> >    }
> >  
> > +  // Check whether this is a youtube embed.
> > +  if (IsYoutubeEmbed()) {
> 
> Is this check reached if the user has Flash disabled or click-to-play (when
> they don't click)? Those are the most important cases to measure because
> those users won't see embedded video but they could if we rewrite the
> <object> embed to HTML5 video embed.

Yes, this check is always hit here whenever we see an embed tag. Unlike the activations bug, I figured we wanted to know ANY time the plugin path was traversed here, even if the plugin wasn't activated.
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #5)
> Oh yeah, also note that probes are *opt-in* on Release channel, unless
> otherwise specified, so you'll only get data from a somewhat biased 1% of
> Release-channel users. All probes are opt-out on pre-release channels. I
> don't know if that's ok for your needs.

I /think/ we only care about pre-release channels right now? ni?'ing cpeterson to see which he wants.
Flags: needinfo?(cpeterson)
(In reply to Kyle Machulis [:kmachulis] [:qdot] (USE NEEDINFO?) from comment #7)
> (In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #5)
> > Oh yeah, also note that probes are *opt-in* on Release channel, unless
> > otherwise specified, so you'll only get data from a somewhat biased 1% of
> > Release-channel users. All probes are opt-out on pre-release channels. I
> > don't know if that's ok for your needs.
> 
> I /think/ we only care about pre-release channels right now? ni?'ing
> cpeterson to see which he wants.

Ideally, we would collect data from all channels. We think these legacy YouTube embeds are uncommon, but we want to gauge how uncommon, so we would like to reach many long-tail websites. That said, I think just testing the pre-release channels should be good enough to start.
Flags: needinfo?(cpeterson)
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #4)
> Comment on attachment 8665687 [details] [diff] [review]
> Patch 1 (v2) - Telemetry Probe for Youtube Embed Tags Seen
> 
> Review of attachment 8665687 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/telemetry/Histograms.json
> @@ +9655,5 @@
> >      "expires_in_version": "55",
> >      "kind": "count",
> >      "description": "Attempt to notify ServiceWorker of push notification resubscription."
> > +  },
> > +  "YOUTUBE_EMBED_SEEN": {
> 
> Because this is reporting on which web content was seen/visited, it makes me
> a little uncomfortable. 
> On the other hand, this is an opt-in probe and reporting on YouTube embeds
> seems pretty innocuous.
> At a minimum we need to set it to expire in a few versions.
> 
> Note also that you'll be reporting on the # of embeds seen during a session,
> with no controlling for session-duration (unless you do a custom analysis on
> the backend), # of distinct sites or pages visited, etc.
> 
> It would seem sufficient to make this a flag histogram and just report if
> *any* YouTube embed was seen during the session. That would also leak less
> data about the user's browsing history. What do you think?
> 
> I'd also like to get an opinion from another privacy peer. Ally?

I would prefer less data about the video browsing history, certainly. Chris, would that be sufficent to access how common/uncommon these embeds are?
Flags: needinfo?(ally) → needinfo?(cpeterson)
(In reply to Allison Naaktgeboren please NEEDINFO? :ally from comment #9)
> (In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #4)
...
> > Note also that you'll be reporting on the # of embeds seen during a session,
> > with no controlling for session-duration (unless you do a custom analysis on
> > the backend), # of distinct sites or pages visited, etc.
> > 
> > It would seem sufficient to make this a flag histogram and just report if
> > *any* YouTube embed was seen during the session. That would also leak less
> > data about the user's browsing history. What do you think?
> > 
> > I'd also like to get an opinion from another privacy peer. Ally?
> 
> I would prefer less data about the video browsing history, certainly. Chris,
> would that be sufficent to access how common/uncommon these embeds are?

That sounds like a good trade-off. Reporting whether a session saw at least one legacy embed is probably adequate. That will still give us an idea of the magnitude of these legacy embeds and whether it decreases over time.
Flags: needinfo?(cpeterson)
Changed probe to flag that is activated when embed is seen. Will file followup for release channel opt-out.
Attachment #8665687 - Attachment is obsolete: true
Attachment #8666855 - Flags: review?(vladan.bugzilla)
Attachment #8666855 - Flags: review?(vladan.bugzilla) → review+
https://hg.mozilla.org/mozilla-central/rev/8a905fb2619e
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
Kyle, now that your embed probe has landed in Nightly 44 (I see it in my about:telemetry :-), can we uplift it to Aurora 43?
Flags: needinfo?(kyle)
Comment on attachment 8666855 [details] [diff] [review]
Patch 1 (v3) - Telemetry Probe for Youtube Embed Tags Seen; r=cpeterson

Approval Request Comment
[Feature/regressing bug #]: Bug 1207785
[User impact if declined]: Won't know how many people are seeing legacy youtube embeds
[Describe test coverage new/current, TreeHerder]: No tests but has passed try and merged to m-c.
[Risks and why]: None 
[String/UUID change made/needed]: None
Flags: needinfo?(kyle)
Attachment #8666855 - Flags: approval-mozilla-aurora?
It seems a little weird to me to hardcode a person's email address into our product. Is this something we have done before? If we haven't, I am not sure we should start now. On the other hand it's your inbox's funeral :)
Flags: needinfo?(kyle)
Flags: needinfo?(cpeterson)
I should have grepped before speaking. I see we have some other stuff like this already in Telemetry. OK!
Flags: needinfo?(kyle)
Flags: needinfo?(cpeterson)
Comment on attachment 8666855 [details] [diff] [review]
Patch 1 (v3) - Telemetry Probe for Youtube Embed Tags Seen; r=cpeterson

This has been on m-c for a while with no problem. Let's uplift to aurora.
Attachment #8666855 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Blocks: 1211046
You need to log in before you can comment on or make changes to this bug.