Closed Bug 1867126 Opened 1 year ago Closed 1 year ago

Glean.js automatic page load instrumentation

Categories

(Data Platform and Tools :: Glean: SDK, task, P1)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: brosa, Assigned: brosa)

References

Details

Attachments

(2 files, 1 obsolete file)

Automatic instrumentation for basic page load events in Glean.js. We are looking to capture the event itself, plus 3 keys

  1. url
  2. referrer
  3. title

We need to handle 2 scenarios

  1. Automatic instrumentation where each time Glean.js is initialized, we collect the event by default.
  2. Automatic instrumentation is turned off and we provide the client a way to collect the same event manually. The metric is provided through a Glean API that allows for overriding all 3 values. If the values aren't overwritten, then we just collect the defaults.
Attached file Data Review Request (obsolete) —

This is a bit of a unique data review since what we are proposing to collect will ultimately be determined by each client.

Should we require a data-review on a per client basis to enable this? I am interested in hearing the data steward point of view for how we manage this once the code is released.

Attachment #9365979 - Flags: data-review?(tlong)

Bruno, typically for things defined internally in Glean like this, we just default to "on" and permanent collection. The way you have this designed it would sort of imply that each product that wants to use this needs to opt-in to enable it and would require a data-review for it. In the interests of not adding that additional friction to using this new page-load metric, I'd say we go ahead and say that this is permanently collected, and assume it will be used by anyone using Glean.js. It's only user interaction data, so no escalation should be necessary to be able to approve this collection.

I think that makes sense.

:dexter & :janerik does anyone have an issue with making this enabled by default? I think if it is going to be enabled by default, that will mean that my documentation updates will need to be very clear about who should be using this and who should use the manual events, based on the project set up.

Flags: needinfo?(jrediger)
Flags: needinfo?(alessio.placitelli)

(In reply to Travis Long [:travis_] from comment #3)

In the interests of not adding that additional friction to using this new page-load metric, I'd say we go ahead and say that this is permanently collected, and assume it will be used by anyone using Glean.js.

I agree with Travis

(In reply to Bruno Rosa [:brosa] from comment #4)

I think that makes sense.

:dexter & :janerik does anyone have an issue with making this enabled by default?

I think for data-review purposes we should assume this is going to be default on. We should likely wait to turn it on by default though, as we want to test and validate it first (but that doesn't matter for data-review purposes).

Flags: needinfo?(jrediger)
Flags: needinfo?(alessio.placitelli)
Comment on attachment 9365979 [details] Data Review Request > ># Request for data collection review form > >**All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.** > >1) What questions will you answer with this data? > >This allows clients to automatically collect page view data for their applications in a format that is consistent across products. This will make automated dash-boarding and building Glean tooling easier. > >2) Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements? Some example responses: > >This makes using the Glean ecosystem much easier, allowing for simpler data collection. > >3) What alternative methods did you consider to answer these questions? Why were they not sufficient? > >The existing alternative method involved clients setting up their own, non-standard page view events which makes it harder later to create automated dashboards for them. > >4) Can current instrumentation answer these questions? > >This is a new instrumentation, there is nothing currently like this. > >5) List all proposed measurements and indicate the category of data collection for each measurement, using the [Firefox data collection categories](https://wiki.mozilla.org/Data_Collection) found on the Mozilla wiki. > >**Note that the data steward reviewing your request will characterize your data collection based on the highest (and most sensitive) category.** > ><table> > <tr> > <td>Measurement Description</td> > <td>Data Collection Category</td> > <td>Tracking Bug #</td> > </tr> > <tr> > <td>Page Load Event</td> > <td>Interaction</td> > <td>https://bugzilla.mozilla.org/show_bug.cgi?id=1867126</td> > </tr> ></table> > >6) Please provide a link to the documentation for this data collection which describes the ultimate data set in a public, complete, and accurate way. > * Often the Privacy Notice for your product will link to where the documentation is expected to be. > * Common examples for Mozilla products/services: > * If this collection is Telemetry you can state "This collection is documented in its definitions files Histograms.json, Scalars.yaml, and/or Events.yaml and in the Probe Dictionary at https://probes.telemetry.mozilla.org." > * If this data is collected using the Glean SDK you can state âThis collection is documented in the Glean Dictionary at https://dictionary.telemetry.mozilla.org/" > * In some cases, documentation is included in the projectâs repository. > >This will change depending on the client. > >7) How long will this data be collected? Choose one of the following: > >I want to permanently monitor this data. brosa > >8) What populations will you measure? > >This is determined by the client and to what environments they roll this out to. > >9) If this data collection is default on, what is the opt-out mechanism for users? > >The data collection will be default on. The client can turn this off via the Glean configuration. The events will not be collected if the application allows a mechanism for turning off telemetry. > >10) Please provide a general description of how you will analyze this data. > >This will depend on the client. More generally, we will be able to use to create more standard automated dashboards for clients. > >11) Where do you intend to share the results of your analysis? > >This will depend on what the client wants to do. > >12) Is there a third-party tool (i.e. not Glean or Telemetry) that you are proposing to use for this data collection? If so: > >This is built in to Glean itself for the web.
Attachment #9365979 - Attachment is obsolete: true
Attachment #9365979 - Flags: data-review?(tlong)
Attachment #9366166 - Flags: data-review?(tlong)

Comment on attachment 9366166 [details]
Updated data-review attached to bug

Data Review

  1. Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?

Yes, through the metrics.yaml file and the Glean Dictionary.

  1. Is there a control mechanism that allows the user to turn the data collection on and off?

Yes, through the data preferences in the integrating application's settings.

  1. If the request is for permanent data collection, is there someone who will monitor the data over time?

permanent collection to be monitored by brosa and glean-team@mozilla.com

  1. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 2, Interaction data

  1. Is the data collection request for default-on or default-off?

Default-on

  1. Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?

No

  1. Is the data collection covered by the existing Firefox privacy notice?

Yes

  1. Does the data collection use a third-party collection tool?

No

Result

data-review+

Attachment #9366166 - Flags: data-review?(tlong) → data-review+
Type: defect → task
Priority: -- → P1
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: