Closed Bug 1520322 Opened 6 years ago Closed 6 years ago

Data Review for send.firefox.com using Amplitude

Categories

(Firefox :: Untriaged, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: clouserw, Unassigned)

Details

Attachments

(1 file)

Hi. Send.firefox.com would like to move off of GA and on to Amplitude for its general data collection.

Attached is the questionnaire and https://github.com/mozilla/send/pull/1091 is a PR which includes an Amplitude schema. Thanks!

Attachment #9036721 - Flags: review?(rrayborn)

Hi. Is this a right component for this type of bugs? Thanks.

Component: Untriaged → Review
Product: Firefox → Data Science

No, data steward collection reviews aren't owned by the data science team; typically these remain in the component matching the place where the data are collected.

Component: Review → Untriaged
Product: Data Science → Firefox

Hi Wil. Which component do you think it would fit bug best? Where is the data being collected from?

Considering that the send.firefox.com is a Firefox Test Pilot experiment, I will choose the component (Data Platform and Tools) Datasets: Experiments since it covers Test Pilot Experiments.

I hope this is finally the right component. If not, please give it a more appropriate component, rather than leaving it Untriaged.

Flags: needinfo?(wclouser)

The directions at https://wiki.mozilla.org/Firefox/Data_Collection say to use Firefox::Untriaged. I'll email the list there and ask them to look at it.

Flags: needinfo?(wclouser)

We should just create a new component. Untriaged isn't design for that.
Please request the creation of a new component here:
https://bugzilla.mozilla.org/enter_bug.cgi?product=bugzilla.mozilla.org&component=Administration

Flags: needinfo?(wclouser)

(In reply to Tim Smith 👨‍🔬 [:tdsmith] from comment #2)

No, data steward collection reviews aren't owned by the data science team; typically these remain in the component matching the place where the data are collected.

This comment shows the normal behavior we currently have regarding the processing of data review bugs, which is to mark its component as the section/area from which the data is being collected from. I could not determine this from the information provided in the bug.

Are you suggesting that a new component should be created? What kind of component are you suggesting? A component for the Data Review Tasks that isn't owned by the Science Team? Or a component for the section/area where the data is being collected from in this bug?

Please tolerate my confusion and help me process this bug further. Thanks.

Flags: needinfo?(sledru)

Answering for :sledru.

Given the instructions in https://wiki.mozilla.org/Firefox/Data_Collection, this bug is in the right component since Firefox Send is managed on GitHub.

:rrayburn needs to review the request.

If we want to make a change to the process so that we have a component for this, and I do think this is a little brittle, then I need to talk with the Data Stewards about it. And that's a separate bug.

Let's continue with the review request here, since that's the current procedure. I've reminded :rrayburn to look at this request.

Flags: needinfo?(sledru)
Comment on attachment 9036721 [details] data collection request form General Notes Can you clarify if 1) user_id's generation has been previously approved by legal or another steward and 2) whether the path that uses email address as a seed is salted? Given that this is pretty siloed, it's not as high risk, but I'd like to know. This has excellent documentation. I'm approving on the basis that ^ would be a larger meta conversation. 1) Is there or will there be **documentation** that describes the schema for the ultimate data set available publicly, complete and accurate? Yes, metrics.md 2) Is there a control mechanism that allows the user to turn the data collection on and off? Yes 3) If the request is for permanent data collection, is there someone who will monitor the data over time? Temporary as we rollout and perfect the service. 4) Using the category system of data types (https://wiki.mozilla.org/Firefox/Data_Collection), what collection type of data do the requested measurements fall under? Category 2 data (service usage, so not category 3) 5) Is the data collection request for default-on or default-off? Default on 6) Does the instrumentation include the addition of any *new* identifier* (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)? No (clarifying how the existing user_id is generated) 7) Is the data collection covered by the existing Firefox privacy notice? Yes, send privacy policy 8) Does there need to be a check-in in the future to determine whether to renew the data? Yes, 6 months from now.
Attachment #9036721 - Flags: review?(rrayborn) → review+

AFAIK, our user ID hashing has not had explicit legal approval but we are using standard best practices. CC'ing Danny Coates for comment on the generation of the user ID in this bug.

Based on Rob's R+ flag on the attachment, I believe we are good to go.

Flags: needinfo?(dcoates)

We've not received approval for our user_id generation. Here's the exact code:

async function hashId(id) {
  const d = new Date();
  const month = d.getUTCMonth();
  const year = d.getUTCFullYear();
  const encoded = textEncoder.encode(`${id}:${year}:${month}`);
  const hash = await crypto.subtle.digest('SHA-256', encoded);
  return arrayToB64(new Uint8Array(hash.slice(16)));
}

id is the uid received from FxA. It is sha-256 hashed along with the month and year.

Flags: needinfo?(dcoates)

Thanks Danny. Based on that, I think we're all done here so closing this bug. Thanks everyone.

Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(wclouser)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: