Closed Bug 1646151 Opened 5 years ago Closed 5 years ago

add Pioneer deletion ping

Categories

(Firefox :: Pioneer, task, P1)

task

Tracking

()

RESOLVED FIXED
82 Branch
Tracking Status
firefox81 + fixed
firefox82 --- fixed

People

(Reporter: rhelmer, Assigned: rhelmer)

References

(Blocks 1 open bug)

Details

Attachments

(5 files)

We need to support a standalone Telemetry deletion ping for Pioneer, since we can't combine the Pioneer ID and Telemetry client ID in the same ping.

Bugbug thinks this bug should belong to this component, but please revert this change in case of error.

Component: General → Telemetry
Product: Firefox → Toolkit
Priority: -- → P1

Should this be a P1? I thought from recent meetings that we weren't ready to do this in 80.

Flags: needinfo?(mpasciutowood)
Component: Telemetry → Pioneer
Product: Toolkit → Firefox

Mark, can we use a custom ping for Pioneer deletion? I guess we'd want to only include the Pioneer ID and nothing else.

How should I proceed with this? I'm presuming that we register a schema somewhere, who handles this on the Telemetry server side?

Flags: needinfo?(mreid)
Status: NEW → ASSIGNED

Yes, I think a custom ping with just the pioneer ID is the right thing to do here. Anthony, can you help with the schema / server side q's?

Flags: needinfo?(mreid) → needinfo?(amiyaguchi)

I would recommend a deletion-request ping per study. The upside of this is that the schema definition is minimal and the shredding process is isolated to a study project. However, when a client wants to delete their data globally, the must send a ping to each study they were enrolled in.

This would take the following form:

{
  "$comment": "Bug 1646151 - add Pioneer deletion ping",
  "$schema": "http://json-schema.org/draft-04/schema#",
  "properties": {
    "pioneerId": {
      "description": "Pioneer ID to be deleted.",
      "pattern":"^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$",
      "type": "string"
    }
  },
  "required": [
    "pioneerId"
  ],
  "title": "deletion-request",
  "type": "object"
}

Using the debug study as an example, the directory structure would take the following form:

pioneer-debug
├── debug
│   └── debug.1.schema.json
└── deletion-request
    └── deletion-request.1.schema.json

The alternative is to support a single deletion ping. Since all pioneer pings are being rerouted from the telemetry pipeline family into the pioneer pipeline family, it does not make sense to the deletion request to live within the telemetry/deletion-request/deletion-request.4.schema.json ping. It is also desirable for all Pioneer components to be isolated from the rest of the data platform. A new namespace (logically, a "study") would need to be allocated for deletion requests and assigned a keypair for acceptance of pings.

pioneer-deletion
└── deletion-request
    └── deletion-request.1.schema.json

The deletion-request schema is the same. However, shredder will now delete data across all studies and pings. Note that if data were encrypted at rest, and the ingestion sink project did not have access to all of the study keys, then this method of shredding data would not be viable. We trust the ACL's and give centralized automation control over all of the studies, so this is something we can do in practice.


I recommend a deletion-request per study because there is more control over what is deleted. If requirements dictate that deletion-requests are centralized, then it is possible to do so through a separate study. I do not think it's a good idea to mix data between telemetry and pioneer (e.g. telemetry.deletion-request.v4) because it leaks information about the set of pioneer ids into the telemetry datasets (e.g. count(distinct pioneer_id))

Flags: needinfo?(amiyaguchi)
Flags: needinfo?(mpasciutowood)

After some discussion with :rhelmer, we've decided that the one ping per study approach will be sufficient for deletions. This follows the Glean-style of deletion requests (one table per dataset). This means the client will need to keep track of all of the studies it is currently and historically been part of in order to send the requests to the correct places. The ingestion pipeline will need to be modified to accept empty payloads, since the keys for the study may not exist anymore. The PioneerId will get injected into the table since it's part of the envelope metadata, so the schema is just the empty schema.

[Tracking Requested - why for this release]: this feature is required by trust and legal for our full launch in 81; we cannot launch without it.

(In reply to Marnie Pasciuto-Wood [:marnie] from comment #10)

[Tracking Requested - why for this release]:

In the future, filling this out with a reason would be appreciated.

Updated comment 10 to include a reason for 81 tracking.

Pushed by rhelmer@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a70c7ee4b579 send deletion ping for each unenrolled Pioneer study r=sfoster,amiyaguchi

(In reply to Narcis Beleuzu [:NarcisB] from comment #15)

Backed out for bc failures on browser_pioneer_ui.js

Backout link: https://hg.mozilla.org/integration/autoland/rev/9b0bd3262a66f20f5fccbb7a25be46eae4ad7cf9
Log link: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=313694751&repo=autoland&lineNumber=5272

Thanks... huh, not seeing this locally w/ a debug build. Could be a race condition in the test.

Flags: needinfo?(rhelmer)

(In reply to Robert Helmer [:rhelmer] from comment #16)

(In reply to Narcis Beleuzu [:NarcisB] from comment #15)

Backed out for bc failures on browser_pioneer_ui.js

Backout link: https://hg.mozilla.org/integration/autoland/rev/9b0bd3262a66f20f5fccbb7a25be46eae4ad7cf9
Log link: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=313694751&repo=autoland&lineNumber=5272

Thanks... huh, not seeing this locally w/ a debug build. Could be a race condition in the test.

Oh sheesh, I just needed to rebase, and the test code is too explicit about the number of pings.

Pushed by rhelmer@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d6e899005724 send deletion ping for each unenrolled Pioneer study r=sfoster,amiyaguchi

OK now there's a race condition (lots of async code, the test isn't waiting until the telemetry is sent before ending). Fixing that now.

Flags: needinfo?(rhelmer)

(In reply to Robert Helmer [:rhelmer] from comment #20)

OK now there's a race condition (lots of async code, the test isn't waiting until the telemetry is sent before ending). Fixing that now.

I ended up turning off TV and filed bug 1660926 to follow-up (I'm pretty confident it's the test and not the code at this point, but I haven't been able to reproduce the problem locally yet even running --verify multiple times).

Try run before attempting re-landing: https://treeherder.mozilla.org/#/jobs?repo=try&revision=be81126a00d7a73e18209010256cad4098905cf1

Pushed by rhelmer@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5b5e0d17dbc3 send deletion ping for each unenrolled Pioneer study r=sfoster,amiyaguchi
Regressions: 1661188
Regressions: 1661198
Flags: needinfo?(rhelmer)
See Also: → 1661250

OK. So a bunch of intermittent bugs are popping up for what should be a pretty simple patch, I'm going to go back to the drawing board a little bit and figure out a more minimal patch. I suspect that I'm doing something wrong in the new test code I've added.

Flags: needinfo?(rhelmer)

Sorry for all the intermittent trouble, I've done a try run https://treeherder.mozilla.org/#/jobs?repo=try&revision=55e5ba98ff9e9106f195bdc1b542d906ec0d426f and also have had mach test --verify going all day, hopefully that got the last of these.

Pushed by rhelmer@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4ad4deeb92c4 send deletion ping for each unenrolled Pioneer study r=sfoster,amiyaguchi
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 82 Branch

Comment on attachment 9171448 [details] [review]
Link to GitHub pull-request: https://github.com/mozilla-services/mozilla-pipeline-schemas/pull/597

Beta/Release Uplift Approval Request

  • User impact if declined: We will be unable to uphold our legal obligations to prove that a user has successfully been deleted from Pioneer.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This is a simple telemetry ping, on the about:pioneer page. No risk to Firefox.
  • String changes made/needed: None
Attachment #9171448 - Flags: approval-mozilla-beta?
Attachment #9170824 - Flags: approval-mozilla-beta?
Attachment #9171448 - Flags: approval-mozilla-beta?

Comment on attachment 9170824 [details]
Bug 1646151 - send deletion ping for each unenrolled Pioneer study r?sfoster

Approved for 81.0b4.

Attachment #9170824 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Hey Chris,

this was lacking data-review, so dr?

Attachment #9175513 - Flags: data-review?(chutten)

Comment on attachment 9175513 [details]
pioneer-deletionrequest.md

DATA COLLECTION REVIEW RESPONSE:

Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes.

Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection is Telemetry so can be controlled through Firefox's Preferences.

If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, Ted Han is responsible.

Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, Technical.

Is the data collection request for default-on or default-off?

Default off.

Does the instrumentation include the addition of any new identifiers?

No.

Is the data collection covered by the existing Firefox privacy notice?

Yes.

Does there need to be a check-in in the future to determine whether to renew the data?

No. This collection is permanent.


Result: datareview+

Attachment #9175513 - Flags: data-review?(chutten) → data-review+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: