Closed Bug 1879154 Opened 3 months ago Closed 2 months ago

Telemetry Ping for User Hardware Characteristics

Categories

(Core :: Privacy: Anti-Tracking, enhancement)

enhancement

Tracking

()

RESOLVED FIXED
125 Branch
Tracking Status
firefox125 --- fixed

People

(Reporter: tjr, Assigned: tjr)

References

(Blocks 1 open bug)

Details

Attachments

(4 files)

This bug creates a ping for user hardware characteristics, but does not wire it up to be submitted; nor does it include any actual data in it.

Once we have an actual metric to put in there we can test that.

Recalling the conversation in Montreal, here were some points from that were mentioned:

  1. include_client_id: false
  2. client_id: type: uuid lifetime: user
  3. send_in_pings: [anti-fingerprinting, deletion-request]
  4. delete_after_days
  5. workgroup: anti-fingerprinting

The last two are, I think, to restrict the access to the dataset on the backend. The first one is obviously that we don't want to include the real client_id we send in other pings. I think that the second one is that we are defining a new client_id - so we can deduplicate reports and only take the most recent one. The third one is to only send it in my ping - but I am not sure why 'deletion-request' is there. Is it so we can receive my client_id in the delete-request ping and use it to wipe the data from the database?

Until we have OHTTP working and deployed, this ping will be disabled. I intend to set it manually on my Nightly instance, and maybe ask some coworkers to opt-in also so that we can populate the tables with some test data and allow us to bootstrap the analysis step on a few records of data.

Attached file data-review-request.md
Assignee: nobody → tom
Status: NEW → ASSIGNED
Attachment #9378860 - Flags: data-review?(chutten)

(In reply to Tom Ritter [:tjr] from comment #4)

Recalling the conversation in Montreal, here were some points from that were mentioned:

  1. include_client_id: false
  2. client_id: type: uuid lifetime: user
  3. send_in_pings: [anti-fingerprinting, deletion-request]
  4. delete_after_days
  5. workgroup: anti-fingerprinting

The last two are, I think, to restrict the access to the dataset on the backend. The first one is obviously that we don't want to include the real client_id we send in other pings. I think that the second one is that we are defining a new client_id - so we can deduplicate reports and only take the most recent one. The third one is to only send it in my ping - but I am not sure why 'deletion-request' is there. Is it so we can receive my client_id in the delete-request ping and use it to wipe the data from the database?

You are correct in your suppositions. (1) You wanted to have the freedom to fingerprint deeply, but not connect it to anything else we might learn. So no Glean client_id. (2) A domain-specific client_id allows for defeating pseudo-replication bias: if you don't want multiple reports from the same profile to overwhelm your sample, you'll want something like a client_id, but not sent in anything else. (3) ...anything else but the "deletion-request" ping because we have a persistent profile identifier in here, so we should offer self-serve deletion. We will need to hook this domain-specific "client_id" up to Shredder (the data deletion backend) so it'll actually do something. That'll be a bug in Data Platform and Tools :: General. (4) and (5) are indeed on the pipeline side. Specifically in repositories.yaml. There we can ensure that (4) we only keep a certain number of days of data and (5) that only anointed individuals are permitted to access it.

Comment on attachment 9378860 [details]
data-review-request.md

PRELIMINARY NOTES:

Though the overall design is enabling the collection of particular HW and SW characteristics, please confirm that this review only contains the ping and the two metrics described and when those characteristics are instrumented and added to the ping that there will be follow-up data collection review requests?

DATA COLLECTION REVIEW RESPONSE:

Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes.

Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection can be controlled through the product's preferences.

If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, Tom Ritter is responsible.

Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, Technical.

Is the data collection request for default-on or default-off?

Default on for all channels.

Does the instrumentation include the addition of any new identifiers?

No.

Is the data collection covered by the existing Firefox privacy notice?

Yes.

Does the data collection use a third-party collection tool?

No.


Result: datareview+

Flags: needinfo?(tom)
Attachment #9378860 - Flags: data-review?(chutten) → data-review+

(Oh, and while I'm here, did you know about ./mach data-review? You might find it handy)

(3) ...anything else but the "deletion-request" ping because we have a persistent profile identifier in here, so we should offer self-serve deletion. We will need to hook this domain-specific "client_id" up to Shredder (the data deletion backend) so it'll actually do something. That'll be a bug in Data Platform and Tools :: General. (4) and (5) are indeed on the pipeline side. Specifically in repositories.yaml. There we can ensure that (4) we only keep a certain number of days of data and (5) that only anointed individuals are permitted to access it.

Thanks, once I've got this ping landed and things are populating, I'll file those bugs. (The ping is default-off, I'm just going to turn it on for myself.)

Though the overall design is enabling the collection of particular HW and SW characteristics, please confirm that this review only contains the ping and the two metrics described and when those characteristics are instrumented and added to the ping that there will be follow-up data collection review requests?

Yes, that is correct. I'm going back and forth between a lot of patches and a lot of data review requests for items individually, or batching up many unrelated items into fewer data review requests. If you have a preference, LMK. Also, who else would you recommend I flag to spread the load? I know a lot of the data stewards are in different orgs, are there Firefox-specific stewards you'd recommend?

Flags: needinfo?(tom)
Attachment #9378837 - Attachment description: WIP: Bug 1879154: Create the User Hardware Characteristics Ping → Bug 1879154: Create the User Hardware Characteristics Ping r?janerik
Attachment #9378839 - Attachment description: WIP: Bug 1879154: Add a test for the ping → Bug 1879154: Add a test for the ping r?janerik
Attachment #9378841 - Attachment description: WIP: Bug 1879154: Add code to (potentially) submit the ping during idle → Bug 1879154: Add code to (potentially) submit the ping during idle r?timhuang
Pushed by tritter@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/254b5503de48
Create the User Hardware Characteristics Ping r=chutten
https://hg.mozilla.org/integration/autoland/rev/49bd1eba2ef6
Add a test for the ping r=janerik
https://hg.mozilla.org/integration/autoland/rev/0673192f62c4
Add code to (potentially) submit the ping during idle r=timhuang
https://hg.mozilla.org/integration/autoland/rev/62b681084f12
apply code formatting via Lando

Backed out for causing bustage on nsUserCharacteristics.h

[task 2024-03-03T21:07:18.146Z] 21:07:18     INFO -  In file included from /builds/worker/checkouts/gecko/toolkit/components/resistfingerprinting/nsUserCharacteristics.cpp:6:
[task 2024-03-03T21:07:18.146Z] 21:07:18    ERROR -  /builds/worker/checkouts/gecko/toolkit/components/resistfingerprinting/nsUserCharacteristics.h:14:10: error: unknown type name 'nsresult'
[task 2024-03-03T21:07:18.146Z] 21:07:18     INFO -     14 |   static nsresult PopulateData();
[task 2024-03-03T21:07:18.147Z] 21:07:18     INFO -        |          ^
[task 2024-03-03T21:07:18.147Z] 21:07:18    ERROR -  /builds/worker/checkouts/gecko/toolkit/components/resistfingerprinting/nsUserCharacteristics.h:15:10: error: unknown type name 'nsresult'
[task 2024-03-03T21:07:18.148Z] 21:07:18     INFO -     15 |   static nsresult SubmitPing();
[task 2024-03-03T21:07:18.148Z] 21:07:18     INFO -        |          ^
[task 2024-03-03T21:07:18.149Z] 21:07:18    ERROR -  /builds/worker/checkouts/gecko/toolkit/components/resistfingerprinting/nsUserCharacteristics.cpp:25:36: error: use of undeclared identifier 'LogLevel'; did you mean 'mozilla::LogLevel'?
[task 2024-03-03T21:07:18.149Z] 21:07:18     INFO -     25 |   MOZ_LOG(gUserCharacteristicsLog, LogLevel::Debug, ("In MaybeSubmitPing()"));
[task 2024-03-03T21:07:18.150Z] 21:07:18     INFO -        |                                    ^~~~~~~~
[task 2024-03-03T21:07:18.150Z] 21:07:18     INFO -        |                                    mozilla::LogLevel
Flags: needinfo?(tom)
Pushed by tritter@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9890a85aacdd
Create the User Hardware Characteristics Ping r=chutten
https://hg.mozilla.org/integration/autoland/rev/2750454a061c
Add a test for the ping r=janerik
https://hg.mozilla.org/integration/autoland/rev/fdc6b48b7944
Add code to (potentially) submit the ping during idle r=timhuang
https://hg.mozilla.org/integration/autoland/rev/499cae53f42d
apply code formatting via Lando

Backed out for causing bustage on nsUserCharacteristics.cpp

[task 2024-03-04T01:37:07.742Z] 01:37:07    ERROR -  /builds/worker/checkouts/gecko/toolkit/components/resistfingerprinting/nsUserCharacteristics.cpp:25:36: error: use of undeclared identifier 'LogLevel'; did you mean 'mozilla::LogLevel'?
[task 2024-03-04T01:37:07.742Z] 01:37:07     INFO -     25 |   MOZ_LOG(gUserCharacteristicsLog, LogLevel::Debug, ("In MaybeSubmitPing()"));
[task 2024-03-04T01:37:07.742Z] 01:37:07     INFO -        |                                    ^~~~~~~~
[task 2024-03-04T01:37:07.742Z] 01:37:07     INFO -        |                                    mozilla::LogLevel
[task 2024-03-04T01:37:07.742Z] 01:37:07     INFO -  /builds/worker/workspace/obj-build/dist/include/mozilla/Logging.h:288:41: note: expanded from macro 'MOZ_LOG'
[task 2024-03-04T01:37:07.743Z] 01:37:07     INFO -    288 |       if (MOZ_LOG_TEST(moz_real_module, _level)) {             \
[task 2024-03-04T01:37:07.743Z] 01:37:07     INFO -        |                                         ^
[task 2024-03-04T01:37:07.744Z] 01:37:07     INFO -  /builds/worker/workspace/obj-build/dist/include/mozilla/Logging.h:231:53: note: expanded from macro 'MOZ_LOG_TEST'
[task 2024-03-04T01:37:07.744Z] 01:37:07     INFO -    231 |     MOZ_UNLIKELY(mozilla::detail::log_test(_module, _level))
[task 2024-03-04T01:37:07.744Z] 01:37:07     INFO -        |                                                     ^
[task 2024-03-04T01:37:07.746Z] 01:37:07     INFO -  /builds/worker/workspace/obj-build/dist/include/mozilla/Likely.h:17:48: note: expanded from macro 'MOZ_UNLIKELY'
[task 2024-03-04T01:37:07.746Z] 01:37:07     INFO -     17 | #  define MOZ_UNLIKELY(x) (__builtin_expect(!!(x), 0))
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -        |                                                ^
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -  /builds/worker/workspace/obj-build/dist/include/mozilla/Logging.h:62:12: note: 'mozilla::LogLevel' declared here
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -     62 | enum class LogLevel {
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -        |            ^
[task 2024-03-04T01:37:07.747Z] 01:37:07    ERROR -  /builds/worker/checkouts/gecko/toolkit/components/resistfingerprinting/nsUserCharacteristics.cpp:25:36: error: use of undeclared identifier 'LogLevel'; did you mean 'mozilla::LogLevel'?
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -     25 |   MOZ_LOG(gUserCharacteristicsLog, LogLevel::Debug, ("In MaybeSubmitPing()"));
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -        |                                    ^~~~~~~~
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -        |                                    mozilla::LogLevel
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -  /builds/worker/workspace/obj-build/dist/include/mozilla/Logging.h:289:53: note: expanded from macro 'MOZ_LOG'
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -    289 |         mozilla::detail::log_print(moz_real_module, _level,    \
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -        |                                                     ^
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -  /builds/worker/workspace/obj-build/dist/include/mozilla/Logging.h:62:12: note: 'mozilla::LogLevel' declared here
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -     62 | enum class LogLevel {
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -        |            ^
[task 2024-03-04T01:37:07.747Z] 01:37:07    ERROR -  /builds/worker/checkouts/gecko/toolkit/components/resistfingerprinting/nsUserCharacteristics.cpp:47:32: error: use of undeclared identifier 'Preferences'; did you mean 'mozilla::Preferences'?
[task 2024-03-04T01:37:07.747Z] 01:37:07     INFO -     47 |   auto lastSubmissionVersion = Preferences::GetInt(kLastVersionPref, 0);
[task 2024-03-04T01:37:07.748Z] 01:37:07     INFO -        |                                ^~~~~~~~~~~
[task 2024-03-04T01:37:07.748Z] 01:37:07     INFO -        |                                mozilla::Preferences
[task 2024-03-04T01:37:07.748Z] 01:37:07     INFO -  /builds/worker/workspace/obj-build/dist/include/mozilla/Preferences.h:93:7: note: 'mozilla::Preferences' declared here
[task 2024-03-04T01:37:07.748Z] 01:37:07     INFO -     93 | class Preferences final : public nsIPrefService,
[task 2024-03-04T01:37:07.748Z] 01:37:07     INFO -        |       ^

Sorry about that, I made some changes that will hopefully avoid these types of errors.

Flags: needinfo?(tom)
Pushed by tritter@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2cffaa0cd7f1
Create the User Hardware Characteristics Ping r=chutten
https://hg.mozilla.org/integration/autoland/rev/5656bb97b54b
Add a test for the ping r=janerik
https://hg.mozilla.org/integration/autoland/rev/3fe0e5c670f2
Add code to (potentially) submit the ping during idle r=timhuang
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 125 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: