Open Bug 1708798 Opened 3 years ago Updated 11 months ago

[meta] Preferences synced to content process unnecessarily

Categories

(Core :: Preferences: Backend, defect, P1)

defect

Tracking

()

People

(Reporter: tjr, Unassigned)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Keywords: meta, Whiteboard: [spectre-blocker])

Attachments

(2 files, 4 obsolete files)

When a preference is changed the value will be synced to content processes unless it is excluded by a blocklist. This blocklist is pretty thin. If I look at my own preferences I can see the following user data:

  • pinned tabs in browser.newtabpage.pinned
  • some detailed info in identity.fxaccounts.account.device.name
  • detailed printer information in print.*
  • my username in services.sync.username

I imagine there's other instances of cross-origin data and/or personal data that could be present.

Component: JavaScript Engine: JIT → Preferences
Product: Core → Toolkit
Severity: -- → N/A
Priority: -- → P3
Assignee: nobody → jfkthame
Status: NEW → ASSIGNED
Assignee: jfkthame → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(jfkthame)

We discussed this and consider it a blocker for disabling Spectre mitigations.

While changing to a safelist approach would be the nicest; at a minimum we should add the most revealing preferences to the blocklist.

Whiteboard: [spectre-blocker]

Some other prefs I would add to this list:

  • download file path location in browser.download.lastDir
  • last open path in browser.open.lastDir
  • debugger breakpoints with source addresses in devtools.debugger.pending-selected-location
  • pinned tabs in browser.newtabpage.pinned
  • some detailed info in identity.fxaccounts.account.device.name
  • detailed printer information in print.*
  • my username in services.sync.username

@tjr, what severity would you put this at? And is there a specific timeline you're looking to have this fixed in?

Flags: needinfo?(tom)

(In reply to Jared Wein [:jaws] (please needinfo? me) from comment #7)

@tjr, what severity would you put this at? And is there a specific timeline you're looking to have this fixed in?

I suppose S3 because they don't impact current functionality - but as a blocker for spectre mitigation removal we are planning (in conversations now) on asking teams to prioritize these issues, although I don't know what timeframe expectation will be attached at the moment. Not "drop what you're doing now" timeframe, but I'd be surprised if it was later than H1 next year.

Flags: needinfo?(tom)

Thanks

Severity: N/A → S3

(In reply to Tom Ritter [:tjr] (ni? for response to CVE/sec-approval/advisories/etc) from comment #0)

When a preference is changed the value will be synced to content processes unless it is excluded by a blocklist. This blocklist is pretty thin.

Note that this blocklist only applies to preference changes. The initial value (which will generally still contain sensitive data) is still always sent.

Maybe we even want to switch to an allowlist approach for prefs with values that are strings? It feels like the most sensitive stuff will be a string. Though maybe some numbers, too.

(In reply to Kris Maglione [:kmag] from comment #10)

(In reply to Tom Ritter [:tjr] (ni? for response to CVE/sec-approval/advisories/etc) from comment #0)

When a preference is changed the value will be synced to content processes unless it is excluded by a blocklist. This blocklist is pretty thin.

Note that this blocklist only applies to preference changes.
The initial value (which will generally still contain sensitive data) is still always sent.

Tom, is this situation " The initial value (which will generally still contain sensitive data) is still always sent" something we wanna tackle in this blocker, too?

Flags: needinfo?(tom)
Attached file content-prefs.txt

I do think we need to prevent prefs from entering the content process at all (not just when changed) and the best solution would be an allowlist.

Attached is a runtime log of every string pref that was accessed by a content process during a full test suite run (on Linux anyway). This list might be a starting point for an allowlist.

That said, in the list we still have print_printer which contained name of my printer from the OS that would be good to remove.

The typical approach for building an allowlist like this would be:

  1. Add a function call in all the 'Get Preferences' codepaths that check the preference name in the content process.
  2. In that function, if the preference name is not in the allowlist, tell the parent process to fire off a Telemetry Event. I would add the preference name to a pref (haha) so we don't submit a mullion events for the same pref name; and we only do it once.
  3. Add a filter to make sure we don't report user data via Event Telemetry. This in undoubtedly the hardest part. See [0]
  4. Figure out what to do about userscript hacks
  5. Roll this out as Nightly Only.
  6. Collect Event Telemetry, improve the allowlist.
  7. Iterate until you don't get very much telemetry anymore.
  8. Roll it out to Beta, repeat.
  9. Roll it out to Release.
  10. Switch on the allowlist you develop below in Thing 1.

Simultaneously you'd be doing two things:

Thing 1:

  1. Build in the infrastructure to not send certain preferences to the Content Process in any capacity (either initially or as updates.)
  2. Apply this infrastructure to preferences that are known-safe (i.e. are in the current blocklist)

Thing 2:

  1. Take the preferences currently used in the content process that are problematic (e.g. print_printer) and figure out why they're needed, and if we can refactor things so they are not.
  2. Review the preferences that come in via Event Telemetry to see if more should be so considered

[0] Not collecting user data is the hardest part. There are known quantities like print.printer_* that we don't want to collect because the preference name contains the user data. It's the unknown quantities that would wind up collecting user data we don't want. These could be prefs users set themselves (but if they did that, it seems impossible they'd be accessed in the content process unless they're using userscript hacks.) Or they could be places in our code we're dynamically constructing preference names that we couldn't find or don't know about.

Flags: needinfo?(tom)
Depends on: 1749598
Assignee: nobody → dminor
Depends on: 1751228
Depends on: 1751936

The new plan is:

  1. We will expand the denylist to the place where preferences are sent to the content process initially. That's Bug 1751936 and dminor will take it.
  2. We will blocklist any pref that has a dynamic name or was created by a user (and not Mozilla). We'll check for this using HasDefaultValue.
  3. We will review all the string preferences we can find and consider them for addition to the denylist if needed
  4. If a denylist-ed pref is accessed from the content process we will DIAGNOSTIC_ASSERT - this will be controllable with a pref
  5. We'll do a try run and look for test breakage and make decisions about crashes on a pref-by-pref basis, most likely asking teams to refactor code (but with uncertain priority at the moment.) Bug 1749598 is an example of one of these.
  6. Once we do local testing and try is clean, we'll land in Nightly and monitor crashes

I'll be doing 2-6; but the work items from #5 will go to individual teams.

Depends on: 1752331
Depends on: 1752332
Depends on: 1752333

Currently we filter preferences sent to the content process when they are updated,
but do not apply this filter to the initial serialization of preferences. This
adds a filter function as an argument to SerializePreference that allows for the
caller to control which preferences are serialized.

Attachment #9261570 - Attachment description: WIP: Bug 1708798 - Add aShouldSerializeFn as argument to SerializePreferences → Bug 1708798 - Add aShouldSerializeFn as argument to SerializePreferences; r=tjr!
Attachment #9261571 - Attachment description: WIP: Bug 1708798 - Add mShouldSerializeFn to SharedPreferenceSerializer → Bug 1708798 - Add mShouldSerializeFn to SharedPreferenceSerializer; r=tjr!
Attachment #9261572 - Attachment description: WIP: Bug 1708798 - Update users of SharedPreferenceSerializer to pass ShouldSyncPreference → Bug 1708798 - Update users of SharedPreferenceSerializer to pass ShouldSyncPreference; r=tjr!

Comment on attachment 9261570 [details]
Bug 1708798 - Add aShouldSerializeFn as argument to SerializePreferences; r=tjr!

Revision D137474 was moved to bug 1751936. Setting attachment 9261570 [details] to obsolete.

Attachment #9261570 - Attachment is obsolete: true

Comment on attachment 9261571 [details]
Bug 1708798 - Add mShouldSerializeFn to SharedPreferenceSerializer; r=tjr!

Revision D137475 was moved to bug 1751936. Setting attachment 9261571 [details] to obsolete.

Attachment #9261571 - Attachment is obsolete: true

Comment on attachment 9261572 [details]
Bug 1708798 - Update users of SharedPreferenceSerializer to pass ShouldSyncPreference; r=tjr!

Revision D137476 was moved to bug 1751936. Setting attachment 9261572 [details] to obsolete.

Attachment #9261572 - Attachment is obsolete: true
Attached file pref-datareview.txt
Attachment #9263775 - Flags: data-review?(chutten)
Component: Preferences → Preferences: Backend
Product: Toolkit → Core

Comment on attachment 9263775 [details]
pref-datareview.txt

PRELIMINARY NOTES:

:tjr, could you confirm that this annotation is in Crash Reports only (not "crash" pings)? And that Crash Reports remain opt-in? This review is predicated on that. Otherwise we'll kick this up to Legal who I think will be quite satisfied by your privacy risk mitigations.

DATA COLLECTION REVIEW RESPONSE:

Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes.

Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection is Telemetry so can be controlled through Firefox's Preferences.

If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, Tom Ritter is responsible.

Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, Technical -- but with Cat3+ risks (( how real these risks are is to be determined by initial roll out ))

Is the data collection request for default-on or default-off?

Default off. Users must opt-in to crash report submission.

Does the instrumentation include the addition of any new identifiers?

No.

Is the data collection covered by the existing Firefox privacy notice?

Yes.

Does the data collection use a third-party collection tool?

No.


Result: datareview+

Flags: needinfo?(tom)
Attachment #9263775 - Flags: data-review?(chutten) → data-review+

Gabriele, do you know if the string in MOZ_CRASH_UNSAFE_PRINTF is included in a crash ping, as well as a crash report?

Flags: needinfo?(tom) → needinfo?(gsvelto)

Yes, the string is stored in the MozCrashReason annotation and sent in the crash ping (see this).

Flags: needinfo?(gsvelto)

Hi Kris,
I'm suggesting the Priority as P1 to indicate the impact, importance and urgency for this bug, as it's a blocker of removing jit mitigations. Please feel free to let me know if you have any questions.

Priority: P3 → P1

So the initial part of the plan where it's default-off and nightly-only (guarded by pref) is the only thing covered by the data-review+, then. You can go ahead with that while we bring in Legal.

Speaking of whom... Hey, :mfeldman, could you run a sensitive data review on this collection? See the attached review. :tjr's your contact for more context.

Flags: needinfo?(mfeldman)

Resetting assignee since my work landed in Bug 1751936.

Assignee: dminor → nobody

Reviewed and discussed the new plan with Tom Ritter. It looks good from the Privacy/Legal perspective.

One thing I'd like to better understand is the sensitive data that is being collected. It's mentioned in the ticket above that it's "category 1, Technical -- but with Cat3+ risks" type of data. It would help to be more specific. (ex. username, email, address, etc.)

Other related question, are these data personal data, meaning can we identify a person from the data? For example, just by seeing the printer name, can we linked it to a person? If no, then we don't have an concern from the privacy perspective, but if these sensitive data here can be linked to a person, then the plan that Tom Ritter proposed is a good privacy/security mitigation measure.

Tom, in case comment 28 fell out of your radar a bit.

Flags: needinfo?(tom)

(In reply to Hsin-Yi Tsai [:hsinyi] from comment #29)

Tom, in case comment 28 fell out of your radar a bit.

Oh, I read the comments again. Looks it's more like a discussion summary rather than open questions?

I think so; yes. When I spoke with Al I explained that we think we have pretty good confidence we won't collect anything based both on the unlikelihood of anything triggering the crash and our incremental opt-in enrollment before an opt-out enrollment - but that ultimately the potential is there even if it's unlikely.

Flags: needinfo?(tom)
Flags: needinfo?(mfeldman)
Keywords: meta
Summary: Preferences synced to content process unnecessarily → [meta] Preferences synced to content process unnecessarily
Depends on: 1766866
Depends on: 1778747
No longer depends on: 1778747
Depends on: 1780403
No longer depends on: 1780403
Depends on: 1780575
Depends on: 1782544
Depends on: 1791814
Depends on: 1796196
Depends on: 1801711
Depends on: 1808783
Depends on: 1811294
Depends on: 1815893
Blocks: 1819714
Whiteboard: [spectre-blocker] → [spectre-blocker][sp3]
Whiteboard: [spectre-blocker][sp3] → [spectre-blocker]
Depends on: 1840851
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: