Closed Bug 1272294 Opened 4 years ago Closed 4 years ago

_getSearchEngineId's fallback to the engine name can leak user data

Categories

(Firefox :: Search, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 49
Tracking Status
firefox49 --- fixed

People

(Reporter: florian, Assigned: Gijs)

References

Details

Attachments

(1 file)

BrowserSearch.recordSearchInTelemetry calls _getSearchEngineId (https://hg.mozilla.org/mozilla-central/annotate/3461f3cae784/browser/base/content/browser.js#l3642) which returns the engine's identifier (null for user-installed engines) and then fallbacks to the engine's name if the identifier is null.

This would likely be fine if this data was only collected for users who opted-in to Telemetry, but bug 1160220 made the SEARCH_COUNTS histogram opt-out, meaning that we are now (I assume) unintentionally collecting the non-default engine names of release users.
Assignee: nobody → gijskruitbosch+bugs
Blocks: 1089670
Status: NEW → ASSIGNED
Comment on attachment 8751807 [details]
MozReview Request: Bug 1272294 - only collect non-default search engine names if the user opted into telemetry, r=florian

https://reviewboard.mozilla.org/r/52221/#review49175

Looks reasonable, but:
- this may be covered by tests, so I would suggest doing a try push with at least bc tests
- I hope nobody was using this data

::: browser/base/content/browser.js:3647
(Diff revision 1)
>    _getSearchEngineId: function (engine) {
>      if (!engine) {
>        return "other";
>      }
>  
>      if (engine.identifier) {

We can further simplify by doing:
  if (engine && engine.identifier) \{
    return engine.identifier;
  \}
 
  return "other";
(then the first early return can disappear too)
Attachment #8751807 - Flags: review?(florian) → review+
Ilana, can you confirm we don't need search engine names from non-default providers?
Flags: needinfo?(isegall)
It's pretty valuable information. Consider the case of hijacked browsers - we'd like to target those who have had their engine changed to (avg-toolbar-search-wetakeyourinfo, whatever) to see if it's a valid selection.

I propose 2 solutions, since the data-privacy one is completely valid.

1. Have a separate variable, nondefaultengine, that only records for those with extended telemetry and a non built-in.
2. (less preferable) Record that the engine is one of a set of known top hijackers/alternatives (we could determine the list every few months) and record it if it falls into one of these, ensuring we don't get any data-leaking search domains.

Thoughts?
Flags: needinfo?(isegall)
(In reply to Ilana from comment #5)
> It's pretty valuable information. Consider the case of hijacked browsers -
> we'd like to target those who have had their engine changed to
> (avg-toolbar-search-wetakeyourinfo, whatever) to see if it's a valid
> selection.

Just to be sure there's no misunderstanding here: this records the search volume for each engine.

We have a different probe to record the current default engine, that we added in bug 1164159, mostly to help us investigate hijacking situations.
Ah, sorry about that.
I'd say that there's still some value here, although it is lower for hijacking than the default setting. There's still an application for potential search partner expansion and for knowing the # of engines used (to correlate with satisfaction, etc). I've cc'd gregglind and rweiss, who may have other applications in mind. Otherwise, this is far less useful than knowing the default engine, although it seems to be at the same level of privacy invasion.
Flags: needinfo?(rweiss)
Flags: needinfo?(glind)
Comment on attachment 8751807 [details]
MozReview Request: Bug 1272294 - only collect non-default search engine names if the user opted into telemetry, r=florian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/52221/diff/1-2/
Comment on attachment 8751807 [details]
MozReview Request: Bug 1272294 - only collect non-default search engine names if the user opted into telemetry, r=florian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/52221/diff/2-3/
Attachment #8751807 - Attachment description: MozReview Request: Bug 1272294 - don't collect non-default search engine names, r?florian → MozReview Request: Bug 1272294 - only collect non-default search engine names if the user opted into telemetry, r?florian
Comment on attachment 8751807 [details]
MozReview Request: Bug 1272294 - only collect non-default search engine names if the user opted into telemetry, r=florian

This is a little hacky because it reuses the same histogram. For non-opt-in-users this will produce "other".

I don't think the faff of creating an extra histogram and dealing with all the test fallout is worth it if all we're interested in is/are absolute/relative frequency of other engines. We presumably know how many people opt in to extended telemetry and can use that to judge volumes accordingly.
Attachment #8751807 - Flags: review+ → review?(florian)
(In reply to :Gijs Kruitbosch from comment #11)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=20211baffa40

Gah, this was pushed by autoland using an inbound head, and so now it doesn't have the changes from bug 1197310 and so the tests time out. :-\
(In reply to :Gijs Kruitbosch from comment #12)
> (In reply to :Gijs Kruitbosch from comment #11)
> > https://treeherder.mozilla.org/#/jobs?repo=try&revision=20211baffa40
> 
> Gah, this was pushed by autoland using an inbound head, and so now it
> doesn't have the changes from bug 1197310 and so the tests time out. :-\

Hm, no, this should have had those csets. There must be something wrong with my patch (it does work on OSX, which is where I tested it...).
Comment on attachment 8751807 [details]
MozReview Request: Bug 1272294 - only collect non-default search engine names if the user opted into telemetry, r=florian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/52221/diff/3-4/
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c7722ee23a86

This time tested locally on my Linux VM. Seems like "true" as the pref value breaks on Linux but not on OS X? Well, anyway, not using a string makes more sense, so let's see if try is happy now.
Comment on attachment 8751807 [details]
MozReview Request: Bug 1272294 - only collect non-default search engine names if the user opted into telemetry, r=florian

https://reviewboard.mozilla.org/r/52221/#review50620

::: browser/components/search/test/browser_healthreport.js:73
(Diff revision 4)
>          break;
>      }
>    }
>  
> +  SpecialPowers.pushPrefEnv({set: [["toolkit.telemetry.enabled", true]]}).then(function() {
> -  Services.obs.addObserver(observer, "browser-search-engine-modified", false);
> +    Services.obs.addObserver(observer, "browser-search-engine-modified", false);

Why is the addObserver call made after the pref change? It looks to me like only the addEngine call needs telemetry enabled.
Attachment #8751807 - Flags: review?(florian) → review+
Comment on attachment 8751807 [details]
MozReview Request: Bug 1272294 - only collect non-default search engine names if the user opted into telemetry, r=florian

https://reviewboard.mozilla.org/r/52221/#review50626

::: browser/components/search/test/browser_healthreport.js:73
(Diff revision 4)
>          break;
>      }
>    }
>  
> +  SpecialPowers.pushPrefEnv({set: [["toolkit.telemetry.enabled", true]]}).then(function() {
> -  Services.obs.addObserver(observer, "browser-search-engine-modified", false);
> +    Services.obs.addObserver(observer, "browser-search-engine-modified", false);

True, I was just being lazy in how I dealt with this. Fixed now.
Attachment #8751807 - Flags: review+
Comment on attachment 8751807 [details]
MozReview Request: Bug 1272294 - only collect non-default search engine names if the user opted into telemetry, r=florian

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/52221/diff/4-5/
Attachment #8751807 - Attachment description: MozReview Request: Bug 1272294 - only collect non-default search engine names if the user opted into telemetry, r?florian → MozReview Request: Bug 1272294 - only collect non-default search engine names if the user opted into telemetry, r=florian
https://hg.mozilla.org/mozilla-central/rev/e68856caffda
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 49
Flags: needinfo?(rweiss)
Flags: needinfo?(glind)
You need to log in before you can comment on or make changes to this bug.