Open Bug 1976220 Opened 2 months ago Updated 1 month ago

Consider locale coalescing

Categories

(Core :: Privacy: Anti-Tracking, enhancement, P3)

enhancement

Tracking

()

ASSIGNED

People

(Reporter: fkilic, Assigned: fkilic)

References

(Blocks 2 open bugs)

Details

Attachments

(4 files)

We can try to coalesce locales the users are using to reduce fingerprintability.

Duplicate entry for LOCALE("ja", ...), which may cause ambiguity or undefined behavior in lookup.

I was going to say before looking at the current patches, that japanese on mac differs from other desktops. Also if this is behind RFP, we do not want to upset Tor Browser's setup which looks similar - i.e we enforce the default languages (from somewhere? pierov?) and make locale match language

ni'ing pierov to put it on his radar


FYI: this is from over a year ago: I was checking what headers and navigator.languages returned in the then set of Tor Browser limited app languages (TZP actually checks this as part of health) - this is where I picked up on Japanese on Macs. There's also something about geckoview being different but I have no dived down that rabbit hole - e.g. see Bug 1961578 - just be aware that the current approach may end up created never before seen fingerprints - e.g. return ja on MacOS = not a benign value

IIUIC you're not changing languages at all, right? Just using that to determine the locale? Be aware that this will alter user's formatting etc (Intl takes a locale and I have FPed the entropy out various locale differences) even within variants (e.g. en-GB vs en-US, or de-DE vs de-AT or fr-FR vs fr-CA etc - the differences can be subtle or outright blatant)

and IIUIC this would also remove the need for spoof_english when in en-* ?

          ar,en-US,en | ar,en-US;q=0.7,en;q=0.3
          ca,en-US,en | ca,en-US;q=0.7,en;q=0.3
       cs,sk,en-US,en | cs,sk;q=0.8,en-US;q=0.5,en;q=0.3
          da,en-US,en | da,en-US;q=0.7,en;q=0.3
          de,en-US,en | de,en-US;q=0.7,en;q=0.3
    el-GR,el,en-US,en | el-GR,el;q=0.8,en-US;q=0.5,en;q=0.3
             en-US,en | en-US,en;q=0.5
    es-ES,es,en-US,en | es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3
    fa-IR,fa,en-US,en | fa-IR,fa;q=0.8,en-US;q=0.5,en;q=0.3
    fi-FI,fi,en-US,en | fi-FI,fi;q=0.8,en-US;q=0.5,en;q=0.3
    fr,fr-FR,en-US,en | fr,fr-FR;q=0.8,en-US;q=0.5,en;q=0.3
 ga-IE,ga,en-IE,en-GB,en-US,en
            | ga-IE,ga;q=0.8,en-IE;q=0.7,en-GB;q=0.5,en-US;q=0.3,en;q=0.2
    he,he-IL,en-US,en | he,he-IL;q=0.8,en-US;q=0.5,en;q=0.3
    hu-HU,hu,en-US,en | hu-HU,hu;q=0.8,en-US;q=0.5,en;q=0.3
          id,en-US,en | id,en-US;q=0.7,en;q=0.3
          is,en-US,en | is,en-US;q=0.7,en;q=0.3
    it-IT,it,en-US,en | it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3
          ja,en-US,en | ja,en-US;q=0.7,en;q=0.3
    ka-GE,ka,en-US,en | ka-GE,ka;q=0.8,en-US;q=0.5,en;q=0.3
    ko-KR,ko,en-US,en | ko-KR,ko;q=0.8,en-US;q=0.5,en;q=0.3
    lt,en-US,en,ru,pl | lt,en-US;q=0.8,en;q=0.6,ru;q=0.4,pl;q=0.2
    mk-MK,mk,en-US,en | mk-MK,mk;q=0.8,en-US;q=0.5,en;q=0.3
          ms,en-US,en | ms,en-US;q=0.7,en;q=0.3
          my,en-GB,en | my,en-GB;q=0.7,en;q=0.3
 nb-NO,nb,no-NO,no,nn-NO,nn,en-US,en
            | nb-NO,nb;q=0.9,no-NO;q=0.8,no;q=0.6,nn-NO;q=0.5,nn;q=0.4,en-US;q=0.3,en;q=0.1
          nl,en-US,en | nl,en-US;q=0.7,en;q=0.3
          pl,en-US,en | pl,en-US;q=0.7,en;q=0.3
    pt-BR,pt,en-US,en | pt-BR,pt;q=0.8,en-US;q=0.5,en;q=0.3
 ro-RO,ro,en-US,en-GB,en
            | ro-RO,ro;q=0.8,en-US;q=0.6,en-GB;q=0.4,en;q=0.2
    ru-RU,ru,en-US,en | ru-RU,ru;q=0.8,en-US;q=0.5,en;q=0.3
    sq,sq-AL,en-US,en | sq,sq-AL;q=0.8,en-US;q=0.5,en;q=0.3
    sv-SE,sv,en-US,en | sv-SE,sv;q=0.8,en-US;q=0.5,en;q=0.3
          th,en-US,en | th,en-US;q=0.7,en;q=0.3
    tr-TR,tr,en-US,en | tr-TR,tr;q=0.8,en-US;q=0.5,en;q=0.3
    uk-UA,uk,en-US,en | uk-UA,uk;q=0.8,en-US;q=0.5,en;q=0.3
    vi-VN,vi,en-US,en | vi-VN,vi;q=0.8,en-US;q=0.5,en;q=0.3
 zh-CN,zh,zh-TW,zh-HK,en-US,en
            | zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2
    zh-TW,zh,en-US,en | zh-TW,zh;q=0.8,en-US;q=0.5,en;q=0.3

Mac: identical except the locale is ja-JP not ja
- not a FPing issue since this is OS specific

Flags: needinfo?(pierov)

FWIW: also see Bug 1968951 and also Bug 1975931 (I knew Intl in gecko returned the same for all constructors except collation - but I didn't know of that bug because I don't build engines)

IIUIC you're not changing languages at all, right

hadn't finished looking at the patches, but still a confused on the game plan

In Tor Browser we're forcing intl.accepted_languages to its default value (the strange thing that loads from chrome://locale/...).
I'd be very happy if we could drop that and stop changing the pref.
Same for spoof English (i.e., remove the code from RFPHelper.sys.mjs, and when spoof English is enabled, skip intl.accepted_languages, it'd solve testing problem as explained to me in https://bugzilla.mozilla.org/show_bug.cgi?id=1963545#c9).
If you do it, you could probably close Bug 1869821, too.

Flags: needinfo?(pierov)
See Also: → 1869821

The goal here is to basically force everyone into one of the chrome://global/locale/intl.properties:intl.accept_languages values. For example, if you have "en, fr" in your intl.accept_languages pref, we'll return the first match here which is "en-US, en". "fr" will be omitted. If you had "fr, kab", then we would match "fr, fr-fr, en-us, en" because that's what we have on the list.

So, if Tor is forcing intl.accept_languages to chrome://global/locale/intl.properties:intl.accept_languages, then there will be no changes. It will only affect users if they have non-default locales.

I do agree with not modifying intl.accept_languages, but it is used in lots of places. Do we have a bug on file? At least we can put it on a bug, and one day pick it up. edit: couldn't find it and created bug 1976742)

Blocks: 1976742

Note that with bug 1760013 we're looking to move the intl.accept_languages default values from localiser-controlled intl.properties files to within code. This is likely to change some of the details like the capitalization of en-US vs. en-us in some values, drop the ja-JP-mac specialization, and it's likely to drop the explicit enumeration of locales for which the preferred order would be $locale, en-US, en.

On a related note, it seems a bit arbitrary to limit the sets of allowed content locales to those for which we provide a localised browser UI. Would it not be acceptable to offer a fixed $locale, en-US, en list for all locales for which we don't explicitly set some other default?

See Also: → 1760013

On a related note, it seems a bit arbitrary to limit the sets of allowed content locales to those for which we provide a localised browser UI.

I mean we can add/remove locales to the list. We know a low percentage of users use a non-default locale settings. do note the query checks if the locale is a 100% match with the values in the translations. if you have "en, fr", you would be in the "non-default locale settings" group because we don't have a 100% match, despite us having both en and fr. So, the percentage is probably even lower.

For

Would it not be acceptable to offer a fixed $locale, en-US, en list for all locales for which we don't explicitly set some other default?

I'll let Tom and others decide.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: