Note: There are a few cases of duplicates in user autocompletion which are being worked on.

Don't reveal UI language to site/page -- Change navigator.language to use Accept-Language instead of the UI language

RESOLVED FIXED in mozilla5

Status

()

Core
DOM
P4
normal
RESOLVED FIXED
17 years ago
6 years ago

People

(Reporter: BenB, Assigned: BenB)

Tracking

({dev-doc-complete, privacy})

Trunk
mozilla5
dev-doc-complete, privacy
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox5-, blocking2.0 -)

Details

(Whiteboard: [defective-privacy], URL)

Attachments

(1 attachment, 5 obsolete attachments)

(Assignee)

Description

17 years ago
Reproduce:
1. Visit <http://gemal.dk/browserspy/language.html>.

Actual result:
Your UI language and locale (e.g. en-US) is displayed.

Expected result:
Neither UI nor OS language or locale are revealed to page or site.

Additional Comments:
Compare HTTP 1.1. spec:
<quote src="http://www.ietf.org/rfc/rfc2616.txt">
15.1.4 Privacy Issues Connected to Accept Headers

   Accept request-headers can reveal information about the user to all
   servers which are accessed. The Accept-Language header in particular
   can reveal information the user would consider to be of a private
   nature, because the understanding of particular languages is often
   strongly correlated to the membership of a particular ethnic group.
   User agents which offer the option to configure the contents of an
   Accept-Language header to be sent in every request are strongly
   encouraged to let the configuration process include a message which
   makes the user aware of the loss of privacy involved.

   An approach that limits the loss of privacy would be for a user agent
   to omit the sending of Accept-Language headers by default, and to ask
   the user whether or not to start sending Accept-Language headers to a
   server if it detects, by looking for any Vary response-header fields
   generated by the server, that such sending could improve the quality
   of service.

   Elaborate user-customized accept header fields sent in every request,
   in particular if these include quality values, can be used by servers
   as relatively reliable and long-lived user identifiers. Such user
   identifiers would allow content providers to do click-trail tracking,
   and would allow collaborating content providers to match cross-server
   click-trails or form submissions of individual users. Note that for
   many users not behind a proxy, the network address of the host
   running the user agent will also serve as a long-lived user
   identifier. In environments where proxies are used to enhance
   privacy, user agents ought to be conservative in offering accept
   header configuration options to end users. As an extreme privacy
   measure, proxies could filter the accept headers in relayed requests.
   General purpose user agents which provide a high degree of header
   configurability SHOULD warn users about the loss of privacy which can
   be involved.
</quote>

While we don't send this info as HTTP header, we offer it as JS property. See
the source of the testcase mentioned above for details.

Comment 1

17 years ago
Isn't the language info also in the useragent string?  According to the browser
sniffer at http://www.ufaq.org/ it is.

User Agent: Mozilla/5.0 (X11; U; Linux 2.2.16-3 i686; en-US; m18) Gecko/20001006
Application Name: Netscape
Application Version: 5.0 (X11; en-US)

While this could be considered a privacy thing, it could be really neat if sites
tailored what language their content was according to your locale and language.

Comment 2

17 years ago
Hmm, on section thought maybe this isn't so good.  I recommend disabling it or
at least pref-disabling it.
(Assignee)

Comment 3

17 years ago
David, you are right and everything is as you suggested it. See the pref UI pane
under Navigator - it configures the HTTP header to send.
This bug is about the JS property which is
- much less useful (do you want to send all available language versions and then
select onteh client side?)
- not opt-in
- corrently not (independantly) changeable by the user, but identical to the UI
language
.

Comment 4

17 years ago
The useragent should not include this information. But the point of accept is 
to say that the user WANTS the content in that language.  Some RFCs need 
comments like ~this is stupid and counterproductive~.

Verah I think you work on that privacy document, add a paragraph w/ link to 
that rfc:
According to <a>RFC</a> we do hereby warn you that asking for content in a 
language you prefer might divulge information about you (including <span 
style="-moz-type-timeless:shocking">the language you prefer to read</span>, 
which may imply <span style="-moz-type-timeless:shocking">your 
ethnicity</span>).

I presume mozilla does send prefered language headers, if not we need a bug for 
that (mozilla1.0)
Severity: normal → enhancement
Keywords: relnoteRTM
OS: Linux → All
Hardware: PC → All
Whiteboard: [defective-privacy]

Updated

17 years ago
Blocks: 50205
(Assignee)

Comment 5

17 years ago
timeless, I don't understand your last comment. Also, it has nothing to do with
privacy *links* (maybe the *document*). Removing dependancy.

Please note the difference between the HTTP header "Accept-Language" and the JS
property. The implementation of the former is OK in Mozilla (I think). This bug
is about the latter.

David Krause, thanks for noting the UA string (anyhow, I missed your comment).
Will investigate.
No longer blocks: 50205
(Assignee)

Updated

17 years ago
Severity: enhancement → normal
Future.
Status: NEW → ASSIGNED
Target Milestone: --- → Future
As there's nothing a user can do about this JS privacy leak, is it worth 
relnoting?

The blurb:
It seems unclear to me whether this bug requires either of a "developer" or 
"user" release note for Netscape 6 RTM. If anyone feels it does, can they please 
draft one and then nominate with the relnote-user or relnote-devel strings in 
the Status Whiteboard.

Thanks :-)

Gerv
(Assignee)

Comment 8

17 years ago
> As there's nothing a user can do about this JS privacy leak, is it worth
> relnoting?

Sure, at least he has to know. He can do something: Use another UA or use
english chrome.
Hasn't Netscape always revealed the UI language somehow? If this behavior is
present in 4.x, then I don't see why we need a relnote. 
(Assignee)

Comment 10

17 years ago
> Hasn't Netscape always revealed the UI language somehow?

Yes, I think so. I am not sure we need a relnote.
(Assignee)

Comment 11

17 years ago
We *already* have something better than a relnote: Tasks|Privacy|Understanding
<chrome://communicator/locale/wallet/privacy.html>. removing relnoteRTM based on
that.
Keywords: relnoteRTM

Updated

17 years ago
QA Contact: czhang → junruh

Comment 12

17 years ago
Mass changing QA to ckritzer.
QA Contact: junruh → ckritzer
Blocks: 71569

Comment 13

16 years ago
We are past the UI freeze for Commmercial Beta. We need a UI freeze to ba able 
to ship Localised products simultaneously with the US. Would it be possible to 
check any changes to chrome://communicator/locale/wallet/privacy.html after we 
have branched in the commercial tree on 6/29?

Comment 14

16 years ago
By the way here is a handy JS for revealing UA info...

<PRE>
<SCRIPT>
with (document) {
    writeln("navigator.userAgent is ", navigator.userAgent);
    writeln("navigator.appCodeName is ", navigator.appCodeName);
    writeln("navigator.appVersion is ", navigator.appVersion);
    writeln("navigator.appName is ", navigator.appName);
}
</SCRIPT>
</PRE>
Target is now 0.9.5, Priority P1.
Priority: P3 → P1
Target Milestone: Future → mozilla0.9.5
(Assignee)

Comment 16

16 years ago
Removing this info from the UA string is trivial. I did this for Beonex
Communicator, I can attach a patch.
Ben,
   By all means, attach your patch. I don't think we will make it the default,
but it would be nice to have as part of "high-privacy mode," which is something
I'm working on.
(Assignee)

Comment 18

16 years ago
Mitch, why not make it the default? The HTTP spec explicitly recommends against
doing what we do atm. Sites won't break either, if we just always send "en-US".
(But possibly with an Accept-Language header, which is customized by the user.)
(Assignee)

Comment 19

16 years ago
Created attachment 43864 [details] [diff] [review]
Proposed Fix for UA-string. Also contains Win fix for bug 57555.
(Assignee)

Comment 20

16 years ago
> (But possibly with an Accept-Language header, which is customized by the user.)

s/with/, we send/

It's not my decision to make. I will ask around here and find out if changing
the default UA is OK. There may be some resistance to changing it at all,
especially if we've always provided the language information in the UA.

Comment 22

16 years ago
this was discussed in a newsgruop that i read recently (well the discussion was 
1-4 years old but...) cc.

Comment 23

16 years ago
Ben, does your patch remove the region/language from navigator.appVersion as well?
(Assignee)

Comment 24

16 years ago
Yes, it seems so. The test page now shows "en" (the hardcoded dummy value)
instead of "en-US". Looks like the Javascript function pulls its value out of
the UA-string, which is nice. So, looks like I fixed this bug. I'll install a
German langpack when I get a chance to be sure.

mstoltz, would you mind, if I took the bug? Who has to be asked about checking
this in, apart from dbaron? (There's no pref to turn this on or off, since I see
no value in the current behaviour - see comment in patch and my earlier comments
here for reasons.)

dbaron, what do you think?
(Assignee)

Comment 25

16 years ago
Posted proposal to .netlib: <news://news.mozilla.org/3B665A87.8050807@beonex.com>.

Comment 26

16 years ago
> It's not my decision to make. I will ask around here and find 
> out if changing the default UA is OK. There may be some 
> resistance to changing it at all, especially if we've always
> provided the language information in the UA.

Netscape commercial builds cannot remove the lang info from
UA string. They serve as important tracking tools. 

If Mozilla wants to make that an option, that is fine 
but that option should not be the default. 
If there is a proposed UI for it, commercial builds might
consider removing the UI.
(Assignee)

Comment 27

16 years ago
> If Mozilla wants to make that an option, that is fine 
> but that option should not be the default.

Why? If it's a pref, Netscape can alter it trivially.

Comment 28

16 years ago
>> If Mozilla wants to make that an option, that is fine 
>> but that option should not be the default.

> Why? If it's a pref, Netscape can alter it trivially.

That is true. As long as it is Netscape's default not 
to turn off lang info, that would be fine.

By the way I would like to raise this issue about
the privacy clause in HHTP 1.1. It is wrong-headed to
single out lang info as the only thing compromising
security. What about the fact that you're using Mozilla,
Gecko, or Netscape, Win NT5, etc. ? Why, someone could
descriminate against Netscape users or IE users or
whatever. That is also a privacy issue if lang info
is a privacy issue. The fact that somethng is mentioned
in an RFC document does not mean we need to be implementing
everything that is in it. 
In these day and age, the fact that someone might be using
en-US build means virtually nothing other than the fact that
someone maight be able to read English. We have users 
all over the world using an en-US build.
The fact that someone is using using an ja-JP build does not
mean that that person is Japanese. It simple means that someone
possibly reads Japanese but may be a Canadian, etc. 
The whole argument about the lang info being a compromising
factor is moot in my opinion. Whoever wrote the HTTP 1.1 section
onlang info and security should examine issues more broadly
and fairly.

Our proud Mozilla localizers around the world would probably 
like to see their L10 work reflected accurately in the 
UA string. 

(Assignee)

Comment 29

16 years ago
> wrong to single out lang info as the only thing compromising security.

Right. There are other bugs about other issues, e.g. bug 57555.

> en-US build means virtually nothing

Right, but if I speak Hebrew, it does mean something.

You can argue about the severity of this bug. But I do think that it should be
fixed. I care less about the default in Mozilla, but I would prefer that Mozilla
followed the advise of the spec.

I will attach a new patch which makes it dependant on a pref
(browser.reveal-ui-lang or similar) when I have time.

Comment 30

16 years ago
ben: you're german no? I know a bunch of people who contribute to mozilla.org 
who can read hebrew and I wonder if _any_ of them have this concern (I know 
it's really odd ..)

Comment 31

16 years ago
Folks:

Some websites sniff U-A string to redirect users to appropriate pages for 
downloading localized version of their software/patch. When locale info does not
present, "en-US" are often used as the default.

Following spec is a good thing when it does not break existing websites. I agree 
that making it a preference and default to 'on' seems to be a good compromise.

Comment 32

16 years ago
> Some websites sniff U-A string to redirect users to appropriate pages for 
> downloading localized version of their software/patch.

IE 5.0, 5.01, 5.5, and presumably earlier versions don't put language in UA.
Opera doesn't put language in UA. Konqueror doesn't put language in UA. This is
a wholly unappropriate and nonportable place for that information. That's the
whole purpose of the Accept-Language header, to specify what language you want
information in.

If you want to ignore the RFC and default Accept-Language to on, that's fine
enough by me (and is a seperate bug anyway). But there's no purpose to reveal
the *UI* language, if not for privacy, for correctness. Accept-Language is the
language(s) the user wants to receive information in.

Comment 33

16 years ago
I don't think I've ever said that it is proper to use U-A string for content
negotiation; as you pointed out, accept-lang in the HTTP header serves such 
purpose. All I said is there are indeed websites misuse the U-A string...

Glad hear standard advocate, though :-) 

Comment 34

16 years ago
> But there's no purpose to reveal the *UI* language, if not 
> for privacy, for correctness. Accept-Language is the
> language(s) the user wants to receive information in.

I think you should read the definition of User-agent:

"14.43 User-Agent

   The User-Agent request-header field contains information about the
   user agent originating the request. This is for statistical purposes,
   the tracing of protocol violations, and automated recognition of user
   agents for the sake of tailoring responses to avoid particular user
   agent limitations. User agents SHOULD include this field with
   requests. The field can contain multiple product tokens (section 3.8)
   and comments identifying the agent and any subproducts which form a
   significant part of the user agent. By convention, the product tokens
   are listed in order of their significance for identifying the
   application.

       User-Agent     = "User-Agent" ":" 1*( product | comment )

   Example:

       User-Agent: CERN-LineMode/2.15 libwww/2"

Further Product tokens are deifned as:

" Product tokens are used to allow communicating applications to
   identify themselves by software name and version. Most fields using
   product tokens also allow sub-products which form a significant part
   of the application to be listed, separated by white space. By
   convention, the products are listed in order of their significance
   for identifying the application. .... etc."

UI language differs if localization files are different. 
It is clearly a significant part of the application. And though
this is not common, localization itself might reveal a bug that 
was not caught before the product was shipped. This latter type
of case does actually. For tracking purposes, it is in my opinion
siginificant info. 

I don't believe that we should use considerations raised
for Accept-Language for user-agent issues. I just want to point
out that there are arguments for both sides of this issue and also
that the HTTP 1.1 says nothing about not revealing the UI language in
the User-agent header. If MS or Opera wants not to include that
info, that is fine but let that not bind what we should do here.

Comment 35

16 years ago
Keep in mind, that where the RFC says "automated recognition of user agents for
the sake of tailoring responses", I think this would more refer to protocol
tailoring, not content. For example, Apache's default config contains some magic
to disable Keep-Alive for some broken versions of IE that claim to support it.

HTML includes its own means of "avoid[ing] particular user agent limitations",
such as CSS, and other ways of making content still accessible to older browsers.

The only browser revealing language in U-A I know of is Netscape 4.*, which,
last I heard, had 8% market share. Any page basing content off this field isn't
a whole hell of a lot effective right now. If future versions of Mozilla and
Netscape 6 remove it from U-A, more web designers won't be tricked into
mistaking U-A for A-L (see also: Microsoft J++).

OK, I'll shutup now, sorry for all the spam, folks.
The user agent language is used for distribution tracking, right? Personally if
I were creating a language pack I'd be gratified to have a clue how far it had
spread, especially for a minority/endangered language (Navaho? Hawaiian?
Gaelic?). If I were using such a language I'd probably want to signify my
presence (ethnic pride).

This kind of thing should obviously be a pref. (and now we can commence arguing
over the default setting in Mozilla.)  If the UA were easier to change (i.e. via
the pref UI rather than hacking prefs.js) this wouldn't be so much of an issue.

As the person mostly responsible for foisting "navigator.language" on people in
4.x I think we could safely nuke it. Eh, I guess we should return a string so we
don't break pages accidentally on an undefined property, maybe "" or "unknown".

Comment 37

16 years ago
Hi, Kat:

We should probably inform webmasters of whatever change we make in the final so
they can adapt accordingly.
I agree that we should have a pref. I don't know what the default setting should
be, but I would lean towards leaving the language in there by default. We should
have a pref checkbox for "paranoid mode" that will turn off the language part of
the UA as well as other small privacy violations which are the norm.

Comment 39

16 years ago
I believe that as mst potential Mozilla users does not live in zones of war or
ethnic cleansing, the default setting should be the present behavior.
(Assignee)

Comment 40

16 years ago
> as mst potential Mozilla users does not live in zones of war or ethnic
> cleansing, the default setting should be the present behavior.

The problem is that users might not know that we spread this info. And it's not
worth a UI pref IMO.

Default in Mozilla: So far, I count 1+4 (non-Netscape+Netscape) votes for on,
4+0 for off/dummy.
Any votes gathered here are going to be meaningless because mostly only people
who agree will find this bug. People who are happy with the way things are have
no clue others want to change things--though I will grant that most people
probably don't care one way or another.

The navigator.language issue should be dealt with separately. IMHO axe that,
leave the UA the way it is, and make it a hell of a lot easier for people to
spoof the UA (e.g. pref UI with radio-buttons for common options and then a text
box for custom text). Paranoid nuts who are worried about language in the UA
also don't like giving out OS info and nearly everything else in the UA -- we
shouldn't address UA privacy issues item by item.

So is this bug about navigator.language, or is it about the UA? If the latter it
should be invalid, in favor of some UA uber-bug
Just to show that I am the kind of person who cares about languages : I use a
browser with an english UI. I configured it so it accepts the following
languages in this order : French, English, Swedish, Spanish, Yiddish, Xhosa.
And the person who Cc:ed me (timeless ?)on this bug also knows that privacy is
on top of list of my concerns.

I think this bug is a total non issue and a waste of time and neurons. We have
language strings crossing the web in all directions since 1996 and nobody never
ever complained about it.

My 0.02? only...
(Assignee)

Comment 43

16 years ago
> People who are happy with the way things are have no clue others want to
> change things

Right.

OK, you talked me into making to default on for Mozilla.

> The navigator.language issue should be dealt with separately. IMHO axe that,

I care more about the UA-string, since we are really spreading this all over the
world and playing it in many server-logs. Implementation is coupled.

> and make it a hell of a lot easier for people to spoof the UA (e.g. pref UI
> with radio-buttons for common options and then a text box for custom text)

There is a bug about it, but it has its own problems, like (by the user)
unintended side-effects. I am not a fan of having UI for setting the UA-string
to arbitary values.
> So is this bug about navigator.language, or is it about the UA? If the latter
> it should be invalid, in favor of some UA uber-bug

It's about both ("don't reveal" includes all ways). But it is INVALID in no
case, because it's a legal request. It is also disabled in Beonex Communicator
(by default), so I would appreciate, if I wouldn't have to carry around source
patches.

If I attach a patch to make it depending on a pref, default on, everybody is
happy, no?

Comment 44

16 years ago
I was originally arguing to remove the language from U-A, but enough NSCPies
want to leave it for various reasons, so I say just leave it. I didn't notice
4.7 had been doing it all this time, so it's nothing urgent. If it becomes an
issue, we'll address it post 1.0. But for god sakes, don't make it a pref. And
certainly no UI. 

If anything, a generic U-A pref (also with no UI, or a plain textbox under
Debug... none of this multi pulldown nonsense) could be used to remove it, or, I
have a bug open on disabling U-A altogether. But seperate prefs to tweak each
part of the U-A string would be nuts.
Ben, implementation could be trivially uncoupled if we wanted to deal with
navigator.language separately.

>> is it about the UA? If the latter it should be invalid, in favor
>> of some UA uber-bug
>
> It's about both ("don't reveal" includes all ways). But it is INVALID in no
> case, because it's a legal request.
It's a valid concern, but wrong to consider in isolation from the other
user-agent privacy concerns. Do you really want a bunch of "Hide UA Language",
"Hide UA platform", "Hide UA OS", "Hide UA OS version", etc. prefs? The UI would
be ugly which means they'd be hidden prefs, and that means most people who might
benefit would have no clue they were there.
If IE 5.0 and 5.5 don't send the language in the UA string, how many sites can
we really be breaking?  I think the compatibility stance so far, in DOM and
other key areas, has been that if IE5 and NS4 do different things, we should ape
IE5 because it has so many more users.

(I'm sure that localization folks would love to know how many people are using
their work, and I have no problem with that desire, but I don't want to turn the
UA string into about:credits.)
The argument that we've been doing this since 1996 doesn't sway me: browsers had
privacy-hostile cookie and image handling for years to, but I'm pretty sure
nobody's going to stand up and say that fixing it doesn't matter.

I'd support pulling UI language out of the UA because
 - people should be using Accept-Language to tailor content

 - I can't believe that we're functionally breaking that many sites if IE
doesn't do this, and the browser-number-tracking stuff really doesn't sway me,
because, again, IE is the majority of the browser population, and you can't do
this kind of tracking on it

 - there's too much crap in the UA anyway

Comment 47

16 years ago
>If IE 5.0 and 5.5 don't send the language in the UA string, how many sites can
>we really be breaking?  

Some websites have logic like this:

  if (Netscape) {
    // assuming locale info presents
    do some locale specific things...
  }
  else if (MSIE) {
    do nothing...
  } else {
    do nothing..
  }

The problem is more like Netscape used to include locale info in the U-A and
some websites use it to do something for international users. Unless 
webmasters are advised of any upcoming change, their websites are doomed to
break. I won't be surprised to see they start advising people that they websites
work better with other browsers.


I agree with dveditz. People who care whether or not their UI language is
revealed in the UA string probably also don't want their browser and OS versions
revealed. We already have a (hidden) pref for overriding the UA string; why
don't we just encourage people to use that?

Comment 49

16 years ago
Mitchell: One of the reasons is bug 83376. It seems Sun Java uses the UA to
check if the browser is Netscape/Mozilla. It is either free choice of UA or
working Java :-(

Comment 50

16 years ago
please IGNORE the sun jvm problem, that's a bug which someone working on oji or 
the sun jre will fix, it shouldn't ask us about our spoofable useragent.  All 
things considered a simple pref could probably be exposed:

[x] Include system and locale information in useragent. checked by default.

unchecking it would strip out all information except very basic stuff.

Comment 51

16 years ago
Timeless: I'm actually for ignoring the Java bug. The thing that bothers me is
the reply from Sun edburns@acm.org posted on 7/17 as a comment to bug 83376:

> Java Plug-in depends on user-agent string for version information, no fix
> will be made.
>
> zhengyu.gu@sun.com

I believe this needs applyng some pressure on Sun.

Comment 52

16 years ago
Although IE5,IE5.5 do not include the language in the user agent string, they do
expose it in JavaScript through navigator.userLanguage and
navigator.systemLanguage. I believe navigator.language should continue to report
useful and accurate information in Mozilla. Perhaps the language returned can be
talored to the accept language stuff under edit->prefs->navigator->language.
Perhaps report the prefered language. It seems that this is the only way the DOM
can configure to the language, which could be useful if not serving special
pages for each language.

I don't see any reason to remove this from UA by default, especially since we
allow changing the UA completely. I've always thought that this was one of the
nicer features of N4.

Comment 53

16 years ago
> Perhaps the language returned can be
> talored to the accept language stuff under edit->prefs->navigator->language.
> Perhaps report the prefered language.
These are different.  The string in the u-a indicates the browser localization
(i.e. the browser UI and some default settings).  The pref indicates the
user's preferred language(s) for the content.  E.g., a user may run a Japanese
browser, but prefers content in Arabic. Since the browser is enabled for many
more languages that it is currently localized, this is not unusual.

Comment 54

16 years ago
Removing ME ---> barrowma (acting browser PM)
time marches on...retargeting to 0.9.6
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Moving to Moz1.0 as part of "paranoid mode" feature set
Target Milestone: mozilla0.9.6 → mozilla1.0

Comment 57

16 years ago
Bugs targeted at mozilla1.0 without the mozilla1.0 keyword moved to mozilla1.0.1 
(you can query for this string to delete spam or retrieve the list of bugs I've 
moved)
Target Milestone: mozilla1.0 → mozilla1.0.1
Futuring.
Target Milestone: mozilla1.0.1 → Future
(Assignee)

Updated

12 years ago
Assignee: security-bugs → ben.bucksch
Status: ASSIGNED → NEW
Target Milestone: Future → ---
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9pre) Gecko/2008033001 SeaMonkey/2.0a1pre

Workaround: Set the pref general.useragent.locale (in about:config) to the empty string (or even to any language you want to spoof as using). If set to the empty string, the semicolon and one surrounding space are removed too.

According to http://kb.mozillazine.org/General.useragent.locale , that pref was created on 2000-02-07.
Whiteboard: [defective-privacy] → [defective-privacy] [has workaround, comment #59]
(Assignee)

Comment 60

9 years ago
Per HTTP spec, "message which makes the user aware of the loss of privacy involved", i.e. not only user-configurable, but even an alert.
(This applies all the more if this is set by default based on UI or OS language, like Firefox does.)
(Assignee)

Updated

9 years ago
Assignee: ben.bucksch → mozilla
Keywords: privacy
Duplicate of this bug: 464414
QA Contact: ckritzer → toolkit
Duplicate of this bug: 525637

Comment 63

8 years ago
While I believe that my request (bug 525637) is a superset of this very old bug and not a duplicate I'd like to reactivate the topic here. Though it seems Mozilla is avoiding the topic.

Also, setting the locale to blank is not a workaround because it will make you stand out even more. Please remove "has workaround" keyword.
Duplicate of this bug: 543202
(Assignee)

Updated

7 years ago
Priority: P1 → P4
Bug 543202 is also a superset and not an exact duplicate. For example, the Gecko build date is absolutely useless for any non-b.m.o sniffing but it makes UA strings have more fingerprintable parts.

Updated

7 years ago
Flags: blocking1.9.0.19?
Note that CSS now exposes the directionality of the UI language. The HTML5 parser (via <isindex> prompt) exposes the UI language but not necessarily which regional variant when the string does not happen to vary by region.
(Assignee)

Comment 67

7 years ago
Both should instead use the *content* language (Prefs|Content|Language|Choose... , which is by default Firefox install the same as the UI language). The content language is fine to expose to the website, that's what it's for, and it's changeable independently. It's also more correct to use that, because the site should adjust based on that, and it would have wrong effects to have an English-UI Firefox nightly set to Arabic report left-to-right and "en" in some places and Arabic in Accept-Language.
Are there _any_ known cases of a website using Accept-Language to infer someone's ethnicity and taking action against them, either electronic or physical?

The only way to avoid that possibility would be to remove all language-identifying features from what the browser sends - Accept, JS, everything. However, these features are used today to provide serious and measurable benefits to web users, 99.999% of whom don't care if a site knows what languages they speak.

Gerv
(In reply to comment #68)

> The only way to avoid that possibility would be to remove all
> language-identifying features from what the browser sends - Accept, JS,
> everything. However, these features are used today to provide serious and
> measurable benefits to web users, 99.999% of whom don't care if a site knows
> what languages they speak.

I don't understand some of the comments above Gerv's... Ok, the spec says something, but pragmatism sometimes help if the spec is counter-productive

There are many web sites out there that are tailored to serve me better, and that includes serve me in my own language if and when they support it. This is a hugely positive factor for all users around the world. Geolocating the IP address is not enough and many web sites have elaborated detection based on all what's available today, including user-agent string. Change that and you'll break their behaviour, and I won't call that "for the greatest benefit of all".

I am myself strongly in favor of a wontfix for this bug unless the solution implemented is a "never reveal my language" pref in an advanced preference panel with default "on". And I'm not sure it's worth the bloat, honestly.
(Assignee)

Comment 70

7 years ago
Gerv, I am not opposed to Accept-Language at all, which is user-configurable and defaulted to the UI language. I think I said as much in my last message.

I am opposed to the browser sending the *UI* language, where it's different from Accept-Language. Wherever we send the language, locale or country to the site, e.g. UserAgent string, it should be user-configurable value from Prefs|Content|Language|Choose... (which is already used for Accept-Language), not the UI language value.

There is no loss for the user here. On the contrary, if anything it's going to work better.
(Assignee)

Comment 71

7 years ago
Please note that the summary of this bug specifically says "UI language", i.e. browser locale/ package, not "content language" = Accept-Language = "Prefs|Content|Language|Choose...".

Updated

7 years ago
blocking1.9.1: --- → ?
blocking1.9.2: --- → ?
blocking2.0: --- → ?
Flags: blocking1.9.0.19?
Keywords: checkin-needed

Updated

7 years ago
Keywords: checkin-needed

Updated

7 years ago
blocking1.9.1: ? → ---
blocking1.9.2: ? → ---
FWIW, I'm not really concerned about using language per se for nefarious purposes. I'm more concerned about UI language being yet another piece of configuration entropy that can be used for fingerprinting. See https://panopticlick.eff.org/
(Assignee)

Comment 73

7 years ago
I'll update patch here as part of bug 57555, once I get to it.
Ben: so you are not arguing for this switch from a privacy point of view, but a functionality one? If so, that does make sense to me.

Gerv
(Assignee)

Comment 75

7 years ago
I argue from both perspectives. If the user is in control, there is no privacy issue, or at least no critical one. Also, reducing the number of permutations is good for functionality (consistent) and privacy (fingerprinting), in this case.
Not holding the 1.9.3 release for this bug.
blocking2.0: ? → -
Depends on: 572656

Comment 77

7 years ago
I don't understand why should all locales but english one have something like

it-it
it;q=0.8
en-us;q=0.5
en;q=0.3

If I download italian, why is english there? And why two strings with different priorities (it-it and it)?

This, by the way gives more information for fingerprinting than a plain "it" (this is in general more common than the double it-it) set as default for every italian build: official or unofficial and for every platform.
(Assignee)

Comment 78

7 years ago
gionnico, you're int he wrong bug. You talk about Accept-Language header, which is not subject of this bug.

Updated

7 years ago
Depends on: 580032
(Assignee)

Comment 79

7 years ago
Created attachment 460173 [details] [diff] [review]
Patch 3: Change navigator.language to use Accept-Language

After bug 572656 has fixed the UA string by removing the language part, this also fixes navigator.language. The property is retained (JS has no access to the Accept-Language header, to my knowledge), and retains the formal format, but uses the value from Accept-Language (which the user can freely configure in the pref window) instead of the UI language.

Asking biesi to review.
Attachment #460173 - Flags: review?(cbiesinger)
(Assignee)

Comment 80

7 years ago
http://browserspy.dk/language.php
http://browserspy.dk/showprop.php
(Assignee)

Updated

7 years ago
Attachment #43864 - Attachment is obsolete: true
(Assignee)

Comment 81

7 years ago
Comment on attachment 460173 [details] [diff] [review]
Patch 3: Change navigator.language to use Accept-Language

Actually, bz reviewed bug 572656
Attachment #460173 - Flags: review?(cbiesinger) → review?(bzbarsky)

Comment 82

7 years ago
Comment on attachment 460173 [details] [diff] [review]
Patch 3: Change navigator.language to use Accept-Language

Would this break on 

en-gb;q=0.8, en;q=0.7

Didn't dig into whitespace handling.

Not that I'm in sync with the rationale of these bugs, for the record.

Comment 83

7 years ago
Comment on attachment 460173 [details] [diff] [review]
Patch 3: Change navigator.language to use Accept-Language

Yeah, Axel's right.  This needs to handle q values.  And things like spaces around the ',' chars, etc.  Probably best to just use nsCharSeparatedTokenizer and then deal with the q value thing.
Attachment #460173 - Flags: review?(bzbarsky) → review-
(Assignee)

Comment 84

7 years ago
> This needs to handle q values.

You mean I need to look for ";" in addition to ","? Yes, sure, sorry for the oversight. Will attach new patch.

(I do not care to support *manually* hacked prefs that have a lesser preferred language as the first entry, if that's what you meant.)

> nsCharSeparatedTokenizer

Will take a look, how big the code will be with that and with FindInReadable().
(Assignee)

Comment 85

7 years ago
Actually, the pref "intl.accept_languages" does not contains ;q= . The UI doesn't write q= in there, and if a user does, it's ignored and not sent in HTTP header, but the HTTP re-calculates the q=.

Also, while nsCharSeparatedTokenizer is useful here, it seems to skip the first token. Either I don't know how to use its API, or the implementation is broken.
(Assignee)

Comment 86

7 years ago
> nsCharSeparatedTokenizer ... seems to skip the first token

Nevermind, I was stupid.
(Assignee)

Comment 87

7 years ago
Created attachment 461774 [details] [diff] [review]
Patch 4: Change navigator.language to use Accept-Language
Attachment #460173 - Attachment is obsolete: true
Attachment #461774 - Flags: review?(bzbarsky)
I feel kinda bad for apparently slowing the approval of the patch down a bit, but I'd like to draw attention to a few things:
1. 3-char lang codes (need to be handled).
Based on url mentioned in another bug, there are 2 existing examples of them: http://mxr.mozilla.org/l10n-mozilla1.9.2/search?string=intl.accept_languages&find=global/intl.properties
(The same could be asked for 3-char country codes, but there are no examples of those, and they might not be probable (haven't looked into it).)
2. Default for messy accept-lang pref: should this be en-US rather than en for consistency w/ default en-US locale? I tend to think "yes".
3. Update urls: %LOCALE% in chrome prefs is resolved by navigator.language as of now, AFAIK. I'm 99.9% sure that the current UI language shouldn't be changed by an update based on preferred content language. If that's the case, then there needs to be an additional patch (in this bug or another), that accounts for necessary additional logic that is now required to substitute %LOCALE% in chrome app.update.url pref w/ UI lang, rather than navigator.language value.
(Plus, I'm not 100% sure that an update in language B for an installation whose UI is in language A works smoothly every time: i'm not saying it doesn't either, but it would probably be something worth verifying if the updates were going to be based on the current preferred language.)
4. empty pref special case (messed up, or user wishes to not specify accept-language in HTTP): 
4.1. should this result in navigator.language being empty, for consistency with HTTP Accept-Language? I tend to think "yes".
4.2. i'm also throwing this note in based on Java StringTokenizer's throwing an exception if nextElement() is called w/o hasMoreTokens() having returned true prior to that. Quick look at the source appears to suggest that nsCharSeparatedTokenizer works differently and (at least currently) returns an empty string instead. Forgive me for not having time to verify that, but i hope those who are already building FX 4 can take a quick look to make sure current empty pref handling wouldn't crash the app, for instance.

Accounting for all of this should easy, except maybe item 3., which necessitates a patch for another class.

P.S. For the record, I can't wait for this fix.
Just noticed this as well:
5. Currently (FX <=3.6.x, country code in navigator.language is upper-case, whereas accept-language pref's country code is lower-case. 
Should navigator.language value be:
5.a. backwards compatible
(this doesn't seem feasible, because some locales have country code in accept-lang, and no country code in navigator.language; either that's something sites will need to adjust to, or there'd need to be a map of first-accept-lang=ui-lang pairs based on FX 3.6.x used for 100% backwards-compatibility)
or
5.b. upper-case
or
5.c. lower-case

IMHO, ideally 5.a., but 5.b. would be easier to implement and more consistent, at the expense of additional country code for some locales and/or case change for some locales (sorry, i haven't analyzed other locales for letter-case). If sites can adjust to UA format change for ALL locales, they can also adjust to navigator.language values changing slightly for SOME locales (but it might  warrant a list of before-after values for affected locales).
(In reply to comment #89)
> 5.b. upper-case
> or
> 5.c. lower-case
Clarification, I meant:
5.b. using upper-case country code
or
5.c. using lower-case country code
That can be addressed in another bug if it needs to be changed.
(In reply to comment #88)
> 3. Update urls: %LOCALE% in chrome prefs is resolved by navigator.language as
> of now, AFAIK. I'm 99.9% sure that the current UI language shouldn't be changed
> by an update based on preferred content language. If that's the case, then
> there needs to be an additional patch (in this bug or another), that accounts
> for necessary additional logic that is now required to substitute %LOCALE% in
> chrome app.update.url pref w/ UI lang, rather than navigator.language value.

Please ignore item 3.: a couple of my memory wires got crossed, and in fact no changes need to be made to app.update.url handling, because %LOCALE% there appears to be replaced based on the locale in update.locale file in app directory, and NOT based on navigator.language value. Unlike elsewhere, I acknowledge having wastefully posted 10 lines in c88 for item 3, and 6 lines here to clear that up.

Also, to clarify, an obvious example for item 5 that i had in mind but didn't mention is en-US vs. en-us.
(Assignee)

Comment 93

7 years ago
Thanks for the comments, Reshat.
1. I was assuming only 2-char ISO lang codes were valid, but the HTTP spec explicitly gives examples with other codes, so I'll be more lax. I'll still check that the user didn't use the wrong Windows syntax of en_GB instead of the correct Internet syntax en-GB.

3. If we broke %LOCALE% in the updater, that'd be really bad. I'll double-check that this is not the case, as well as whether there are other stupid uses of navigator.language.

5. Yes, I was already thinking of casing, thanks for pointing out that the current navigator.language is en-US. I'll maybe just fix the casing, but only in cases of a 5-letter code (lowercase 2-letter, dash, uppercase 2-letter).
Whether this is important also depends on other browsers. If they use different casing, likely sites will be tolerant (using .toLowerCase()).

Specs:
<http://asg.web.cmu.edu/rfc/rfc2616.html#sec-14.4>
<https://developer.mozilla.org/en/Navigator.language>
(Assignee)

Comment 94

7 years ago
4. The fallback could be either "en" or "". I chose the former, but it's trivial to do the latter. Up to reviewer. *Informed* opinions welcome.
(Assignee)

Comment 95

7 years ago
Responding to self, I think it would indeed be better to use "" as fallback, esp. in light of "i-cherokee" as first accept-lang, and of sites in other countries which want to use the local language as fallback.

When using "" as fallback, we leave the fallback to the site. When using "en", the site cannot differentiate whether it's a fallback or the user really meant English. So, I'll use "" as fallback.

Comment 96

7 years ago
There's a good chance that we'll have script tags in language codes at some point, and maybe even x- stuff.
(In reply to comment #88)
> 2. Default for messy accept-lang pref: should this be en-US rather than en for
> consistency w/ default en-US locale? I tend to think "yes".
...
> 4. empty pref special case (messed up, or user wishes to not specify
> accept-language in HTTP): 
> 4.1. should this result in navigator.language being empty, for consistency with
> HTTP Accept-Language? I tend to think "yes".

IMHO, "" navigator.language for "" accept-language is good.
With regards to "" or afore-mentioned "en-US" navigator.language for messy/unrecognized accept-language, i'm not so sure, because accept-language header is not "" or "en-US" in this case as of now. I tend to think navigator.language should at least match the language in accept-language (allowing the (remote) possibility of not having a country code as it is the case in 3.6.x). So for "messed-up" intl.accept_languages, i'd either:
2.1. provide "messed-UP" as navigator.language, and keep HTTP accept-language as is (i.e., "messed-up")
or
2.2. provide "" for both navigator.language and HTTP accept-language
2.3. provide "" for both navigator.language, and log a bug to do the same for HTTP accept-language

Taking into account Axel's comment as well, 2.1 might be the best way to go in the context of this bug. If 2.2, or 2.3 were found desirable, IMHO we need a separate bug for that (especially, because exceptional manually set values are the subject of discussion here).
(Assignee)

Comment 98

7 years ago
Created attachment 462244 [details] [diff] [review]
Patch 5: Change navigator.language to use Accept-Language

- removed the check, so now also allowing i-cherokee
- return "", if the accept-lang pref is empty (or otherwise invalid)
- replace _ with - (only first one for now)
- return uppercase for en-US
Attachment #461774 - Attachment is obsolete: true
Attachment #462244 - Flags: review?(bzbarsky)
Attachment #461774 - Flags: review?(bzbarsky)
(Assignee)

Comment 99

7 years ago
Searching for "navigator.language" in source returns (only):
./extensions/reporter/resources/content/reporter/reportWizard.js:
  const gParamLanguage = window.navigator.language;
./testing/extensions/community/chrome/content/litmus.js:
  this.locale = navigator.language;
./testing/sisyphus/tests/mozilla.org/download-page/userhook.js:
  data['005 navigator.language']      = navigator.language;
I should fix these, too, but I will probably use bug 580032 as tracker for that.
Hmm.  Why do we want to do the bits about converting '_' to '-' (how would the '_' get there?) and uppercasing stuff?
IMHO, these points are worth another look:
4.    IMHO, no assert is needed for empty accept-language, since RFC says:
"If no Accept-Language header is present in the request, the server
    SHOULD assume that all languages are equally acceptable."
Only affects debug builds, but still...
5. country code upper-casing:
The following cases should be handled as well:
abc-XY
abc-XY-dialect
https://wiki.mozilla.org/L10n:Locale_Codes
https://wiki.mozilla.org/L10n:Teams
One way of pseudocoding would be: if first '-' (at index 2 or 3), if any, is followed by no more than 2 alpha chars in a row, uppercase those 2 chars.
That said, the whole item goes away if we are going to make country code in navigator.language lower-case, though that wouldn't be backwards-compatible.

Comment 102

7 years ago
Please, don't use our wiki as a standard. The real document is http://tools.ietf.org/html/bcp47, which refers to http://tools.ietf.org/html/rfc4647 for the matching stuff.

Here, http://tools.ietf.org/html/bcp47#section-2.1.1 rules:

 At all times, language tags and their subtags, including private use
   and extensions, are to be treated as case insensitive: there exist
   conventions for the capitalization of some of the subtags, but these
   MUST NOT be taken to carry meaning.
Thanks, Axel. Well, i assumed we wanted consistency. If we are fine w/ en-US on one hand, and ast-es on the other, that's a different matter of course. If that is considered and decided to be OK, then that's the way it's gonna be. Still worth drawing attention to it, IMHO.
(Assignee)

Comment 104

7 years ago
> Hmm.  Why do we want to do the bits about converting '_' to '-'
> (how would the '_' get there?) and uppercasing stuff?

1. "-" vs. "_". The RFCs and ISO say that the separator is "-", e.g. "en-US". However, POSIX uses LANG="de_DE" in env vars, and Windows also uses the "en_US" notation. The latter is therefore common, and I've seen many people use "en_US" although the spec/protocol explicitly said "en-US", so it's a common error. How would it get there? By somebody editing the pref manually in <about:config>. The normal pref dialog does not allow to specify arbitrary tags. To avoid confusion and problems on the site's end, I want to prevent this, even if it's unlikely. I tried to make sure it's not a perf problem (just 2 int comparisons) nor a lot of code (just 2 lines).

2. uppercase: our pref contains e.g. "de-de,en-us,en", i.e. lower case. However, our locale codes are "de-DE", i.e. country part in upper case. BCP47 (mentioned by Axel above) also uses this notation in the examples. It's the convention. We used to return that as well, e.g. "en-US". If a site doesn't use navigator.language.toLowerCase(), the comparison will not fail. If our own code (comment 99) is any indication, this error is common, so if we don't adjust the casing to the convention, we may break stuff.

Now, my parsing here is primitive. It works fine for the 2-letter-dash-2-letter codes that we use for locales, and IIRC that's all that the pref dialog allows currently, so it should work be sufficient currently. Only "failure" is indeed "ast-es" and similar codes with 3 letters as first part. BCP47 also gives examples: zh-Hant, de-Latn-DE, de-DE-x-goethe. For these, my code would return zh-HANT, de-LATN-DE, de-DE-X-GOETHE, which is not the convetional casing. If you think I should improve this, I would use the nsCharSeparator here as well, and then uppercase every 2-letter part, apart from the first part.
I guess my question is how far we're willing to go in terms of trying to canonicalize random input.  I think the only two sources of this pref are:

1)  What our prefs dialog generates.
2)  What our localizers set up as the default for their locale.
3)  about:config.

I claim we don't care about #3, can impose any reasonable rules we want on #2, and fully control #1.  So if it simplifies our code, we should just assume whatever we want and impose corresponding rules on #2.  Axel, thoughts?

Comment 106

7 years ago
rightly so
s/two sources/sources/, clearly.  ;)
Concise:
IMHO, Ben's suggestion of uppercasing the first 2-letter part, after the first part, if any, might be ideal (though a bit harder to implement), but restricting such uppercasing to just the assumption that this 2-letter part is the second subtag (following 2- or 3-char first subtag), as i understand Boris and Axel appear to be inclined to do, will probably be sufficient for a long time (and might save x ms on each access?). If these are the 2 operational choices to proceed with, then choosing between them is almost a coin-toss situation IMHO.

Verbose additional info:
One could also say that we have 1 model, which is the intl.accept_languages chrome pref, whose value is presented by several views (in MVC pattern lingo).

Paraphrasing Boris, the input can come from:
i. pre-shipped intl.accept_languages pref values based on intl.properties (1) and 2))
ii. random about:config entries by the user (3))

FYI, if i enter "messed1-up,messed2-up" as random input, both prefs dialog and about:config reflect the same value (prefs dialog just shows them as codes in [] without displaying a recognized lang name in front of each value, although some pre-shipped values also don't have recognized lang names).
(Assignee)

Comment 109

7 years ago
Created attachment 462638 [details] [diff] [review]
Patch 6: Change navigator.language to use Accept-Language, using tokerizer for uppercasing

Here's another patch with better uppercasing code. It uses the tokenizer for this as well, and uppercases all 2-letter parts, apart from the first one.
Attachment #462638 - Flags: review?(bzbarsky)
(Assignee)

Comment 110

7 years ago
> I claim we don't care about [<about:config>], can impose any reasonable rules
> we want on [our prefs UI], and fully control [the defaults of the
> localized builds].  So if it simplifies our code, we should just assume
> whatever we want and impose corresponding rules on [the prefs UI].

OK, great, works for me.

Only catch is: the intl.accept_languages pref may have existing values. So, even if we change the prefs UI, the old values are still there. And unfortunately, they are in "de-de" notation. So, unless you want to migrate, we have to work with that.

I attached another patch with better lang part parsing. Make your pick. I'd take patch 6.
Comment on attachment 462638 [details] [diff] [review]
Patch 6: Change navigator.language to use Accept-Language, using tokerizer for uppercasing

I still think the uppercasing is silly, but r=me if you s/PRBool/bool/ for that thing you assign bools into.
Attachment #462638 - Flags: review?(bzbarsky) → review+

Updated

7 years ago
Attachment #462244 - Flags: review?(bzbarsky) → review-
(Assignee)

Updated

7 years ago
Attachment #462244 - Attachment is obsolete: true
(Assignee)

Comment 112

7 years ago
Comment on attachment 462638 [details] [diff] [review]
Patch 6: Change navigator.language to use Accept-Language, using tokerizer for uppercasing

what happens with sr?
Attachment #462638 - Flags: superreview?(bzbarsky)
(Assignee)

Comment 113

7 years ago
Created attachment 462663 [details] [diff] [review]
Patch 7: Change navigator.language to use Accept-Language

Playing Boules with bool and PRBool.
Attachment #462638 - Attachment is obsolete: true
Attachment #462663 - Flags: superreview?(bzbarsky)
Attachment #462663 - Flags: review+
Attachment #462638 - Flags: superreview?(bzbarsky)
Comment on attachment 462663 [details] [diff] [review]
Patch 7: Change navigator.language to use Accept-Language

Let's have jst do that.
Attachment #462663 - Flags: superreview?(bzbarsky) → superreview?(jst)
Comment on attachment 462663 [details] [diff] [review]
Patch 7: Change navigator.language to use Accept-Language

+    while (localeTokenizer.hasMoreTokens())
+    {
+      const nsSubstring &code = localeTokenizer.nextToken();
+      if (code.Length() == 2 && !first)
+      {
+        nsAutoString upper(code);
+        ::ToUpperCase(upper);
+        aLanguage.Replace(pos, code.Length(), upper);
+      }
+      pos += code.Length() + 1; // 1 is the separator
+      if (first)
+        first = false;

Might as well loose the if check there, the result will be the same w/o it and with less branching and less code.

sr=jst
Attachment #462663 - Flags: superreview?(jst) → superreview+

Comment 116

7 years ago
That bug basically means decreased usability for people on our website, as we can't offer the correct locale download matching the browser version they are using right now any more. Thanks for breaking us.

Updated

7 years ago
Attachment #462663 - Flags: approval2.0?
(Assignee)

Comment 117

7 years ago
KaiRo, wrong. Just use Accept-Language. That's the standard and what you should have used anyway.

Comment 118

7 years ago
(In reply to comment #117)
> KaiRo, wrong. Just use Accept-Language. That's the standard and what you should
> have used anyway.

Completely wrong. I don't give a damn about the preferred language of *web sites* for that user (and that's what Accept-Language is), I only care about the *UI language* he is using. If I can't match that, I probably should even think about trying about giving the user any specific preferred download but send him through a hurdle run of clicks to select it himself. Very user friendly, but thank you for giving me no other choice.
(In reply to comment #118)

> Completely wrong. I don't give a damn about the preferred language of *web
> sites* for that user (and that's what Accept-Language is), I only care about
> the *UI language* he is using. If I can't match that, I probably should even
> think about trying about giving the user any specific preferred download but
> send him through a hurdle run of clicks to select it himself. Very user
> friendly, but thank you for giving me no other choice.

Guys, you are _both_ right. Ben want to follow an IETF recommendation on
a privacy issue and KaiRo wants to match the UI language because that's the
only way he can serve correctly someone like me, ie someone browsing the web
in french when it's available but using only en-US software.

That said, KaiRo, I think the vast majority of internet users use a browser
UI locale matching the accept-language's topmost language, and only a small
minority of geeks have a different configuration.

Let me ask a naive question here: if I download a given localized version of
Firefox, is the Accept-Language set by default to match that language? If yes,
then at least KaiRo can rely on that for, again, the vast majority of users.
The minority of übergeeks will be annoyed a bit but hey we're always annoyed
by everything aren't we?

FWIW, I still think the war-on-privacy-issues goes too far here. Anyway...
(In reply to comment #119)
> Let me ask a naive question here: if I download a given localized version of
> Firefox, is the Accept-Language set by default to match that language?

Yes.

FWIW, this isn't just fixing a privacy issue. It's also about making navigator.language more useful when it's used on the client side very much like Accept-Language would be used on the server side.
(Assignee)

Comment 121

7 years ago
> if I download a given localized version of
> Firefox, is the Accept-Language set by default to match that language?

For the big locales, yes. For some small locales, there may be differences.

Yes, the download button detecting system and language is only for a good default, to make it easier for the majority of users.

I think a link/page "Other languages and systems", like Firefox has, should solve this for the most part.

Comment 122

7 years ago
(In reply to comment #119)
> Let me ask a naive question here: if I download a given localized version of
> Firefox, is the Accept-Language set by default to match that language?

Fully depends on the localizers, but usually, the UA language is *among* the value in Accept-Language, even if some carefully chosen variant of it might be the primary language listed there. Also, the user might change the Accept-Language at will, making it way more fingerprintable than the UI locale, and e.g. possible making "ger-saxon" their primary Accept-Language header when using a German (de) build, i.e. having an Accept-Language header including all of ger-saxon, de-DE, de, and possibly even en-US and/or en.

> The minority of übergeeks will be annoyed a bit but hey we're always annoyed
> by everything aren't we?

At least by all the paranoia going on with some things like the so-called "privacy" or "fingerprinting" threats introduced by UA strings.

And I'm very sensible about privacy matters usually, but there are clearly things where we are overdoing it while at the same time not working on stuff that ought to be much higher priority but may have higher impact overall, like the exposure of the plugin or installed font lists to the web, which are way more fingerprintable than ridiculous things like the UI languages or using a nightly.

Comment 123

7 years ago
In most cases, the first accept locale differs from the chosen locale. Maybe just in that it's de-de instead of de, or upper vs lower case things that folks reading locale-code specs should already deal with, but it's not exactly the same thing.

http://mxr.mozilla.org/l10n-mozilla1.9.2/search?string=intl.accept_languages&find=global/intl.properties for data.

Comment 124

7 years ago
In any case, as I can't be bothered to write a heuristic parser for that new stupid property, I'll just offer en-US builds when that locale doesn't give an exact match with one of the locales we can offer, everyone else needs to use the the "other platforms and languages link". Who needs usability anyhow.
(Assignee)

Comment 125

7 years ago
> usually, the UA language is *among* the value in Accept-Language

So, it's solvable. You can detect which UI locale the build is, and offer the right download, in most cases (98%? of users) at least. The others can fall back to "other languages".

Comment 126

7 years ago
(In reply to comment #125)
> > usually, the UA language is *among* the value in Accept-Language
> 
> So, it's solvable. You can detect which UI locale the build is, and offer the
> right download, in most cases (98%? of users) at least.

Only if I build a parser for the Accept-Languages list, which is quite some work for a simple download box...
(Assignee)

Comment 127

7 years ago
fairly simple:
var localeMapping = {
  "de-de" : "de",
  "fo-ba" : "ba",
  ...
}
var useLocale = "unknown";
for each (var entry in acceptLang.split(",")) { // separate langs
  // strip q and spaces
  let lang = entry.replace(/;.*/, "").replace(" ", "").toLowerCase();
  if (localeMapping[lang]) {
    useLocale = localeMapping[lang];
    break;
  }
}
if (useLocale == "unknown") {
  showOtherLangsInBiggerFont(); // or directly on page
  useLocale = "en-US";
}
var downloadURL = mirror + "seamonkey-" + currentVersion + "-" + platformSpec + "-" + useLocale + platformExtension;

That's 12 lines of JS code, plus the mapping (which is fairly static). It won't be much more in PHP or whatever you use on the website, if you want to do it there instead.
kairo, for PHP you can use my locale detection class:
http://granary.stage.mozilla.com/libs/l10n-demos/localeDetectionDemo.php

Comment 129

7 years ago
FTR, both ignore script tags, which we'll apparently not get for 4.0, but that are totally fine to use (not that we have UI for those).

Comment 130

7 years ago
Should we get a followup bug for removing the 'general.useragent.locale' pref from all.js and nuking the corresponding code/API in nsHttpHandler?

Comment 131

7 years ago
We don't have to have some pref to select the chrome locale, and I don't see a good argument for dropping this one.

Comment 132

7 years ago
Sorry, I don't follow -- you're saying it's unnecessary but you want to keep it? What useful information does it provide?
(In reply to comment #130)
> Should we get a followup bug for [...] nuking the corresponding code/API in nsHttpHandler?
(In reply to comment #116)
> That bug basically means decreased usability for people on our website, as we
> can't offer the correct locale download matching the browser version they are
> using right now any more. 

It would also be more convenient if people had their name, and their preferred language written on their foreheads. Yet, somehow, i don't think i would want to be one of those people, and i think the majority of people wouldn't be either. It's not all about convenience and usability. There are trade-offs, and cost-benefit analyses involved.

(In reply to comment #130)
> Should we get a followup bug for removing the 'general.useragent.locale' pref
> from all.js and nuking the corresponding code/API in nsHttpHandler?

I think the pref might be used by some add-ons that provide UI for switching between more than 2 langpacks. Not sure if this is still workable in FF 4... That said, http should no longer have anything to do w/ this pref. If it stays, it should only be for manual or addon-based UI locale manipulation.

Comment 135

7 years ago
I don't understand comment 131 or comment 133.

Comment 136

7 years ago
(In reply to comment #131)
> We don't have to have some pref to select the chrome locale, and I don't see a
> good argument for dropping this one.

Can't type.

We do have to have some pref to select the chrome locale, and I don't see a good argument for dropping g.u.locale.

Comment 137

7 years ago
It's now misnamed, that's all. We can leave the name as-is but we should probably remove the API for reading it on nsHttpHandler, because that doesn't belong there anymore.
Please wait until after we branch.
Attachment #462663 - Flags: approval2.0? → approval2.0-

Updated

6 years ago
Depends on: 610267
I do not feel comfortable taking this on cedar.  Please land this on mozilla-central when it's ready
Whiteboard: [defective-privacy] [has workaround, comment #59] → [defective-privacy] [has workaround, comment #59][not-ready-for-cedar]
http://hg.mozilla.org/mozilla-central/rev/ead683169ef2
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Component: Security → DOM
QA Contact: toolkit → general
Resolution: --- → FIXED
Summary: Don't reveal UI language to site/page → Change navigator.language to use Accept-Language instead of the UI language
Target Milestone: --- → mozilla2.2

Updated

6 years ago
No longer depends on: 580032
Duplicate of this bug: 580032

Updated

6 years ago
Whiteboard: [defective-privacy] [has workaround, comment #59][not-ready-for-cedar] → [defective-privacy]

Updated

6 years ago
No longer blocks: 71569
(Assignee)

Updated

6 years ago
Summary: Change navigator.language to use Accept-Language instead of the UI language → Don't reveal UI language to site/page -- Change navigator.language to use Accept-Language instead of the UI language
(Assignee)

Comment 142

6 years ago
Thanks, Dao, for commiting!
(Assignee)

Comment 143

6 years ago
(hihi. Almost 10 years after my first patch here. Do I get a prize? :) )

Updated

6 years ago
Blocks: 646428

Comment 144

6 years ago
@Ben: You get a Cookie (Brand name: Privacy Cookies). Congratz.
(Assignee)

Comment 145

6 years ago
nomnomnom
sheppy, I think this might be important for add-on developers and worthwhile documenting. Thanks!
Keywords: dev-doc-needed
Documentation updated:

https://developer.mozilla.org/en/DOM/window.navigator.language

Also mentioned on Firefox 5 for developers.
Keywords: dev-doc-needed → dev-doc-complete

Updated

6 years ago
tracking-firefox5: --- → ?

Comment 148

6 years ago
not interested for 5.
tracking-firefox5: ? → -
I thought this landed already on Aurora. Did it not?
Looks like it did, but release drivers still have no reason to track it in particular at this point.

Updated

6 years ago
tracking-firefox6: --- → ?

Updated

6 years ago
tracking-firefox6: ? → ---
Blocks: 418485
You need to log in before you can comment on or make changes to this bug.