Closed Bug 58034 Opened 24 years ago Closed 23 years ago

Accept-Language header needs q values

Categories

(Core :: Networking, defect, P3)

defect

Tracking

()

VERIFIED FIXED
mozilla0.9

People

(Reporter: havill, Assigned: havill)

References

Details

(Keywords: intl)

Attachments

(7 files)

You can change the priority of the languages within the preferences, but no q
values are attached to each language. Thus, true content negotiators such as
mod_negotiation in Apache view all languages as being preferred equally, as
according
to HTTP/1.1 <URL:http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4>
(IE properly adds q values to the languages)
Severity: minor → normal
OS: Linux → All
Hardware: PC → All
scc: do you have coding style objections ;-?

I think i don't like sprintf. but I'm not an expert.

Thank you for the diff, cvs diff -u is prefered.

Would someone review this patch?

BenB: isn't there a bug complaning that this sort of thing is a privacy 
violation? reporter: ignore this, it's a note to self.

Thanks again for working on mozilla.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: mozilla0.9, patch, review
timeless, no, Accept-Language is OK. You mean bug 55366
 (see explanation there, why Accept-Lanaguage is OK).
Related issues are discussed in Bug 55800.
The reporter wanted to have UI to generate Q values. I argued
against it and suggested a simple backend
solution which reflects the order of languages
chosen by the user. I also said in that bug that
severs will take the accept-language list sent
by a client even without Q values since Q values are
optional.

It's good to see q-values being added. This is, if I understand the patch right,
far better than nothing. I feel it is still suboptimal of a GUI where the user
may set q-values, since language preferences of an individual is hardly
something that can be modelled by a simple function in the general case. While
q-values are optional, skipping the q-value means that all listed languages are
equivalent for the user, and I suspect few users have language skills where they
are fluent in a bunch of languages, but cannot understand a word of any others.

Let me name an example where the use of q-values would be a great advantage:
You have a site, which is frequently updated and translated into other
languages. However, the translations may always lag behind the original to some
extent, and different translations may lag behind differently. The obvious thing
to do, is to serve the documents where the server q-value is smaller for an
older version of the document. In this case, it would be important for a user to
specify exactly how important it is to have the document in his or her language,
as quantified by the user-specified q-values. Say, the original is in English
but there exists a bit older version in e.g. Spanish with server q-value=0.8. A
Spanish user is able to read some English, and will accept languages es, en.
Without q-values, the server will serve the English version, since it is newer. 
With q-values, the user may specify e.g. es;q=1, en;q=0.2, the language
negotation will turn out with es;q=0.8, en;q=0.2 (product of user and server
q-values), and thus serve the Spanish version. Which is in accordance with the
user's wishes. Now, generated q-values might reflect this to a certain extent,
but not as much as if the user had been allowed to choose for themselves.
Kjetil,

Under Comm 4.x, the user can input "Q" values manually when
a language is added. So instead of choosing
Chinese (zh), the user can manually input

zh;q=0.75

and Communicator will remember the entry.

There is a bug filed for allowing manual input of
what is allowed in Accept-Language syntax. This is
Bug 54093. I have requested there that Q value input
such as:

es;q=0.84

be accommodated.

What this means for this bug is that the fix needs to
ignore an entry which already has a "Q" value assigned.
Since the fix seems to be generating arbitrary
descending values, then what we probably need to do
is to adjust to what has been manually inserted.
For example, if there are entries like the following:

ja
en;q=0.72    <-- manually inserted
zh

in this order, then the assignment mechanism needs
to re-adjust value for zh taking the the Q value or
en as the starting value.

If Bug 54093 gets fixed and allows "Q" value input,
then surely this bug fix needs to accommodate it.
If that is too complicated or undesirable for this
bug, then we may defer that decision and disallow
"Q" value input in Bug 54093.
Adrian and Shanjian may want to coordinate this.

As I argued elsewhere, creating "Q" value UI would be
too complicated and will not even serve the majority of users
since they don't know how to even use such mechanisms.
But code should be able to adjust to manually input values.
Katsuhiko Momoi wrote:
> What this means for this bug is that the fix needs to
> ignore an entry which already has a "Q" value assigned.

It already does this. See line 44 of the patch.

> Since the fix seems to be generating arbitrary
> descending values, then what we probably need to do
> is to adjust to what has been manually inserted.

Not exactly "arbitrary". 1 is divided amongst the languages present. For
example, if you have three languages, the first gets a q of 1.000, the second
gets 0.667, the third gets 0.333.
Adrian,

Thanks for the clarification.
Please clarify further if the fix simply removes
existing "q=..:" value and replaces it with
whatever your algorithm assigns. If so, that would
change what the user has entered as preferred Q values.
If Bug 54093 gets fixed to allow "Q" value input
manually, the user may not see that value unless the
algorithm takes that value into account and adjust any
values below that entry taking that value as the base.
Katsuhiko Momoi wrote:
> Please clarify further if the fix simply removes
> existing "q=..:" value and replaces it with
> whatever your algorithm assigns.

Yes, it removes and replaces with its own computed value based on the list
position and total list items.

> If so, that would
> change what the user has entered as preferred Q values.

In regards to Bug 54093, I stated that I don't think the ability to enter Q
values directly should be allowed as this provides the same functionality. The
patch attached to Bug 54093 add form input checking, and disallow the entering
of raw HTTP stuff like Q values and such.
Depends on: 65092
No longer depends on: 65092
Blocks: 65092
Hm, I must admit that I haven't had time to RTFM lately, I see it was a rather
lengthy discussion in Bug 54093, perhaps I should shut up, but I hope it is OK I
voice my concern about this:
>In regards to Bug 54093, I stated that I don't think the ability to enter Q
>values directly should be allowed as this provides the same functionality. The
>patch attached to Bug 54093 add form input checking, and disallow the entering
>of raw HTTP stuff like Q values and such.

I mean, it may be newbie-friendly, they may not really care, but I I'd really
like something that is expert-friendly as well, and completely disallowing raw
HTTP stuff doesn't sound good to me. And I don't understand how a simple model
can provide the same functionality as entering q-values by hand or by GUI.
For example, my current language string looks like this:
no-bok;q=1, nb;q=1, no;q=0.95, nn;q=0.94, da;q=0.7, en;q=0.68, sv;q=0.65,
de;q=0.4, is;q=0.25, nl;q=0.1, fr;q=0.05, eo;q=0.03, *;q=0.001
and that's hard to represent by a simple function, be it linear or exponential
or whatever, especially when the difference in ability to read certain languages
are marginal, like in my case.
Gagan,
This patch looks good to me. Could you review this too.

Neeti
Target Milestone: --- → mozilla0.9
I just have a few comments about this patch:

1) It would be nice if there were some comments describing what the input looks
   like and what the corresponding output will be.

2) Can we replace the sprintf's with PR_snprintf's?  It scares me to use
   sprintf since you have no guarantee that it won't scribble past the
   end of the buffer.  Using PR_snprintf will require that we keep track
   of the buffer size, but I don't think that's a big burden given the
   benefit of making the code more robust/durable.
We shouldn't be doing all this processing inside the getAcceptLanguages call.
Instead this should be in the setAcceptLanguages and the qvalues should then be
cached for subsequent calls to the get function. Not just would it save several
computing cycles, it will also be faster. Other than that, the PR_snprintf would
be more desirable. 
Oh shoot! Attached the right patch to the wrong bug (although Darin and Gagan's
comments still apply). The last patch was for Accept-Charset not Accept-Language.
Could you submit your patch using 

cvs diff -u , and not change the filename to nsHTTPHandler.cpp.new

Thanks,
Neeti
Couple of things about the new patch.
1. We do need mAcceptLanguagesPrepped, we can just use mAcceptLanguages.
2. We could put all the functionality of PrepareAcceptLanguages(..) in 
SetAcceptLanguages(..) itself, no need for a new function.
I meant We do NOT need mAcceptLanguagesPrepped, we can just use 
mAcceptLanguages, in my comments above.
seems like havill needs to submit an updated patch.
Assignee: gagan → havill
Neeti: I used the "...Prepped" value thinking that someone may want to retrieve
the original value that was put into SetAcceptLanguages with some kind of a
"Get" in the future. I made the changes you suggested against the latest tree...
merging the extra function all into the SetAcceptLanguages.

However, it appears that the SetAcceptCharset code that also uses the "Prepped"
(to hold the modified value) and it was put into the tree, so I'm not clear
which strategy Mozilla wants.

Anyway, moving the code in and out of a function and using or not using the
Prepped variable isn't rocket science, so I'll include two patches... one
keeping the Prepped variable like SetAcceptCharset does and the separate
function (updated against the current tree per Gagan's request), and another
reflecting Neeti's suggestions.

Need a reviewer to pick the one they want and check it into the tree please.
Keywords: intl
r=neeti for updated patch(Gagan) id = 28696. to be consistent with what's 
already in the tree.

Darin: could you do an sr= on this?
cc'ing darin for sr=
sr=darin
Checked in patch
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
verified
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: