Closed Bug 672448 Opened 13 years ago Closed 12 years ago

Clamp quality factor ('q') values to 2 decimal places

Categories

(Core :: Networking: HTTP, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla18

People

(Reporter: GPHemsley, Assigned: GPHemsley)

References

(Blocks 1 open bug, )

Details

(Keywords: dev-doc-complete, Whiteboard: [bcp47])

Attachments

(2 files, 7 obsolete files)

17.01 KB, patch
Biesinger
: review+
Details | Diff | Splinter Review
7.50 KB, patch
GPHemsley
: review+
Details | Diff | Splinter Review
The HTTP standard allows the quality ('q') value of a header (e.g. 'Accept-Language') to be up to 3 decimal places.[1] However, the Gecko code arbitrarily limits q values to 1 decimal place[2][3], which often creates situations where multiple items in a header ostensibly have the same q value.

For the sake of clarity, this value should be upped to the standard 3-decimal-place limit.

[1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.9
[2] http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttp.h#232
[3] http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpHandler.cpp#1152
Updates code to allow 3 decimal places in a q value, using the existing logic. The integer must be zero-padded to 3 digits.

Originally pushed to try without the zero-padding:
https://tbpl.mozilla.org/?tree=Try&rev=d173894afbdb

Pushed to try again, with the zero-padding added:
https://tbpl.mozilla.org/?tree=Try&rev=8c734e13342e
Assignee: nobody → gphemsley
Attachment #613059 - Flags: review?(cbiesinger)
Oh, I forget to mention. It's still not clear to me whether line 1328 in nsHttpHandler.cpp [1] needs to be updated in this patch, as I do know what the '11' value represents. But preliminary tests indicate that the code works as intended without changing it.

[1] https://hg.mozilla.org/mozilla-central/file/4721cf101ae6/netwerk/protocol/http/nsHttpHandler.cpp#l1328
Try run for 8c734e13342e is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=8c734e13342e
Results (out of 287 total builds):
    exception: 2
    success: 243
    warnings: 38
    failure: 4
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/gphemsley@gmail.com-8c734e13342e
(In reply to Mozilla RelEng Bot from comment #3)
> Results (out of 287 total builds):
>     exception: 2
>     success: 243
>     warnings: 38
>     failure: 4

These issues all appear to have been transient and unrelated, as most turned green when run again.

Question: How would I go about creating tests for this patch?
ok the 11 comes from the original checkin of this code, in attachment 19451 [details] [diff] [review] (yes, 5 digit attachment number, starting with a one). still trying to figure out the 11
Comment on attachment 613059 [details] [diff] [review]
Clamp quality factor ('q') values to 3 decimal places

this would've been easier to review if you hadn't done the reformatting as part of this patch :/

also relevant: attachment 118464 [details] [diff] [review]

OK my best guess is that the 11 is a remainder from when this function used to produce longer q= strings, so this should be fine.

I will give this an r+ when it has tests - it seems that you can just change the pref, create a channel, and call getRequestHeader on the new channel. no need to asyncOpen it.
Attachment #613059 - Flags: review?(cbiesinger) → review-
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #5)
> ok the 11 comes from the original checkin of this code, in attachment 19451 [details] [diff] [review]
> [details] [diff] [review] (yes, 5 digit attachment number, starting with a
> one). still trying to figure out the 11

If I'm not mistaken, the final patch from that bug was attachment 24745 [details] [diff] [review], no?

(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #6)
> hm... also https://bugzilla.mozilla.org/attachment.cgi?id=28697&action=edit

This is probably the more relevant one, though the comment history indicates they were basically just duplicates of each other.

I still don't understand what the 11 is. This code is so ancient...!

If understand this code correctly (and I may not), this adds up the length of the string o_Accept, then adds 11 * the number of language tags, then adds 1 more for the null byte?


    o_Accept = nsCRT::strdup(i_AcceptLanguages);
    if (!o_Accept)
        return NS_ERROR_OUT_OF_MEMORY;
    for (p = o_Accept, n = size = 0; '\0' != *p; p++) {
        if (*p == ',') n++;
            size++;
    }

    available = size + ++n * 11 + 1;
    q_Accept = new char[available];


But it then removes the length of each chunk added from that value as it writes the string. And then it checks to see if anything is left. Somehow, there is never anything left? Which means that the 11 is actually an important value somehow? I don't know...

    *q_Accept = '\0';
    q = 1.0;
    dec = q / (double) n;
    n = 0;
    p2 = q_Accept;
    for (token = nsCRT::strtok(o_Accept, ",", &p);
         token != (char *) 0;
         token = nsCRT::strtok(p, ",", &p))
    {
        token = net_FindCharNotInSet(token, HTTP_LWS);
        char* trim;
        trim = net_FindCharInSet(token, ";" HTTP_LWS);
        if (trim != (char*)0)  // remove "; q=..." if present
            *trim = '\0';

        if (*token != '\0') {
            comma = n++ != 0 ? "," : ""; // delimiter if not first item
            PRUint32 u = QVAL_TO_UINT(q);
            if (u < 10)
                wrote = PR_snprintf(p2, available, "%s%s;q=0.%u", comma, token, u);
            else
                wrote = PR_snprintf(p2, available, "%s%s", comma, token);
            q -= dec;
            p2 += wrote;
            available -= wrote;
            NS_ASSERTION(available > 0, "allocated string not long enough");
        }
    }


(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #7)
> Comment on attachment 613059 [details] [diff] [review]
> Clamp quality factor ('q') values to 3 decimal places
> 
> this would've been easier to review if you hadn't done the reformatting as
> part of this patch :/

Yeah, sorry about that. There were a lot of lines with trailing whitespace, which got automatically stripped upon saving. (I think I only changed about 3 real lines of code.)

> also relevant: attachment 118464 [details] [diff] [review]
> 
> OK my best guess is that the 11 is a remainder from when this function used
> to produce longer q= strings, so this should be fine.

Leave as-is, you mean?

> I will give this an r+ when it has tests - it seems that you can just change
> the pref, create a channel, and call getRequestHeader on the new channel. no
> need to asyncOpen it.

I'll need a bit of a walkthrough on that. I've never made tests for the C++ code before.

Thanks for all your help tracking stuff down!
This patch includes a test. I don't know if this is the best or cleanest way to do it, but it seems to work.

Pushed to try:
https://tbpl.mozilla.org/?tree=Try&rev=8fc0429f6644

(Apologies again for all the whitespace removal.)
Attachment #613059 - Attachment is obsolete: true
Attachment #617261 - Flags: review?(cbiesinger)
Try run for 8fc0429f6644 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=8fc0429f6644
Results (out of 304 total builds):
    exception: 3
    success: 262
    warnings: 38
    failure: 1
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/gphemsley@gmail.com-8fc0429f6644
Christian, what's the status of this?
Fix bitrot.
Attachment #617261 - Attachment is obsolete: true
Attachment #617261 - Flags: review?(cbiesinger)
Attachment #630249 - Flags: review?(cbiesinger)
Comment on attachment 630249 [details] [diff] [review]
Clamp quality factor ('q') values to 3 decimal places (v3)

Looks mostly good. You don't actually need any of the HTTP server stuff in the test though since you never open the channel, just delete all the httpserver stuff and the do_test_{pending,finished}
Attachment #630249 - Flags: review?(cbiesinger) → review+
Simplified the test per discussion on IRC.
Attachment #630249 - Attachment is obsolete: true
Attachment #631696 - Flags: checkin?
Keywords: checkin-needed
Comment on attachment 631696 [details] [diff] [review]
Clamp quality factor ('q') values to 3 decimal places (v4)

checkin-needed is enough
Attachment #631696 - Flags: checkin?
https://hg.mozilla.org/integration/mozilla-inbound/rev/d425c0c8b559
Flags: in-testsuite+
Keywords: checkin-needed
Target Milestone: --- → mozilla16
https://hg.mozilla.org/mozilla-central/rev/d425c0c8b559
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Blocks: 765048
No longer blocks: 765048
Depends on: 765048
This was backed out in bug 765048.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
To prevent the issue we had the last time with all the distracting whitespace changes, I put them into a separate patch. (Although I note that Splinter now highlights whitespace changes.)
Attachment #656035 - Flags: review?(cbiesinger)
These new changes ensure that trailing zeroes are not included in the Accept-Language header quality values, while also allowing up to 3 decimal places when necessary.

Apologies for the delay in getting this ready; I was working on another project.

Will this be eligible for uplift to Aurora?
Attachment #631696 - Attachment is obsolete: true
Attachment #656036 - Flags: review?(cbiesinger)
(In reply to Gordon P. Hemsley [:gphemsley] from comment #20)

> These new changes ensure that trailing zeroes are not included in the
> Accept-Language header quality values, while also allowing up to 3 decimal
> places when necessary.
> 

I think that's great, fwiw.

> Will this be eligible for uplift to Aurora?

once it is r+ for m-c you can always ask the drivers by setting the a? flags on the patch... they make that decision. Personally I wouldn't submit it - this is a feature not a bugfix and there are no top tier features that are depending on it, so it should ride the normal stability trains. But ymmv with the drivers.
Why should we add extra decimal places when the number of languages is low enough to rule out duplicate values? See also bug 765048 comment 9.
(In reply to Dão Gottwald [:dao] from comment #22)
> Why should we add extra decimal places when the number of languages is low
> enough to rule out duplicate values? See also bug 765048 comment 9.

I wouldn't let perfect be the enemy of the good here.. the base case sends the same # of bytes.

also q values have a magnitude, they aren't just ordinals for sorting - from 2616

"       Accept: audio/*; q=0.2, audio/basic

SHOULD be interpreted as "I prefer audio/basic, but send me any audio type if it is the best available after an 80% mark-down in quality."'
(In reply to Patrick McManus [:mcmanus] from comment #23)
> (In reply to Dão Gottwald [:dao] from comment #22)
> > Why should we add extra decimal places when the number of languages is low
> > enough to rule out duplicate values? See also bug 765048 comment 9.
> 
> I wouldn't let perfect be the enemy of the good here.. the base case sends
> the same # of bytes.

I'm confused -- What's the base case? My understanding is that we send reasonably few accept languages for most users, say 3 +/- 2, and that we'd start sending extra decimal places in these cases, which doesn't seem like a win by any measure.

> also q values have a magnitude, they aren't just ordinals for sorting - from
> 2616
> 
> "       Accept: audio/*; q=0.2, audio/basic
> 
> SHOULD be interpreted as "I prefer audio/basic, but send me any audio type
> if it is the best available after an 80% mark-down in quality."'

How would one interpret this for languages?
(In reply to Dão Gottwald [:dao] from comment #22)
> Why should we add extra decimal places when the number of languages is low
> enough to rule out duplicate values? See also bug 765048 comment 9.

Well, the main reason this patch doesn't do that is because I didn't reread those comments until after I already wrote it. :)

I suppose I could turn on 3 decimal places when n > 10, where n is the number of languages. But the main motivation for turning on 3 decimal places is to guard against multiple languages getting the same q-value. I'll have to check to make sure this doesn't occur with n < 10. (Or else, would n > 5 be enough to satisfy the majority case?)

(In reply to Dão Gottwald [:dao] from comment #24)
> (In reply to Patrick McManus [:mcmanus] from comment #23)
> > (In reply to Dão Gottwald [:dao] from comment #22)
> > > Why should we add extra decimal places when the number of languages is low
> > > enough to rule out duplicate values? See also bug 765048 comment 9.
> > 
> > I wouldn't let perfect be the enemy of the good here.. the base case sends
> > the same # of bytes.
> 
> I'm confused -- What's the base case? My understanding is that we send
> reasonably few accept languages for most users, say 3 +/- 2, and that we'd
> start sending extra decimal places in these cases, which doesn't seem like a
> win by any measure.

He was probably referring only to 'en-US', which currently has n = 2. However, a range of 1 <= n <= 5 is probably a fair measure for the majority of locale defaults. (Though, of course, we need to ensure we can handle n > 5, for user customization.)

> > also q values have a magnitude, they aren't just ordinals for sorting - from
> > 2616
> > 
> > "       Accept: audio/*; q=0.2, audio/basic
> > 
> > SHOULD be interpreted as "I prefer audio/basic, but send me any audio type
> > if it is the best available after an 80% mark-down in quality."'
> 
> How would one interpret this for languages?

Frankly, how would one interpret this for anything, in practice? It's easy to say, abstractly, that it's a "weight", but when it comes down to application, you can only send what you have.

Whether it be media types, languages, or something else, you'll basically only be sending which ever of your content has the highest quality value, regardless of whether it's 0.999 or 0.001.

So I think it's fine to disregard any abstract interpretation of "weightedness" beyond "better than the next one".

Unless I'm missing something?
(In reply to Gordon P. Hemsley [:gphemsley] from comment #25)
> I'll have
> to check to make sure this doesn't occur with n < 10.

Looks like it's safe to keep 1 decimal place for n < 10. It's only with n = 11 that duplicates begin to occur.

Now, theoretically, it might be good to then only use 2 decimal places until duplicates start to occur there. The problem is, I don't know what algorithm to use to determine that.

Would n < 10 be a satisfactory, simple constraint to use?
Two decimal places should be sufficient for n <= 100, as the difference between two values would be >= 0.01.
(In reply to Dão Gottwald [:dao] from comment #27)
> Two decimal places should be sufficient for n <= 100, as the difference
> between two values would be >= 0.01.

Ah, it seems you are correct.

n <= 10 : 1 decimal place
n <= 100 : 2 decimal places
n <= 1000 : 3 decimal places

Beyond that, and a valid q-value wouldn't be enough. (Of course, I'm fairly certain we'd hit the maximum length of a header field long before that point anyway.)

So what do you think? Just up it to 2 decimal places max? Because as I said in comment 25, the concept of "weight" doesn't mean much in practical applications.
Using 1 decimal place for n <= 10 and 2 decimal places after that seems good to me.
OK, this patch uses 1 decimal place when n < 10 (or when q is a factor of 0.1), and 2 decimal places otherwise.

I've also improved the test somewhat.
Attachment #656036 - Attachment is obsolete: true
Attachment #656036 - Flags: review?(cbiesinger)
Attachment #656242 - Flags: review?(cbiesinger)
(In reply to Gordon P. Hemsley [:gphemsley] from comment #30)
> Created attachment 656242 [details] [diff] [review]
> [2/2] Clamp quality factor ('q') values to 3 decimal places (v6)
> 
> OK, this patch uses 1 decimal place when n < 10 (or when q is a factor of
> 0.1), and 2 decimal places otherwise.

This should be n <= 10, right?
(In reply to Dão Gottwald [:dao] from comment #31)
> (In reply to Gordon P. Hemsley [:gphemsley] from comment #30)
> > Created attachment 656242 [details] [diff] [review]
> > [2/2] Clamp quality factor ('q') values to 3 decimal places (v6)
> > 
> > OK, this patch uses 1 decimal place when n < 10 (or when q is a factor of
> > 0.1), and 2 decimal places otherwise.
> 
> This should be n <= 10, right?

Well, it turns out not to matter, because when n == 10, all the q-values have trailing zeroes and become 1 decimal place anyway.

Also, since the Try autoposter seems to not be working:
(v5) https://tbpl.mozilla.org/?tree=Try&rev=de37bbbc8fc5
(v6) https://tbpl.mozilla.org/?tree=Try&rev=cd92f73ae223
Attachment #656035 - Flags: review?(cbiesinger) → review+
Comment on attachment 656242 [details] [diff] [review]
[2/2] Clamp quality factor ('q') values to 3 decimal places (v6)

+  // Save old value of preference for later.
+  let oldPref = intlPrefs.getCharPref( "accept_languages" );

please don't add those spaces around "accept_languages". I'll note you didn't use that style in getBranch, either.

also you don't really need to store the old value; you can just call clearUserPref at the end to reset to the default

+  for( let i = 0; i < acceptedLanguagesLength; i++ ) {

Codestyle is generally for (let i = ...; ...; i++) (note the spaces)

+    if( i == 0 ) {

similarly, if (i == 0) {

several occurrences of this, please fix all of them

+  chan.QueryInterface(Ci.nsIHttpChannel);
+  chan.requestMethod = "GET";

no need for these two lines
Attachment #656242 - Flags: review?(cbiesinger) → review+
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #33)
> Comment on attachment 656242 [details] [diff] [review]
> [2/2] Clamp quality factor ('q') values to 3 decimal places (v6)
> 
> +  // Save old value of preference for later.
> +  let oldPref = intlPrefs.getCharPref( "accept_languages" );
> 
> please don't add those spaces around "accept_languages". I'll note you
> didn't use that style in getBranch, either.
> 
> +  for( let i = 0; i < acceptedLanguagesLength; i++ ) {
> 
> Codestyle is generally for (let i = ...; ...; i++) (note the spaces)
> 
> +    if( i == 0 ) {
> 
> similarly, if (i == 0) {
> 
> several occurrences of this, please fix all of them

Sorry, I let my personal coding style seep in at some points. Fixed now.

> also you don't really need to store the old value; you can just call
> clearUserPref at the end to reset to the default

Well, I grab it in order to test it, and then I put it last in the list to save the step of restoring/resetting it.

> +  chan.QueryInterface(Ci.nsIHttpChannel);
> +  chan.requestMethod = "GET";
> 
> no need for these two lines

Deleting the first line prevents the test from running. But I can delete the second line safely, it seems, so I've done that.
Attachment #656242 - Attachment is obsolete: true
Attachment #656628 - Flags: review+
Keywords: checkin-needed
Unfortunately, this got caught by a recent change to make netwerk/ subject to warnings-as-errors. Backed out.
https://hg.mozilla.org/integration/mozilla-inbound/rev/2855dfc9e789

https://tbpl.mozilla.org/php/getParsedLog.php?id=14823227&tree=Mozilla-Inbound

../../../../netwerk/protocol/http/nsHttpHandler.cpp: In function 'nsresult PrepareAcceptLanguages(const char*, nsACString_internal&)':
../../../../netwerk/protocol/http/nsHttpHandler.cpp:1271:32: error: deprecated conversion from string constant to 'char*'
../../../../netwerk/protocol/http/nsHttpHandler.cpp:1274:32: error: deprecated conversion from string constant to 'char*'
Flags: in-testsuite+
Target Milestone: mozilla18 → ---
This likely just needs to replace:
char *qval_str;
With:
const char *qval_str;
Bit of a hiccup on inbound because of the following error:

../../../../netwerk/protocol/http/nsHttpHandler.cpp: In function 'nsresult PrepareAcceptLanguages(const char*, nsACString_internal&)':
../../../../netwerk/protocol/http/nsHttpHandler.cpp:1271:32: error: deprecated conversion from string constant to 'char*'
../../../../netwerk/protocol/http/nsHttpHandler.cpp:1274:32: error: deprecated conversion from string constant to 'char*'

Changed to 'const char*' per suggestion from jgilbert. Thanks to RyanVM for his patience. :)

Try Results:
https://tbpl.mozilla.org/?tree=Try&rev=766c2ef51437
Attachment #656628 - Attachment is obsolete: true
Attachment #656732 - Flags: review+
Keywords: checkin-needed
(In reply to Gordon P. Hemsley [:gphemsley] from comment #38)
> https://tbpl.mozilla.org/?tree=Try&rev=766c2ef51437

Green on Try (f'realz this time).

https://hg.mozilla.org/integration/mozilla-inbound/rev/963664f25295
https://hg.mozilla.org/integration/mozilla-inbound/rev/1418f9ce6f8b
Flags: in-testsuite+
Keywords: checkin-needed
Summary: Clamp quality factor ('q') values to 3 decimal places → Clamp quality factor ('q') values to 2 decimal places
https://hg.mozilla.org/mozilla-central/rev/963664f25295
https://hg.mozilla.org/mozilla-central/rev/1418f9ce6f8b
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
You need to log in before you can comment on or make changes to this bug.