Closed Bug 468426 Opened 16 years ago Closed 14 years ago

Do not force validation when "Vary: Cookie" is in cached response and no Cookie header is in the request.

Categories

(Core :: Networking: HTTP, enhancement)

x86
All
enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla2.0b8

People

(Reporter: sylvain.pasche, Assigned: bjarne)

References

Details

(Keywords: perf)

Attachments

(1 file, 9 obsolete files)

Let's start with an example:

 * Open http://www.djangoproject.com/
 * Click on Documentation
 * Click on Home

With Firefox, clicking on Home will trigger a network request.  Now try with Google Chrome, Opera or Safari.  Clicking on Home uses the cache and no network request is made.

The response header contains:
 Expires: Mon, 08 Dec 2008 11:45:39 GMT
 Vary: Cookie
 Last-Modified: Mon, 08 Dec 2008 11:40:39 GMT
 ETag: "54620b91cbe5f0a1df2a9d8d65e01a16"
 Cache-Control: max-age=300

So if you have no Cookie for that domain the cached version of that entity should be used without validation.

The reason is explained here:
http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/src/nsHttpChannel.cpp?mark=1182-1195#1178

That's unfortunate because if you build a website that uses Cookie based authentication and want to use some caching, you will force every page to be validated even if you are not logged in (which is the case for most of the users).  Here's the frequency of that header from the Opera MAMA project:
http://dev.opera.com/articles/view/mama-http-headers/#vary

What could be done?

 * Hash the vary headers (including Cookie) instead of storing them in the cache.  That's what Chromium does.
 * Maybe more simple: Store the Cookie header in the cache without its value (so that there is no size or privacy issue).  When validating, do not require validation if the request has no Cookie header.
I'll grab this as a followup to bug #510359. See also bug #94123.
Assignee: nobody → bjarne
Suggested approach - see numerous TODOs in the patch.
+                                       Unfortunately we cannot store the
+                // cookie-values in the cache

Why not?


Your approach of storing this mapping in the HTTP handler doesn't survive restarts, so that's not really very useful...
(I'm thinking of nsICacheEntryDescriptor::setMetaDataElement)
I'm really just reading the comment in nsHttpChannel::ResponseWouldVary(). :) I couldn't find any log or discussion about this, but the comment is clear enough...

If we can use nsICacheEntryDescriptor::setMetaDataElement, life becomes a lot simpler...
You could do what comment 0 of this bug suggested and store a hash of the value.
Mmmm...  was that was what he meant? I misinterpreted the comment to be "store cookie-values in a (volatile) hashtable, not in the cache"...  But a proper cryptographic hash of the values could be stored safely in the cache, yes.

Would the functions in nsICryptoHash be suitable for this, or are there other, preferred functions in necko for this?
Yeah my comment could be misinterpreted. I meant to store the Cookie hash in the cache in a persistent way.
Well, good thing biesi caught my confusion. :)  I'll change it to use setMetaDataElement.

What about the strict way to compare cookies (including all attributes)? Do you think it will represent a problem in practice, or should I add a method to nsICookieService returning a string of cookies with only the attributes needed to compare them (and probably also listing cookies in a well-defined order)?
I think it's better not to try to be too smart here, also because extensions can set cookie headers on the channel (that the cookie service then wouldn't know about).
Right. But from the channel I can (afaics) only retrieve a string representing all cookies, including all attributes. The question is whether this string should be used as is, or if we should try to get a string containing only relevant attributes for comparison (in most cases a sub-set of the string we can obtain from the channel).

IMO, it will be smart to delegate generation of such a string to the cookie server which has access to parsed cookies and all the code for parsing cookie-strings anyway, (And from a design point-of-view, the cookie server may be the natural place to decide what is relevant when comparing two cookies for equality.)

The patch requires equality in all attributes and values of all cookies in order to re-use a cached variant of the url. This approach will not erroneously use a cached variant, and may work fine in practice, but I don't have enough experience with how cookies are used in real life to tell. Maybe we should try the strict approach and leave the rest as an optimization for later?
Yeah, I think that's fine. Anyway, RFC 2616 doesn't allow "smart" comparisons of headers for Vary:

   The selecting request-headers from two requests are defined to match
   if and only if the selecting request-headers in the first request can
   be transformed to the selecting request-headers in the second request
   by adding or removing linear white space (LWS) at places where this
   is allowed by the corresponding BNF, and/or combining multiple
   message-header fields with the same field name following the rules
   about message headers in section 4.2.
I agree that this is pretty clear, yes. However, I googled for "two cookies equal" and found that most software seems to use a subset of attributes when comparing, like name, value, domain and path.

Let's stick to RFC 2616 at the moment and maybe at some point later, someone will complain about us being too strict and we can reconsider. :)
This approach stores the MD5 hash of the cookie-values as metadata in the cache-entry.
Attachment #398774 - Attachment is obsolete: true
Attachment #412866 - Flags: review?(cbiesinger)
Not sure what happened, but the format in the previous patch toasted the try-server (applied nicely locally, though...)  This one should be ok.
Attachment #412866 - Attachment is obsolete: true
Attachment #412952 - Flags: review?(cbiesinger)
Attachment #412866 - Flags: review?(cbiesinger)
Comment on attachment 412952 [details] [diff] [review]
Updated to a proper patch-format...

This one only contains nsHttpChannel.h?
Attached patch Now contains it all... (obsolete) — Splinter Review
Jeez...  I must be getting too old for working late!
Here's the content...
Attachment #412952 - Attachment is obsolete: true
Attachment #413294 - Flags: review?(cbiesinger)
Attachment #412952 - Flags: review?(cbiesinger)
Comment on attachment 413294 [details] [diff] [review]
Now contains it all...

+                    // value for this header in cache, but no value in request
+                    if (!newVal) return PR_TRUE; // yes - response would vary

Interesting. So the old code screwed up this case?

Could you put the return statement on its own line, so that it's possible to set breakpoints on it?

+                        const char *hash = MD5Hash(newVal);

you're accessing free'd memory here. You need to add an out parameter to MD5Hash and pass an nsCAutoString or something to it.

+                            return PR_TRUE; // yes, response would vary

incorrect indentation

+            if (*token == '*') {
+                return PR_TRUE; // if we encounter this, just get out of here
             }
             else {

So now that you have an early return here, please remove the else so that you can avoid an indentation level for the rest of the block

+        // We will only use MD5

that doesn't seem like an informative comment

btw, any reason to use MD5 instead of a more secure hash (SHA-1)?

+    rv = mHasher->Update((unsigned char*)buf, strlen(buf));

please use C++-style casts

+    return hashString.get();

as mentioned above, you can't do that.

+                            // If hash failed, don't store anything

wouldn't it better to store _something_, so that we don't incorrectly assume that there was no cookie?
Attachment #413294 - Flags: review?(cbiesinger) → review-
(In reply to comment #18)
> (From update of attachment 413294 [details] [diff] [review])
> +                    // value for this header in cache, but no value in request
> +                    if (!newVal) return PR_TRUE; // yes - response would vary
> 
> Interesting. So the old code screwed up this case?

I think it would be caught by "if (newVal...)" and the break-statement, no?

> Could you put the return statement on its own line, so that it's possible to
> set breakpoints on it?

Ok.

> +                        const char *hash = MD5Hash(newVal);
> 
> you're accessing free'd memory here. You need to add an out parameter to
> MD5Hash and pass an nsCAutoString or something to it.

Of-course...  my bad!

> +                            return PR_TRUE; // yes, response would vary
> 
> incorrect indentation

Ok.

> +            if (*token == '*') {
> +                return PR_TRUE; // if we encounter this, just get out of here
>              }
>              else {
> 
> So now that you have an early return here, please remove the else so that you
> can avoid an indentation level for the rest of the block

Ok.

> 
> +        // We will only use MD5
> 
> that doesn't seem like an informative comment
> 
> btw, any reason to use MD5 instead of a more secure hash (SHA-1)?

Not really.  In fact, using SHA1 is probably better. I've changed the function to reflect this. (And removed the comment...)

> +    rv = mHasher->Update((unsigned char*)buf, strlen(buf));
> 
> please use C++-style casts

Ok. (But they are so looong... :)  )

> +    return hashString.get();
> 
> as mentioned above, you can't do that.

Right...

> +                            // If hash failed, don't store anything
> 
> wouldn't it better to store _something_, so that we don't incorrectly assume
> that there was no cookie?

Good point. That would handle any cases where this failure occurs and subsequent requests do not set this cookie (resulting in the cached entry being used). Will fix...
Attached patch Updated with latest comments (obsolete) — Splinter Review
Attachment #413294 - Attachment is obsolete: true
Attachment #413469 - Flags: review?(cbiesinger)
(In reply to comment #19)
> (In reply to comment #18)
> > (From update of attachment 413294 [details] [diff] [review] [details])
> > +                    // value for this header in cache, but no value in request
> > +                    if (!newVal) return PR_TRUE; // yes - response would vary
> > 
> > Interesting. So the old code screwed up this case?
> 
> I think it would be caught by "if (newVal...)" and the break-statement, no?

No? The old code assumed that validation was not necessary if oldVal != NULL but newVal == NULL. Right?
Attachment #413469 - Flags: review?(cbiesinger) → review-
Comment on attachment 413469 [details] [diff] [review]
Updated with latest comments

+                    if (NS_FAILED(rv)) return PR_TRUE;

please put the return on its own line here as well

+                    newVal = hash.get();

you still have the same problem. you need to move the nsCAutoString out of the if block.

+    unsigned char *tmp =
+        reinterpret_cast<unsigned char*>(const_cast<char *>(buf));

er, why are you casting away the const? the Update signature is:
  NS_SCRIPTABLE NS_IMETHOD Update(const PRUint8 *aData, PRUint32 aLen) = 0;

+                            nsCAutoString hash;

same problem - you need to move the string into the same scope as requestVal
Thanks for comment #22. Can only admit being guilty as charged... :)

(In reply to comment #21)
> No? The old code assumed that validation was not necessary if oldVal != NULL
> but newVal == NULL. Right?

You're right, of-course. That particular case would slip through using the old code...  new code is hopefully be safer in this respect.
Attachment #413469 - Attachment is obsolete: true
Attachment #414131 - Flags: review?(cbiesinger)
Comment on attachment 414131 [details] [diff] [review]
Updated according to latest comments

+    rv = mHasher->Finish(PR_TRUE, hash);

You need to call Init again after finish(). See http://mxr.mozilla.org/mozilla-central/source/netwerk/base/public/nsICryptoHash.idl#132

Also, this needs a unit test. Please add one.
Attachment #414131 - Flags: review?(cbiesinger) → review-
Attached patch Various updates (obsolete) — Splinter Review
Updated to latest comments by reviewer, and a unit-test has been added. The unit-test for bug #510359 is modified since it would fail with this fix.

Quite a bit of the logic has been fixed, including handling the fact that response-headers are named "Cookie" whereas request-headers are named "Set-Cookie", and making sure to clear metadata-values if the response do not contain a value for it.

If I have forgotten any cases in the unit-test, please let me know. It passes local tests - pushing to TryServer now...
Attachment #414131 - Attachment is obsolete: true
Attachment #414843 - Flags: review?(cbiesinger)
Comment on attachment 414843 [details] [diff] [review]
Various updates

+            // The response-header is named "Cookie", whereas the header
+            // we need to peek in the request is "Set-Cookie"
+            nsHttpAtom atom = nsHttp::ResolveAtom(token);
+            if (atom == nsHttp::Cookie)
+                atom = nsHttp::Set_Cookie;
+

Firstly, that's backwards. The request header is cookie, the response header is set-cookie.

Secondly, why does it matter what the response header is?

Thirdly, Set-Cookie can be and often is a subset of the Cookie header.
Thank god we have reviews...  :-0

> Secondly, why does it matter what the response header is?

I have misinterpreted the meaning of Vary: Cookie to take into account the cookies returned by the server! Jeez...   Reading it again, I see that it should only apply to cookies in subsequent requests.

Will fix asap...
Btw, wouldn't the last few lines of nsHttpChannel::AddCacheEntryHeaders() actually store cookies set by the server in cleartext in the cache...?
(In reply to comment #28)
> Btw, wouldn't the last few lines of nsHttpChannel::AddCacheEntryHeaders()
> actually store cookies set by the server in cleartext in the cache...?

Mmmm...  no. It probably only contains the first line, not the header-values?
It will not only store the first line, but there's a blacklist of headers not to include. See http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/src/nsHttpResponseHead.cpp#81
Attached patch Fixed my misinterpretation (obsolete) — Splinter Review
Here we go...
Attachment #414843 - Attachment is obsolete: true
Attachment #414851 - Flags: review?(cbiesinger)
Attachment #414843 - Flags: review?(cbiesinger)
Passes the TryServer afacs.
Comment on attachment 414851 [details] [diff] [review]
Fixed my misinterpretation

Just for the heck of it - changing reviewer and requesting approval for 2.0...

The patch implements proper (or at least way better) handling of Vary: Cookie headers, which we currently deal with in a rather harsh way. I can unbitrot it and verify it with tryserver if gets approved for 2.0
Attachment #414851 - Flags: review?(cbiesinger)
Attachment #414851 - Flags: review?(bzbarsky)
Attachment #414851 - Flags: approval2.0?
I'd really prefer for biesi or jduell to review this.
I'll change it when/if there is approval.
Comment on attachment 414851 [details] [diff] [review]
Fixed my misinterpretation

Changing reviewer
Attachment #414851 - Flags: review?(bzbarsky) → review?(cbiesinger)
Comment on attachment 414851 [details] [diff] [review]
Fixed my misinterpretation

For the two places where you hash the cookie header, please add a comment that the reason for doing it is for privacy reason & because they can be large (just like the old comment said)
Attachment #414851 - Flags: review?(cbiesinger) → review+
Carrying over r+ from previous version. Requesting approval for 2.0

Also pushed to tryserver since it's been some time...
Attachment #414851 - Attachment is obsolete: true
Attachment #491517 - Flags: review+
Attachment #491517 - Flags: approval2.0?
Attachment #414851 - Flags: approval2.0?
Comment on attachment 491517 [details] [diff] [review]
Unbitrotted, otherwise identical to previous

Seems to pass tryserver. Requesting approval...
Comment on attachment 491517 [details] [diff] [review]
Unbitrotted, otherwise identical to previous

This version of the patch doesn't test the condition, Bjarne says he'll attach a corrected version for approval.
Attachment #491517 - Flags: approval2.0? → approval2.0-
Looks like unit-test was lost somehow while unbitrotting previous version...  carried over r+ and requesting approval again
Attachment #491517 - Attachment is obsolete: true
Attachment #492649 - Flags: review+
Attachment #492649 - Flags: approval2.0?
Attachment #492649 - Flags: review+
Attachment #492649 - Flags: approval2.0? → approval2.0+
Keywords: checkin-needed
http://hg.mozilla.org/mozilla-central/rev/63c63bac58ec
Status: NEW → RESOLVED
Closed: 14 years ago
Flags: in-testsuite+
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b8
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: