Closed Bug 132957 Opened 23 years ago Closed 22 years ago

Implement full support for Vary header [was: Content negotiation prohibits caching]

Categories

(Core :: Networking: HTTP, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.4final

People

(Reporter: usch2000, Assigned: darin.moz)

References

()

Details

(Keywords: topembed+, Whiteboard: [http/1.1])

Attachments

(1 file, 3 obsolete files)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Win95; en-US; rv:0.9.9+) Gecko/20020322 BuildID: 2002032203 Documents that come with a "Vary: accept-charset" or "Vary: accept-language" response header are never cached, even if they have proper "Expires:", "Last-Modified:" and "Cache-Control:" headers. When the very same document is delivered without the "Vary:" header, caching works as expected. Reproducible: Always Steps to Reproduce: 1.Go to the URL given 2.Examine the cache related items in the "Page Info" box 3.Examine the "Cache entry information" page Actual Results: Document is marked as "not chached", "no expiration set", "size unknown". Expected Results: Document should be cached until it expires. It may expire early if the user changes the charset and/or language preferences, as indicated by the "vary:" response header. Caching negotiated documents did work in Version 0.9.5. It does not work in 0.9.8 and 0.9.9. I didn't try 0.9.6 and 0.9.7.
To HTTP. This is the "solution" we decided on in bug 37609....
Assignee: gordon → darin
Status: UNCONFIRMED → NEW
Component: Networking: Cache → Networking: HTTP
Ever confirmed: true
OS: Windows 95 → All
Hardware: PC → All
right, time to roll-up-the-sleeves and fix how we handle the Vary header.
Priority: -- → P3
Whiteboard: [http/1.1
Target Milestone: --- → mozilla1.1alpha
Status: NEW → ASSIGNED
Whiteboard: [http/1.1 → [http/1.1]
So we seem to have two separate problems here: 1. Page Info displays "not cached" although the page actually IS in cache. I submitted this as new bug 133015. 2. Handling of negotiated pages is not efficient as it could be, but fortunately not as bad as the page info indicates. The proper solution would be to store the request headers indicated by "Vary:" along with the cache entry, and revalidate only if they don't match the current set of headers, or if the entry is stale for some other reason.
Please assigned this bug to me.i'm sure i can fix it.
this bug is due to,mozilla cache page info in disk cache,but when mozilla return entry,the page's expires time is less than current time,then mozilla will clean the cache entry and don't retrun it.i will research it,i think i can avoid time's estimation.mozilla really cache the page info in disk cache.
Antonio, are you talking about fixing page info? Or about fixing the handling of the "Vary" header in the cache in general?
Target Milestone: mozilla1.1alpha → ---
mass futuring of untargeted bugs
Target Milestone: --- → Future
Blocks: 65092
Summary: Content negotiation prohibits caching → Implement full support for Vary header [was: Content negotiation prohibits caching]
darin, is this something we should have for http 1.1 support?
saari: yes and no. our current impl results in more server hits then probably necessary. however, server hits at least give websites control. i'd rate this low priority as far as HTTP/1.1 compliance bugs go. however, the fact that we don't fully support the Vary header most likely discourages sites from utilizing it, and from a standards point of view that's bad.
*** Bug 191781 has been marked as a duplicate of this bug. ***
this really should be fixed. we can fix it easily for: Vary: Accept-Language Vary: Accept-Charset fixing it for Vary: * is going to take a little more work. in the case of AL and AC, we can just check if our preferences have changed since the cache entry was filled. if so, then we can re-request the document. otherwise, we can just ignore the vary header. in the general case, we'd need to store all request headers. though, i suppose that could be done only when "Vary: *" appears in a cached response. incidentally devedge.netscape.com serves up most content with "Vary: Accept-Language", which triggers this bug in a severe way.
Severity: normal → major
Keywords: nsbeta1, topembed
Priority: P3 → P2
Target Milestone: Future → mozilla1.4alpha
plussing this for topembed, but recommend that we break out the "Vary: *" support into another bug as it does not appear to be as pervasive.
Keywords: topembedtopembed+
-> suresh
Assignee: darin → suresh
Status: ASSIGNED → NEW
moving to mozilla 1.4beta.
Status: NEW → ASSIGNED
Target Milestone: mozilla1.4alpha → mozilla1.4beta
Attached patch patch. (obsolete) — Splinter Review
Attachment #121556 - Flags: superreview?(darin)
Comment on attachment 121556 [details] [diff] [review] patch. >Index: nsHttp.h >+// helper fn to parse a header delimited by commas. >+static inline >+nsresult GetHeaderArray(const char *aHeader, nsCStringArray &aHeaderList) >+{ no reason why this can't be non-inline and defined in nsHttp.cpp instead :-) building up the full list of header values seems overkill to me. i think it'd better to try to just extract the next element of the comma delimited header value. the code that wishes to iterate over the list of Vary header values could do this instead: nsCAutoString buf; mResponseHead->GetHeader(nsHttp::Vary, buf); char *val = (char *) buf.get(); // going to wack |buf| char *tok = nsCRT::strtok(val, NS_HTTP_HEADER_SEPS, &val); while (tok) { // use |tok| // ... tok = nsCRT::strtok(val, NS_HTTP_HEADER_SEPS, &val); } it is probably sufficient to declare NS_HTTP_HEADER_SEPS like so: #define NS_HTTP_HEADER_SEPS ", \t" put this in nsHttp.h >Index: nsHttpChannel.cpp >+ for (PRInt32 i = 0; i < count; i++) { >+ varyheaderarray.CStringAt(i, element); >+ if (!PL_strcasestr(element.get(), "accept-language") && >+ !PL_strcasestr(element.get(), "accept-charset")) { strcasestr seems wrong. seems like you should be looking for string matches. so, strcasecmp would be better, assuming whitespace was properly trimmed. >+ processVary = PR_FALSE; >+ break; >+ } >+ } this seems to be overly costly. you are generating copies of each element of the header value and storing them in the array object. then you are copying each value out into the auto string |element|. first of all, i don't think there is any need to build up the array. instead, you could have a function that simply extracts the next element of the header string. >+ >+ if (!processVary) >+ return PR_FALSE; >+ >+ // if the value of any of the Vary headers has been changed, we must re-request >+ // the document. see bug 132957. Right now we just honor AL and/or AC headers. >+ for (PRInt32 j = 0; j < count; j++) { >+ varyheaderarray.CStringAt(j, element); >+ if (PL_strcasestr(element.get(), "accept-language")) { >+ nsXPIDLCString cachedal; >+ mCacheEntry->GetMetaDataElement("accept-language", getter_Copies(cachedal)); >+ if (!cachedal.IsEmpty()) { >+ nsCAutoString requestval; >+ mRequestHead.GetHeader(nsHttp::Accept_Language, requestval); >+ if (!requestval.Equals(cachedal)) avoid string copy... const char *requestval = mRequestHead.PeekHeader(nsHttp::Accept_Language); if (strcmp(cachedal.get(), requestval) != 0) same nit in the case of "accept-charset" >+ else if (ProcessVaryHeaders()) { >+ LOG(("Validating based on Vary headers returning TRUE\n")); >+ doValidation = PR_TRUE; nit: ProcessVaryHeader is too generic a name. maybe something more specific, like... ResponseWouldVary ??? >+ // Store Vary headers iff it contains AL and/or AC. >+ const char *val = mResponseHead->PeekHeader(nsHttp::Vary); >+ if (val) { ... >+ rv = GetHeaderArray(val, varyheaderarray); ... >+ for (PRInt32 i = 0; i < count; i++) { ... >+ } >+ >+ if (storeVary) { >+ //if we come here => the Vary header has either AL or AC or both. >+ for (PRInt32 j = 0; j < count; j++) { ... >+ } >+ } >+ } the above section of code seems more complicated then it needs to be. how about this instead: const char *vary = mResponseHead->PeekHeader(nsHttp::Vary); if (PL_strcasestr(vary, "accept-language")) { const char *al = mRequestHead.PeekHeader(nsHttp::Accept_Language); mCacheEntry->SetMetaDataElement("accept-language", al); } if (PL_strcasestr(vary, "accept-charset")) { const char *ac = mRequestHead.PeekHeader(nsHttp::Accept_Charset); mCacheEntry->SetMetaDataElement("accept-charset", ac); } >Index: nsHttpResponseHead.cpp >+ if (val) { >+ nsCStringArray varyheaderarray; >+ nsresult rv = GetHeaderArray(val, varyheaderarray); >+ if (NS_SUCCEEDED(rv)) { >+ PRInt32 count = varyheaderarray.Count(); >+ for (PRInt32 i = 0; i < count; i++) { >+ nsCAutoString element; >+ varyheaderarray.CStringAt(i, element); >+ //if Vary header contains other than AL or AC return true >+ if (!PL_strcasestr(element.get(), "accept-language") && >+ !PL_strcasestr(element.get(), "accept-charset")) { >+ maybe the code for testing if Vary contains more than AL or AC should be factored into a helper function. probably it would make sense to make this helper function be a member function of nsHttpResponseHead. that way you can just call it from nsHttpChannel. my appologies for not getting to this code review sooner!!
Attachment #121556 - Flags: superreview?(darin) → superreview-
topembed-
Keywords: topembed+topembed-
Attached patch updated patch (obsolete) — Splinter Review
Attachment #121556 - Attachment is obsolete: true
Attachment #122765 - Flags: superreview?(darin)
Attached patch v2 patch (obsolete) — Splinter Review
revised patch. i started reviewing this patch, got thinking about how some things could be refactored, and then i realized that we could easily add full support for the Vary header instead of just partial support. here's what i came up with.
Attachment #122765 - Attachment is obsolete: true
Attachment #122765 - Attachment is obsolete: false
Attachment #122765 - Flags: superreview?(darin)
btw: there's an internal testcase here, http://unagi.mcom.com/tests/vary/test.cgi the document changes every time it is loaded from the server. it can be reused from the cache provided the Accept-Language and Accept-Charset request headers do not vary.
Attachment #122788 - Flags: superreview?(alecf)
Attachment #122788 - Flags: review?(suresh)
Comment on attachment 122788 [details] [diff] [review] v2 patch >Index: nsHttpResponseHead.cpp >=================================================================== ... >- PL_strcasestr(val, "accept-charset") || >- PL_strcasestr(val, "accept-language"))) { >- LOG(("Must validate based on \"%s\" header\n", val)); >+ if (val && PL_strcasestr(val, "cookie")) { >+ LOG(("Must validate since Vary contains \"Cookie\"\n")); > return PR_TRUE; hmm..don't we need to parse the individual tokens here and check for "cookie"? Also, how about the * case (Vary: *). I think this patch/bug doesn't deal with this. right?
Attached patch v2.1 patchSplinter Review
add back support for "vary: *" ...thx suresh for catching that! i've also added some more testcases under http://unagi/tests/vary/
Attachment #122765 - Attachment is obsolete: true
Attachment #122788 - Attachment is obsolete: true
Attachment #122810 - Flags: superreview?(alecf)
Attachment #122810 - Flags: review?(suresh)
renominating for topembed status. this bug causes poor performance on websites employing server side content negotation, which is used to deliver localized content based on the user's language preferences as set in the browser. devedge performs poorly with gecko because of this bug, and i think it would help our evangelism efforts if we supported the Vary header. the patch is low enough risk that i think it should be considered for 1.4 final. momoi might be able to give examples of sites that use the Vary header to drive content negotiation. i know google does some content negotiation, but i'm not sure that it makes use of this particular header (yet!)
Assignee: suresh → darin
Status: ASSIGNED → NEW
Flags: blocking1.4?
Keywords: topembed-topembed
Attachment #122788 - Flags: superreview?(alecf)
Attachment #122788 - Flags: review?(suresh)
Comment on attachment 122810 [details] [diff] [review] v2.1 patch sweet! looks good to me. Thanks darin!! r-suresh
Attachment #122810 - Flags: review?(suresh) → review+
adt: nsbeta1- If this makes it for Buffy that's fine. Not a stop ship bug.
Keywords: nsbeta1nsbeta1-
Status: NEW → ASSIGNED
Target Milestone: mozilla1.4beta → mozilla1.4final
Comment on attachment 122810 [details] [diff] [review] v2.1 patch + // enumerate the elements of the Vary header... + char *val = (char *) buf.get(); make this a NS_CONST_CAST - nsCRT::strtok is going to whack "val" to make "buf" contain embedded nulls, watch out! there's one more case of this, same thing sr=alecf with the cast.
Attachment #122810 - Flags: superreview?(alecf) → superreview+
alecf: sure, NS_CONST_CAST is it, and i'll add some comments about the fact that we are intentionally munging the nsCAutoString's buffer.
Comment on attachment 122810 [details] [diff] [review] v2.1 patch seeking drivers approval for 1.4 final.. see comment #23 for a description of why this is important. patch is relatively low risk.
Attachment #122810 - Flags: approval1.4?
Discussed in top embed bug triage. Plussing.
Keywords: topembedtopembed+
Comment on attachment 122810 [details] [diff] [review] v2.1 patch a=asa (on behalf of drivers) for checkin to 1.4.
Attachment #122810 - Flags: approval1.4? → approval1.4+
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Flags: blocking1.4?
Blocks: 61682
QA Contact: tever → networking.http
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: