Closed Bug 1274917 Opened 4 years ago Closed 3 years ago

Decide how to store original headers in cache

Categories

(Core :: Networking: Cache, defect)

48 Branch
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla49
Tracking Status
firefox49 --- fixed

People

(Reporter: dragana, Assigned: dragana)

References

Details

(Whiteboard: [necko-active])

Attachments

(1 file, 3 obsolete files)

nsHttpResponseHead now holds original headers as well (headers in the form as they arrived from the network)

Original headers and our header representation are stored in the same array with corresponding flags:
 - eVarietyResponseNetOriginalAndResponse - it is original header and our representation
 - eVarietyResponseNetOriginal - it is only original header
 - eVarietyResponse - it is only our representation.

We could: 
1) store the flag above into the cache
2) only store original header to the cache and parse them again as if they came from the network. In this way we will lose headers that are set for example by addons.
3) save both separately and try to merge them (this is more work)

I am more for version 1 maybe also version 2

We should also shortly think about what to do with requests partially served from network and partially from the cache.
Honza, suggestions?
Flags: needinfo?(honzab.moz)
Assignee: nobody → dd.mozilla
Whiteboard: [necko-active]
Can you please give me some examples of how can original and (by the code we control - not addons) modified headers look like?  Are we also adding our own headers (not just modifying) we want to keep in the HTTP cache?  One example that comes to my mind is e.g. the X-Firefox-Spdy header, but that is something not vital for the functionality, just a (not very well IMO) exposed info about the HTTP transaction.

Which form are we storing to the HTTP cache now?  How is it backward compatible?

Anyway, I think simply storing the flag along in a new (preferred) or updated existing metadata would do.  How will the old code that has no notion of header type varieties behave in this case?  Thinking of compatibility in both ways.
Flags: needinfo?(honzab.moz) → needinfo?(dd.mozilla)
Example: 

if headers from the network have only one "Link: something" header we will store it as eVarietyResponseNetOriginalAndResponse

if headers arriving from the network have 2 Link header firefox use to merge them and now it does it as well:
so we will have in the header list 3 headers:
header=link value=something variety = eVarietyResponseNetOriginal
header=link value=something2 variety = eVarietyResponseNetOriginal
header=link value=something,something2 variety = eVarietyResponse

If we have also "Location: somethingElse" it will be save as
header=location value=somethingElse variety = eVarietyResponseNetOriginalAndresponse

Our internal code still sees only header=link value=something,something2 and header=location value=somethingElse (I have not change that)

I have seen that the streamconverter adds Content-Type header sometime. (I had a quick look and have not seen other SetHeader()).
this will be added to the header list as eVarietyResponse.

Into cache now we are storing only headers that firefox sees (the same as before). From the example above we will store (without variety parameter):
header=link value=something,something2 variety = eVarietyResponse
and also
header=location value=somethingElse variety = eVarietyResponseNetOriginalAndresponse
header=content-type value=plain/text variety = eVarietyResponse

They are flatten without variety parameter at:
http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpChannel.cpp#2729
http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpHeaderArray.cpp#327

I think (correct me if I am wrong) the headers are read from cache using SetHeaderFromNet so the above stored header will all get variety = eVarietyResponseNetOriginalAndresponse. And this bug should fix that.

So your suggestion?
We could store "variety" a well and if we do not have it we interpreter it as a header coming from the network?
Flags: needinfo?(dd.mozilla)
Thanks.  Now I understand how this all works.

So, I'm for adding a new metadata "net-response-headers" (or alike) and set it a string flatten from only the eVarietyResponseNetOriginal headers (actually those you now skip at [1]).

When loading back, see if the meta is there.  

If not, we are loading the old cache entry and have to behave the same way as before this bug ever fixed (just go with eVarietyResponseNetOriginalAndResponse for all the headers unflatten from the cache, there is nothing better we could do).

If this new meta is there, then do the following:
- unflatten the net headers to a temp array
- walk one by one:
 - if the header already exists in the response head
  - and the value is the same 
   => skip this header (the existing one already has eVarietyResponseNetOriginalAndresponse, OTOH, this should probably never happen, as I'm thinking of it?)
  - and the value is different 
   => change variety on the existing header to eVarietyResponse and add the net header with eVarietyResponseNetOriginal
 - if the header doesn't exist in the response head
  => add it as eVarietyResponseNetOriginal (question is, can this ever happen too?)

This way you are mostly compatible.  It may break when headers are updated by an older necko version and the net-response-headers being possibly out-of-sync.  Though, that is not very likely and I would not bother about it.  This is no critical info we are trying to hold here, it's just for devtools.


[1] https://hg.mozilla.org/mozilla-central/annotate/5511d54a3f17/netwerk/protocol/http/nsHttpHeaderArray.cpp#l334
Attached patch bug_1274917_v1.patch (obsolete) — Splinter Review
Original header stored in the cache will not be updated. If devtools decide that they want something else we can change that behavior.
Attachment #8757840 - Flags: review?(honzab.moz)
(In reply to Dragana Damjanovic [:dragana] from comment #6)
> Created attachment 8757840 [details] [diff] [review]
> bug_1274917_v1.patch
> 
> Original header stored in the cache will not be updated. If devtools decide
> that they want something else we can change that behavior.

I'll ask again - please describe the patch, what id does, what it changes, how.  Please add reasonable level of details so I don't have to try hard to follow the code changes myself.  Thanks.
(In reply to Honza Bambas (:mayhemer) from comment #7)
> (In reply to Dragana Damjanovic [:dragana] from comment #6)
> > Created attachment 8757840 [details] [diff] [review]
> > bug_1274917_v1.patch
> > 
> > Original header stored in the cache will not be updated. If devtools decide
> > that they want something else we can change that behavior.
> 
> I'll ask again - please describe the patch, what id does, what it changes,
> how.  Please add reasonable level of details so I don't have to try hard to
> follow the code changes myself.  Thanks.

Added new meta data entry "orignal-response-head" (simple flatten "header: value \r\n...", similar to old response header)
It will be set on a new entry, it will not be changed if headers are updated after 304 or partial content 206)

nsresult Parse(char *block); change to nsresult ParseCachedHead(char *block); because it is a more clear name.
and nsresult ParseCachedOriginalHeader(char *block); is added.

When we are parsing meta data, original headers, if present, are parsed first so that we can keep there proper order. and they are added to header array as eVarietyResponseNetOriginal.
After original header, the old response header are parsed (in the same way as before) and added to the header array as ea VarietyResponse if there is not a corresponding eVarietyResponseNetOriginal header already or as eVarietyResponseNetOriginalAndResponse if a header is already present.
Comment on attachment 8757840 [details] [diff] [review]
bug_1274917_v1.patch

Review of attachment 8757840 [details] [diff] [review]:
-----------------------------------------------------------------

I didn't check the test.

mainly naming issues.

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +3357,5 @@
>  
>      // Get the cached HTTP response headers
> +    mCachedResponseHead = new nsHttpResponseHead();
> +
> +    // Get original headers first, because we need to keep there in order.

"keep there in order" ?

@@ +3358,5 @@
>      // Get the cached HTTP response headers
> +    mCachedResponseHead = new nsHttpResponseHead();
> +
> +    // Get original headers first, because we need to keep there in order.
> +    rv = entry->GetMetaDataElement("orignal-response-head", getter_Copies(buf));

it's not the whole "head", it's just headers.  the name must reflect that.

@@ +3368,5 @@
>      rv = entry->GetMetaDataElement("response-head", getter_Copies(buf));
>      NS_ENSURE_SUCCESS(rv, rv);
>  
>      // Parse the cached HTTP response headers
> +    rv = mCachedResponseHead->ParseCachedHead((char *) buf.get());

maybe update the comment above this method

@@ +4588,5 @@
>      responseHead->Flatten(head, true);
>      rv = entry->SetMetaDataElement("response-head", head.get());
>      if (NS_FAILED(rv)) return rv;
> +    head.Truncate();
> +    responseHead->FlattenOriginalHeader(head);

This has to be renamed to FlattenNetworkOriginalHeaders <- add s

::: netwerk/protocol/http/nsHttpHeaderArray.cpp
@@ +184,5 @@
> +    if (variety == eVarietyResponseNetOriginal) {
> +        return SetHeader_internal(header, value,
> +                                  eVarietyResponseNetOriginal);
> +    } else {
> +        uint32_t index = 0;

nsTArray::IndexOf returns size_t (use nsTArray<nsEntry>::index_type)

@@ +185,5 @@
> +        return SetHeader_internal(header, value,
> +                                  eVarietyResponseNetOriginal);
> +    } else {
> +        uint32_t index = 0;
> +        while (index != UINT32_MAX) {

The correct way is:

index != mHeaders.NoIndex

@@ +190,5 @@
> +            index = mHeaders.IndexOf(header, index, nsEntry::MatchHeader());
> +            if (index != UINT32_MAX) {
> +                nsEntry &entry = mHeaders[index];
> +                if (value.Equals(entry.value)) {
> +                    MOZ_ASSERT(entry.variety == eVarietyResponseNetOriginal);

something tells me to extend this assert to also accept eVarietyResponseNetOriginalAndResponse

::: netwerk/protocol/http/nsHttpResponseHead.cpp
@@ +553,4 @@
>  }
>  
>  nsresult
> +nsHttpResponseHead::ParseHeaderLine_locked(const char *line, bool fromCache)

the argument "fromCache" is wrongly named IMO... 
it should more be reflecting what kind of header we are parsing here.  there can be processed cached headers or original network raw headers.  this all effort is about special casing network raw headers, so the arg should be about that special case - bool originalNetwork or so (and revert the logic in the method).

::: netwerk/protocol/http/nsHttpResponseHead.h
@@ +85,5 @@
>  
>      // parse flattened response head. block must be null terminated. parsing is
>      // destructive.
> +    nsresult ParseCachedHead(char *block);
> +    nsresult ParseCachedOriginalHeader(char *block);

both need comments what they do and also a big fat note in what order they are expected to be called!

ParseCachedOriginalHeaders <- add s

but even better name would be ParseOriginalNetworkHeaders to express what it does more precisely


I would remove the word "Cached" because I believe these might be used not just for something we loaded from cache, but generally anything that's been produced with Flatten.  _Those_ names should mirror.

::: netwerk/test/unit/test_original_sent_received_head.js
@@ +7,5 @@
>  //  The "original header" is introduced to hold the header array in the order
>  //  and the form as they have been received from the network.
>  //  Here, the "original headers" are tested.
>  //
> +//  Original headers will be store in the cache as well. This test checks

stored
Attachment #8757840 - Flags: review?(honzab.moz) → feedback+
Comment on attachment 8757840 [details] [diff] [review]
bug_1274917_v1.patch

Review of attachment 8757840 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpHeaderArray.cpp
@@ +185,5 @@
> +        return SetHeader_internal(header, value,
> +                                  eVarietyResponseNetOriginal);
> +    } else {
> +        uint32_t index = 0;
> +        while (index != UINT32_MAX) {

and maybe do { } while (index != NoIndex); would do better here?
Attached patch bug_1274917_v2.patch (obsolete) — Splinter Review
Attachment #8757840 - Attachment is obsolete: true
Attachment #8758174 - Flags: review?(honzab.moz)
I kept "Cached" in the names because these functions are only used for reading from the cache and since they do something specific to the cache I would keep them in the name. If they start to be used for other stuff we can change the name.

Flatten... are use for other stuff too.
Comment on attachment 8758174 [details] [diff] [review]
bug_1274917_v2.patch

Review of attachment 8758174 [details] [diff] [review]:
-----------------------------------------------------------------

r+ but please only with all the comments and typos fixed.  thanks.

::: netwerk/protocol/http/nsHttpHeaderArray.cpp
@@ +184,5 @@
> +    if (variety == eVarietyResponseNetOriginal) {
> +        return SetHeader_internal(header, value,
> +                                  eVarietyResponseNetOriginal);
> +    } else {
> +        size_t index = 0;

s/size_t/nsTArray<nsEntry>::index_type

::: netwerk/protocol/http/nsHttpResponseHead.h
@@ +85,3 @@
>  
> +    // The next 2 functions parse flattened response head and original net headers.
> +    // They are use when we are reading an entry from the cache.

They are used

@@ +86,5 @@
> +    // The next 2 functions parse flattened response head and original net headers.
> +    // They are use when we are reading an entry from the cache.
> +    //
> +    // To keep proper order of the original headers we MUST call
> +    // ParseCachedOriginalHeader FIRST and than ParseCachedHead.

and then

thinking of putting them to a single method with two args (where the network orig can be null).  or having an assertion on call order.  but both are just nice to have ;)

@@ +88,5 @@
> +    //
> +    // To keep proper order of the original headers we MUST call
> +    // ParseCachedOriginalHeader FIRST and than ParseCachedHead.
> +    //
> +    //block must be null terminated. parsing is destructive.

// block

@@ +90,5 @@
> +    // ParseCachedOriginalHeader FIRST and than ParseCachedHead.
> +    //
> +    //block must be null terminated. parsing is destructive.
> +    nsresult ParseCachedHead(char *block);
> +    nsresult ParseCachedOriginalHeader(char *block);

ParseCachedOriginalHeaders
Attachment #8758174 - Flags: review?(honzab.moz) → review+
Attached patch bug_1274917_v2.patch (obsolete) — Splinter Review
Attachment #8758174 - Attachment is obsolete: true
Attachment #8758712 - Flags: review+
A typo.
Attachment #8758712 - Attachment is obsolete: true
Attachment #8758713 - Flags: review+
Keywords: checkin-needed
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3c2ac62acf83
Add net original headers to cache.r=mayhemer
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/3c2ac62acf83
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
The code introduced by this bug appears to have resulted in a change in the histograms NETWORK_CACHE_METADATA{_FIRST_READ}_SIZE [1] [2]. The median value has increased from 2.45K to 2.75K since this code's introduction [3].

Is this change expected? Is this to be the new normal? Or is it a regression?

[1] http://alerts.telemetry.mozilla.org/index.html#/detectors/1/metrics/1166/alerts/?from=2016-06-02&to=2016-06-02
[2] http://alerts.telemetry.mozilla.org/index.html#/detectors/1/metrics/1165/alerts/?from=2016-06-02&to=2016-06-02
[3] http://mzl.la/1tc4C9s
Flags: needinfo?(dd.mozilla)
(In reply to Chris H-C :chutten from comment #19)
> The code introduced by this bug appears to have resulted in a change in the
> histograms NETWORK_CACHE_METADATA{_FIRST_READ}_SIZE [1] [2]. The median
> value has increased from 2.45K to 2.75K since this code's introduction [3].
> 
> Is this change expected? Is this to be the new normal? Or is it a regression?
> 
> [1]
> http://alerts.telemetry.mozilla.org/index.html#/detectors/1/metrics/1166/
> alerts/?from=2016-06-02&to=2016-06-02
> [2]
> http://alerts.telemetry.mozilla.org/index.html#/detectors/1/metrics/1165/
> alerts/?from=2016-06-02&to=2016-06-02
> [3] http://mzl.la/1tc4C9s

This should be the new normal. For backwards compatibility this was the best way.
Thanks.
Flags: needinfo?(dd.mozilla)
We added a new metadata original-response-headers which are headers in the form received fro the network.
You need to log in before you can comment on or make changes to this bug.