Closed
Bug 1377223
Opened 7 years ago
Closed 7 years ago
RCWN - Should we revalidate when racing and the cache wins
Categories
(Core :: Networking, enhancement)
Core
Networking
Tracking
()
RESOLVED
FIXED
mozilla56
Tracking | Status | |
---|---|---|
firefox56 | --- | fixed |
People
(Reporter: valentin, Assigned: michal)
References
(Blocks 1 open bug)
Details
(Whiteboard: [necko-active])
Attachments
(1 file, 1 obsolete file)
6.07 KB,
patch
|
michal
:
review+
benjamin
:
review+
|
Details | Diff | Splinter Review |
In the case of immediate racing, where the network request is sent immediately after asyncOpen, we might get to a situation where the cache wins, but needs to revalidate the entry. Since the network has already been triggered, no conditional request is sent, so we have to wait for the initial one to complete. Another path would be to cancel the initial network request, and issue a second one. But since we only race for small resources, this might not be necessary. Also, another BUG is that we should not ignore the network response that comes back after the cache wins.
Comment 1•7 years ago
|
||
We had a plan in the past to add some info needed for the reval/don't reval decision to the cache index. I even believe there is a bug filed for this. If there is an ETag, you only need to wait for the headers. If the server responses with the same ETag, go from the cache (it will probably be faster than from the cache *)) I think the same applies when only the Last-Modified header is present for revalidation. If identical, you can go from cache. And there is always the trivial solution - ignore the cached content completely if it's small enough *), since it is very likely the content comes quickly after the headers from the net. *) depends... :D
Reporter | ||
Comment 2•7 years ago
|
||
> Also, another BUG is that we should not ignore the network response that
> comes back after the cache wins.
Incomplete description: the case in which we might ignore the network response is for partial cache entries - and the request would get stuck.
Comment 3•7 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #2) > > Also, another BUG is that we should not ignore the network response that > > comes back after the cache wins. > > Incomplete description: the case in which we might ignore the network > response is for partial cache entries - and the request would get stuck. Ah! Yes.. that's a problem. Maybe at the first stage of this feature evolution always use the network response when you find the cache entry being partial (for the sake of simplicity and according how many partial requests we usually have cached - I believe a very small amount?) The more correct solution IMO is to add some more info to the index, like L-M, ETag (which is tho an arbitrary length string), is-partial, and do early decision based on it.
Reporter | ||
Comment 4•7 years ago
|
||
Michal, do you think you could take this bug?
Flags: needinfo?(michal.novotny)
Assignee | ||
Updated•7 years ago
|
Assignee: valentin.gosu → michal.novotny
Flags: needinfo?(michal.novotny)
Assignee | ||
Comment 5•7 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #2) > > Also, another BUG is that we should not ignore the network response that > > comes back after the cache wins. > > Incomplete description: the case in which we might ignore the network > response is for partial cache entries - and the request would get stuck. How this can happen? I'm trying to reproduce the problem but it's IMO not possible. In case the cache wins and the network was already triggered, we ignore the cache entry. In case the cache wins and the network was not yet triggered, we set up the range request and trigger the network, but we're not racing (mRaceCacheWithNetwork == false) so the network response isn't ignored. Did I miss some other possibility?
Flags: needinfo?(valentin.gosu)
Assignee | ||
Comment 6•7 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #3) > The more correct solution IMO is to add some more info to the index, like > L-M, ETag (which is tho an arbitrary length string), is-partial, and do > early decision based on it. This would be nice but definitely not something that can be done for release 57. I don't like canceling already sent non-conditional request to be able to send a conditional one. We should probably get some telemetry to know how often this happens. We could also have some minimum delay time in MaybeRaceCacheWithNetwork, which would eliminate this problem in case we think the cache is slow but it isn't.
Reporter | ||
Comment 7•7 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #5) > (In reply to Valentin Gosu [:valentin] from comment #2) > In case the cache wins and the network was already triggered, we ignore the > cache entry. I think this case was the reason behind this bug. The scenario was: We race cache and network. Cache wins, so we set mFirstResponseSource = cache Cache entry is partial, so we try issue a request for the full content, but we don't, since we already triggered the network. Raced network request comes back, but we ignore it since mFirstResponseSource == cache. Right now I can't seem to figure out how we ignore the cache entry if the network was already triggered.
Flags: needinfo?(valentin.gosu) → needinfo?(michal.novotny)
Assignee | ||
Comment 8•7 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #7) > (In reply to Michal Novotny (:michal) from comment #5) > > (In reply to Valentin Gosu [:valentin] from comment #2) > > > In case the cache wins and the network was already triggered, we ignore the > > cache entry. > > I think this case was the reason behind this bug. > The scenario was: > We race cache and network. > Cache wins, so we set mFirstResponseSource = cache > Cache entry is partial, so we try issue a request for the full content, but > we don't, since we already triggered the network. > Raced network request comes back, but we ignore it since > mFirstResponseSource == cache. > > Right now I can't seem to figure out how we ignore the cache entry if the > network was already triggered. When the cache wins mFirstResponseSource = cache is set in ReadFromCache. When the cache entry is partial we first send the range request and we start to read from the cache entry in ProcessPartialContent() after the server responds. Honza, is is possible that both mCachedContentIsValid and mCachedContentIsPartial are true? I don't think so, OTOH this check would be then useless http://searchfox.org/mozilla-central/rev/a83a4b68974aecaaacdf25145420e0fe97b7aa22/netwerk/protocol/http/nsHttpChannel.cpp#747
Flags: needinfo?(michal.novotny) → needinfo?(honzab.moz)
Comment 9•7 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #8) > Honza, is is possible that both mCachedContentIsValid and > mCachedContentIsPartial are true? Yes, they can. In the most common case when we receive 206 and start reading from the cache. I don't follow what's the big deal here. From the nature of this feature, it's possible we send a non-range request for the content and later discover a cached entry that is partial. There are, IIUC, only two outcomes: - network response comes before OnCacheEntryCheck call (which means also before ReadFromCache()) -> we ignore any cached content - OnCacheEntryCheck comes before OnStartRequest from the transaction and only then we find out the entry is partial -> we ignore cache since we are about to get a complete request from the server (it's already on its way) I think it's OK for the first impl to it this way. Doing a second request (best via a redirect to itself, while the cache entry will be delivered synchronously to the target channel) is doable but IMO more only an optimization.
Flags: needinfo?(honzab.moz)
Assignee | ||
Comment 10•7 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #9) > (In reply to Michal Novotny (:michal) from comment #8) > > Honza, is is possible that both mCachedContentIsValid and > > mCachedContentIsPartial are true? > > Yes, they can. In the most common case when we receive 206 and start > reading from the cache. Sure, I meant before we sent the range request. > I don't follow what's the big deal here. From the nature of this feature, > it's possible we send a non-range request for the content and later discover > a cached entry that is partial. There are, IIUC, only two outcomes: > > - network response comes before OnCacheEntryCheck call (which means also > before ReadFromCache()) -> we ignore any cached content > - OnCacheEntryCheck comes before OnStartRequest from the transaction and > only then we find out the entry is partial -> we ignore cache since we are > about to get a complete request from the server (it's already on its way) > > I think it's OK for the first impl to it this way. Doing a second request > (best via a redirect to itself, while the cache entry will be delivered > synchronously to the target channel) is doable but IMO more only an > optimization. This confirms what I wrote in comment #5. I.e. no problem mentioned in comments #2 and #3 exists in the current code.
Comment 11•7 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #10) > (In reply to Honza Bambas (:mayhemer) from comment #9) > > (In reply to Michal Novotny (:michal) from comment #8) > > > Honza, is is possible that both mCachedContentIsValid and > > > mCachedContentIsPartial are true? > > > > Yes, they can. In the most common case when we receive 206 and start > > reading from the cache. > > Sure, I meant before we sent the range request. Then I think no. If they were, would it break? > > > I don't follow what's the big deal here. From the nature of this feature, > > it's possible we send a non-range request for the content and later discover > > a cached entry that is partial. There are, IIUC, only two outcomes: > > > > - network response comes before OnCacheEntryCheck call (which means also > > before ReadFromCache()) -> we ignore any cached content > > - OnCacheEntryCheck comes before OnStartRequest from the transaction and > > only then we find out the entry is partial -> we ignore cache since we are > > about to get a complete request from the server (it's already on its way) > > > > I think it's OK for the first impl to it this way. Doing a second request > > (best via a redirect to itself, while the cache entry will be delivered > > synchronously to the target channel) is doable but IMO more only an > > optimization. > > This confirms what I wrote in comment #5. I.e. no problem mentioned in > comments #2 and #3 exists in the current code. Good!
Assignee | ||
Comment 12•7 years ago
|
||
Valentin, note that I also changed the condition when calling Unused << ReadFromCache() in this patch, but it is just a cosmetic change. They should be identical but I think this is easier to understand at the first look.
Attachment #8888502 -
Flags: review?(valentin.gosu)
Attachment #8888502 -
Flags: review?(benjamin)
Assignee | ||
Comment 13•7 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #11) > (In reply to Michal Novotny (:michal) from comment #10) > > (In reply to Honza Bambas (:mayhemer) from comment #9) > > > (In reply to Michal Novotny (:michal) from comment #8) > > > > Honza, is is possible that both mCachedContentIsValid and > > > > mCachedContentIsPartial are true? > > > > > > Yes, they can. In the most common case when we receive 206 and start > > > reading from the cache. > > > > Sure, I meant before we sent the range request. > > Then I think no. If they were, would it break? Then it could break, but this would mean that ReadFromCache could be called before validating the partial content with the server. It doesn't make sense to me and AFAICS nsHttpChannel cannot handle this.
Reporter | ||
Comment 14•7 years ago
|
||
Comment on attachment 8888502 [details] [diff] [review] fix Review of attachment 8888502 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/protocol/http/nsHttpChannel.cpp @@ +9353,5 @@ > } > > + if (mRaceDelay < sRCWNMinWaitMs) { > + mRaceDelay = sRCWNMinWaitMs; > + } As discussed on IRC, let's wait for telemetry, and we can land the minimum wait in a separate bug. ::: toolkit/components/telemetry/Histograms.json @@ +2445,5 @@ > "alert_emails": ["necko@mozilla.com", "bsmedberg@mozilla.com"], > "bug_numbers": [1354405] > }, > + "NETWORK_RACE_CACHE_VALIDATION": { > + "record_in_processes": ["main", "content"], Since nsHttpChannel does not run in the content process, and we only accumulate there, is there any point in keeping the content process here?
Attachment #8888502 -
Flags: review?(valentin.gosu) → review+
Assignee | ||
Comment 15•7 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #14) > ::: toolkit/components/telemetry/Histograms.json > @@ +2445,5 @@ > > "alert_emails": ["necko@mozilla.com", "bsmedberg@mozilla.com"], > > "bug_numbers": [1354405] > > }, > > + "NETWORK_RACE_CACHE_VALIDATION": { > > + "record_in_processes": ["main", "content"], > > Since nsHttpChannel does not run in the content process, and we only > accumulate there, is there any point in keeping the content process here? No, I just copied another existing probe. Recording in content and main to all existing probes was added in bug 1335343. I will remove it.
Assignee | ||
Comment 16•7 years ago
|
||
Attachment #8888502 -
Attachment is obsolete: true
Attachment #8888502 -
Flags: review?(benjamin)
Attachment #8888556 -
Flags: review+
Assignee | ||
Updated•7 years ago
|
Attachment #8888556 -
Flags: review?(benjamin)
Assignee | ||
Comment 17•7 years ago
|
||
Comment on attachment 8888556 [details] [diff] [review] patch v2 - removed the minimum delay Could we get the data review today, please? We need to land this before a shield study starts on Monday.
Attachment #8888556 -
Flags: feedback?(francois)
Comment 18•7 years ago
|
||
Comment on attachment 8888556 [details] [diff] [review] patch v2 - removed the minimum delay data-r=me - I apologize for not getting to this Friday
Attachment #8888556 -
Flags: review?(benjamin)
Attachment #8888556 -
Flags: review+
Attachment #8888556 -
Flags: feedback?(francois)
Comment 19•7 years ago
|
||
Pushed by mnovotny@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/69e573abaccb RCWN - Should we revalidate when racing and the cache wins, r=valentin, data-r=bsmedberg
Comment 20•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/69e573abaccb
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox56:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
You need to log in
before you can comment on or make changes to this bug.
Description
•