Closed Bug 1395760 Opened 5 years ago Closed 4 years ago

Stop serving partially cached content in offline mode

Categories

(Core :: Networking: Cache, defect, P2)

defect

Tracking

()

VERIFIED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: janakaud+mozilla, Assigned: mayhemer)

References

Details

(Whiteboard: [necko-triaged])

Attachments

(3 files)

User Agent: Mobile Safari
Build ID: 20170524100215

Steps to reproduce:

Loaded the page at https://logging.apache.org/log4j/2.x/manual/async.html (or refreshed the already open page, with Offline Mode disabled).
Place the browser in offline mode (File > Work Offline) without closing the tab.
Refreshed the page.


Actual results:

Only a portion of the page gets loaded (specifically, up to the text "There are a" on the paragraph that precedes the heading "Mixing Synchronous and Asynchronous Loggers")


Expected results:

The full page should be loaded from the cache
Response headers for the request:

Date: Thu, 31 Aug 2017 14:12:11 GMT
Server: Apache/2.4.7 (Ubuntu)
Last-Modified: Wed, 30 Aug 2017 06:13:57 GMT
ETag: "f5a7-557f26edce35d-gzip"
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 14234
Content-Type: text/html
The Network tab of FF developer toolbox shows a "Transfer Size" of 13.90 KB (consistent with Content-Length) for the online request and a "Size" of 61.41 KB. For the cached version the displayed "Size" is only 31.38 KB.

I have been experiencing this issue for this page since FF 45.0 (couldn't try earlier versions).

Latest verification was using a fully clean profile on a recent Nightly: 57.0a1 (2017-08-27) (64-bit)
Issue was migrated from WebCompat (https://github.com/webcompat/web-bugs/issues/9493)
Component: Untriaged → Networking: Cache
Product: Firefox → Core
I cannot reproduce it. Could you please provide HTTP log? See https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging

Use this list of modules to log: timestamp,rotate:50,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5,cache2:5
Flags: needinfo?(janakaud)
Michal, Can you spot any additional clues in these logs?
Flags: needinfo?(michal.novotny)
Priority: -- → P2
Whiteboard: [necko-triaged]
What happens here is:

1) The channel is canceled due to charset switch (http://searchfox.org/mozilla-central/rev/31606bbabc50b08895d843b9f5f3da938ccdfbbf/parser/html/nsHtml5TreeOpExecutor.cpp#742). If the channel hasn't been loaded completely before Cancel() is called, we end up with partial entry even if it's not resumable because nsHttpChannel::Cancel() calls CloseCacheEntry() with doomOnFailure == false. The logic when we pass doomOnFailure == true and when false isn't clear to me.

2) The channel is opened again, we find the partial entry and because the channel is not resumable nsHttpChannel::OnCacheEntryCheck() fails at http://searchfox.org/mozilla-central/rev/31606bbabc50b08895d843b9f5f3da938ccdfbbf/netwerk/protocol/http/nsHttpChannel.cpp#4164. Then nsHttpChannel::OnCacheEntryAvailable() is called without an entry, so we won't cache the content again.

3) When Firefox is in offline mode, we start to read from the cache at http://searchfox.org/mozilla-central/rev/31606bbabc50b08895d843b9f5f3da938ccdfbbf/netwerk/protocol/http/nsHttpChannel.cpp#4096 regardless if the entry is partial or complete.

Honza, it seems to me that in every step there is some problem. In (1) we should IMO doom the partial entry. In (2) when OnCacheEntryCheck() fails, shouldn't we rather pass an empty new entry to OnCacheEntryAvailable() so the content can be cached? And finally in (3), is it correct that we use partial entries in offline mode?
Flags: needinfo?(michal.novotny) → needinfo?(honzab.moz)
(In reply to Michal Novotny (:michal) from comment #8)
> What happens here is:
> 
> 1) The channel is canceled due to charset switch
> (http://searchfox.org/mozilla-central/rev/
> 31606bbabc50b08895d843b9f5f3da938ccdfbbf/parser/html/nsHtml5TreeOpExecutor.
> cpp#742). If the channel hasn't been loaded completely before Cancel() is
> called, we end up with partial entry even if it's not resumable because
> nsHttpChannel::Cancel() calls CloseCacheEntry() with doomOnFailure == false.
> The logic when we pass doomOnFailure == true and when false isn't clear to
> me.
> 
> 2) The channel is opened again, we find the partial entry and because the
> channel is not resumable nsHttpChannel::OnCacheEntryCheck() fails at
> http://searchfox.org/mozilla-central/rev/
> 31606bbabc50b08895d843b9f5f3da938ccdfbbf/netwerk/protocol/http/nsHttpChannel.
> cpp#4164. Then nsHttpChannel::OnCacheEntryAvailable() is called without an
> entry, so we won't cache the content again.
> 
> 3) When Firefox is in offline mode, we start to read from the cache at
> http://searchfox.org/mozilla-central/rev/
> 31606bbabc50b08895d843b9f5f3da938ccdfbbf/netwerk/protocol/http/nsHttpChannel.
> cpp#4096 regardless if the entry is partial or complete.
> 
> Honza, it seems to me that in every step there is some problem. 

I agree with that conclusion, thanks.

> In (1) we
> should IMO doom the partial entry. 

I was once pretty much involved in what the value of the doom-on-fail arg should be in each case we call closecacheentry.  I have to find the bug, since there was an assessment of the whole thing (leaving ni?me for that)

> In (2) when OnCacheEntryCheck() fails,
> shouldn't we rather pass an empty new entry to OnCacheEntryAvailable() so
> the content can be cached? 

Yes, I have to think first, we may have a new result value, like WANT_EMPTY_ENTRY or something, to this only explicitly.

> And finally in (3), is it correct that we use
> partial entries in offline mode?

That's bad, yes.
(In reply to Honza Bambas (:mayhemer) from comment #9)
> I was once pretty much involved in what the value of the doom-on-fail arg
> should be in each case we call closecacheentry.  I have to find the bug,
> since there was an assessment of the whole thing (leaving ni?me for that)

https://bugzilla.mozilla.org/show_bug.cgi?id=1276673#c0

> 
> > In (2) when OnCacheEntryCheck() fails,
> > shouldn't we rather pass an empty new entry to OnCacheEntryAvailable() so
> > the content can be cached? 
> 
> Yes, I have to think first, we may have a new result value, like
> WANT_EMPTY_ENTRY or something, to this only explicitly.

Michal, do you remember what exactly meant by "when OnCacheEntryCheck() fails" ?

> 
> > And finally in (3), is it correct that we use
> > partial entries in offline mode?
> 
> That's bad, yes.

I'll file bugs for all these 3 cases.
Flags: needinfo?(honzab.moz) → needinfo?(michal.novotny)
Depends on: 1439656
(In reply to Honza Bambas (:mayhemer) from comment #10)
> Michal, do you remember what exactly meant by "when OnCacheEntryCheck()
> fails" ?

I don't understand your question. Comment #8 points to an exact line where nsHttpChannel::OnCacheEntryCheck fails.
Flags: needinfo?(michal.novotny)
(In reply to Michal Novotny (:michal) from comment #8)
> What happens here is:
> 
> 1) The channel is canceled due to charset switch
> (http://searchfox.org/mozilla-central/rev/
> 31606bbabc50b08895d843b9f5f3da938ccdfbbf/parser/html/nsHtml5TreeOpExecutor.
> cpp#742). If the channel hasn't been loaded completely before Cancel() is
> called, we end up with partial entry even if it's not resumable because
> nsHttpChannel::Cancel() calls CloseCacheEntry() with doomOnFailure == false.

that happens only when the channel in tail-blocked (the state in which there has note even the request been made, hence cache entry is still EMPTY, hence can't be partial), otherwise we don't call clocasecacheentry at all from cancel.

see also https://bugzilla.mozilla.org/show_bug.cgi?id=1439656#c1
(In reply to Michal Novotny (:michal) from comment #8)
> 3) When Firefox is in offline mode, we start to read from the cache at
> http://searchfox.org/mozilla-central/rev/
> 31606bbabc50b08895d843b9f5f3da938ccdfbbf/netwerk/protocol/http/nsHttpChannel.
> cpp#4096 regardless if the entry is partial or complete.

I think it's somewhat OK to serve a partial entry here for display content (images).  But could be a problem for active content (html/js/css) which is the case here.

- we must make sure we don't cache partial entries that are non-resumable (can't be recognized as partial) what I believe it IS ensured with the current code -> nothing to do here
- at code ref of (3) there should be a check for an entry being partial and fail when so

Hence, fixing the concern (3) is the only thing to do here.
Assignee: nobody → honzab.moz
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Page gets cached only partially (possibly generating a corrupted cache entry?) → Stop serving partially cached content in offline mode
Status: NEW → ASSIGNED
Attached patch v1Splinter Review
simply copying the code for partiality check from few lines below.  note that when a channel is a fallback channel, it always has appcache assigned (simplified the condition).  appcache data are allowed to go regardless potential partiality, just in case.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=bf5d4cf6eed8ac4ef743add9bf754845360c03a2
Attachment #8955670 - Flags: review?(michal.novotny)
michal, ping on review
Attachment #8955670 - Flags: review?(michal.novotny) → review+
thanks!
Keywords: checkin-needed
Pushed by dluca@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5e50ce0cce02
In offline mode, fail HTTP load when a cache entry is partial, r=michal
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/5e50ce0cce02
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
(In reply to Pulsebot from comment #18)
> In offline mode, fail HTTP load when a cache entry is partial, r=michal

Verified the behaviour in Nightly 61.0a1 (2018-04-13).

(In reply to Michal Novotny (:michal) from comment #8)
> What happens here is:
> 
> 1) The channel is canceled due to charset switch
> (http://searchfox.org/mozilla-central/rev/
> 31606bbabc50b08895d843b9f5f3da938ccdfbbf/parser/html/nsHtml5TreeOpExecutor.
> cpp#742). 

Is it not possible to remedy this cancellation behaviour (which AFAIU causes the cache 'corruption' in the first place)? Just wondering :)
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.