HTTP cache v2: nsHttpChannel must bypass concurrent read when request is not resumable

RESOLVED FIXED in mozilla28

Status

()

--
major
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: mayhemer, Assigned: mayhemer)

Tracking

unspecified
mozilla28
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [cache2])

Attachments

(2 attachments, 4 obsolete attachments)

(Assignee)

Description

5 years ago
Created attachment 812665 [details]
log [search for 10bbdd000 and 1097b1000, the first and the second channel]

Scenario:
- osx, new cache on
- load http://www.bbc.co.uk/news/technology-24179644
- the entry has max-age=0 so it always goes from net
- but during the page load something (no idea what!) cancels the page load channel with NS_BINDING_ABORTED
- at the same time it restarts the load again
- the new channel sees the entry as in-write-progress since it has not been closed by the first channel yet (happens in OnStopRequest triggered by the transaction, the entry is doomed because of the cancel)
- the new channel loads the content so far stored by the first channel from the cache (regardless the max-age=0 because of LOAD_FROM_CACHE flag)
- then it hits the end of the entry < content-length
- and cannot resume the load (the request is not resumable)

=> page doesn't load completely

Solution is one of:
1. don't do concurrent r/ of non-resumable responses
2. don't do concurrent r/w at all
(Assignee)

Comment 1

5 years ago
For the record, the cancel is called from:

>	xul.dll!nsHtml5TreeOpExecutor::NeedsCharsetSwitchTo(const char * aEncoding=0x101038d8, int aSource=0x00000009, unsigned int aLineNumber=0x00000050)  Line 840	C++
 	xul.dll!nsHtml5TreeOperation::Perform(nsHtml5TreeOpExecutor * aBuilder=0x11fec958, nsIContent * * aScriptElement=0x0042f54c)  Line 628	C++
 	xul.dll!nsHtml5TreeOpExecutor::RunFlushLoop()  Line 560	C++
 	xul.dll!nsHtml5ExecutorFlusher::Run()  Line 133	C++

  // ask the webshellservice to load the URL
> if (NS_SUCCEEDED(wss->StopDocumentLoad())) {
    wss->ReloadDocument(aEncoding, aSource);
  }
(Assignee)

Comment 2

5 years ago
Created attachment 812693 [details] [diff] [review]
v1

- when write in progress is detected on an entry, and the request is not resumable, don't use the entry
- this is also disabled, GetDataSize cannot return -1 for old entries
Attachment #812693 - Flags: review?(michal.novotny)
(Assignee)

Comment 3

5 years ago
Comment on attachment 812693 [details] [diff] [review]
v1

This fix is not complete.  I'll introduce a new return code ENTRY_WANTED_COMPLETE in onCacheEntryCheck that will cause CacheEntry to call onCacheEntryAvailable() after the output stream is written/closed by a concurrent writer.

This seems more reasonable then to just bypass the cache.
Attachment #812693 - Flags: review?(michal.novotny)
(Assignee)

Updated

5 years ago
Depends on: 922741
(Assignee)

Comment 4

5 years ago
Created attachment 813147 [details] [diff] [review]
v2

- when write of the entry is still in progress and the response is not resumable -> bypass concurrent read and rather block until the entry's output stream is closed

These nsHttpChannel changes need a bit more careful review, affect however only the new cache.

https://tbpl.mozilla.org/?tree=Try&rev=ca5b5fb00799
Attachment #812693 - Attachment is obsolete: true
Attachment #813147 - Flags: review?(michal.novotny)
(Assignee)

Comment 5

5 years ago
The ENTRY_WANTED_COMPLETE result has a side affect: when the entry is doomed between calls to OnCheck and OnAvail, the consumer (channel) will get ENTRY_NOT_FOUND.  However, I think the consumer still wants to store to a new cached entry, so we should provide it.

Filed a new bug on that, since this one blogs actual issues.

Patch for this bug needs a small update.
(Assignee)

Comment 6

5 years ago
> Filed a new bug on that, since this one blogs actual issues.
bug 923688
(Assignee)

Comment 8

5 years ago
Created attachment 827978 [details] [diff] [review]
v3 [merged]

- just remerged to m-c
Attachment #813147 - Attachment is obsolete: true
Attachment #813776 - Attachment is obsolete: true
Attachment #813147 - Flags: review?(michal.novotny)
Attachment #827978 - Flags: review?(michal.novotny)
(Assignee)

Updated

5 years ago
Depends on: 935595
(Assignee)

Comment 9

5 years ago
Created attachment 829269 [details] [diff] [review]
v3.1

- patch is using the ENTRY_WANTED_COMPLETE result of onCacheEntryCheck
- it is returned when write to the entry is still in progress but the response is not resumable (we cannot resume when the writer is interrupted in the middle)
- the |mCachedContentIsValid = false;| bit at nsHttpChannel::OnNormalCacheEntryAvailable: before this patch onCheck was always immediately proceeded with onAvail ; with this patch onAvail can be delayed and state of the entry (dooming) may change between onCheck and onAvail
- IsResumable(entrySize, contentLength) must ignore that entrySize == -1 when called from onCacheEntryCheck, it's not at that moment critical for the condition and makes it just always-false
Attachment #827978 - Attachment is obsolete: true
Attachment #827978 - Flags: review?(michal.novotny)
Attachment #829269 - Flags: review?(michal.novotny)
(Assignee)

Comment 10

5 years ago
Try runs of all involved patches queue:

https://tbpl.mozilla.org/?tree=Try&rev=695c59519cf4 (cache1)
https://tbpl.mozilla.org/?tree=Try&rev=7739e390c433 (cache2)

Here are additional xpcshell-test only runs for fixed 14b test:
https://tbpl.mozilla.org/?tree=Try&rev=475f9e1e84ef (cache1)
https://tbpl.mozilla.org/?tree=Try&rev=93b33603f8a3 (cache2)
Comment on attachment 829269 [details] [diff] [review]
v3.1

Looks good, just update ENTRY_WANTED_COMPLETE return value and wantCompleteEntry variable according to the change in patch v2 in bug #922741.
Attachment #829269 - Flags: review?(michal.novotny) → review+
(Assignee)

Updated

5 years ago
Attachment #829269 - Flags: checkin+
https://hg.mozilla.org/mozilla-central/rev/9ab3228988c2
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28

Updated

5 years ago
Keywords: verifyme

Comment 14

5 years ago
With browser.cache.use_new_backend pref set to 1 on Nightly from 2013-10-29, http://www.bbc.co.uk/news/technology-24179644 doesn't load completely and with Nightly debug build from 2013-09-30 it doesn't load at all. Do I need to use some tool to reproduce and verify this issue?

On Firefox 28 beta 8 (Build ID: 20140303165517), the URL properly loads with the new cache enabled on Windows 7 64bit, Mac OS X 10.9 and Ubuntu 13.10 32bit:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0
Mozilla/5.0 (X11; Linux i686; rv:28.0) Gecko/20100101 Firefox/28.0
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0
Flags: needinfo?(honzab.moz)
(Assignee)

Comment 15

5 years ago
I don't understand what the problem with the verification exactly is.  However, to answer the question: no, you don't need any tools, just a lot of luck :)

Unfortunatelly I so far was not able to write an automated test (just a question of time).
Flags: needinfo?(honzab.moz) → in-testsuite-

Comment 16

5 years ago
(In reply to Honza Bambas (:mayhemer) from comment #15)
> However, to answer the question: no, you don't need any tools, just a lot of
> luck :)
> 
> Unfortunatelly I so far was not able to write an automated test (just a
> question of time).

Dropping verifyme keyword since QA couldn't reproduce this issue and an automated test will be added.
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.