As background - by default two simultaneous network requests for the same potentially cachable resource will serialize themselves until we learn about the cache disposition. Most things are potentially cachable until you see some headers and response codes - but this can be an unbounded wait as servers prepare responses. A request can opt-out of this with the BYPASS_LOCAL_CACHE_IF_BUSY load flag. In that case such a conflict would result in one channel getting hold of the cache entry and the other one proceeding to the network independent of the cache (but in parallel). Honza and I have some differences of opinion about how much of the code should opt-out via that flag, but we agree that xhr should do so. Basically - xhr tends to be a little less cache friendly and more API driven (which uses a lot of repeated urls) than other necko patterns. The attached patch makes that change. I want to be clear, this doesn't disable caching for XHR - it just makes an optimization change of favoring the network over the cache in the case of a race condition. The BYPASS_LOCAL_CACHE_IF_BUSY flag is already used in sync xhr's, but that is for deadlock prevention purposes rather than as a performance motivation.
Assignee: nobody → mcmanus
Status: NEW → ASSIGNED
Comment on attachment 8556560 [details] [diff] [review] XHR can set BYPASS_LOCAL_CACHE_IF_BUSY all the time >+ // Don't block on the URI's cache entry if it is busy - favoring parllelism "parallelism". Please document that if someone ever changes this code they MUST leave the flag on for the sync case and why. As in, keep most of the old comment as a warning to future modifiers of this code. r=me, but do watch out for web compat issues....
Attachment #8556560 - Flags: review?(bzbarsky) → review+
based on try from comment 2 test_bug482935 is unhappy with this patch - as is test_bug475156 They have the same flow.. 1 test does an xhr get.. out of "done" it does another for the same url and makes sure the second one responds with cached content. (one of the tests actually also does an abort of the first xhr after it receives the done for it, but I think that's irrelevant here). given that these queries are serialized I'm surprised this patch impacts them. Is it a matter of the cache thread being async and not complete when the second request is made?
(In reply to Patrick McManus [:mcmanus] from comment #4) > based on try from comment 2 test_bug482935 is unhappy with this patch - as > is test_bug475156 > > They have the same flow.. > > 1 test does an xhr get.. out of "done" it does another for the same url and > makes sure the second one responds with cached content. > > (one of the tests actually also does an abort of the first xhr after it > receives the done for it, but I think that's irrelevant here). > > given that these queries are serialized I'm surprised this patch impacts > them. Is it a matter of the cache thread being async and not complete when > the second request is made? That sounds odd, just try to run it with cache2:5. Feel free to give me the log, pointing the line to look at would help.
honza, here is a try run with the failure and nspr cache2 and http logging.. linux64 debug failed test_bug 475156.html as described in comment 4. The nspr log is linked from the treehereder report. https://treeherder.mozilla.org/#/jobs?repo=try&revision=a9ee29c29313 the log is for all of mochitest-1 of course, so I just searched for the name of the test. thanks!
ni comment 6
It's not clear to me why XMLHttpRequest should follow a different path here. If all the input parameters are the same, why can the request not be shared?
Shared in what sense? Say we make a request to url X and another request to url X. The first request misses the cache and hits the network. The second request has two options: it can either ignore the cache (which is now in the process of being written to) and hit the network immediately, or it can wait until the first one finishes and puts stuff in the cache, after which it can decide whether it can use that stuff from the cache or not.
Put like that, reading from the cache. It's not clear why XMLHttpRequest needs to be different from <script> or @font-face.
It's not clear that we wouldn't want to bypass the cache in the case of <script> too, in this case. But more importantly, comment 0 explains why XHR should be different from <script> here: it's _much_ more common for the things XHR is fetching to be marked "not cacheable", so a heuristic that guesses that the response from the first request won't be usable by the second request anyway is a lot more likely to be correct in the XHR case than in the <script> case.
And note that either way the behavior is spec-compliant, since the spec makes no requirements or guarantees about when something is in fact cached...
So HTML's fetch algorithm allows fetches to be shared. I need to port that to Fetch still. However, I'm mainly wondering if we then need to explicitly exclude a bunch of request contexts from this down the line because the heuristic is unworkable for them (or relied on not to be present).
What's the scope of the sharing? The one case I know of in Gecko that effectively shares fetches is images, and we've been considering getting rid of it because it causes various problems (the per-document cache would remain, of cours). I'm 99% sure sharing fetches for XHR is not viable in web compat terms.
Specifically, if a fetch gets a no-cache response, then what does sharing it mean? It basically means that you get racy behavior depending on whether your second fetch starts before the first one ends, no? So on a fast network you might get updated data and on a slow network you wouldn't.
Per the HTML specification the scope is all fetches. Images have a very specific document-bound cache (that can be copied across documents) that allows for synchronous lookups too. They are very much a special case that I think is needed for compatibility still.
> Per the HTML specification the scope is all fetches. Yeah, I'm not sure how useful that is. It means that a fetch on one site is affected by what other sites are doing, which seems pretty odd.
Re-triggering (logs gone).
(In reply to Honza Bambas (:mayhemer) from comment #18) > Re-triggering (logs gone). Failed :( Sorry. If you can, please resubmit.
the rebased build in comment 20 still has the failure of 475156 and a fresh nspr log
The thing here is that we write and close the cache output stream on the cache io thread. This is a leftover from times of cache1 . We no longer need to go async here since the output stream implementation allows writing/closing on the main thread (blocking buffered stream, except that it never blocks.) We can remove tee->InitAsync call and are done. Also we save some memory copying  which is always good. Filing a bug.  http://hg.mozilla.org/mozilla-central/diff/0c91d9aa9476/netwerk/protocol/http/nsHttpChannel.cpp#l1.2526  http://hg.mozilla.org/mozilla-central/annotate/986e840a2979/xpcom/io/nsInputStreamTee.cpp#l79
Just confirmed the patch from bug 1134735 fixed this failure.
a green try run now that bug 1134735 has landed https://treeherder.mozilla.org/#/jobs?repo=try&revision=d341b4962511
You need to log in before you can comment on or make changes to this bug.