Last Comment Bug 722034 - Cache entry check--CheckCache()--is done on the main thread, causing the main thread to wait on the cache service lock, which may be blocked on disk I/O
: Cache entry check--CheckCache()--is done on the main thread, causing the main...
Status: RESOLVED FIXED
[Snappy:P1]
: main-thread-io, perf
Product: Core
Classification: Components
Component: Networking: HTTP (show other bugs)
: Trunk
: All All
: P1 normal with 4 votes (vote)
: mozilla16
Assigned To: Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
:
Mentors:
: 758358 762743 (view as bug list)
Depends on: 722033 742610 746018 758358 760955 763342 765665
Blocks: 477578 717761 736844 741678 759886 759889 759928 760380 761736 762743
  Show dependency treegraph
 
Reported: 2012-01-28 03:51 PST by Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
Modified: 2016-02-17 17:27 PST (History)
43 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
-


Attachments
[WIP] Part 1: Classify cache locks (47.30 KB, patch)
2012-01-31 15:28 PST, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
[WIP] Part 2: Split Connect into two functions (10.25 KB, patch)
2012-01-31 15:29 PST, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
[WIP] Part 3: Classify more bad locks (5.52 KB, patch)
2012-01-31 15:29 PST, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
[WIP] Part 4: Remove IsStorageEnabledForPolicy (4.10 KB, patch)
2012-01-31 15:30 PST, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
[WIP] Part 5: Remove about:cache (20.92 KB, patch)
2012-01-31 15:30 PST, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
[WIP] Part 6: Simplify nsCacheEntryDescriptor (17.66 KB, patch)
2012-01-31 15:31 PST, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
[WIP] Part 7: Remove visitEntries (17.93 KB, patch)
2012-01-31 15:31 PST, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
[WIP] Part 8: Do CheckCache on the cache thread (229.10 KB, patch)
2012-01-31 15:36 PST, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
Split nsHttpChannel::Connect into two functions (7.41 KB, patch)
2012-04-02 15:21 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
nsHttpRequestHead & nsHttpResponseHead const-correctness and constructor/destructor/assignment cleanup (24.17 KB, patch)
2012-04-16 18:45 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
cbiesinger: review+
Details | Diff | Review
Factor out nsHttpChannel::Hash into a standalone function (5.77 KB, patch)
2012-04-16 18:50 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
cbiesinger: review+
Details | Diff | Review
Make HttpBaseChannel::mRedirectedCachekeys an nsAutoPtr<nsTArray<nsCString> > (3.50 KB, patch)
2012-04-16 18:53 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
cbiesinger: review+
Details | Diff | Review
Factor out ShouldUpdateOfflineCacheEntry (7.47 KB, patch)
2012-04-17 16:24 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
Factor out CheckCache() and functios it calls into a separate class (31.61 KB, patch)
2012-04-17 16:26 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
Move HttpCacheQuery logic (in particular, CheckCache) to the cache service thread (28.46 KB, patch)
2012-04-17 16:30 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
Cache changes necessary to build the previous patch (11.58 KB, patch)
2012-04-17 16:40 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
Stop accessing cache descriptor on main thread for telemetry purposes (10.65 KB, patch)
2012-04-17 19:24 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
hurley: feedback+
Details | Diff | Review
Part 1 - Make AsyncOpenCacheEntry return NS_ERROR_CACHE_WAIT_FOR_VALIDATION (3.07 KB, patch)
2012-05-15 11:41 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
Part 2 - Make nsHttpResponseHead and nsHttpRequestHead copy-constructable and assignable (24.07 KB, patch)
2012-05-15 11:43 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
brian: review+
Details | Diff | Review
Part 3 - Factor out nsHttpChannel::Hash into a standalone function (6.19 KB, patch)
2012-05-15 11:47 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
brian: review+
Details | Diff | Review
Part 4 - Convert HttpBaseChannel::mRedirectedCachekeys to nsAutoPtr<nsTArray<nsCString> > (3.60 KB, patch)
2012-05-15 11:48 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
brian: review+
Details | Diff | Review
Part 5 - Factor out nsHttpChannel::ShouldUpdateOfflineCacheEntry into a standalone function (7.60 KB, patch)
2012-05-15 11:51 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
Part 6 - Make a copy of the information needed during cache validation in preparation for moving cache validation to the cache thread (32.80 KB, patch)
2012-05-15 11:54 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
honzab.moz: review+
Details | Diff | Review
Part 7 - Move cache validation to the cache thread (28.89 KB, patch)
2012-05-15 12:04 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
honzab.moz: review+
Details | Diff | Review
Part 8 - Stop accessing cache descriptor on main thread for telemetry purposes (11.10 KB, patch)
2012-05-15 12:07 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
hurley: review+
Details | Diff | Review
Part 3 - Factor out nsHttpChannel::Hash into a standalone function (3.60 KB, patch)
2012-05-30 13:26 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
brian: review+
Details | Diff | Review
Part 5 - Move cache validation to the cache thread (30.68 KB, patch)
2012-05-30 13:30 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
Part 1 - Make AsyncOpenCacheEntry call the listener for non-blocking async requests from background threads (1.24 KB, patch)
2012-05-30 13:32 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
hurley: review+
Details | Diff | Review
Part 2 - Factor out nsHttpChannel::Hash into a standalone function (6.17 KB, patch)
2012-05-30 13:33 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
brian: review+
Details | Diff | Review
Part 3 - Convert HttpBaseChannel::mRedirectedCachekeys to nsAutoPtr<nsTArray<nsCString> > (3.60 KB, patch)
2012-05-30 13:34 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
brian: review+
Details | Diff | Review
Part 4 - Make a copy of the information needed during cache validation in preparation for moving cache validation to the cache thread (32.36 KB, patch)
2012-05-30 13:35 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
honzab.moz: review+
Details | Diff | Review
Part 5 - Move cache validation to the cache thread (30.68 KB, patch)
2012-05-30 13:38 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
honzab.moz: review+
Details | Diff | Review
Part 6 - Stop accessing cache descriptor on main thread for telemetry purposes (11.13 KB, patch)
2012-05-30 13:39 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
brian: review+
Details | Diff | Review
Part 7 - Stop calling mOfflineCacheEntry->GetLastModified when processing response from normal cache, r?honzab (3.12 KB, patch)
2012-05-30 15:31 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
no flags Details | Diff | Review
Part 8 - Work around bug 759805 (2.54 KB, patch)
2012-05-30 15:32 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
michal.novotny: review+
Details | Diff | Review
Back out Bug 722034 and Bug 746018 from mozilla-aurora (246 bytes, patch)
2012-07-03 15:38 PDT, Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
akeybl: approval‑mozilla‑aurora+
Details | Diff | Review

Description Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-01-28 03:51:08 PST
+++ This bug was initially created as a clone of Bug #717761 +++

See the description of bug 717761. This bug is about one possible solution to the particular instance description in the summary of that bug: Move the CheckCache() logic from the main thread to the cache service thread.

The idea is:

1.  Create a new variant of AsyncOpenCacheEntry that calls the cache listener's OnCacheEntryAvailable method on the cache thread.
2. Implement OnCacheEntryAvailable in a new class. That implementation will call CheckCache(), then dispatch an event that calls nsHttpChannel::OnCacheEntryAvailable on the main thread.
3. Remove the CheckCache call from nsHttpChannel::Connect
4. Profit.

Unfortunately, this cannot be done completely until bug 722033 is fixed.
Comment 1 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-01-31 15:28:42 PST
Created attachment 593238 [details] [diff] [review]
[WIP] Part 1: Classify cache locks
Comment 2 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-01-31 15:29:15 PST
Created attachment 593239 [details] [diff] [review]
[WIP] Part 2: Split Connect into two functions
Comment 3 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-01-31 15:29:44 PST
Created attachment 593240 [details] [diff] [review]
[WIP] Part 3: Classify more bad locks
Comment 4 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-01-31 15:30:22 PST
Created attachment 593241 [details] [diff] [review]
[WIP] Part 4: Remove IsStorageEnabledForPolicy
Comment 5 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-01-31 15:30:49 PST
Created attachment 593242 [details] [diff] [review]
[WIP] Part 5: Remove about:cache
Comment 6 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-01-31 15:31:20 PST
Created attachment 593243 [details] [diff] [review]
[WIP] Part 6: Simplify nsCacheEntryDescriptor
Comment 7 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-01-31 15:31:58 PST
Created attachment 593244 [details] [diff] [review]
[WIP] Part 7: Remove visitEntries
Comment 8 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-01-31 15:36:21 PST
Created attachment 593249 [details] [diff] [review]
[WIP] Part 8: Do CheckCache on the cache thread
Comment 9 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-02-03 15:52:39 PST
Comment on attachment 593240 [details] [diff] [review]
[WIP] Part 3: Classify more bad locks

There is an updated version of the lock classification patch in bug 717761.
Comment 10 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-02-03 15:55:52 PST
Comment on attachment 593242 [details] [diff] [review]
[WIP] Part 5: Remove about:cache

The new lock classification patch uses the information from this patch to determine which nsCacheEntryDescriptor methods are only used by about:cache, labeling them "BAD". Also, Nick is working on making most getter/setter methods of nsCacheEntryDescriptor work without taking the cache service lock.
Comment 11 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-02-03 16:00:01 PST
Comment on attachment 593241 [details] [diff] [review]
[WIP] Part 4: Remove IsStorageEnabledForPolicy

This patch is now attached to bug 717761 where it belongs.
Comment 12 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-02 15:21:13 PDT
Created attachment 611615 [details] [diff] [review]
Split nsHttpChannel::Connect into two functions

This patch splits nsHttpChannel::Connect() into two functions: one that runs before the cache is checked (which will always be run on the main thread), and one that runs after the cache is checked (which may, eventually, be run on a background thread).
Comment 13 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-02 20:31:02 PDT
Comment on attachment 611615 [details] [diff] [review]
Split nsHttpChannel::Connect into two functions

Review of attachment 611615 [details] [diff] [review]:
-----------------------------------------------------------------

Canceling review. I will have a better version of the patch soon.
Comment 14 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-16 18:45:36 PDT
Created attachment 615588 [details] [diff] [review]
nsHttpRequestHead & nsHttpResponseHead const-correctness and constructor/destructor/assignment cleanup

1. When CheckCache() is running on the background thread, it needs a consistent copy of the request headers and response headers. In order to ensure that the main thread does not change the headers while ChechCache() is using/modifying them, I make a copy of the headers.

2. Some of the code I have written takes const pointers to the nsRequestHead/nsResponseHead objects, so I fixed some const-related issues.

3. Some of the destructors/constructors are unnecessary and so I removed them.
Comment 15 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-16 18:50:33 PDT
Created attachment 615589 [details] [diff] [review]
Factor out nsHttpChannel::Hash into a standalone function

I will use Hash() from outside of nsHttpChannel, so I factored it out. The overhead of creating the nsICryptoHash instance should not be significant enough for this to be a performance problem.
Comment 16 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-16 18:53:29 PDT
Created attachment 615591 [details] [diff] [review]
Make HttpBaseChannel::mRedirectedCachekeys an nsAutoPtr<nsTArray<nsCString> >

This simplifies the main patch. Note that we must still use the old "X<Y<Z> >" syntax for nested template parameters because GCC on MacOS X rejects the modern "X<Y<Z>>" syntax.
Comment 17 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-17 00:31:33 PDT
Verifying that my refactoring and rebasing of the patches didn't break anything:

https://tbpl.mozilla.org/?tree=Try&rev=80a2916786e0
Comment 18 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-17 16:24:11 PDT
Created attachment 615939 [details] [diff] [review]
Factor out ShouldUpdateOfflineCacheEntry
Comment 19 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-17 16:26:24 PDT
Created attachment 615941 [details] [diff] [review]
Factor out CheckCache() and functios it calls into a separate class

This class will eventually extend nsRunnable and its Run() method will call CheckCache on the cache service thread.

In order to be absolutely sure there are no concurrency issues, we copy all the needed data from the channel and the new object. This copying is needed because nsHttpRequestHead and nsHttpResponseHead, and probably other things, are not thread-safe.
Comment 20 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-17 16:30:14 PDT
Created attachment 615942 [details] [diff] [review]
Move HttpCacheQuery logic (in particular, CheckCache) to the cache service thread
Comment 21 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-17 16:40:01 PDT
Created attachment 615947 [details] [diff] [review]
Cache changes necessary to build the previous patch

This patch:

1. Fixes NS_ERROR_CACHE_WAIT_FOR_VALIDATION processing for AsyncOpenCacheEntry when it is called on the cache I/O thread. In particular, we cannot have AsyncOpenCacheEntry return NS_OK unless it will later call the listener's OnCacheEntryAvailable method. This part replaces the patch in bug 739307, which I will WONTFIX.

2. Reduces #includes in nsCacheService.h (and headers it includes) so that nsHttpChannel can #include "nsCacheService.h" without us needing to export any other headers from netwerk/cache.

3. Moves class nsCacheEntryHashTable to nsCacheService so that nsCacheService.h does not need to include nsCacheEntry.h
Comment 22 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-17 16:47:42 PDT
Here is the try run for all the patches in bug 746018:
https://tbpl.mozilla.org/?tree=Try&rev=07eee97362e6

Here is the try run for this bug:
https://tbpl.mozilla.org/?tree=Try&rev=b3c952a1d018

And, for comparison purposes, here is a run without any of these patches:
https://hg.mozilla.org/try/rev/ef7cf6591031

And, here is what your patch queue should look like, should you try to build these yourself:

From bug 746018:
check-for-opened-channel-in-more-places
split-connect-1
CachePump-buffering-cache
746018-p4-more-cache-logging
746018-p5-start-buffering-earlier

From this bug:
cache-changes
copyable-and-assignable-nsHttpRequesteHead-and-nsHttpResponseHead
factor-out-Hash
mRedirectedCachekeys
factor-out-ShouldUpdateOfflineCacheEntry
HttpCacheQuery
OnCacheThread
Comment 23 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-17 19:24:12 PDT
Created attachment 615988 [details] [diff] [review]
Stop accessing cache descriptor on main thread for telemetry purposes

This patch moves the call of nsCacheEntryDescriptor::GetDeviceID() to the cache service thread, and also fixes the leak filed as bug 736844.
Comment 24 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-04-17 19:35:51 PDT
Comment on attachment 615988 [details] [diff] [review]
Stop accessing cache descriptor on main thread for telemetry purposes

Switching the reviewer since this is probably too difficult to review without understanding the other patches.
Comment 25 Honza Bambas (:mayhemer) 2012-04-18 03:45:56 PDT
Brian, please give the patches order numbers (Part1,2 etc...), thanks.  I may be hit by need of knowing the order you want to land them in when reviewing.
Comment 26 Honza Bambas (:mayhemer) 2012-04-18 03:53:56 PDT
OK, Brian please add "Part N, .." to the patches description.  I don't want to spend time studding what patch has what file name.  Sorry.  I demanded this from you ones already.
Comment 27 Michal Novotny (:michal) 2012-04-19 07:34:16 PDT
Comment on attachment 615947 [details] [diff] [review]
Cache changes necessary to build the previous patch

(In reply to Brian Smith (:bsmith) from comment #21)
> 1. Fixes NS_ERROR_CACHE_WAIT_FOR_VALIDATION processing for
> AsyncOpenCacheEntry when it is called on the cache I/O thread. In
> particular, we cannot have AsyncOpenCacheEntry return NS_OK unless it will
> later call the listener's OnCacheEntryAvailable method.

When exactly happen that we return NS_OK from AsyncOpenCacheEntry() and then don't call OnCacheEntryAvailable()? I'm not aware of this behavior. IMO your patch causes that we return NS_ERROR_CACHE_WAIT_FOR_VALIDATION in case of async non-blocking request that can't be satisfied and then we call the listener with the same result.


> +    static void AssertIsCacheIOThread()
> +    {
> +      MOZ_ASSERT(IsCacheIOThread());
> +    }

AFAICS this method is used only by HttpCacheQuery that lives outside cache directory where you should probably access cache service only via interface methods.
Comment 28 Nicholas Hurley [:nwgh][:hurley] 2012-04-19 10:45:01 PDT
Comment on attachment 615988 [details] [diff] [review]
Stop accessing cache descriptor on main thread for telemetry purposes

Review of attachment 615988 [details] [diff] [review]:
-----------------------------------------------------------------

This all looks sane to me.

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ -68,5 @@
>  #include "nsDNSPrefetch.h"
>  #include "nsChannelClassifier.h"
>  #include "nsIRedirectResultListener.h"
>  #include "mozilla/TimeStamp.h"
> -#include "mozilla/Telemetry.h"

I might keep this in (I like to be able to know easily how things get included instead of having to track down through a bunch of #include directives), but we don't seem to do that in general, so whichever you prefer is fine by me.
Comment 29 Christian :Biesinger (don't email me, ping me on IRC) 2012-05-09 14:28:57 PDT
Comment on attachment 615589 [details] [diff] [review]
Factor out nsHttpChannel::Hash into a standalone function

Please keep that comment with the new Hash() function
Comment 30 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 11:41:47 PDT
Created attachment 624123 [details] [diff] [review]
Part 1 - Make AsyncOpenCacheEntry return NS_ERROR_CACHE_WAIT_FOR_VALIDATION

Rebased/renamed version of patch 615588 r+d by :biesinger
Comment 31 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 11:43:18 PDT
Created attachment 624126 [details] [diff] [review]
Part 2 - Make nsHttpResponseHead and nsHttpRequestHead copy-constructable and assignable

Updated/rebased/renamed version of attachment 615589 [details] [diff] [review] that addresses cbiesinger's review comment.
Comment 32 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 11:45:31 PDT
Comment on attachment 624123 [details] [diff] [review]
Part 1 - Make AsyncOpenCacheEntry return NS_ERROR_CACHE_WAIT_FOR_VALIDATION

Numbering was off in my head. This patch is the patch I need to discuss with Michal regarding the handling of NS_ERROR_CACHE_WAIT_FOR_VALIDATION.
Comment 33 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 11:47:27 PDT
Created attachment 624130 [details] [diff] [review]
Part 3 - Factor out nsHttpChannel::Hash into a standalone function

Updated/rebased/renamed version of attachment 615589 [details] [diff] [review] that addresses cbiesinger's review comment.
Comment 34 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 11:48:37 PDT
Created attachment 624131 [details] [diff] [review]
Part 4 - Convert HttpBaseChannel::mRedirectedCachekeys to nsAutoPtr<nsTArray<nsCString> >

Rebased/renamed version of attachment 615591 [details] [diff] [review] r+d by cbiesinger
Comment 35 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 11:51:47 PDT
Created attachment 624132 [details] [diff] [review]
Part 5 - Factor out nsHttpChannel::ShouldUpdateOfflineCacheEntry into a standalone function

Renamed version of attachment 615939 [details] [diff] [review]
Comment 36 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 11:54:05 PDT
Created attachment 624133 [details] [diff] [review]
Part 6 - Make a copy of the information needed during cache validation in preparation for moving cache validation to the cache thread
Comment 37 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 12:04:56 PDT
Created attachment 624139 [details] [diff] [review]
Part 7 - Move cache validation to the cache thread
Comment 38 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 12:07:01 PDT
Created attachment 624141 [details] [diff] [review]
Part 8 - Stop accessing cache descriptor on main thread for telemetry purposes

Renamed/rebased version of attachment 615988 [details] [diff] [review].

Nick, I asked you for feedback on this patch before. I think it is OK to change your feedback+ to r+ so somebody else does not have to review it.
Comment 39 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 12:53:10 PDT
This patch series is going through try now:
https://tbpl.mozilla.org/?tree=Try&rev=10b6ced92518

This requires the patches from bug 746018, so that the patch queue looks like this

pfc-1-v2-check-for-opened-channel-in-more-places.patch
pfc-2-v2-split-connect.patch
pfc-3-v1-do-more-logging-in-nsInputStreamWrapper.patch
pfc-4-v1-more-cache-logging.patch
pfc-5-v1-start-buffering-earlier.patch
ctcc-1-v2
ctcc-2-v1
ctcc-3-v1
ctcc-4-v1
ctcc-5-v1
ctcc-6-v1
ctcc-7-v2
ctcc-8-v2

(In reply to Michal Novotny (:michal) from comment #27)
> (In reply to Brian Smith (:bsmith) from comment #21)
> > 1. Fixes NS_ERROR_CACHE_WAIT_FOR_VALIDATION processing for
> > AsyncOpenCacheEntry when it is called on the cache I/O thread. In
> > particular, we cannot have AsyncOpenCacheEntry return NS_OK unless it will
> > later call the listener's OnCacheEntryAvailable method.
> 
> When exactly happen that we return NS_OK from AsyncOpenCacheEntry() and then
> don't call OnCacheEntryAvailable()? I'm not aware of this behavior. IMO your
> patch causes that we return NS_ERROR_CACHE_WAIT_FOR_VALIDATION in case of
> async non-blocking request that can't be satisfied and then we call the
> listener with the same result.

First, I am very open to the possibility that my patch isn't good here, because I don't understand the cache code well.

If I apply this entire patch series except for NS_ERROR_CACHE_WAIT_FOR_VALIDATION change (which is in Part 1, ctcc-1-v2), then the test netwerk\test\unit\test_reentrancy.js hangs. 

> AFAICS this method is used only by HttpCacheQuery that lives outside cache
> directory where you should probably access cache service only via interface
> methods.

I have made this change, which makes the patch much simpler. Thank you for the pointer.
Comment 40 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 12:55:31 PDT
Comment on attachment 624123 [details] [diff] [review]
Part 1 - Make AsyncOpenCacheEntry return NS_ERROR_CACHE_WAIT_FOR_VALIDATION

Michal, see comment 39 for my reply to your earlier review comments.
Comment 41 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-15 12:57:22 PDT
I am not sure why I added bug 725993 as blocking this bug, so I am removing it. Also, I changed my patch so that fixing bug 742218 is not necessary as of now.
Comment 42 Nicholas Hurley [:nwgh][:hurley] 2012-05-15 13:20:48 PDT
Comment on attachment 624141 [details] [diff] [review]
Part 8 - Stop accessing cache descriptor on main thread for telemetry purposes

Review of attachment 624141 [details] [diff] [review]:
-----------------------------------------------------------------

/me wishes for a "Ship It!" button. I'll have to settle for r+ and publish, I guess.
Comment 43 Michal Novotny (:michal) 2012-05-18 13:54:08 PDT
Comment on attachment 624123 [details] [diff] [review]
Part 1 - Make AsyncOpenCacheEntry return NS_ERROR_CACHE_WAIT_FOR_VALIDATION

(In reply to Brian Smith (:bsmith) from comment #39)
> > When exactly happen that we return NS_OK from AsyncOpenCacheEntry() and then
> > don't call OnCacheEntryAvailable()? I'm not aware of this behavior. IMO your
> > patch causes that we return NS_ERROR_CACHE_WAIT_FOR_VALIDATION in case of
> > async non-blocking request that can't be satisfied and then we call the
> > listener with the same result.
> 
> First, I am very open to the possibility that my patch isn't good here,
> because I don't understand the cache code well.
> 
> If I apply this entire patch series except for
> NS_ERROR_CACHE_WAIT_FOR_VALIDATION change (which is in Part 1, ctcc-1-v2),
> then the test netwerk\test\unit\test_reentrancy.js hangs. 

There is really a problem that we don't call the listener in case of non-blocking async request called on a background thread when the cache entry is in use. But the correct solution would be to call the listener instead of returning NS_ERROR_CACHE_WAIT_FOR_VALIDATION from AsyncOpenCacheEntry(). At least this is the behavior described in nsICacheSession.idl. So the change should be IMO in nsCacheService::ProcessRequest()

-        if (NS_FAILED(rv) && calledFromOpenCacheEntry)
+        if (NS_FAILED(rv) && calledFromOpenCacheEntry && request->IsBlocking())
Comment 44 Honza Bambas (:mayhemer) 2012-05-21 14:04:32 PDT
Comment on attachment 624132 [details] [diff] [review]
Part 5 - Factor out nsHttpChannel::ShouldUpdateOfflineCacheEntry into a standalone function

Review of attachment 624132 [details] [diff] [review]:
-----------------------------------------------------------------

Shouldn't this be part of the patch for bug 746018?  I now can see that ShouldUpdateOfflineCacheEntry works with a bad mResponseHead when called from StartBufferingCachedEntity()

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +908,5 @@
> +            if (mOfflineCacheEntry) {
> +                rv = mOfflineCacheEntry->GetLastModified(&offlineLastModified);
> +                NS_ENSURE_SUCCESS(rv, rv);
> +            }
> +            if (ShouldUpdateOfflineCacheEntry(offlineLastModified,

So, when there is no mOfflineCacheEntry, you pass 0, that probably passes if (docLastModifiedTime > offlineLastModifiedTime) and returns true.  But the original implementation returns false for case of no mOfflineCacheEntry.
Comment 45 Honza Bambas (:mayhemer) 2012-05-22 14:33:23 PDT
Comment on attachment 624141 [details] [diff] [review]
Part 8 - Stop accessing cache descriptor on main thread for telemetry purposes

Review of attachment 624141 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpChannel.h
@@ +377,5 @@
>      nsRefPtr<nsInputStreamPump>       mCachePump;
>      nsAutoPtr<nsHttpResponseHead>     mCachedResponseHead;
>      nsCOMPtr<nsISupports>             mCachedSecurityInfo;
>      nsCacheAccessMode                 mCacheAccess;
> +    mozilla::Telemetry::ID            mCacheEntryDeviceID;

Can you please rename this to mCacheEntryDeviceIDTelemetry or so?  It's very misleading name when actually the member carries a telemetry graph.
Comment 46 Honza Bambas (:mayhemer) 2012-05-22 15:30:33 PDT
Comment on attachment 624139 [details] [diff] [review]
Part 7 - Move cache validation to the cache thread

Review of attachment 624139 [details] [diff] [review]:
-----------------------------------------------------------------

I didn't check very deeply this patch logic, but merged patch (both bugs) looks good to me and works well.

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +634,5 @@
>      if (mCacheEntry) {
>          if (NS_FAILED(rv))
>              mCacheEntry->Doom();
> +    }
> +    CloseCacheEntry(false);

Maybe add a comment why you do this even there is no cache entry (on all places).

@@ +5366,5 @@
>  
>      // if the channel's already fired onStopRequest, then we should ignore
>      // this event.
> +    if (!mIsPending) {
> +        mCacheAsyncInputStream.CloseAndRelease();

This should be lower in the patch queue I think, but up to you.

::: netwerk/protocol/http/nsHttpHandler.cpp
@@ -449,5 @@
>  }
>  
> -nsresult
> -nsHttpHandler::GetCacheSession(nsCacheStoragePolicy storagePolicy,
> -                               nsICacheSession **result)

Thanks!
Comment 47 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-24 13:47:10 PDT
(In reply to Michal Novotny (:michal) from comment #43)
> -        if (NS_FAILED(rv) && calledFromOpenCacheEntry)
> +        if (NS_FAILED(rv) && calledFromOpenCacheEntry &&
> request->IsBlocking())

This causes several tests to fail. I filed bug 758358 about this issue.

(In reply to Honza Bambas (:mayhemer) from comment #44)
> Comment on attachment 624132 [details] [diff] [review]
> Part 5 - Factor out nsHttpChannel::ShouldUpdateOfflineCacheEntry into a
> standalone function
> 
> Shouldn't this be part of the patch for bug 746018?  I now can see that
> ShouldUpdateOfflineCacheEntry works with a bad mResponseHead when called
> from StartBufferingCachedEntity()

Yes, good catch. 

I have put this patch (Part 5), and in the other series and rebases the other patches in that series on it. I also am moving Part 2 to the other series because the patch for part 5 depends on it.
Comment 48 Honza Bambas (:mayhemer) 2012-05-29 04:21:26 PDT
Comment on attachment 624132 [details] [diff] [review]
Part 5 - Factor out nsHttpChannel::ShouldUpdateOfflineCacheEntry into a standalone function

Dropping the r flag, the patch has been moved and r+ in another bug.
Comment 49 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-30 13:26:53 PDT
Created attachment 628440 [details] [diff] [review]
Part 3 - Factor out nsHttpChannel::Hash into a standalone function

rebased.
Comment 50 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-30 13:30:11 PDT
Created attachment 628444 [details] [diff] [review]
Part 5 - Move cache validation to the cache thread

These two patches were moved to other bugs.
Comment 51 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-30 13:32:37 PDT
Created attachment 628447 [details] [diff] [review]
Part 1 - Make AsyncOpenCacheEntry call the listener for non-blocking async requests from background threads

This is Michal's patch for AsyncOpenCacheEntry. Nick, please see the discussion above.
Comment 52 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-30 13:33:20 PDT
Created attachment 628448 [details] [diff] [review]
Part 2 - Factor out nsHttpChannel::Hash into a standalone function

rebased, no changes.
Comment 53 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-30 13:34:00 PDT
Created attachment 628449 [details] [diff] [review]
Part 3 - Convert HttpBaseChannel::mRedirectedCachekeys to nsAutoPtr<nsTArray<nsCString> >

rebased.
Comment 54 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-30 13:35:52 PDT
Created attachment 628452 [details] [diff] [review]
Part 4 - Make a copy of the information needed during cache validation in preparation for moving cache validation to the cache thread

This is basically the same as the previous version you reviewed.
Comment 55 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-30 13:38:06 PDT
Created attachment 628454 [details] [diff] [review]
Part 5 - Move cache validation to the cache thread

Note that Part 4 and Part 5 have some changes to merge in the private browsing changes that landed since the previous time I posted these patches.
Comment 56 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-30 13:39:07 PDT
Created attachment 628455 [details] [diff] [review]
Part 6 - Stop accessing cache descriptor on main thread for telemetry purposes

rebased, r=hurley. I also addressed Honza's request to rename the telemetry ID member variable.
Comment 57 Nicholas Hurley [:nwgh][:hurley] 2012-05-30 13:46:53 PDT
Comment on attachment 628447 [details] [diff] [review]
Part 1 - Make AsyncOpenCacheEntry call the listener for non-blocking async requests from background threads

This looks good to me. Does this mean we can close bug 758358 as invalid now? (Based on the follow-up comment you put in, Brian, that seems to be the case.)
Comment 58 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-30 15:31:14 PDT
Created attachment 628515 [details] [diff] [review]
Part 7 - Stop calling mOfflineCacheEntry->GetLastModified when processing response from normal cache, r?honzab

This patch simply moves the call to mOfflineCacheEntry->GetLastModified to OnOfflineCacheEntryForWritingAvailable, to remove the last non-error, pre-validation access to cache entries from the processing of normal cache entries. We We will still do parts of the offline cache entry processing on the the main thread temporarily (see bug 759886 and bug 759889) but this is the best place to put this access for now.
Comment 59 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-30 15:32:39 PDT
Created attachment 628516 [details] [diff] [review]
Part 8 - Work around bug 759805

This patch is simply a workaround for bug 759805.
Comment 60 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-30 15:33:09 PDT
This is the try run for this set of patches:
https://tbpl.mozilla.org/?tree=Try&rev=2e878335f924
Comment 61 Michal Novotny (:michal) 2012-05-31 07:52:17 PDT
Comment on attachment 628516 [details] [diff] [review]
Part 8 - Work around bug 759805

(In reply to Brian Smith (:bsmith) from comment #59)
> This patch is simply a workaround for bug 759805.

The workaround looks good, but I'd rather fix the bug 759805. Is this still an issue with the current patch part1?
Comment 62 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-31 10:27:19 PDT
(In reply to Michal Novotny (:michal) from comment #61)
> Comment on attachment 628516 [details] [diff] [review]
> Part 8 - Work around bug 759805
> 
> (In reply to Brian Smith (:bsmith) from comment #59)
> > This patch is simply a workaround for bug 759805.
> 
> The workaround looks good, but I'd rather fix the bug 759805. Is this still
> an issue with the current patch part1?

Yes. I think it may actually be *caused* by patch part1.
Comment 63 Nicholas Hurley [:nwgh][:hurley] 2012-05-31 10:51:21 PDT
Brian, are both of the unresolved bugs in the depends field hard-blockers for this one? I'm trying to get a handle on what *actually* needs to be done before this can all be landed.
Comment 64 Honza Bambas (:mayhemer) 2012-05-31 11:39:08 PDT
Comment on attachment 628452 [details] [diff] [review]
Part 4 - Make a copy of the information needed during cache validation in preparation for moving cache validation to the cache thread

Review of attachment 628452 [details] [diff] [review]:
-----------------------------------------------------------------

I checked just the interdiff with an already r+'ed patch.

r=honzab

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +5260,5 @@
> +    if (mCacheQuery) {
> +        mRequestHead = mCacheQuery->mRequestHead;
> +        mRedirectedCachekeys = mCacheQuery->mRedirectedCachekeys.forget();
> +        mCacheAsyncInputStream.takeOver(mCacheQuery->mCacheAsyncInputStream);
> +        mCacheAsyncFunc = mCacheQuery->mCacheAsyncFunc;

mCacheAsyncFunc seems to be unused.
Comment 65 Honza Bambas (:mayhemer) 2012-05-31 11:45:29 PDT
Comment on attachment 628454 [details] [diff] [review]
Part 5 - Move cache validation to the cache thread

Review of attachment 628454 [details] [diff] [review]:
-----------------------------------------------------------------

Checked the diff only.

r=honzab
Comment 66 Honza Bambas (:mayhemer) 2012-05-31 11:52:41 PDT
Comment on attachment 628515 [details] [diff] [review]
Part 7 - Stop calling mOfflineCacheEntry->GetLastModified when processing response from normal cache, r?honzab

Review of attachment 628515 [details] [diff] [review]:
-----------------------------------------------------------------

Why is this patch needed?
Comment 67 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-31 12:07:56 PDT
*** Bug 758358 has been marked as a duplicate of this bug. ***
Comment 68 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-31 12:13:47 PDT
(In reply to Honza Bambas (:mayhemer) from comment #66)
> Comment on attachment 628515 [details] [diff] [review]
> Part 7 - Stop calling mOfflineCacheEntry->GetLastModified when processing
> response from normal cache, r?honzab
> 
> Why is this patch needed?

I tried to explain in comment 58. It isn't strictly needed, because I am just moving it from one place executed on the main thread to another place executed on the main thread. But, long term, we cannot call mOfflineCacheEntry->GetLastModified in ShouldUpdateOfflineCachEntry because ShouldUpdateOfflineCachEntry will always be called on the main thread, and mOfflineCacheEntry->GetLastModified takes the cache service lock. And, after bug 759889 is fixed, we will stop calling nsHttpChannel::OnOfflineCacheEntryForWritingAvailable on the main thread.
Comment 69 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-31 22:23:51 PDT
Comment on attachment 628516 [details] [diff] [review]
Part 8 - Work around bug 759805

Review of attachment 628516 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +2840,4 @@
>                                        nsresult status)
>  
>  {
>      AssertOnCacheThread();

In HttpCacheQuery::Run() we set mCacheThread = nsnull when we're on the main thread, right before we call nsHttpChannel::OnCacheEntryAvailable.

This creates a race condition that can result in a crash here, because AssertOnCacheThread() accesses mCacheThread; if our Run() method sets mCacheThread = null before this line executes on the cache thread, then we will crash with a null pointer access.

I have fix this by moving the mRunCount check below to be above the call to AssertOnCacheThread.
Comment 71 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-05-31 23:54:41 PDT
Comment on attachment 628515 [details] [diff] [review]
Part 7 - Stop calling mOfflineCacheEntry->GetLastModified when processing response from normal cache, r?honzab

I moved part 7 to its own bug, bug 760380, because it wasn't strictly needed for fixing this one.
Comment 72 (dormant account) 2012-06-04 09:14:47 PDT
We should redo the test in https://wiki.mozilla.org/Performance/Snappy/Testing:BFCache_Sprint, include google Chrome this time.
Comment 73 (dormant account) 2012-06-04 09:33:47 PDT
(In reply to Taras Glek (:taras) from comment #72)
> We should redo the test in
> https://wiki.mozilla.org/Performance/Snappy/Testing:BFCache_Sprint, include
> google Chrome this time.

Actually this is premature. This bug only fixes cache reads. I think the proper way to test this is to run an ioheavy workload(like bonnie++ io benchmark) in the background while visiting previously viewed webpages and then check the CACHE_SERVICE_LOCK_WAIT_MAINTHREAD sum in about:telemetry.
Comment 74 (dormant account) 2012-06-04 15:11:32 PDT
Can someone explain the difference in telemetry for CACHE_SERVICE_LOCK_WAIT that coincided with this landing? http://dl.dropbox.com/u/5961467/moz/CACHE_SERVICE_LOCK_WAIT.PNG

The last 2 builds are reporting more time spent in CACHE_SERVICE_LOCK_WAIT. The data is new so the population size is 3-5x smaller than that of older builds, but it's something to keep in eye on and if it persists for a week, we might have a problem.
Comment 75 Nicholas Hurley [:nwgh][:hurley] 2012-06-13 14:00:07 PDT
This telemetry remains high almost two weeks after these patches landed, see http://dl.dropbox.com/u/16486235/cache_lock_mainthread_crap.jpg

Brian, I was just talking to Josh, and we decided these patches need backed out until we can either determine the cause and fix these patches, or fix the blocking of the main thread in a different way. Can you get the backout done, please? Thanks!
Comment 76 (dormant account) 2012-06-13 15:46:32 PDT
(In reply to Nick Hurley [:hurley] from comment #75)
> This telemetry remains high almost two weeks after these patches landed, see
> http://dl.dropbox.com/u/16486235/cache_lock_mainthread_crap.jpg
> 
> Brian, I was just talking to Josh, and we decided these patches need backed
> out until we can either determine the cause and fix these patches, or fix
> the blocking of the main thread in a different way. Can you get the backout
> done, please? Thanks!

See https://bugzilla.mozilla.org/show_bug.cgi?id=762576#c10. There evidence that the patches may not be a regression.
Comment 77 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-07-03 15:38:36 PDT
Created attachment 638894 [details] [diff] [review]
Back out Bug 722034 	and Bug 746018 from mozilla-aurora

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Bug 746018 and Bug 722034

User impact if declined: Probable performance regression

Testing completed (on m-c, etc.): none

Risk to taking this patch (and alternatives if risky): Leave the changes in, and accept that we may have regressed performance.

String or UUID changes made by this patch: none.
Comment 78 Alex Keybl [:akeybl] 2012-07-05 14:35:29 PDT
Comment on attachment 638894 [details] [diff] [review]
Back out Bug 722034 	and Bug 746018 from mozilla-aurora

[Triage Comment]
Sounds right - this is suspected of causing 2 issues on desktop and mobile. Approved for Aurora 15.
Comment 79 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-07-10 11:42:39 PDT
(In reply to Alex Keybl [:akeybl] from comment #78)
> Comment on attachment 638894 [details] [diff] [review]
> Back out Bug 722034 	and Bug 746018 from mozilla-aurora
> 
> [Triage Comment]
> Sounds right - this is suspected of causing 2 issues on desktop and mobile.
> Approved for Aurora 15.

Backed out of mozilla-aurora:
https://hg.mozilla.org/releases/mozilla-aurora/rev/c576c2b8d9ba
https://hg.mozilla.org/releases/mozilla-aurora/rev/deec51854e4e
Comment 80 Alex Keybl [:akeybl] 2012-08-22 17:14:04 PDT
Given bug 770243, bug 760955, and bug 539040, should we consider doing the same for Aurora 17?
Comment 81 Alex Keybl [:akeybl] 2012-08-27 18:08:55 PDT
and Beta 16?
Comment 82 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-08-28 13:38:38 PDT
Taras, is there still a need for testing on this bug, RE:qawanted? Please advise.
Comment 83 (dormant account) 2012-08-28 14:23:25 PDT
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #82)
> Taras, is there still a need for testing on this bug, RE:qawanted? Please
> advise.

that was a stale one
Comment 84 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-09-19 06:26:45 PDT
(In reply to Alex Keybl [:akeybl] from comment #80)
> Given bug 770243, bug 760955, and bug 539040, should we consider doing the
> same for Aurora 17?

We have fixes for all those bugs, and I think the fixes are less risky than trying to do a backout because some patches landed on top of this, and it looks non-trivial to detangle them because these patches significantly refactored the code.
Comment 85 Patrick McManus [:mcmanus] 2016-02-16 11:15:07 PST
*** Bug 762743 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.