Last Comment Bug 695399 - Remove calls to nsICacheSession::openCacheEntry on the main thread
: Remove calls to nsICacheSession::openCacheEntry on the main thread
Status: RESOLVED FIXED
[Snappy:P3]
: addon-compat, APIchange, dev-doc-needed
Product: Core
Classification: Components
Component: Networking: Cache (show other bugs)
: Trunk
: All All
: P2 normal (vote)
: mozilla18
Assigned To: Michal Novotny (:michal)
:
: Patrick McManus [:mcmanus]
Mentors:
Depends on: 801930 513008 513074 722033 737614 737615 738128 739473 792804
Blocks: 792735
  Show dependency treegraph
 
Reported: 2011-10-18 11:08 PDT by Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
Modified: 2012-10-16 06:27 PDT (History)
30 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
browser part (17.27 KB, patch)
2012-07-27 17:30 PDT, Michal Novotny (:michal)
gavin.sharp: review-
Details | Diff | Splinter Review
necko part (15.35 KB, patch)
2012-07-27 17:31 PDT, Michal Novotny (:michal)
hurley: review+
Details | Diff | Splinter Review
necko part - unbitrotted patch, r=hurley (15.36 KB, patch)
2012-08-15 05:19 PDT, Michal Novotny (:michal)
michal.novotny: review+
Details | Diff | Splinter Review
update API use in pageInfo.js (18.89 KB, patch)
2012-09-13 03:46 PDT, :Margaret Leibovic
gavin.sharp: review+
Details | Diff | Splinter Review
update API use in pageInfo.js (for check-in) (19.03 KB, patch)
2012-09-13 09:26 PDT, :Margaret Leibovic
no flags Details | Diff | Splinter Review

Description Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2011-10-18 11:08:56 PDT
If we can remove openCacheEntry then I can avoid writing extra code to support it when I add certificate validation for cache entries in bug 660749.
Comment 1 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2011-10-18 11:20:19 PDT
Seems like nsICache::NON_BLOCKING and nsICache::BLOCKING could be removed too at the same time.

Some non-test callers:

  nsAboutCacheEntry::GetContentStream -> nsAboutCacheEntry::OpenCacheEntry
      * This wouldn't be affected by my SSL cert validation changes

  nsFtpState::CheckCache

  nsHttpChannel::OpenCacheEntry
      * // must use synchronous open for
        // LOAD_BYPASS_LOCAL_CACHE_IF_BUSY
  nsHttpChannel::OpenNormalCacheEntry
      * // must use synchronous open for
        // LOAD_BYPASS_LOCAL_CACHE_IF_BUSY
  nsHttpChannel::OpenOfflineCacheEntryForWriting
  nsHttpChannel::DoInvalidateCacheEntry

  nsWyciwygChannel::AsyncOpen (via nsWyciwygChannel::OpenCacheEntry)
  nsWyciwygChannel::WriteToCacheEntryInternal
     (via nsWyciwygChannel::OpenCacheEntry)
Comment 2 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2011-10-18 11:38:47 PDT
My changes for bug 660749 will affect *reading* from the cache only, not *writing*.

Bug 513008 is the bug where we removed almost all the synchronous reads.

Bug 551447 is about avoiding asynchrnous writes to the memory cache. If that needs to be done, maybe we need some *write-only* *memory-cache-only* synchronous API.
Comment 3 Michal Novotny (:michal) 2011-10-18 17:09:17 PDT
(In reply to Brian Smith (:bsmith) from comment #1)
>   nsWyciwygChannel::AsyncOpen (via nsWyciwygChannel::OpenCacheEntry)

nsWyciwygChannel::AsyncOpen calls nsWyciwygChannel::OpenCacheEntry with access mode nsICache::ACCESS_READ, so async API is used.


>   nsWyciwygChannel::WriteToCacheEntryInternal
>      (via nsWyciwygChannel::OpenCacheEntry)

This uses synchronous API because it runs on the cache IO thread, so using async API would just add some delay. We would also need to implement some queuing of the writes because there are probably already other pending nsWyciwygWriteEvents when we open the cache entry during the first write.
Comment 4 Michal Novotny (:michal) 2011-12-08 08:17:52 PST
(In reply to Brian Smith (:bsmith) from comment #1)
>   nsHttpChannel::OpenCacheEntry
>       * // must use synchronous open for
>         // LOAD_BYPASS_LOCAL_CACHE_IF_BUSY
>   nsHttpChannel::OpenNormalCacheEntry
>       * // must use synchronous open for
>         // LOAD_BYPASS_LOCAL_CACHE_IF_BUSY

These calls can't be easily switched to AsyncOpenCacheEntry() because it doesn't support non-blocking mode. We could add non-blocking mode to the async method, but it is IMO weird to have "blockingMode" argument in case of asynchronous method.

Taras mentioned that we should implement a timeout for caching. I.e. if we cannot get the cached data within certain time, we would bypass the cache and hit the net. Should we do it in the cache code or rather in http channel? If we implement it in the cache, the timeout will limit just the waiting for the entry, not the reading from it. Taras, any opinion on this?

If we decide to do it in the cache, I'd suggest to add a timeout argument to AsyncOpenCacheEntry(), so passing a zero would be equal to the "non-blocking" async request.


>   nsHttpChannel::DoInvalidateCacheEntry

We can make this asynchronous, but I'd like to avoid adding extra logic for this in nsHttpChannel::OnCacheEntryAvailable(), so I'd suggest to create a helper class or to add doom(in ACString key) method to nsCacheSession. I don't have a strong preference but I tend to do the latter.
Comment 5 Christian :Biesinger (don't email me, ping me on IRC) 2012-01-15 20:14:27 PST
So, a quick comment on why the sync version exists (of this and other APIs). The HTTP channel first tries a synchronous open, and falls back to async if sync would block. The reason for that is that it is faster to do it that way than to do all the async stuff with the associated context switching.

However, I don't know if that is still a relevant concern, and will hand over this bug back to y'all :)
Comment 6 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-01-15 23:11:01 PST
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #5)
> The HTTP channel first tries a synchronous open, and falls back to async if
> sync would block.

The HTTP channel falls back to async if the sync request would block on *socket* I/O (i.e. revalidating with the server). However, the very first step of sync openCacheEntry is acquiring the cache service lock:

   nsCacheServiceAutoLock lock;
   rv = gService->ProcessRequest(request, true, result);

Which means it is likely to be blocked on disk I/O from other requests or (my understanding is a little hazy here) even disk I/O for *this* request, right?

So, it might be worth keeping the synchronous API *if* we can eliminate all cases where the cache service lock is held during disk I/O. However, I do not know how feasible such a change would be and if we would be able to find *all* cases where the lock is held during disk I/O.
Comment 7 Michal Novotny (:michal) 2012-01-17 07:35:13 PST
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #5)
> So, a quick comment on why the sync version exists (of this and other APIs).
> The HTTP channel first tries a synchronous open, and falls back to async if
> sync would block.

This is no longer true. It was changed by bug #513008 to eliminate I/O on the mainthread. Sync API is used only in few specific cases mentioned above.
Comment 8 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-03-03 16:59:48 PST
Almost all of the addons on AMO that are using this API are using this API for the following purposes:

1. To doom an entry. We should add a method to nsICacheService that asynchronously dooms an entry given a URL and calls a callback. Many addons, including LiveHTTPHeaders and AdBlockPlus are doing this.

2. To access the entry's dataSize attribute, because nsIDocument and friends do not expose the information about the size of the document. We should add such a dataSize attribute to nsIDocument and/or some other place that an addon can get it, navigating from a document. Many addons are doing this, including StumbleUpon.

3. To load an entry out of the cache, without revalidating, without checking for validation, and without touching the network. If loadURI doesn't support this already, then we should add it. Many video/image saving addons are doing this.

4. To add some metadata, e.g. whether the favicon could be loaded. Not many addons are doing this.

5. To check to see whether a cache entry exists for a given URL.

AFAICT, nothing horrific would happen to these addons, as many of them do a reasonable job of wrapping access to the catch in try...catch because they have to deal with the case where the cache doesn't have the entry. However, the video/image downloading addons are likely to stop working entirely and/or partially. :(
Comment 9 Michal Novotny (:michal) 2012-07-27 17:30:17 PDT
Created attachment 646777 [details] [diff] [review]
browser part
Comment 10 Michal Novotny (:michal) 2012-07-27 17:31:38 PDT
Created attachment 646780 [details] [diff] [review]
necko part
Comment 11 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-07-27 19:18:49 PDT
(In reply to Michal Novotny (:michal) from comment #9)
> Created attachment 646777 [details] [diff] [review]
> browser part

In bug 737642 we had all agreed (I think) to just remove all usage of the cache from the page info dialog box. If it is possible to get the page info dialog box to completely stop using the cache, I think that is very much preferable to rewriting the code to be async. (If you look at the dependencies on bug 737642, you can see that the way the page info dialog box is using the cache causes a lot of problems, besides main thread I/O.)
Comment 12 Nicholas Hurley [:nwgh][:hurley] 2012-08-14 16:05:35 PDT
Comment on attachment 646780 [details] [diff] [review]
necko part

Review of attachment 646780 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, just one question/clarification below. r=me otherwise

::: netwerk/protocol/about/nsAboutCacheEntry.cpp
@@ +140,3 @@
>      if (NS_FAILED(rv)) return rv;
>  
> +    *result = inputStream.forget().get();

I'm probably missing something, but why do a get on a forgotten pointer?
Comment 13 Michal Novotny (:michal) 2012-08-15 05:19:26 PDT
Created attachment 652072 [details] [diff] [review]
necko part - unbitrotted patch, r=hurley

> I'm probably missing something, but why do a get on a forgotten pointer?

forget() returns already_AddRefed<nsIAsyncInputStream> so the get() is necessary.
Comment 14 Josh Aas 2012-08-15 07:36:28 PDT
Comment on attachment 646777 [details] [diff] [review]
browser part

Gavin - can you review this or find a reviewer? This is important for our cache performance work.
Comment 15 :Gavin Sharp [email: gavin@gavinsharp.com] 2012-08-20 13:32:42 PDT
Comment on attachment 646777 [details] [diff] [review]
browser part

>diff --git a/browser/base/content/pageinfo/pageInfo.js b/browser/base/content/pageinfo/pageInfo.js

> function makeGeneralTab()

>+  openCacheEntry(cacheKey, function(cacheEntry) {

>+    securityOnLoad();

It doesn't seem necessary (or useful) to put this inside the openCacheEntry callback - none of the security info depends on the cache entry info being retrieved, AFAICT.

> function addImage(url, type, alt, elem, isBg)

>   if (!gImageHash[url][type].hasOwnProperty(alt)) {
>     gImageHash[url][type][alt] = gImageView.data.length;
>+    openCacheEntry(url, function(cacheEntry) {
...
>+    });
>   }
>   else {
>     var i = gImageHash[url][type][alt];
>     gImageView.data[i][COL_IMAGE_COUNT]++;

I don't understand exactly what all of this code is doing, but this looks problematic - this code seems to assume that gImageView.data will be updated synchronously for each addImage() call, and putting the gImageView.addRow() a callback invoked asynchronously will break that. addImage() is called in a loop for all images in the page, and will be called multiple times before any of the openCacheEntry callbacks can be invoked.

I noticed that this patch causes browser_bug460146.js and browser_bug517902.js to fail, but didn't look into why (it might be because of the issue I point out above).
Comment 16 :Margaret Leibovic 2012-09-13 03:46:47 PDT
Created attachment 660781 [details] [diff] [review]
update API use in pageInfo.js

This patch aims to minimize what we put in the openCacheEntry callback functions. Unfortunately, there are a chain of dependencies inside makePreview that force us to put a lot of logic inside the callback.

With this patch browser_bug517902.js was still failing because it wasn't accounting for the async image preview. I added a new hook to execute a callback after the image preview is shown, and this fixes the failure. I considered using a notification, but this is more similar to what pageInfo.js is already doing.
Comment 17 :Gavin Sharp [email: gavin@gavinsharp.com] 2012-09-13 08:42:28 PDT
Comment on attachment 660781 [details] [diff] [review]
update API use in pageInfo.js

>diff --git a/browser/base/content/pageinfo/pageInfo.js b/browser/base/content/pageinfo/pageInfo.js

>+    // Fill in cache data asynchronously
>+    openCacheEntry(url, function(cacheEntry) {
>+      // The data at row[2] corresponds to the data size.
>+      if (cacheEntry)
>+        row[2] = cacheEntry.dataSize;

Don't you need to call gImageView.invalidate() here for the tree to repaint itself accordingly, now that you've changed the underlying data? AFAIK it won't repaint otherwise (unless you cause it to by e.g. scrolling).

Actually, it would be more efficient to call gImageView.tree.invalidateRow(gImageView.rows.indexOf(row)), since this callback will be called for each row in the tree. Alternatively you could just call invalidate() once all callbacks have been called, I guess, but that seems more annoying to track.

r=me with that addressed.
Comment 18 :Margaret Leibovic 2012-09-13 09:26:13 PDT
Created attachment 660868 [details] [diff] [review]
update API use in pageInfo.js (for check-in)

(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #17)
> Comment on attachment 660781 [details] [diff] [review]
> update API use in pageInfo.js
> 
> >diff --git a/browser/base/content/pageinfo/pageInfo.js b/browser/base/content/pageinfo/pageInfo.js
> 
> >+    // Fill in cache data asynchronously
> >+    openCacheEntry(url, function(cacheEntry) {
> >+      // The data at row[2] corresponds to the data size.
> >+      if (cacheEntry)
> >+        row[2] = cacheEntry.dataSize;
> 
> Don't you need to call gImageView.invalidate() here for the tree to repaint
> itself accordingly, now that you've changed the underlying data? AFAIK it
> won't repaint otherwise (unless you cause it to by e.g. scrolling).

Yes, good catch.

> Actually, it would be more efficient to call
> gImageView.tree.invalidateRow(gImageView.rows.indexOf(row)), since this
> callback will be called for each row in the tree.

What you really meant is gImageView.data.indexOf(row), since gImageView.rows is an integer (totally intuitive).

Michal, I'll let you check this in with your work, since this is your bug :)
Comment 19 Wesley Johnston (:wesj) 2012-09-18 11:09:55 PDT
This is missing a catch in browser.js code. browser.js fails to load and it does not load pages.
Comment 20 Wesley Johnston (:wesj) 2012-09-18 11:18:53 PDT
Whoops. I misread blame. This is fine. Sorry about that.
Comment 22 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-09-18 14:44:23 PDT
I think Michal will post a link to an example of converting from OpenCacheEntry to AsyncOpenCacheEntry, which means making the callers work asynchronously instead of synchronously. This will affect some aspects of mailnews. Sorry for the belated notification:

http://mxr.mozilla.org/comm-central/search?string=OpenCacheEntry&find=%2F&findi=&filter=^[^\0]*%24&hitlimit=&tree=comm-central
Comment 23 Michal Novotny (:michal) 2012-09-18 15:13:05 PDT
Please have a look at the patch #652072. In case you need to replace OpenCacheEntry() with AsyncOpenCacheEntry(), see the change in nsAboutCacheEntry.cpp. But in most cases we used to use OpenCacheEntry() followed by AsyncOpenCacheEntry() if the synchronous call would block. In this case just remove the call to OpenCacheEntry() just like we did in nsFtpConnectionThread.cpp.

Note You need to log in before you can comment on or make changes to this bug.