Closed Bug 695399 Opened 13 years ago Closed 12 years ago

Remove calls to nsICacheSession::openCacheEntry on the main thread

Categories

(Core :: Networking: Cache, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla18

People

(Reporter: briansmith, Assigned: michal)

References

(Depends on 1 open bug)

Details

(Keywords: addon-compat, APIchange, dev-doc-needed, Whiteboard: [Snappy:P3])

Attachments

(2 files, 3 obsolete files)

If we can remove openCacheEntry then I can avoid writing extra code to support it when I add certificate validation for cache entries in bug 660749.
Seems like nsICache::NON_BLOCKING and nsICache::BLOCKING could be removed too at the same time.

Some non-test callers:

  nsAboutCacheEntry::GetContentStream -> nsAboutCacheEntry::OpenCacheEntry
      * This wouldn't be affected by my SSL cert validation changes

  nsFtpState::CheckCache

  nsHttpChannel::OpenCacheEntry
      * // must use synchronous open for
        // LOAD_BYPASS_LOCAL_CACHE_IF_BUSY
  nsHttpChannel::OpenNormalCacheEntry
      * // must use synchronous open for
        // LOAD_BYPASS_LOCAL_CACHE_IF_BUSY
  nsHttpChannel::OpenOfflineCacheEntryForWriting
  nsHttpChannel::DoInvalidateCacheEntry

  nsWyciwygChannel::AsyncOpen (via nsWyciwygChannel::OpenCacheEntry)
  nsWyciwygChannel::WriteToCacheEntryInternal
     (via nsWyciwygChannel::OpenCacheEntry)
My changes for bug 660749 will affect *reading* from the cache only, not *writing*.

Bug 513008 is the bug where we removed almost all the synchronous reads.

Bug 551447 is about avoiding asynchrnous writes to the memory cache. If that needs to be done, maybe we need some *write-only* *memory-cache-only* synchronous API.
Depends on: 513008, 513074
(In reply to Brian Smith (:bsmith) from comment #1)
>   nsWyciwygChannel::AsyncOpen (via nsWyciwygChannel::OpenCacheEntry)

nsWyciwygChannel::AsyncOpen calls nsWyciwygChannel::OpenCacheEntry with access mode nsICache::ACCESS_READ, so async API is used.


>   nsWyciwygChannel::WriteToCacheEntryInternal
>      (via nsWyciwygChannel::OpenCacheEntry)

This uses synchronous API because it runs on the cache IO thread, so using async API would just add some delay. We would also need to implement some queuing of the writes because there are probably already other pending nsWyciwygWriteEvents when we open the cache entry during the first write.
(In reply to Brian Smith (:bsmith) from comment #1)
>   nsHttpChannel::OpenCacheEntry
>       * // must use synchronous open for
>         // LOAD_BYPASS_LOCAL_CACHE_IF_BUSY
>   nsHttpChannel::OpenNormalCacheEntry
>       * // must use synchronous open for
>         // LOAD_BYPASS_LOCAL_CACHE_IF_BUSY

These calls can't be easily switched to AsyncOpenCacheEntry() because it doesn't support non-blocking mode. We could add non-blocking mode to the async method, but it is IMO weird to have "blockingMode" argument in case of asynchronous method.

Taras mentioned that we should implement a timeout for caching. I.e. if we cannot get the cached data within certain time, we would bypass the cache and hit the net. Should we do it in the cache code or rather in http channel? If we implement it in the cache, the timeout will limit just the waiting for the entry, not the reading from it. Taras, any opinion on this?

If we decide to do it in the cache, I'd suggest to add a timeout argument to AsyncOpenCacheEntry(), so passing a zero would be equal to the "non-blocking" async request.


>   nsHttpChannel::DoInvalidateCacheEntry

We can make this asynchronous, but I'd like to avoid adding extra logic for this in nsHttpChannel::OnCacheEntryAvailable(), so I'd suggest to create a helper class or to add doom(in ACString key) method to nsCacheSession. I don't have a strong preference but I tend to do the latter.
So, a quick comment on why the sync version exists (of this and other APIs). The HTTP channel first tries a synchronous open, and falls back to async if sync would block. The reason for that is that it is faster to do it that way than to do all the async stuff with the associated context switching.

However, I don't know if that is still a relevant concern, and will hand over this bug back to y'all :)
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #5)
> The HTTP channel first tries a synchronous open, and falls back to async if
> sync would block.

The HTTP channel falls back to async if the sync request would block on *socket* I/O (i.e. revalidating with the server). However, the very first step of sync openCacheEntry is acquiring the cache service lock:

   nsCacheServiceAutoLock lock;
   rv = gService->ProcessRequest(request, true, result);

Which means it is likely to be blocked on disk I/O from other requests or (my understanding is a little hazy here) even disk I/O for *this* request, right?

So, it might be worth keeping the synchronous API *if* we can eliminate all cases where the cache service lock is held during disk I/O. However, I do not know how feasible such a change would be and if we would be able to find *all* cases where the lock is held during disk I/O.
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #5)
> So, a quick comment on why the sync version exists (of this and other APIs).
> The HTTP channel first tries a synchronous open, and falls back to async if
> sync would block.

This is no longer true. It was changed by bug #513008 to eliminate I/O on the mainthread. Sync API is used only in few specific cases mentioned above.
Whiteboard: [Snappy:P3]
Almost all of the addons on AMO that are using this API are using this API for the following purposes:

1. To doom an entry. We should add a method to nsICacheService that asynchronously dooms an entry given a URL and calls a callback. Many addons, including LiveHTTPHeaders and AdBlockPlus are doing this.

2. To access the entry's dataSize attribute, because nsIDocument and friends do not expose the information about the size of the document. We should add such a dataSize attribute to nsIDocument and/or some other place that an addon can get it, navigating from a document. Many addons are doing this, including StumbleUpon.

3. To load an entry out of the cache, without revalidating, without checking for validation, and without touching the network. If loadURI doesn't support this already, then we should add it. Many video/image saving addons are doing this.

4. To add some metadata, e.g. whether the favicon could be loaded. Not many addons are doing this.

5. To check to see whether a cache entry exists for a given URL.

AFAICT, nothing horrific would happen to these addons, as many of them do a reasonable job of wrapping access to the catch in try...catch because they have to deal with the case where the cache doesn't have the entry. However, the video/image downloading addons are likely to stop working entirely and/or partially. :(
Depends on: 737614
Depends on: 737615
Depends on: 737642
Depends on: 739473
Summary: Remove synchronous cache API (nsICacheSession::openCacheEntry) → Remove calls to nsICacheSession::openCacheEntry on the main thread
Attached patch browser part (obsolete) — Splinter Review
Attachment #646777 - Flags: review?(db48x)
Attached patch necko part (obsolete) — Splinter Review
Attachment #646780 - Flags: review?(jduell.mcbugs)
(In reply to Michal Novotny (:michal) from comment #9)
> Created attachment 646777 [details] [diff] [review]
> browser part

In bug 737642 we had all agreed (I think) to just remove all usage of the cache from the page info dialog box. If it is possible to get the page info dialog box to completely stop using the cache, I think that is very much preferable to rewriting the code to be async. (If you look at the dependencies on bug 737642, you can see that the way the page info dialog box is using the cache causes a lot of problems, besides main thread I/O.)
Attachment #646780 - Flags: review?(jduell.mcbugs) → review?(hurley)
Comment on attachment 646780 [details] [diff] [review]
necko part

Review of attachment 646780 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, just one question/clarification below. r=me otherwise

::: netwerk/protocol/about/nsAboutCacheEntry.cpp
@@ +140,3 @@
>      if (NS_FAILED(rv)) return rv;
>  
> +    *result = inputStream.forget().get();

I'm probably missing something, but why do a get on a forgotten pointer?
Attachment #646780 - Flags: review?(hurley) → review+
> I'm probably missing something, but why do a get on a forgotten pointer?

forget() returns already_AddRefed<nsIAsyncInputStream> so the get() is necessary.
Attachment #646780 - Attachment is obsolete: true
Attachment #652072 - Flags: review+
Comment on attachment 646777 [details] [diff] [review]
browser part

Gavin - can you review this or find a reviewer? This is important for our cache performance work.
Attachment #646777 - Flags: review?(db48x) → review?(gavin.sharp)
Comment on attachment 646777 [details] [diff] [review]
browser part

>diff --git a/browser/base/content/pageinfo/pageInfo.js b/browser/base/content/pageinfo/pageInfo.js

> function makeGeneralTab()

>+  openCacheEntry(cacheKey, function(cacheEntry) {

>+    securityOnLoad();

It doesn't seem necessary (or useful) to put this inside the openCacheEntry callback - none of the security info depends on the cache entry info being retrieved, AFAICT.

> function addImage(url, type, alt, elem, isBg)

>   if (!gImageHash[url][type].hasOwnProperty(alt)) {
>     gImageHash[url][type][alt] = gImageView.data.length;
>+    openCacheEntry(url, function(cacheEntry) {
...
>+    });
>   }
>   else {
>     var i = gImageHash[url][type][alt];
>     gImageView.data[i][COL_IMAGE_COUNT]++;

I don't understand exactly what all of this code is doing, but this looks problematic - this code seems to assume that gImageView.data will be updated synchronously for each addImage() call, and putting the gImageView.addRow() a callback invoked asynchronously will break that. addImage() is called in a loop for all images in the page, and will be called multiple times before any of the openCacheEntry callbacks can be invoked.

I noticed that this patch causes browser_bug460146.js and browser_bug517902.js to fail, but didn't look into why (it might be because of the issue I point out above).
Attachment #646777 - Flags: review?(gavin.sharp) → review-
Attachment #646777 - Flags: review?(margaret.leibovic)
Attached patch update API use in pageInfo.js (obsolete) — Splinter Review
This patch aims to minimize what we put in the openCacheEntry callback functions. Unfortunately, there are a chain of dependencies inside makePreview that force us to put a lot of logic inside the callback.

With this patch browser_bug517902.js was still failing because it wasn't accounting for the async image preview. I added a new hook to execute a callback after the image preview is shown, and this fixes the failure. I considered using a notification, but this is more similar to what pageInfo.js is already doing.
Attachment #646777 - Attachment is obsolete: true
Attachment #646777 - Flags: review?(margaret.leibovic)
Attachment #660781 - Flags: review?(gavin.sharp)
Comment on attachment 660781 [details] [diff] [review]
update API use in pageInfo.js

>diff --git a/browser/base/content/pageinfo/pageInfo.js b/browser/base/content/pageinfo/pageInfo.js

>+    // Fill in cache data asynchronously
>+    openCacheEntry(url, function(cacheEntry) {
>+      // The data at row[2] corresponds to the data size.
>+      if (cacheEntry)
>+        row[2] = cacheEntry.dataSize;

Don't you need to call gImageView.invalidate() here for the tree to repaint itself accordingly, now that you've changed the underlying data? AFAIK it won't repaint otherwise (unless you cause it to by e.g. scrolling).

Actually, it would be more efficient to call gImageView.tree.invalidateRow(gImageView.rows.indexOf(row)), since this callback will be called for each row in the tree. Alternatively you could just call invalidate() once all callbacks have been called, I guess, but that seems more annoying to track.

r=me with that addressed.
Attachment #660781 - Flags: review?(gavin.sharp) → review+
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #17)
> Comment on attachment 660781 [details] [diff] [review]
> update API use in pageInfo.js
> 
> >diff --git a/browser/base/content/pageinfo/pageInfo.js b/browser/base/content/pageinfo/pageInfo.js
> 
> >+    // Fill in cache data asynchronously
> >+    openCacheEntry(url, function(cacheEntry) {
> >+      // The data at row[2] corresponds to the data size.
> >+      if (cacheEntry)
> >+        row[2] = cacheEntry.dataSize;
> 
> Don't you need to call gImageView.invalidate() here for the tree to repaint
> itself accordingly, now that you've changed the underlying data? AFAIK it
> won't repaint otherwise (unless you cause it to by e.g. scrolling).

Yes, good catch.

> Actually, it would be more efficient to call
> gImageView.tree.invalidateRow(gImageView.rows.indexOf(row)), since this
> callback will be called for each row in the tree.

What you really meant is gImageView.data.indexOf(row), since gImageView.rows is an integer (totally intuitive).

Michal, I'll let you check this in with your work, since this is your bug :)
Attachment #660781 - Attachment is obsolete: true
This is missing a catch in browser.js code. browser.js fails to load and it does not load pages.
Whoops. I misread blame. This is fine. Sorry about that.
I think Michal will post a link to an example of converting from OpenCacheEntry to AsyncOpenCacheEntry, which means making the callers work asynchronously instead of synchronously. This will affect some aspects of mailnews. Sorry for the belated notification:

http://mxr.mozilla.org/comm-central/search?string=OpenCacheEntry&find=%2F&findi=&filter=^[^\0]*%24&hitlimit=&tree=comm-central
Please have a look at the patch #652072. In case you need to replace OpenCacheEntry() with AsyncOpenCacheEntry(), see the change in nsAboutCacheEntry.cpp. But in most cases we used to use OpenCacheEntry() followed by AsyncOpenCacheEntry() if the synchronous call would block. In this case just remove the call to OpenCacheEntry() just like we did in nsFtpConnectionThread.cpp.
https://hg.mozilla.org/mozilla-central/rev/07bb197f28f1
https://hg.mozilla.org/mozilla-central/rev/a9fc8c6f0223
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Blocks: 792735
Depends on: 792804
No longer depends on: 737642
Depends on: 801930
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: