Closed Bug 446876 Opened 16 years ago Closed 8 years ago

Need to teach necko to deal with channels reading and writing to a cache entry simultaniously

Categories

(Core :: Networking: Cache, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 913807

People

(Reporter: sicking, Assigned: michal)

References

(Blocks 2 open bugs)

Details

(Keywords: perf, Whiteboard: [necko-would-take])

Attachments

(1 file)

Currently if you start two loads of the same http URI and we end up caching the result as it comes it, the second load will be stalled until the first one finishes.

This is very evident if you load a long mxr pages in two tabs at the same time. The first one loads fine and incremental, but the second doesn't get any data at all until the first one is done.

This also leads to problems when doing sync loading. Any sync load is basically forced to bypass the cache if the cache is busy, since the sync load might be happening inside a necko callback originating from the same uri. In this case the sync load is blocked until the callback returns, but the return can't happen until the syncload finishes.

This can happen if the page foo.html tries to sync-load foo.html from its onload handler using XHR. We currently deal with this by making the syncload not get the cached data, which of course is suboptimal.

Finally, this probably blocks the ability to do speculative parse-ahead (bug 364315) properly. Whenever a speculative load is started that will block the "actual load" for as long as the speculative load is going.
Assignee: nobody → dcamp
Michal want to take a crack?
Assignee: dcamp → michal
The summary is IMHO misleading since there is no problem when multiple channels read from the cache. Problem is only when one channel writes the data to the cache.

I'm looking into the code and I'm not sure what is the right solution. I was thinking about changing the cache so that data in the cache is readable even if it isn't complete and validated. The second channel would read data from the cache as it is written to it. But I found some problems with this approach.

Any channel reading data from the cache depends on the channel that is writing the data to the cache. If user cancels loading of the tab with writing channel, other channels will stop too. From user's perspective it is unexpected behaviour.

Also amount of data that can be cached is limited. There is limit of cachesize/2 for 1 item. Cache entry is doomed when it reaches this limit (25MB by default). If two channels would download 30MB image, the one would succeed but the second would fail after reaching 25MB.
Changed the summary, feel free to change again as needed.

> Any channel reading data from the cache depends on the channel that is writing
> the data to the cache. If user cancels loading of the tab with writing
> channel, other channels will stop too. From user's perspective it is
> unexpected behaviour.

Indeed. Could cancel just the nsIChannel object in this case, but transfer the ownership of the network connection over to the channel that still remains open?

> Also amount of data that can be cached is limited. There is limit of
> cachesize/2 for 1 item. Cache entry is doomed when it reaches this limit (25MB
> by default). If two channels would download 30MB image, the one would succeed
> but the second would fail after reaching 25MB.

Tricky indeed. What do we currently do if the networks serves us 30MB of data, but the reading channel hasn't read more than a few kB?
Summary: Need to teach necko to deal with multiple channels reading from the cache at the same time → Need to teach necko to deal with channels reading and writing to a cache entry simultaniously
Wrong link above in Comment #4; what is meant is please cf. https://bugzilla.mozilla.org/show_bug.cgi?id=408901
Flags: blocking1.9.1?
any progress here?
(In reply to comment #3)
> Indeed. Could cancel just the nsIChannel object in this case, but transfer the
> ownership of the network connection over to the channel that still remains
> open?

Yes of course, this can be done. BTW is this bug about HTTP channels only or the solution should be generic for any channel?

> Tricky indeed. What do we currently do if the networks serves us 30MB of data,
> but the reading channel hasn't read more than a few kB?

There are 2 options. If the channel is opened synchronously then nsSyncStreamListener is created to handle data as it comes from the server. Consumer can read data as slow as it wants, but cache is filled as fast as the data comes from the network. Data that consumer still haven't read is stored in the pipe nsSyncStreamListener::mPipeIn. The pipe is created without any limit, so this can potentially take a lot of memory.

If the channel is opened asynchronously, consumer must read all available data in nsIStreamListener::onDataAvailable(). In this case cache is filled as fast as consumer handles the data. If the channel is cancelled before all data is read from the server then cache entry stays in the cache but doesn't contain all data.
The more I'm thinking about the solution the more complex it seems to me. We can ignore for now the fact that not all HTTP requests are cached and just discuss if following approach is reasonable.

nsHttpChannel can be split into two parts, nsHTTPChannelCacheWriter and nsHttpChannelCacheReader.
nsHttpChannelCacheWriter (just "CacheWriter" from now on) would be responsible for getting data from the server and putting it to the cache.
nsHttpChannelCacheReader (just "CacheReader" from now on) would read data only from the cache, this would be the class that implements nsIChannel.

If there is a valid cache entry for the request then CacheReader will read the data from the cache.
If there is no cache entry for the request then CacheReader will create new instance of CacheWriter to fill the cache. Cache entry is accessible immediately and is marked as incomplete. CacheReader will continously pass the data from cache to the consumer.
If another request is started before CacheWriter finishes then CacheReader will read the data from cache and register itself in the existing CacheWriter, so that CacheWriter knows about all existing CacheReaders.
CacheWriter is cancelled only if last registered CacheReaders was cancelled too.

At this time cache entry is removed immediately after its size reaches maximum size and there is no notification to the writer about that. This should be changed so that at this point :

- cache entry isn't destroyed immediately so all existing CacheReaders can read it up to the maximum but no newly created CacheReader can see the cache entry
- CacheWriter is notified about this and it creates nsPipe for every CacheReader, so CacheReaders can continue to receive data after they handle all data from cache

When new request is started after this point, new CacheWriter and new cache entry are created...

I'm looking forward to your comments.
This sucks, but we're not going to block on it.
Flags: blocking1.9.1? → blocking1.9.1-
This python webserver reproduces the problem in a frameset if you navigate to http://localhost:8080/
And one other note:

This might be an issue with prefetch: imagine something contains a LINK
rel=prefetch to a resource that's a bit slow to load.  If a user navigates to that page while the fetch is ongoing, they won't get any rendering or parsing at all until the resource is fully retrieved.  This means that even without network contention, prefetch can slow down UI.
Maybe Bjarne can look at this after he finishes with the async write stuff?

(Sup, Gavin!)
Michal, I like your idea. I think it is pretty much the same thing I was thinking, but your phrasing is very cache-oriented whereas I have been thinking about this as something that should happen in the HTTP layer.

Imagine that we had two kinds of channels: mother and daughter. If there is no channel currently open for a URI, then that first channel for the URI becomes the mother; otherwise, if there is already a master channel for a URI, then all new URIs for that URI become daughters for that mother. Only mother channels will ever do any network I/O and only mother channels would read or write to the cache. When a mother receives a chunk of a response from the network, it appends the data to an internal buffer and then notifies all of her daughters. When a daughter is finished processing a chunk, it notifies its mother. When the mother receives all the notifications from all her daughters, for a given chunk, she discards that chunk from her internal buffer, once that internal buffer reaches some threshold (e.g. 1MB).

It will often be the case that a new daughter channel attempts to attach to a mother channel after the mother channel has discarded some chunks from her buffer. Perhaps, it would be OK to simply refuse the daughter's attempt to attach, so that the daughter becomes her own mother channel; in this case the daughter channel would automatically load the resource off the network, bypassing the cache completely (basically, automatically applying the LOAD_BYPASS_LOCAL_CACHE_IF_BUSY semantics).

Obviously, the LOAD_BYPASS_LOCAL_CACHE_IF_BUSY semantics are pretty wasteful. However, it is probably VERY rare for the user to simultaneously try to load the same >1MB resource, so IMO it is not worth optimizing for that immediately. But, later, we could implement a scheme where the cache really does implement simultaneous readers and writers, so that the daughter channels could catch up to their mother by reading from the cache before responding to any of the mother's notifications of new chunks of the response, if there is a cache entry available to actually catch up from (as you noted, there might not be, if the resource isn't cacheable, either due to cache-control policy or due to its size); otherwise, they would continue on with the LOAD_BYPASS_LOCAL_CACHE_IF_BUSY semantics.

AFAICT, it seems like a good idea to implemented this mother/daughter channel scheme soon, and then do our cache disk format changes, and then implement the simultaneous cache entry reader/writer mechanism after we know what the new disk format is going to look like.

For FTP channels, or any other non-HTTP, non-WYCIWYG channels, we should just punt and always implement automatic LOAD_BYPASS_LOCAL_CACHE_IF_BUSY semantics. I don't know yet what is sensible for WYCIWYG channels, but, AFAICT, only LOAD_BYPASS_LOCAL_CACHE_IF_BUSY semantics make sense for it.

Thoughts?
(In reply to Brian Smith (:bsmith) from comment #15)
> otherwise, if there is already a master channel for a
> URI, then all new URIs for that URI become daughters for that mother. 

...then all new CHANNELS for that URI become daughters for that mother.

> I don't know yet what is sensible for WYCIWYG channels, but,
> AFAICT, only LOAD_BYPASS_LOCAL_CACHE_IF_BUSY semantics make
> sense for it.

...for THEM.
Blocks: http_cache
Blocks: 814285
Keywords: perf
This is actually fixed in the new cache.  However, it's not in all cases possible to always read/write concurrently now.  There is a bit more space to include more cases, but still not all are possible I think.
Whiteboard: [necko-would-take]
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
Duplicating since what we have is currently good enough, the cache is capable.  Comment 17 talks about http channel (cache consumer) that doesn't support concurrency in all cases.  I don't think we need to improve on that immediately.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: