Open Bug 578541 Opened 10 years ago Updated 3 years ago

Add priority levels to cache

Categories

(Core :: Networking: Cache, defect, P3)

defect

Tracking

()

People

(Reporter: jduell.mcbugs, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-backlog])

For various purposes we are going to want to flag cache entries for certain pages (ex: app tabs) or content-types (JS and CSS) as having higher priority in the HTTP cache (i.e. evict lower priority elements before higher ones).

We currently appear to have no mechanism for having a cache "priority" (it's all LRU).  We need to add one.  But there are some subtleties here.  In particular, we don't want to allow high priority content to crowd out regular content completely, nor do we want a particular domain to fill up all app tab content, etc.

After some discussion on IRC with biesi and byronm, here's what we've cooked up
so far:

1) Keep at least three different eviction queues: regular, JS/CSS, and app tab content (in increasing order of priority).

2) Put a % limit on how much space each higher-priority cache can use (to guarantee that LRU works for regular content, with at least some reasonable working set size).  If the size would be exceeded, demote oldest entries to the regular cache.

3) For the app cache queue only, keep a per-domain tally on how much space the tab is using in the high-priority cache, and if a site hits the limit, demote it's oldest (i.e. least-recently used) entry into the regular cache 

4) Periodically sweep and evict stale (not accessed for longer than N days, perhaps tempered by how much the browser has been used, so I don't lose my items because I go on vacation) entries from the higher priority caches, demoting them to the regular cache or dropping them if they're older than oldest regular cache entries.

This can and quite possibly will get more complex.  We might want to have separate queues for app tabs (ex: keep JS/CSS for them more aggressively than images or videos).  We would probably want to allow sites to mark content as "Cache-priority: low" (so an app-tabbed google maps can put image tiles in the regular cache, but keep more reused items in the app tab queue).  It's tempting to offer a "Cache-priority: high" HTTP header (which could get lumped into the JS/CSS queue, or its own queue), but to ensure fairness we'd have to start doing per-domain limit checks there, too.  Should be doable, but the exact policies and limits will obviously need to be carefully examined.  For starters, having priorities and the ability to set domain limits on a per-queue basis seems like a good start.
Blocks: 578544
We might want to also bump top-level document loads into a higher than default priority.  (Keeping index.html around means we can skip a round-trip and start loading images, JS, etc, right away).

I'm not sure how large files (see bug 81640) fit into this.  In some sense we could either keep them as a lowest priority, or keep a separate "large files" area of the cache.  (Note that if we set them as lowest priority, we'd have to put a % of cache size limit on them, as the whole point of separating out large files would be to prevent them from evicting too many small files from the cache).
I'm in NYC with Tom Hughes-Croucher of Yahoo! who is gonna comment shortly. Just a heads-up and cc'ing myself.

/be
Hi Jason,

I've been chatting with Brendan on a similar topic. So I wanted to offer some thoughts:

As a site we'd really like to retain certain parts of the site in a priority cache for example CSS and JS, but also certain "site chrome" images such as logos and sprites. Since certain sites like Facebook, Flickr, etc can easily provide enough content to cycle a cache it would be good to have a priority cache for these things. Having that segmented by domain would be ideal so our priority cache is not at risk from others. 

I think those requirement fits closely with your suggestion here. I'd love to see an x-cache-priority header or some other mechanism for indicating cache preferences. Headers might be somewhat difficult for developers to deploy because of the amount of integration based on application platforms.

It seems like the biggest gains would be from letting sites self-declare what needs to be cached, but the most immediate gains would be from the browser figuring it out.

I'd love to help turn this into a spec that could be implemented. I think this could have some really serious performance gains.
Tom,

> I'd love to see an x-cache-priority header... segmented by domain...

Agreed.

Meanwhile, we've made a lot of nuts and bolts improvements in our HTTP cache for Firefox 4 (cache will be 10x bigger for most users; a 5 MB max cache object size has fixed some groaner bugs where YouTube and other vids caused massive cache eviction; some other fixes: see http://tinyurl.com/2cefu2m for details), so I'm hoping you'll find that even our current, plain old FIFO algorithm hopefully captures the working set of most users w/o commonly reused items getting evicted.
Another thing we might want to prioritize cache entries on is "how hard they were to get", i.e. prefer keeping items from slow or high-latency connections.  Of course that would mean extra bookkeeping and an algorithm that could sensibly approximate "network quality" when in real-life conditions may be in flux often.
(In reply to comment #5)
>
> sensibly approximate "network quality" when in real-life conditions may be in
> flux often.

I'm a big proponent of this strategy.. while conditions can change, all else being equal we would rather flush things out of the cache that can be refilled from the LAN rather than the WAN - for example.
Whiteboard: [necko-backlog]
Duplicate of this bug: 578544
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3
You need to log in before you can comment on or make changes to this bug.