Closed Bug 1154974 Opened 5 years ago Closed 5 years ago

Merge image cache entries for blob URIs with the same underlying blob

Categories

(Core :: ImageLib, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: seth, Assigned: seth)

References

Details

Attachments

(2 files, 5 obsolete files)

It's an unfortunately common pattern on B2G that developers are holding references to blobs and calling createObjectUrl() and revokeObjectUrl() over and over depending on whether or not the blobs are currently needed.

This causes extremely bad behavior in some cases - for example, it defeats all image caching when flicking between images in the B2G Gallery app, because as soon as you've moved two images away from an image, its MediaFrame object is destroyed and revokeObjectUrl() is called on the blob URI that was used to draw. When you flick back to the image, a new MediaFrame object is created which calls createObjectUrl() again, producing a new URI. This misses in the ImageLib cache, forcing us to load the image's data again and redecode it again.

We could fix this in the Gaia code, certainly, but I actually think this is a trap that's easy to fall into for any developer. There's no reason this shouldn't "just work", so let's make it work.

In this bug, I'll make it so that ImageLib cache entries for blob URIs are keyed based upon the underlying blob's identity, rather than the URI. This should be totally safe, particularly since blobs are immutable.
Depends on: 1155429
Here's the blob portion of the patch. This adds a global serial number for all
FileImpl objects and records the serial number in each FileImplBase instances.

I also threw in a utility function necessary for part 2 that lets us get the
blob underlying a blob URI based on the spec, even if we don't have an nsIURI.
Attachment #8593712 - Flags: review?(bent.mozilla)
Here's the patch that actually implements the change I talked about in comment
0. When it represents a blob URI, ImageCacheKey now uses the underlying blob's
serial number for its identity, instead of the URI spec.

This should fix the performance issues I've seen in the B2G Gallery app.
Attachment #8593713 - Flags: review?(amarchesini)
Comment on attachment 8593712 [details] [diff] [review]
(Part 1) - Give blobs serial numbers

Review of attachment 8593712 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/File.cpp
@@ +735,5 @@
>  
> +/* static */ uint32_t
> +FileImpl::NextSerialNumber()
> +{
> +  static Atomic<uint32_t> nextSerialNumber;

Hrm, I really think this should be a 64-bit counter to be safe. I know we don't have 64-bit atomic increment on x86 but I think we should still try to avoid 32-bit numbers here.

Let's just use a StaticMutex and increment a 64-bit value here on 32-bit platforms (keep using atomics on 64-bit platforms I suppose). If the mutex ever shows up as a perf problem (which I would really doubt) then we can try to get more clever with a spin lock or something.
Attachment #8593712 - Flags: review?(bent.mozilla) → review-
(In reply to Ben Turner [:bent] (use the needinfo flag!) from comment #4)
> Comment on attachment 8593712 [details] [diff] [review]
> (Part 1) - Give blobs serial numbers
> 
> Review of attachment 8593712 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/base/File.cpp
> @@ +735,5 @@
> >  
> > +/* static */ uint32_t
> > +FileImpl::NextSerialNumber()
> > +{
> > +  static Atomic<uint32_t> nextSerialNumber;
> 
> Hrm, I really think this should be a 64-bit counter to be safe. I know we
> don't have 64-bit atomic increment on x86 but I think we should still try to
> avoid 32-bit numbers here.

Yeah, I'd vastly prefer a 64-bit counter as well. Maybe the best bet here is actually to implement Atomic<uint64_t> with a lock internally on platforms where we don't have a 64-bit fetch-and-add. That'd be better than reimplementing a mutex-based fallback every time we need this functionality.

I'll ask Waldo if he'd be OK with that.
Depends on: 1155864
OK, looks like we should shortly have Atomic support for 64-bit types everywhere. I'll update the patches.
Comment on attachment 8593713 [details] [diff] [review]
(Part 2) - Merge image cache entries for blobs URIs with the same underlying blob

Review of attachment 8593713 [details] [diff] [review]:
-----------------------------------------------------------------

As usual, questions:

Why do you need the blob serial? It seems to me that this imgCache is per process.
From a blob URI you can have the FileImpl. Can you its pointer has blob serial ID?
Attachment #8593713 - Flags: review?(amarchesini)
OK, we can now use Atomic<uint64_t> everywhere, so here's an updated version of
part 1 that uses 64-bit serial numbers.
Attachment #8598394 - Flags: review?(bent.mozilla)
Attachment #8593712 - Attachment is obsolete: true
Here's a rebased version of part 2.
Attachment #8598396 - Flags: review?(amarchesini)
Attachment #8593713 - Attachment is obsolete: true
(In reply to Andrea Marchesini (:baku) from comment #7)
> As usual, questions:

Of course. =)

> Why do you need the blob serial? It seems to me that this imgCache is per
> process.
> From a blob URI you can have the FileImpl. Can you its pointer has blob
> serial ID?

My original design *did* use a FileImpl pointer as the key, but unfortunately there are problems. If we try to do that, we have two options:

(1) We store a FileImpl* in the ImageCacheKey. Then there's nothing preventing that FileImpl from being freed and another being allocated at the same address, which would resulting in a false cache hit. We could try purging the FileImpl* from all image caches when it's freed, but that is likely to be nasty because image caches are main-thread-only and FileImpl's aren't, and it would introduce coupling between the blob code and ImageLib that I think everyone involved would really like to avoid.

(2) We could store a nsRefPtr<FileImpl> in the ImageCacheKey. Since image cache entries hang around for a potentially unlimited amount of time, that means that we may extend the lifetime of blob resources for a very long time. We don't want to do that. It's not useful at all to make blobs last beyond the time when revokeObjectUrl() is called, since JS can't access them anymore at that point.

It might seem like there's a third option:

(3) We could store a WeakPtr<FileImpl> in the ImageCacheKey. Unfortunately, WeakPtr's are not threadsafe, and FileImpl's need to be threadsafe, so we can't use weak pointers to solve this problem.

Given that all of the alternatives have serious problems, adding a per-FileImpl serial number seemed like the least bad solution. It gives us a way to uniquely identify FileImpl instances where collisions are very unlikely, it doesn't affect the lifetime of the FileImpl instances at all, and it doesn't introduce any undesirable coupling between unrelated pieces of code.

So right now this seems like the best way forward.
Tell me if I'm wrong:

If we are using FileImpl in imgCache is because there is a blob URI somewhere. And imgCache is main-thread only. If these 2 sentences are correct (and not from an english point of point), can we change nsHostObjectProtocolHandler::RemoveDataEntry() and purge the imgCache at that point?

Note: also nsHostObjectProtocolHandler is main-thread only. Recently I have worked on bug 1156875, centralizing the blob/media object URIs into nsIGlobalObject, so, if you like this approach maybe you want to change the code on top of my patches.
Flags: needinfo?(seth)
(In reply to Andrea Marchesini (:baku) from comment #11)
> Tell me if I'm wrong:
> 
> If we are using FileImpl in imgCache is because there is a blob URI
> somewhere. And imgCache is main-thread only. If these 2 sentences are
> correct (and not from an english point of point), can we change
> nsHostObjectProtocolHandler::RemoveDataEntry() and purge the imgCache at
> that point?

Unless I'm missing something, that won't work. =\ It's my mistake - I made it sound like this would work, I think, because I made a misstatement in comment 10. Here's where I went wrong:

> It's not useful at all to make blobs last beyond the time when revokeObjectUrl() is called, since JS can't access them anymore at that point.

That's not true; actually, it's critical that we support that use-case! What I meant to say was that it's not useful at all to make blobs last beyond the time when revokeObjectUrl() is called *and the last reference to the blob has been dropped*. That's a really important difference.

Let me back up, and explain the B2G Gallery App's usage model. (Actually, it's the Gaia MediaFrame library, which is used in many places, but the Gallery serves as a good example.) My motivation for this patch is to support this kind of usage better, so I think it's important to explain.

The B2G Gallery App stores most images via the storage service, and accesses them as blobs. It *doesn't* keep blob URIs alive for long periods; instead, it keeps the underlying blobs alive, and calls createObjectUrl() when the an image is either visible onscreen or is about to become visible. It calls revokeObjectUrl() as soon as the user flicks past the image.

With the existing image cache, that totally defeats our ability to cache decoded versions of the images, because each time the user returns to the image, createObjectUrl() has been called again and ImageLib sees a different URI.

So what we need to do is merge image cache entries for the same FileImpl, even if revokeObjectUrl() has been called and there are no longer any outstanding blob URIs for the FileImpl. We really care about the FileImpl's lifetime here, not the blob URI lifetime. That's critical to the goals of this patch.

So any solution cannot be implemented in terms of nsHostObjectProtocolHandler::RemoveDataEntry(); we need a FileImpl-level solution.
Flags: needinfo?(seth)
Comment on attachment 8598396 [details] [diff] [review]
(Part 2) - Merge image cache entries for blobs URIs with the same underlying blob

Review of attachment 8598396 [details] [diff] [review]:
-----------------------------------------------------------------

::: image/src/imgLoader.cpp
@@ +1106,5 @@
> +template <typename T> bool
> +URISchemeIs(T* aURI, const char* aScheme)
> +{
> +  bool schemeMatches = false;
> +  return NS_SUCCEEDED(aURI->SchemeIs(aScheme, &schemeMatches)) && schemeMatches;

Just because SchemeIs should not fail, would be nice to have this:

return !NS_WARN_IF(NS_FAILED(aURI->SchemeIs(aScheme, &schemeMatches))) && schemeMatches;

but it's hard to read :)

@@ +1115,5 @@
> +{
> +  nsRefPtr<FileImpl> blob;
> +  if (NS_SUCCEEDED(NS_GetBlobForBlobURISpec(aSpec, getter_AddRefs(blob))) &&
> +      blob) {
> +    return Some(blob->GetSerialNumber());

What about if GetSerialNumber returns a value > 0. Then you get rid of Maybe and return 0 instead Some/Nothing.
Attachment #8598396 - Flags: review?(amarchesini) → review+
Thanks for the review!

(In reply to Andrea Marchesini (:baku) from comment #13)
> Just because SchemeIs should not fail, would be nice to have this:
> 
> return !NS_WARN_IF(NS_FAILED(aURI->SchemeIs(aScheme, &schemeMatches))) &&
> schemeMatches;
> 
> but it's hard to read :)

Haha yeah. =) I agree that'd be better, though. We should be able to the same effect if we break it up into multiple statements, and it wouldn't be so hard to read then. I'll make that change.

> What about if GetSerialNumber returns a value > 0. Then you get rid of Maybe
> and return 0 instead Some/Nothing.

I think this is partly a matter of taste. =) One reason I like Maybe is that it gives you a lot of built-in assertions. It's hard to misuse it without triggering an assert or a compilation failure.
Comment on attachment 8598394 [details] [diff] [review]
(Part 1) - Give blobs serial numbers

Review of attachment 8598394 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/File.h
@@ +307,5 @@
> +   * Returns a new, effectively-unique serial number. This should be used
> +   * by implementations to obtain a serial number for GetSerialNumber().
> +   * The implementation is thread safe.
> +   */
> +  static uint64_t NextSerialNumber();

My only quibble would be that this maybe this should live on FileImplBase since it seems like FileImpl is really trying to just be an interface. But I don't care strongly either way.
Attachment #8598394 - Flags: review?(bent.mozilla) → review+
Thanks for the review, Ben!

I've moved NextSerialNumber() onto FileImplBase in this version of the patch.
And in this version of part 2, I've added the NS_WARN_IF check for SchemeIs().
Attachment #8598394 - Attachment is obsolete: true
Attachment #8598396 - Attachment is obsolete: true
So the try job revealed a new problem that was previously masked by other bugs. =\

Basically we are currently recomputing the ImageCacheKey for an imgRequest from its URI every time we need it. The problem is that, for blob URIs, we end up with a different ImageCacheKey depending on whether revokeObjectUrl() has been called for a blob URI or not - if the blob URI is still around, we based the ImageCacheKey on the FileImpl's serial number, but if it's not, we just use the URI. This results in issues where we try to delete an imgRequest from the cache, but fail to do so, because the new ImageCacheKey we create to try to remove it from the cache doesn't match the ImageCacheKey it's stored in the cache with.

The fix is basically straightforward: we need imgRequests to store their ImageCacheKeys, so we can always use the same ImageCacheKey with the same imgRequest.

I just pushed a try job to make sure this fixes the problem. I want to restructure the patch a bit before uploading it for review, though.
Depends on: 1160703
Attachment #8600196 - Attachment is obsolete: true
^ So now that bug 1160703 has been pushed, hopefully this one is now ready to land too. It needed some significant rebasing, though. Let's see what try turns up...
https://hg.mozilla.org/mozilla-central/rev/4ee8035b043d
https://hg.mozilla.org/mozilla-central/rev/fc204743547e
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Depends on: 1169847
You need to log in before you can comment on or make changes to this bug.