Closed Bug 1063369 Opened 7 years ago Closed 7 years ago

CSS ImageLoader cloned imgRequestProxy will point to old imgRequest if it fails to validate

Categories

(Core :: ImageLib, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
firefox40 --- fixed

People

(Reporter: tnikkel, Assigned: tnikkel)

References

Details

Attachments

(1 file)

When we load an image and there is an entry in the image cache for it's uri, but we have to validate it we will hand back a proxy to the old request that's in the cache, while the network request to validate it is still pending. If the CSS ImageLoader was the caller it will then clone the request, store the original in a hash table as the "canonical" request with a nullptr key (which is only used for print/print preview docs) and store the clone in the hashtable using the document's pointer as a key (this is the one that gets used in all other cases). Let's say the validation fails and we need to re-fetch the image. When the validator network requests determines this (that we need to re-fetch the image) we will then change the owner of the original proxy that we handed back to the new network request. However the cloned proxy (the one we use in most situation) stays with the old owner (the old request).
To fix this perhaps we should record in a proxy if it is subject to a validator and could have it's owner changed, and then when cloning it ensure that the imgCacheValidator gets a pointer to the new clone and adds the clone to the imgCacheValidator's list of proxies whose owner the imgCacheValidator needs to (potentially) change.
So the plan is to eliminate the current imgCacheValidator / ChangeOwner dance totally and switch to always performing a normal load from Necko. This whole mess is an artifact of ImageLib even having a separate cache, and it's not clear that it should. The ImageLib cache currently buys us two things:

1. Merging the ImageLib-level objects associated with multiple requests for the same resource.
2. Storing the ImageLib-level objects associated with a request for a period of time even after other references to those objects have been dropped, so we can quickly recreate them.

So we need an approach to get those benefits from the Necko cache directly, and indeed I have such an approach. I'm not having luck finding the bug right now, but I'll either find it or file a new one later today and make it block this bug.

(Note that if you want to solve this particular issue in another way beforehand, that's fine. I don't anticipate beginning work on this until Q2. But we need this for lots of reasons.)
As long as we preserve the web-compat-required (and spec-required) invariant that once an <img src="foo"> has been loaded in a document loading another <img src="foo"> with the same "foo" will synchronously have the image available...  Note that "synchronously" is key there.
> loading another <img src="foo">

In the same document, of course.
(In reply to Boris Zbarsky [:bz] from comment #3)
> As long as we preserve the web-compat-required (and spec-required) invariant
> that once an <img src="foo"> has been loaded in a document loading another
> <img src="foo"> with the same "foo" will synchronously have the image
> available...  Note that "synchronously" is key there.

Absolutely. It's probably not clear but point #1 in comment 2 includes that case.
(In reply to Seth Fowler [:seth] - PTO until 2/19 from comment #5)
> (In reply to Boris Zbarsky [:bz] from comment #3)
> > As long as we preserve the web-compat-required (and spec-required) invariant
> > that once an <img src="foo"> has been loaded in a document loading another
> > <img src="foo"> with the same "foo" will synchronously have the image
> > available...  Note that "synchronously" is key there.

Hmm.. actually considering this more, is this even possible in general? What if "foo"'s HTTP headers specify Cache-Control: No-Cache? I need to think about this more...
> Hmm.. actually considering this more, is this even possible in general? 

We implement it right now, via the image cache.

> What if "foo"'s HTTP headers specify Cache-Control: No-Cache? 

Doesn't matter.  The per-document in-memory object cache for images explicitly doesn't obey HTTP caching semantics. 

Now one option is to hoist the per-document cache up to documents or something, of course, and rely on something built on top of the HTTP cache for the "navigate to another page on the same site, would be nice if we get the images quickly in the common case that they have not changed" case.
So I found the place where this is discussed in the spec (though unfortunately I can't link it very precisely):

https://html.spec.whatwg.org/multipage/embedded-content.html#the-img-element

The spec says "The list of available images is intended to enable synchronous switching when changing the src attribute to a URL that has previously been loaded, and to avoid re-downloading images in the same document even when they don't allow caching per HTTP." 

It also says "User agents may also remove images from such lists at any time (e.g. to save memory)." This sounds to me like a conforming implementation may keep no list of available images at all. So I don't think this language constrains us much, but it sounds like web compatibility does.

What's not clear at all to me is how the existing code ensures this invariant. Where does this "list of available images" actually exist? It doesn't seem like the image cache behaves in the way this spec requires (it doesn't capture the per-document nature of the list of available images, and it doesn't seem to have a concept of the "ignore higher-layer caching" flag). So do we implement this behavior *now*?
I didn't see comment 7 before posting comment 8, sorry.

(In reply to Boris Zbarsky [:bz] from comment #7)
> We implement it right now, via the image cache.

I guess I don't understand how it works...

> Now one option is to hoist the per-document cache up to documents or
> something, of course, and rely on something built on top of the HTTP cache
> for the "navigate to another page on the same site, would be nice if we get
> the images quickly in the common case that they have not changed" case.

That is an option, definitely. I haven't thought about it much, but one thing that might be nice is that it might make it easier to do memory reporting of images per-document, which is much-requested.
> So I don't think this language constrains us much, but it sounds like web
> compatibility does.

Agreed.  Obviously on low memory we can and should throw away stuff including preloaded images.

> it doesn't capture the per-document nature of the list of available images, and it
> doesn't seem to have a concept of the "ignore higher-layer caching" flag

The per-document bit is captured via the "aCX" argument to imgLoader::LoadImage, which is the document pointer.  We end up in imgLoader::ValidateEntry which does, with a bunch of irrelevant stuff trimmed:

1627   bool validateRequest = false;
1635   void *key = (void *)aCX;
1636   if (request->mLoadId != key) {
...
1643     validateRequest = ShouldRevalidateEntry(aEntry, aLoadFlags, hasExpired);
...
1648   }
...
1678   if (validateRequest && aCanMakeNewChannel) {
...
1681     return ValidateRequestWithNewChannel(request, aURI, aInitialDocumentURI,
...
1687   }
1689   return !validateRequest;

so as long as aCx == request->mLoadId we never end up in ValidateRequestWithNewChannel and just use the entry as-is without ever hitting the HTTP layer in any way.

This doesn't quite match the spec, since if multiple documents all try to use the same image we end up thrashing the image cache for it (we have bugs on this).  But within a single document this does give us sync access to the image with no attention paid to HTTP semantics.
(In reply to Boris Zbarsky [:bz] from comment #10)
> so as long as aCx == request->mLoadId we never end up in
> ValidateRequestWithNewChannel and just use the entry as-is without ever
> hitting the HTTP layer in any way.

Thanks; that explains it!
OK, so based upon this discussion I think that before implementing the plan I mentioned in comment 2, we should actually implement a per-document "list of available images" as discussed in the spec. This seems to have a number of advantages:

- Gives us the synchronous behavior bz mentions in comment 3, even once we make the cache changes I'd like to make.
- Avoids the cache churn issue when multiple documents try to use the same image, as mentioned in comment 10.
- Makes per-document image memory reporting much easier.

This change will work (and add value) even without the cache refactoring.

I'll file some bugs later today.
Attached patch patchSplinter Review
I spent enough time debugging and dealing with this bug and the patch was relatively simple I just implemented comment 1.
Assignee: nobody → tnikkel
Attachment #8586527 - Flags: review?(seth)
Comment on attachment 8586527 [details] [diff] [review]
patch

Review of attachment 8586527 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, but I have a question before r+'ing this version of the patch (see below).

::: image/src/imgLoader.cpp
@@ +2608,5 @@
>    }
> +
> +  uint32_t count = mProxies.Count();
> +  for (int32_t i = count-1; i>=0; i--) {
> +    imgRequestProxy *proxy = static_cast<imgRequestProxy *>(mProxies[i]);

Nit: It's current mozilla style for the '*' to hug the type, as in 'imgRequestProxy*'. (You're probably just trying to conform to local style here, but a patch is about to land that will update this entire file to the current Gecko style.)

::: image/src/imgRequestProxy.cpp
@@ +264,5 @@
>  }
>  
> +void imgRequestProxy::SetValidator(imgCacheValidator* aValidator)
> +{
> +  mValidator = aValidator;

I may be missing something. Is this going to be a different value from |GetOwner()->GetValidator()|?
(In reply to Seth Fowler [:seth] from comment #15)
> ::: image/src/imgRequestProxy.cpp
> @@ +264,5 @@
> >  }
> >  
> > +void imgRequestProxy::SetValidator(imgCacheValidator* aValidator)
> > +{
> > +  mValidator = aValidator;
> 
> I may be missing something. Is this going to be a different value from
> |GetOwner()->GetValidator()|?

You are correct. When writing the patch I got the various request members on imgCacheValidator and mistakenly thought that only the new request got a pointer to the validator. And then I saw the patch on bug 432391 uses this.

However this patch does have one significant difference from bz's patch in bug 432391. The bug 432391 patch calls SetNotificationsDeferred and add's the proxy to the img cache validator _after_ calling SyncNotifyListener. This triggers orange on try server. The attached patch here, which does the AddProxy before SyncNotifyListener, but doesn't call SetNotificationsDeferred is green on try. Calling AddProxy and SetNotificationsDeferred before SyncNotifyListener is also green. So this is very close to some sensitive areas. It's good that we can land it at the start of a release cycle now and see what regressions shake out.
(In reply to Timothy Nikkel (:tn) from comment #16)
> The bug 432391 patch calls SetNotificationsDeferred and add's
> the proxy to the img cache validator _after_ calling SyncNotifyListener.
> This triggers orange on try server.

Yeah, that makes sense. Presumably we don't want to send out notifications while we're still validating.
Comment on attachment 8586527 [details] [diff] [review]
patch

Sounds like after a switch to GetOwner()->GetValidator() this patch will be good to go. r+!
Attachment #8586527 - Flags: review?(seth) → review+
Pushed. I had a green try run from this before on at least linux64, b2g, and android before.

https://hg.mozilla.org/integration/mozilla-inbound/rev/43af95e75ece
sorry had to back this out, seems this was causing perma failures like https://treeherder.mozilla.org/logviewer.html#?job_id=8689018&repo=mozilla-inbound
Flags: needinfo?(tnikkel)
Depends on: 1155940
Bug 1155940 has a patch for that failure. Seems the test is just flakey and this tickled it.
Flags: needinfo?(tnikkel)
https://hg.mozilla.org/mozilla-central/rev/2e6b34e4afa8
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
You need to log in before you can comment on or make changes to this bug.