Intermittent image/test/mochitest/test_bug1217571.html | containers for identical images in different iframes should be identical

RESOLVED FIXED in Firefox 59

Status

()

defect
P3
normal
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: intermittent-bug-filer, Assigned: aosmond)

Tracking

({intermittent-failure})

unspecified
mozilla59
Points:
---

Firefox Tracking Flags

(firefox59 fixed)

Details

(Whiteboard: [gfx-noted][stockwell fixed:product])

Attachments

(1 attachment)

Your recent landings made this worse maybe?
Flags: needinfo?(aosmond)
Hmm I landed bug 1366097 on the 17th and bug 1183378 early on the 23rd, but these failures seem to have ramped up on the 22nd before the latter landed. Nothing I landed should have changed the matching of the imgRequest/imgRequestProxy/imgIContainer themselves. Scanning the commit log doesn't point out any obvious candidates.
It have been unable to reproduce locally and a try [1] with MOZ_LOG for imgRequest does not appear to hit the issue. Bug 1406253 did land around the time the frequency increased, but I can't see any relation there either.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=bfc8cfdb2ab2106f65ef0f1fd3ba06b1e9c77403&selectedJob=147425707
Failed to reproduce with another try where I just slightly modified the test to dump the request and image values if the test fails. I may need to actually land it and wait for it to happen organically.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=63811d4c3638bf44679b663e267a0a2028457925
All the failures seem to be in opt builds, and your try pushes are debug only, so maybe try adding logging to an opt build?
(In reply to Timothy Nikkel (:tnikkel) from comment #6)
> All the failures seem to be in opt builds, and your try pushes are debug
> only, so maybe try adding logging to an opt build?

I haven't seen any failures on optimized builds.
Assignee: nobody → aosmond
Flags: needinfo?(aosmond)
From the reproductions with the MOZ_LOG logging, I had an idea of what is going wrong.

We get a few LoadImage calls (notably two for the iframes, plus another for some other reason). The first LoadImage call seems to be some sort of preloading request perhaps, as it cancels its own proxy shortly after. They all bind to the same imgRequest as expected, waiting for the cache to resolve. We get some data in, and it decodes the metadata, and signals imgRequestProxy::OnLoadComplete to the object from the *second* LoadImage call. Presumably that is one of our iframes that is somehow not being blocked in the cache validator. Then the cache validator returns and decides we need a new cache -- it reassigns the proxy for the *third* LoadImage call to the new request, decodes the metadata and also signals imgRequestProxy::OnLoadComplete. Presumably that is the other iframe.

We should not have issued the load complete for the second imgRequestProxy. Why that happened is what I am investigating now.
Actually it is a bit more obvious than I thought. The first LoadImage was a cache miss, and started loading. The second LoadImage attached to the request created in the first. The third LoadImage forces us to validate the cache (even though we haven't loaded anything yet) and adds its own requests to the cache validator -- but the first and second LoadImage request proxies are missing from that cache validator! Since they don't know about the validator, they also aren't deferring notifications, and don't get their request reassigned when the request itself changes.
I believe the problem is that the original cache miss has yet to return, and so the imgCacheEntry::GetExpiryTime() is still zero. The second LoadImage request has the same load ID as the first and so avoids the problem. The third LoadImage request has a different load ID, and so triggers the revalidation -- even though we haven't finished the initial miss yet.
Status: NEW → ASSIGNED
Priority: -- → P3
Whiteboard: [stockwell needswork:owner] → [gfx-noted][stockwell needswork:owner]
Attachment #8932111 - Flags: review?(tnikkel)
Hi,

Any update on this bug?
Flags: needinfo?(aosmond)
Comment on attachment 8932111 [details] [diff] [review]
0001-Bug-1419889-Don-t-force-the-image-cache-to-validate-.patch

>+      if (expiration == 0) {
>+        // If the channel doesn't support caching, then ensure this expires the
>+        // next time it is used.
>+        expiration = uint32_t(int64_t(PR_Now()) / int64_t(PR_USEC_PER_SEC));
>       }

Can't we just use SecondsFromPRTime here?

Also, won't this mean that we will not validate if we issue another new load within the same second of this line running? And we would want to validate in that situation, no? So to preserve the old behaviour we should maybe subtract one from this value?
Attachment #8932111 - Flags: review?(tnikkel) → review+
(In reply to Timothy Nikkel (:tnikkel) from comment #16)
> Comment on attachment 8932111 [details] [diff] [review]
> 0001-Bug-1419889-Don-t-force-the-image-cache-to-validate-.patch
> 
> >+      if (expiration == 0) {
> >+        // If the channel doesn't support caching, then ensure this expires the
> >+        // next time it is used.
> >+        expiration = uint32_t(int64_t(PR_Now()) / int64_t(PR_USEC_PER_SEC));
> >       }
> 
> Can't we just use SecondsFromPRTime here?
> 

It's a static non-class member method. I can attach it to imgLoader though to expose it.

> Also, won't this mean that we will not validate if we issue another new load
> within the same second of this line running? And we would want to validate
> in that situation, no? So to preserve the old behaviour we should maybe
> subtract one from this value?

Good point, I will add this.
Flags: needinfo?(aosmond)
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1a1ebee6fb8c
Don't force the image cache to validate if it hasn't started yet. r=tnikkel
(In reply to Pulsebot from comment #18)
> Pushed by aosmond@gmail.com:
> https://hg.mozilla.org/integration/mozilla-inbound/rev/1a1ebee6fb8c
> Don't force the image cache to validate if it hasn't started yet. r=tnikkel

It made more sense to attach it to imgCacheEntry.
https://hg.mozilla.org/mozilla-central/rev/1a1ebee6fb8c
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Whiteboard: [gfx-noted][stockwell needswork:owner] → [gfx-noted][stockwell fixed:product]
You need to log in before you can comment on or make changes to this bug.