Closed Bug 1019840 Opened 6 years ago Closed 5 years ago

Intermittent width-special-values-image.html,width-special-values-image-block.html | image comparison (==), max difference: 255, number of differing pixels: 14779

Categories

(Core :: Layout, defect)

x86
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox37 --- wontfix
firefox38 --- fixed
firefox39 --- fixed
firefox-esr31 --- unaffected
b2g-v2.2 --- fixed
b2g-master --- fixed

People

(Reporter: RyanVM, Assigned: seth)

References

Details

(Keywords: intermittent-failure)

Attachments

(15 files, 1 obsolete file)

5.39 KB, image/png
Details
5.38 KB, image/png
Details
3.20 KB, text/plain
Details
10.22 KB, text/plain
Details
10.22 KB, text/plain
Details
11.15 KB, text/plain
Details
2.03 KB, text/plain
Details
11.53 KB, text/x-log
Details
10.59 KB, text/x-log
Details
396.77 KB, patch
Details | Diff | Splinter Review
1.79 KB, patch
tnikkel
: review+
Details | Diff | Splinter Review
13.59 KB, text/plain
Details
13.49 KB, text/plain
Details
1.88 KB, patch
philor
: review+
Details | Diff | Splinter Review
4.96 KB, patch
tnikkel
: review+
Details | Diff | Splinter Review
Attached image test screenshot
There's probably a better Layout sub-component for this to be under.

https://tbpl.mozilla.org/php/getParsedLog.php?id=40965252&tree=Mozilla-Central

Ubuntu VM 12.04 mozilla-central debug test reftest on 2014-06-03 07:29:13 PDT for push 78245b8d422d
slave: tst-linux32-spot-046

07:51:10     INFO -  REFTEST TEST-START | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image.html
07:51:10     INFO -  REFTEST TEST-LOAD | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image.html | 1940 / 10982 (17%)
07:51:10     INFO -  ++DOMWINDOW == 44 (0x9d2c8a40) [pid = 2356] [serial = 5206] [outer = 0xa3c614f0]
07:51:10     INFO -  REFTEST TEST-LOAD | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image-ref.html | 1940 / 10982 (17%)
07:51:10     INFO -  ++DOMWINDOW == 45 (0x9d2c8c30) [pid = 2356] [serial = 5207] [outer = 0xa3c614f0]
07:51:11     INFO -  REFTEST TEST-UNEXPECTED-FAIL | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image.html | image comparison (==), max difference: 255, number of differing pixels: 14779
07:51:11     INFO -  REFTEST   IMAGE 1 (TEST): 
07:51:11     INFO -  REFTEST   IMAGE 2 (REFERENCE): 
07:51:11     INFO -  REFTEST INFO | Saved log: START file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image.html
07:51:11     INFO -  REFTEST INFO | Saved log: [CONTENT] OnDocumentLoad triggering AfterOnLoadScripts
07:51:11     INFO -  REFTEST INFO | Saved log: Initializing canvas snapshot
07:51:11     INFO -  REFTEST INFO | Saved log: DoDrawWindow 0,0,800,1000
07:51:11     INFO -  REFTEST INFO | Saved log: [CONTENT] RecordResult fired
07:51:11     INFO -  REFTEST INFO | Saved log: RecordResult fired
07:51:11     INFO -  REFTEST INFO | Saved log: START file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image-ref.html
07:51:11     INFO -  REFTEST INFO | Saved log: [CONTENT] OnDocumentLoad triggering AfterOnLoadScripts
07:51:11     INFO -  REFTEST INFO | Saved log: Initializing canvas snapshot
07:51:11     INFO -  REFTEST INFO | Saved log: DoDrawWindow 0,0,800,1000
07:51:11     INFO -  REFTEST INFO | Saved log: [CONTENT] RecordResult fired
07:51:11     INFO -  REFTEST INFO | Saved log: RecordResult fired
07:51:11     INFO -  REFTEST INFO | Loading a blank page
07:51:11     INFO -  ++DOMWINDOW == 46 (0x9d2c8e20) [pid = 2356] [serial = 5208] [outer = 0xa3c614f0]
07:51:11     INFO -  REFTEST TEST-END | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image.html
Attached image reference screenshot
Something seems to have gotten much worse here just before 2/16 (starting with comment 7).

(Note that comment 6 was posted on 2/16 but was about a failure on 2/13.)

I count 5 failures yesterday (on 2/16) and 4 failures *so far* today. (and it's not even noon yet)
Duplicate of this bug: 1133658
Summary: Intermittent width-special-values-image.html | image comparison (==), max difference: 255, number of differing pixels: 14779 → Intermittent width-special-values-image.html,width-special-values-image-block.html | image comparison (==), max difference: 255, number of differing pixels: 14779
Retriggers are strongly pointing at this merge for when it regressed:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=a7c177546ca0
Conveniently, it appears that backing out bug 1123762 took care of this spike too.
See Also: → 1134459
Interestingly, the reference screenshot actually looks broken here.
Assignee: nobody → mchang
Status: NEW → ASSIGNED
This one test never uses sync image decoding which probably creates the race condition. Digging more.
From https://treeherder.mozilla.org/#/jobs?repo=try&revision=c8fda839fb36

In the non-ref version, we successfully call RasterImage::Draw, which successfully calls and returns imgFrame draw 12 times. This is the correct version as there are 12 boxes. In the failing test case, it only gets called 3 times before canvas.DrawWindow gets called. While the images are being drawn, the flag FLAG_SYNC_DECODE is always false until canvas.DrawWindow, which is called from the reftest framework, sets it to true. At this point we get a snapshot. There aren't any errors during the RasterImage::Draw phase.

From talking with Seth, if we have a series of images that are async (A, B, C), then a final image (D) to render that is called synchronously, there is no guarantee that the previous images (A, B, C) will be drawn even though the final image (D) was drawn synchronously. Still digging.
See Also: → 1128229
So once we get a sync paint request, we should actually hold and invalidate the area that needs to be painted. The invalidated area should then be repainted synchronously and so the images should be painted. This seems to be the case on mac where I artificially hold async decodes to take a very very long time.

I'm double checking the invalidated region to see if during the sync decode and we haven't painted the images, is the invalidated region correct. However, adding logging seems to have made the bug go away both locally and on my try loaner machine, so waiting on try to see if this still reproduces.
See Also: → 1125055
From https://treeherder.mozilla.org/#/jobs?repo=try&revision=539a92db1a91

Never layerizing images doesn't seem to be the problem, so bug 1125055 is ruled out. On the plus side, it seems to make the error occur more often which is great! Checking if I can reproduce locally now.
See Also: 1125055
Attached file Rendering Log
In the reference case, we aren't even getting to the point of adding the missing images to the display list, hence we don't draw them.
So we're getting to nsImageFrame::BuildDisplayList but deciding not to add a nsDisplayImage? Why not?
Well we want to paint 12 images, but we only get to nsImageFrame::BuildDisplayList for 3 of those images. In the passing case, we call nsImageFrame::BuildDisplayList for all 12 images. 

Why aren't we calling it for the other 9 cases I'm still trying to figure out why.
Still trying to get a frame tree of the ref image in the failing case, I blew through the log limit in the failing case :(.
In the ref image failed case(only show 3 instead of 12 images), We still process  nsImageFrame::BuildDisplayList 12 times. But we skip 9 image because of the zero size at [1].


[1]
https://hg.mozilla.org/mozilla-central/annotate/eea6188b9b05/layout/generic/nsImageFrame.cpp#l1618
That gets set when we reflow the image frame. Presumably the page gets reflowed to the correct sizes eventually, so maybe all pending reflows aren't getting flushed before taking the snapshot? Check mDirtyRoots on the presshell when we his the computed size is 0 case.
Thanks, Timothy.
We update the mComputedSize in nsImageFrame::Reflow[1]. Will add more log for these failed cases.

[1]
https://hg.mozilla.org/mozilla-central/annotate/eea6188b9b05/layout/generic/nsImageFrame.cpp#l883
For the failed case, we still call nsImageFrame::Reflow for every image(12 images). Only 3 images have non-zero size aReflowState[1], and we have 9 zero-size Reflow calls. I haven't checked the length of mDirtyRoots yet, but I think we do the flush for every image. Otherwise, we will not see 12 images' Reflow calls.

[1]
https://hg.mozilla.org/mozilla-central/annotate/eea6188b9b05/layout/generic/nsImageFrame.cpp#l857
In general we get more than one reflow call per frame, one reflow call does not mean the frame is fully reflowed and at the right position and size.

I think if mIntrinsicRatio/mIntrinsicSize don't get updated then that could also result in being zero sized.
Frame tree of the failing test case. In 9/12 image frames, I see these dimensions in app units:

{60,1980,1800,0}

Because it is 0 height, we skip over them at [1]. In the good case, each image should have these dimensions:

{60,1800,3300,1500}

Each successive image should have a y coordinate be +1800 in the good case, but it is 1,140 app units in the bad case. These coordinates and measurements stay the same from the beginning to the end of the test.

[1] https://hg.mozilla.org/mozilla-central/annotate/eea6188b9b05/layout/generic/nsImageFrame.cpp#l1618
Another odd thing about the bad case is that is that the ReftestContent::OnDocumentLoad [1] occurs much earlier in the good case. The OnDocumentLoad executes after the "load" event fires. In the good case, the "load" event fires, then we start building the display list and paint the images. 

In the bad case, right after RasterImage finishes decoding, we start building the display list and begin painting. From comment 71, I probably still have to check to ensure that we reflow correctly. After we paint 3 images, we get a call to Sync Decode Images[2] due to canvas.DrawWindow in the reftest framework.  I don't think the order that we call Set Sync Decode Images matters since in the good case, we paint 12 images then call canvas.DrawWindow. I think this means by the time the "load" event fires and we do a canvas.DrawWindow, the error has already occurred and we aren't able to recover. I think the sync decode part is doing it's job but it doesn't matter in this case.

[1] https://dxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest-content.js?from=reftest-content.js&case=true#571
[2] https://dxr.mozilla.org/mozilla-central/source/layout/base/nsDisplayList.h#456
One last final note is that the aDirtyRect[1] passed into nsImageFrame::BuildDisplayList is the same on both the good and base case. The dirty rect is generally: (-480, -480) width: 48000 height: 60000 

[1] https://hg.mozilla.org/mozilla-central/annotate/eea6188b9b05/layout/generic/nsImageFrame.cpp#l1603
(In reply to Mason Chang [:mchang] from comment #74)
> Another odd thing about the bad case is that is that the
> ReftestContent::OnDocumentLoad [1] occurs much earlier in the good case. The
> OnDocumentLoad executes after the "load" event fires. In the good case, the
> "load" event fires, then we start building the display list and paint the
> images. 
> 
> In the bad case, right after RasterImage finishes decoding, we start
> building the display list and begin painting. From comment 71, I probably
> still have to check to ensure that we reflow correctly. After we paint 3
> images, we get a call to Sync Decode Images[2] due to canvas.DrawWindow in
> the reftest framework.  I don't think the order that we call Set Sync Decode
> Images matters since in the good case, we paint 12 images then call
> canvas.DrawWindow. I think this means by the time the "load" event fires and
> we do a canvas.DrawWindow, the error has already occurred and we aren't able
> to recover. I think the sync decode part is doing it's job but it doesn't
> matter in this case.
> 
> [1]
> https://dxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest-
> content.js?from=reftest-content.js&case=true#571
> [2]
> https://dxr.mozilla.org/mozilla-central/source/layout/base/nsDisplayList.
> h#456

Just a clarification, in the base case, we load the test, but do not fire the "load" event. We decode the images, paint the three images, then the "load" event fires which causes canvas.drawWindow to fire. In the good case, the "load" event fires, then we build the display list and paint 12 images, then canvas.drawWindow fires. I don't think that's what's going on here, but just more information. Seems like we should find out why the image sizes are so weird.
Hey Seth,

In the frame tree of the bad case, I'm seeing this:

line 453aed18: count=2 state=inline,clean,prevmarginclean,not impacted,wrapped,before:nobr,after:nobr[0x120] {0,1140,3780,1140} {0,1140,3780,1140;cw=6000} vis-overflow=0,1140,3780,2280 scr-overflow=0,1140,3780,1140 <
  ImageFrame(img)(3)@45030378 next=450303e8 {60,1980,1800,0} vis-overflow=0,0,1800,1440 scr-overflow=0,0,1800,0 [state=0000000000200000] [content=45301020] [sc=45028088] [src=http://10.0.2.2:8888/tests/layout/reftests/pixel-rounding/green-25x25.png]
  Text(4)"\n"@450303e8 next=45030430 {3780,1140,0,1140} [state=40000000a0400000] [content=447fa2c0] [sc=44fad888:-moz-non-element] [run=44873510][0,1,T]
>


So the ImageFrame has an incorrect size but it still has the correct source, which means it isn't a placeholder image. From RasterImage::Progress, I see FLAG_SIZE_AVAILABLE called already and getting to the point of RasterImage::FLAG_ONLOAD_UNBLOCKED as well, which I think means the whole decode process is complete. Any hints on what else to look at? I'm going to try to find where the decoder determines the image size to see if that helps, but if you have any other tips I'd appreciate it!
Flags: needinfo?(seth)
I try to wait the image onload event before calling drawWindow, but the img reflow size is still zero.

<script>
  var imageNum = 0;

  function onLoadImage() {
    dump("bignose image load event");
    imageNum++;
    checkStatus();
  }

  function checkStatus() {
    if (imageNum==12) {
      document.documentElement.removeAttribute("class");
    }
  }

  window.addEventListener("MozReftestInvalidate", checkStatus, false);

</script>
...
<img src="../pixel-rounding/green-25x25.png" alt="[]" class="v1 s1" onload="onLoadImage()">
...
From chatting with :tn over the weekend, the intrinsic size and ratio in nsImageFrame are the same in both the good and bad cases. In addition, the values in Decoder::PostSize are the same for both cases. The execution order up until painting is also the same. 

Only in the bad case, after we decode the images and decide their sizes, do we start failing due to the mComputedSize.height being 0 at nsImageFrame::BuildDisplayList.
More logging - https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsImageFrame.cpp?from=nsImageFrame.cpp#857

The aReflowState passed into nsImageFrame::Reflow is always 0, 0. I'm not sure why the height in the frame tree is 0 but the width is non-zero now. Also, the image does not have a fixed size in both the good and the case case. [1]

[1] https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsImageFrame.cpp?from=nsImageFrame.cpp#871
If the computed size thats passed down is wrong, then it looks like it comes from nsImageFrame::ComputeSize, so logging that would be useful.
I saw a different size for blockHtmlRS.ComputedWidth() and blockHtmlRS.ComputedHeight() passing at [1].
The size in good case is (1500,1500), and the failed case is (0,0).
Still add more log in nsBlockFrame::ReflowBlockFrame().

[1]
https://hg.mozilla.org/mozilla-central/annotate/0b3c520002ad/layout/generic/nsBlockFrame.cpp#l3358
Also from https://hg.mozilla.org/try/rev/8a76f48233da

Changing the reftest itself to only be 1 image fails to reproduce the intermittent :/
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #82)
> I saw a different size for blockHtmlRS.ComputedWidth() and
> blockHtmlRS.ComputedHeight() passing at [1].
> The size in good case is (1500,1500), and the failed case is (0,0).
> Still add more log in nsBlockFrame::ReflowBlockFrame().
> 
> [1]
> https://hg.mozilla.org/mozilla-central/annotate/0b3c520002ad/layout/generic/
> nsBlockFrame.cpp#l3358

In both the block and inline case that comes from nsImageFrame::ComputeSize. I added logging there and did a push but it didn't get any failures any many retries https://treeherder.mozilla.org/#/jobs?repo=try&revision=21f576e6ddd3

I did another push and I'm still waiting to get more results for it
https://treeherder.mozilla.org/#/jobs?repo=try&revision=317d01083a66
(In reply to Timothy Nikkel (:tn) from comment #86)
> (In reply to Jerry Shih[:jerry] (UTC+8) from comment #82)
> > I saw a different size for blockHtmlRS.ComputedWidth() and
> > blockHtmlRS.ComputedHeight() passing at [1].
> > The size in good case is (1500,1500), and the failed case is (0,0).
> > Still add more log in nsBlockFrame::ReflowBlockFrame().
> > 
> > [1]
> > https://hg.mozilla.org/mozilla-central/annotate/0b3c520002ad/layout/generic/
> > nsBlockFrame.cpp#l3358
> 
> In both the block and inline case that comes from nsImageFrame::ComputeSize.
> I added logging there and did a push but it didn't get any failures any many
> retries https://treeherder.mozilla.org/#/jobs?repo=try&revision=21f576e6ddd3
> 
> I did another push and I'm still waiting to get more results for it
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=317d01083a66

The failure really only occurs with the vsync refresh driver. Add the patch from https://bug1123762.bugzilla.mozilla.org/attachment.cgi?id=8557494. It just sets the pref "gfx.vsync.refreshdriver" to true on b2g.
(In reply to Mason Chang [:mchang] from comment #87)
> The failure really only occurs with the vsync refresh driver. Add the patch
> from https://bug1123762.bugzilla.mozilla.org/attachment.cgi?id=8557494. It
> just sets the pref "gfx.vsync.refreshdriver" to true on b2g.

That push has that patch, it just doesn't show up because I pushed that same changeset on an earlier try push (which failed to build).
Attached patch reftest_list.patch (obsolete) — Splinter Review
duplicate the failed test case to reduce the try re-trigger time.
Attachment #8572447 - Attachment is obsolete: true
(In reply to Timothy Nikkel (:tn) from comment #88)
> (In reply to Mason Chang [:mchang] from comment #87)
> > The failure really only occurs with the vsync refresh driver. Add the patch
> > from https://bug1123762.bugzilla.mozilla.org/attachment.cgi?id=8557494. It
> > just sets the pref "gfx.vsync.refreshdriver" to true on b2g.
> 
> That push has that patch, it just doesn't show up because I pushed that same
> changeset on an earlier try push (which failed to build).

Oops, I had the patch that flips those prefs in the main all.js, which actually only flips them for windows and mac because even the false version of the prefs are inside a windows or mac ifdef. I needed the b2g.js change that you pointed out here.
We got computed size (0,0) because we can't get a image from imgRequestProxy::GetImage().

The call stack for width are:
nsImageFrame::ComputeSize()
  ->nsImageLoading::GetNaturalWidth()
    ->imgRequestProxy::GetImage()

In imgRequestProxy::GetImage(), we go through [1], but we still get null image at [2]. Then we got reflow size (0,0).

Seth, I saw the Decoder::PostSize() has already called and the size is correct. What's the reason we can't get the image in GetImage()? Does it mean the image is still under decoding?

[1]
https://hg.mozilla.org/mozilla-central/annotate/56492f7244a9/image/src/imgRequestProxy.cpp#l500
[2]
https://hg.mozilla.org/mozilla-central/annotate/56492f7244a9/dom/base/nsImageLoadingContent.cpp#l999
We discussed this on IRC. I don't know for sure what the explanation is, but there's definitely a data race on imgRequest::mImage happening in that imgRequestProxy code. It's possible that the race is what is causing this test failure.

I'm working on fixing this in bug 1137002 and the bugs blocking it.
Flags: needinfo?(seth)
(In reply to Seth Fowler [:seth] from comment #94)
> We discussed this on IRC. I don't know for sure what the explanation is, but
> there's definitely a data race on imgRequest::mImage happening in that
> imgRequestProxy code. It's possible that the race is what is causing this
> test failure.
> 
> I'm working on fixing this in bug 1137002 and the bugs blocking it.

If this is the case, can we disable this test for now until it is fixed?
Disable these tests until the image proxy code is fixed so we can land silk.
Attachment #8573291 - Flags: review?(seth)
Attachment #8573291 - Flags: review?(seth) → review?(tnikkel)
Attachment #8573291 - Flags: review?(tnikkel) → review+
From irc: :tn and I agree that since this is blocking silk, and while we have yet to confirm that this is a problem in the image proxy code, we both agree that this is independent of the vsync refresh driver. Therefore, we can disable this test until the image code is fixed as it is blocking silk.
Also note, since the test is passing right now, I will only land in conjunction with bug 1123762	.
I now have an explanation for this that seems quite plausible and matches all printf logs & observations.

1) the image file is loaded by a previous test, so it's in the image cache.
2) the failing test loads the image file, it gets a proxy to the imgRequest that's already in the cache, but it also kicks off an async imgCacheValidator to make sure the image is still valid.
3) we init the image frame, it grabs the image (the one that was cached from a previous test) and caches it locally.
4) the async imgCacheValidator comes back and says that we can't use the old image request, it changes the proxy it handed back before to a new request and starts the process of creating loading the image anew.
5) the image frame reflows, we still have the mImage cached on the image frame, but it's the old image. ComputeSize ignores mImage and mIntrinsicSize/Ratio and asks the nsImageLoadingContent for the natural width/height. nsImageLoadingContent goes through the imgRequestProxy which now points to an imgRequest that doesn't have an image yet, it returns success and 0/0. We get reflowed at 0,0.
6) the new image is created and size available is sent to the image frame. We update the cached mImage and cached mIntrinsicSize/Ratio, size it's the same image on disk the intinsic size/ratio do not change, no reflow is triggered.

There are some silly things going on here:
1) nsImageLoadingContent::GetNaturalWidth/Height should probably return error if they have no image.
2) nsImageFrame::ComputeSize calculates the intrinsic size differently from other reflow metrics like GetMinISize, GetPrefISize, GetIntrinsicSize
3) it's not clear to me how the values returned from nsImageLoadingContent::GetNaturalWidth/Height can ever differ from the stored mIntrinsic size when nsImageLoadingContent can get ahold of an image.
4) file uris always have to validate (when loaded in a different doc) because of this http://mxr.mozilla.org/mozilla-central/source/image/src/imgRequest.cpp?rev=be76c67b791a#519 So we kick off a new imgCacheValidator to load the file again, in OnStartRequest we check if we can re-use the existing image by checking if the channel is a caching channel (only htttp channels are caching channels). So since the uri is a file uri we know the channel cannot be a caching channel and hence we will not be able to use the existing image, all before we kick off the imgCacheValidator. We should just deny using the cached entry in this case if this is the img cache behaviour we want for file uris.
https://hg.mozilla.org/mozilla-central/rev/4bf6d48b8090
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
I now have a try server log that confirms that comment 103 is indeed what is happening.
(In reply to Timothy Nikkel (:tn) from comment #105)
> I now have a try server log that confirms that comment 103 is indeed what is
> happening.

Very nice! Should we spin up another bug or keep this one open?
Attachment #8574403 - Flags: review?(tnikkel) → review+
We should actually have a bug on fixing the problem.  Reopening this one for now, but filing a new one is ok too.
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---
Re-enabling the tests due to backout of bug 1123762.
Thanks to everyone who helped investigate this one! I'm glad we've figured it out now.

(In reply to Timothy Nikkel (:tn) from comment #103)
> 3) we init the image frame, it grabs the image (the one that was cached from
> a previous test) and caches it locally.

After bug 1103157, there's no need to cache the image in nsImageFrame::Init. SIZE_AVAILABLE is always delivered now. It was an oversight that this code wasn't removed in that bug.

> 2) nsImageFrame::ComputeSize calculates the intrinsic size differently from
> other reflow metrics like GetMinISize, GetPrefISize, GetIntrinsicSize

Yeah, I'm skeptical of that code. We really need to cut down on the number of places where we compute image-related sizes.

> 3) it's not clear to me how the values returned from
> nsImageLoadingContent::GetNaturalWidth/Height can ever differ from the
> stored mIntrinsic size when nsImageLoadingContent can get ahold of an image.

It looks to me like there is a difference indeed, in that nsImageLoadingContent::GetNaturalWidth/Height does not take image-orientation into account. I am actually not totally sure what the spec intends here. I've sent an email to the what-wg mailing list asking for clarification.

> 4) We should just deny
> using the cached entry in this case if this is the img cache behaviour we
> want for file uris.

Agreed.
(In reply to Seth Fowler [:seth] from comment #112)
> > 3) it's not clear to me how the values returned from
> > nsImageLoadingContent::GetNaturalWidth/Height can ever differ from the
> > stored mIntrinsic size when nsImageLoadingContent can get ahold of an image.
> 
> It looks to me like there is a difference indeed, in that
> nsImageLoadingContent::GetNaturalWidth/Height does not take
> image-orientation into account. I am actually not totally sure what the spec
> intends here. I've sent an email to the what-wg mailing list asking for
> clarification.

Actually I later realized that HTMLImageElement extends nsImageLoadingContent so the code that actually gets called here is HTMLImageElement::GetNaturalWidth/Height which multiplies the image size by a density factor, so it is different. (And indeed using the stored mIntrinsicSize here instead of asking the HTMLImageElement/nsImageLoadingContent does cause some srcset reftests to fail.)
Yeah, I noticed that too. The structure of the code actually makes it impossible
for us to make GetNaturalWidth/Height return an error in this situation, because
it seems that that error value would be exposed to the web via HTMLImageElement.

I've taken an alternative approach here: we check if the nsImageLoadingContent's
current request has a size before calling GetNaturalWidth/Height. This is ugly,
and I don't like it, but I think a clean solution is nontrivial and needs its
own bug. (Which I've filed - it's bug 1141395.)

I've also filed bug 1141376 and bug 1141380 to fix some other things that got
uncovered during this investigation.
Attachment #8575004 - Flags: review?(tnikkel)
Assignee: mchang → seth
Status: REOPENED → ASSIGNED
Oh yes, I also filed bug 1141398 about the file URI issue.
Attachment #8575004 - Flags: review?(tnikkel) → review+
Thanks for the review!

As a reminder to myself, we also need to make validation mean "assume validation will fail and just fetch again" for non-HTTP channels. That's another issue Timothy uncovered above that still needs to be fixed.
Blocks: 1083072
Went ahead and pushed now that some recent churn in ImageLib has settled down:

https://hg.mozilla.org/integration/mozilla-inbound/rev/1b724ad95f55
I don't think this should be leave-open anymore.
Keywords: leave-open
Status: ASSIGNED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
I'm guessing this needs Aurora and b2g37 nominations.
Flags: needinfo?(seth)
Yep, planning to nominate as part of bug 1083072, but I want to make sure the fix is verified first.
Flags: needinfo?(seth)
Comment on attachment 8575004 [details] [diff] [review]
Use cached intrinsic size in nsImageFrame::ComputeSize unless the image loader has a size

Approval Request Comment
[Feature/regressing bug #]: srcset
[User impact if declined]: Images intermittently fail to display. This is bug 1083072, which is tracked. This patch is the main fix, but we want this to come together with bug 1141376. Note that in addition to fixing bug 1083072, this also fixes an intermittent orange, which the sheriffs will certainly appreciate.
[Describe test coverage new/current, TreeHerder]: On m-c for days.
[Risks and why]: Some risk but this code gets run very frequently, so a bug that results in crashes or other serious issues should've showed up very quickly.
[String/UUID change made/needed]: The nsIImageLoadingContent interface gained a new method, which resulted in a UUID change. The new method is not exposed to addons, so it's unlikely to cause a problem.
Attachment #8575004 - Flags: approval-mozilla-b2g37?
Attachment #8575004 - Flags: approval-mozilla-aurora?
Comment on attachment 8575004 [details] [diff] [review]
Use cached intrinsic size in nsImageFrame::ComputeSize unless the image loader has a size

Taking as we will have time in case of regression.
Attachment #8575004 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8575004 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
You need to log in before you can comment on or make changes to this bug.