Closed
Bug 1019840
Opened 11 years ago
Closed 10 years ago
Intermittent width-special-values-image.html,width-special-values-image-block.html | image comparison (==), max difference: 255, number of differing pixels: 14779
Categories
(Core :: Layout, defect)
Tracking
()
RESOLVED
FIXED
mozilla39
People
(Reporter: RyanVM, Assigned: seth)
References
Details
(Keywords: intermittent-failure)
Attachments
(15 files, 1 obsolete file)
|
5.39 KB,
image/png
|
Details | |
|
5.38 KB,
image/png
|
Details | |
|
3.20 KB,
text/plain
|
Details | |
|
10.22 KB,
text/plain
|
Details | |
|
10.22 KB,
text/plain
|
Details | |
|
11.15 KB,
text/plain
|
Details | |
|
2.03 KB,
text/plain
|
Details | |
|
11.53 KB,
text/x-log
|
Details | |
|
10.59 KB,
text/x-log
|
Details | |
|
396.77 KB,
patch
|
Details | Diff | Splinter Review | |
|
1.79 KB,
patch
|
tnikkel
:
review+
|
Details | Diff | Splinter Review |
|
13.59 KB,
text/plain
|
Details | |
|
13.49 KB,
text/plain
|
Details | |
|
1.88 KB,
patch
|
philor
:
review+
|
Details | Diff | Splinter Review |
|
4.96 KB,
patch
|
tnikkel
:
review+
Sylvestre
:
approval-mozilla-aurora+
bajaj
:
approval-mozilla-b2g37+
|
Details | Diff | Splinter Review |
There's probably a better Layout sub-component for this to be under.
https://tbpl.mozilla.org/php/getParsedLog.php?id=40965252&tree=Mozilla-Central
Ubuntu VM 12.04 mozilla-central debug test reftest on 2014-06-03 07:29:13 PDT for push 78245b8d422d
slave: tst-linux32-spot-046
07:51:10 INFO - REFTEST TEST-START | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image.html
07:51:10 INFO - REFTEST TEST-LOAD | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image.html | 1940 / 10982 (17%)
07:51:10 INFO - ++DOMWINDOW == 44 (0x9d2c8a40) [pid = 2356] [serial = 5206] [outer = 0xa3c614f0]
07:51:10 INFO - REFTEST TEST-LOAD | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image-ref.html | 1940 / 10982 (17%)
07:51:10 INFO - ++DOMWINDOW == 45 (0x9d2c8c30) [pid = 2356] [serial = 5207] [outer = 0xa3c614f0]
07:51:11 INFO - REFTEST TEST-UNEXPECTED-FAIL | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image.html | image comparison (==), max difference: 255, number of differing pixels: 14779
07:51:11 INFO - REFTEST IMAGE 1 (TEST):
07:51:11 INFO - REFTEST IMAGE 2 (REFERENCE):
07:51:11 INFO - REFTEST INFO | Saved log: START file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image.html
07:51:11 INFO - REFTEST INFO | Saved log: [CONTENT] OnDocumentLoad triggering AfterOnLoadScripts
07:51:11 INFO - REFTEST INFO | Saved log: Initializing canvas snapshot
07:51:11 INFO - REFTEST INFO | Saved log: DoDrawWindow 0,0,800,1000
07:51:11 INFO - REFTEST INFO | Saved log: [CONTENT] RecordResult fired
07:51:11 INFO - REFTEST INFO | Saved log: RecordResult fired
07:51:11 INFO - REFTEST INFO | Saved log: START file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image-ref.html
07:51:11 INFO - REFTEST INFO | Saved log: [CONTENT] OnDocumentLoad triggering AfterOnLoadScripts
07:51:11 INFO - REFTEST INFO | Saved log: Initializing canvas snapshot
07:51:11 INFO - REFTEST INFO | Saved log: DoDrawWindow 0,0,800,1000
07:51:11 INFO - REFTEST INFO | Saved log: [CONTENT] RecordResult fired
07:51:11 INFO - REFTEST INFO | Saved log: RecordResult fired
07:51:11 INFO - REFTEST INFO | Loading a blank page
07:51:11 INFO - ++DOMWINDOW == 46 (0x9d2c8e20) [pid = 2356] [serial = 5208] [outer = 0xa3c614f0]
07:51:11 INFO - REFTEST TEST-END | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/box-properties/width-special-values-image.html
| Reporter | ||
Comment 1•11 years ago
|
||
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 16•10 years ago
|
||
| Reporter | ||
Updated•10 years ago
|
Summary: Intermittent width-special-values-image.html | image comparison (==), max difference: 255, number of differing pixels: 14779 → Intermittent width-special-values-image.html,width-special-values-image-block.html | image comparison (==), max difference: 255, number of differing pixels: 14779
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Reporter | ||
Comment 53•10 years ago
|
||
Retriggers are strongly pointing at this merge for when it regressed:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=a7c177546ca0
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Reporter | ||
Comment 57•10 years ago
|
||
Conveniently, it appears that backing out bug 1123762 took care of this spike too.
Comment 58•10 years ago
|
||
Interestingly, the reference screenshot actually looks broken here.
Assignee: nobody → mchang
Status: NEW → ASSIGNED
Comment 59•10 years ago
|
||
This one test never uses sync image decoding which probably creates the race condition. Digging more.
Comment 60•10 years ago
|
||
From https://treeherder.mozilla.org/#/jobs?repo=try&revision=c8fda839fb36
In the non-ref version, we successfully call RasterImage::Draw, which successfully calls and returns imgFrame draw 12 times. This is the correct version as there are 12 boxes. In the failing test case, it only gets called 3 times before canvas.DrawWindow gets called. While the images are being drawn, the flag FLAG_SYNC_DECODE is always false until canvas.DrawWindow, which is called from the reftest framework, sets it to true. At this point we get a snapshot. There aren't any errors during the RasterImage::Draw phase.
From talking with Seth, if we have a series of images that are async (A, B, C), then a final image (D) to render that is called synchronously, there is no guarantee that the previous images (A, B, C) will be drawn even though the final image (D) was drawn synchronously. Still digging.
Comment 61•10 years ago
|
||
So once we get a sync paint request, we should actually hold and invalidate the area that needs to be painted. The invalidated area should then be repainted synchronously and so the images should be painted. This seems to be the case on mac where I artificially hold async decodes to take a very very long time.
I'm double checking the invalidated region to see if during the sync decode and we haven't painted the images, is the invalidated region correct. However, adding logging seems to have made the bug go away both locally and on my try loaner machine, so waiting on try to see if this still reproduces.
Comment 62•10 years ago
|
||
From https://treeherder.mozilla.org/#/jobs?repo=try&revision=539a92db1a91
Never layerizing images doesn't seem to be the problem, so bug 1125055 is ruled out. On the plus side, it seems to make the error occur more often which is great! Checking if I can reproduce locally now.
See Also: 1125055 →
Comment 63•10 years ago
|
||
In the reference case, we aren't even getting to the point of adding the missing images to the display list, hence we don't draw them.
Comment 64•10 years ago
|
||
So we're getting to nsImageFrame::BuildDisplayList but deciding not to add a nsDisplayImage? Why not?
Comment 65•10 years ago
|
||
Well we want to paint 12 images, but we only get to nsImageFrame::BuildDisplayList for 3 of those images. In the passing case, we call nsImageFrame::BuildDisplayList for all 12 images.
Why aren't we calling it for the other 9 cases I'm still trying to figure out why.
Comment 66•10 years ago
|
||
Comment 67•10 years ago
|
||
Still trying to get a frame tree of the ref image in the failing case, I blew through the log limit in the failing case :(.
Comment 68•10 years ago
|
||
In the ref image failed case(only show 3 instead of 12 images), We still process nsImageFrame::BuildDisplayList 12 times. But we skip 9 image because of the zero size at [1].
[1]
https://hg.mozilla.org/mozilla-central/annotate/eea6188b9b05/layout/generic/nsImageFrame.cpp#l1618
Comment 69•10 years ago
|
||
That gets set when we reflow the image frame. Presumably the page gets reflowed to the correct sizes eventually, so maybe all pending reflows aren't getting flushed before taking the snapshot? Check mDirtyRoots on the presshell when we his the computed size is 0 case.
Comment 70•10 years ago
|
||
Thanks, Timothy.
We update the mComputedSize in nsImageFrame::Reflow[1]. Will add more log for these failed cases.
[1]
https://hg.mozilla.org/mozilla-central/annotate/eea6188b9b05/layout/generic/nsImageFrame.cpp#l883
Comment 71•10 years ago
|
||
For the failed case, we still call nsImageFrame::Reflow for every image(12 images). Only 3 images have non-zero size aReflowState[1], and we have 9 zero-size Reflow calls. I haven't checked the length of mDirtyRoots yet, but I think we do the flush for every image. Otherwise, we will not see 12 images' Reflow calls.
[1]
https://hg.mozilla.org/mozilla-central/annotate/eea6188b9b05/layout/generic/nsImageFrame.cpp#l857
Comment 72•10 years ago
|
||
In general we get more than one reflow call per frame, one reflow call does not mean the frame is fully reflowed and at the right position and size.
I think if mIntrinsicRatio/mIntrinsicSize don't get updated then that could also result in being zero sized.
Comment 73•10 years ago
|
||
Frame tree of the failing test case. In 9/12 image frames, I see these dimensions in app units:
{60,1980,1800,0}
Because it is 0 height, we skip over them at [1]. In the good case, each image should have these dimensions:
{60,1800,3300,1500}
Each successive image should have a y coordinate be +1800 in the good case, but it is 1,140 app units in the bad case. These coordinates and measurements stay the same from the beginning to the end of the test.
[1] https://hg.mozilla.org/mozilla-central/annotate/eea6188b9b05/layout/generic/nsImageFrame.cpp#l1618
Comment 74•10 years ago
|
||
Another odd thing about the bad case is that is that the ReftestContent::OnDocumentLoad [1] occurs much earlier in the good case. The OnDocumentLoad executes after the "load" event fires. In the good case, the "load" event fires, then we start building the display list and paint the images.
In the bad case, right after RasterImage finishes decoding, we start building the display list and begin painting. From comment 71, I probably still have to check to ensure that we reflow correctly. After we paint 3 images, we get a call to Sync Decode Images[2] due to canvas.DrawWindow in the reftest framework. I don't think the order that we call Set Sync Decode Images matters since in the good case, we paint 12 images then call canvas.DrawWindow. I think this means by the time the "load" event fires and we do a canvas.DrawWindow, the error has already occurred and we aren't able to recover. I think the sync decode part is doing it's job but it doesn't matter in this case.
[1] https://dxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest-content.js?from=reftest-content.js&case=true#571
[2] https://dxr.mozilla.org/mozilla-central/source/layout/base/nsDisplayList.h#456
Comment 75•10 years ago
|
||
One last final note is that the aDirtyRect[1] passed into nsImageFrame::BuildDisplayList is the same on both the good and base case. The dirty rect is generally: (-480, -480) width: 48000 height: 60000
[1] https://hg.mozilla.org/mozilla-central/annotate/eea6188b9b05/layout/generic/nsImageFrame.cpp#l1603
Comment 76•10 years ago
|
||
(In reply to Mason Chang [:mchang] from comment #74)
> Another odd thing about the bad case is that is that the
> ReftestContent::OnDocumentLoad [1] occurs much earlier in the good case. The
> OnDocumentLoad executes after the "load" event fires. In the good case, the
> "load" event fires, then we start building the display list and paint the
> images.
>
> In the bad case, right after RasterImage finishes decoding, we start
> building the display list and begin painting. From comment 71, I probably
> still have to check to ensure that we reflow correctly. After we paint 3
> images, we get a call to Sync Decode Images[2] due to canvas.DrawWindow in
> the reftest framework. I don't think the order that we call Set Sync Decode
> Images matters since in the good case, we paint 12 images then call
> canvas.DrawWindow. I think this means by the time the "load" event fires and
> we do a canvas.DrawWindow, the error has already occurred and we aren't able
> to recover. I think the sync decode part is doing it's job but it doesn't
> matter in this case.
>
> [1]
> https://dxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest-
> content.js?from=reftest-content.js&case=true#571
> [2]
> https://dxr.mozilla.org/mozilla-central/source/layout/base/nsDisplayList.
> h#456
Just a clarification, in the base case, we load the test, but do not fire the "load" event. We decode the images, paint the three images, then the "load" event fires which causes canvas.drawWindow to fire. In the good case, the "load" event fires, then we build the display list and paint 12 images, then canvas.drawWindow fires. I don't think that's what's going on here, but just more information. Seems like we should find out why the image sizes are so weird.
Comment 77•10 years ago
|
||
Hey Seth,
In the frame tree of the bad case, I'm seeing this:
line 453aed18: count=2 state=inline,clean,prevmarginclean,not impacted,wrapped,before:nobr,after:nobr[0x120] {0,1140,3780,1140} {0,1140,3780,1140;cw=6000} vis-overflow=0,1140,3780,2280 scr-overflow=0,1140,3780,1140 <
ImageFrame(img)(3)@45030378 next=450303e8 {60,1980,1800,0} vis-overflow=0,0,1800,1440 scr-overflow=0,0,1800,0 [state=0000000000200000] [content=45301020] [sc=45028088] [src=http://10.0.2.2:8888/tests/layout/reftests/pixel-rounding/green-25x25.png]
Text(4)"\n"@450303e8 next=45030430 {3780,1140,0,1140} [state=40000000a0400000] [content=447fa2c0] [sc=44fad888:-moz-non-element] [run=44873510][0,1,T]
>
So the ImageFrame has an incorrect size but it still has the correct source, which means it isn't a placeholder image. From RasterImage::Progress, I see FLAG_SIZE_AVAILABLE called already and getting to the point of RasterImage::FLAG_ONLOAD_UNBLOCKED as well, which I think means the whole decode process is complete. Any hints on what else to look at? I'm going to try to find where the decoder determines the image size to see if that helps, but if you have any other tips I'd appreciate it!
Flags: needinfo?(seth)
Comment 78•10 years ago
|
||
I try to wait the image onload event before calling drawWindow, but the img reflow size is still zero.
<script>
var imageNum = 0;
function onLoadImage() {
dump("bignose image load event");
imageNum++;
checkStatus();
}
function checkStatus() {
if (imageNum==12) {
document.documentElement.removeAttribute("class");
}
}
window.addEventListener("MozReftestInvalidate", checkStatus, false);
</script>
...
<img src="../pixel-rounding/green-25x25.png" alt="[]" class="v1 s1" onload="onLoadImage()">
...
Comment 79•10 years ago
|
||
From chatting with :tn over the weekend, the intrinsic size and ratio in nsImageFrame are the same in both the good and bad cases. In addition, the values in Decoder::PostSize are the same for both cases. The execution order up until painting is also the same.
Only in the bad case, after we decode the images and decide their sizes, do we start failing due to the mComputedSize.height being 0 at nsImageFrame::BuildDisplayList.
Comment 80•10 years ago
|
||
More logging - https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsImageFrame.cpp?from=nsImageFrame.cpp#857
The aReflowState passed into nsImageFrame::Reflow is always 0, 0. I'm not sure why the height in the frame tree is 0 but the width is non-zero now. Also, the image does not have a fixed size in both the good and the case case. [1]
[1] https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsImageFrame.cpp?from=nsImageFrame.cpp#871
Comment 81•10 years ago
|
||
If the computed size thats passed down is wrong, then it looks like it comes from nsImageFrame::ComputeSize, so logging that would be useful.
Comment 82•10 years ago
|
||
I saw a different size for blockHtmlRS.ComputedWidth() and blockHtmlRS.ComputedHeight() passing at [1].
The size in good case is (1500,1500), and the failed case is (0,0).
Still add more log in nsBlockFrame::ReflowBlockFrame().
[1]
https://hg.mozilla.org/mozilla-central/annotate/0b3c520002ad/layout/generic/nsBlockFrame.cpp#l3358
Comment 83•10 years ago
|
||
From irc, changing nsImageFrame to reflow if mIntrinsicSize changed at [1][2] did not change the results.
Neither did always invalidating the frame at [3].
[1] https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsImageFrame.cpp?from=nsImageFrame.cpp#585
[2] https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsImageFrame.cpp?from=nsImageFrame.cpp#694
[3] https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsImageFrame.cpp?from=nsImageFrame.cpp#702
Comment 84•10 years ago
|
||
Corresponding good version to attachment 8572173 [details].
Try https://treeherder.mozilla.org/#/jobs?repo=try&revision=93ae10a73c0f
Comment 85•10 years ago
|
||
Also from https://hg.mozilla.org/try/rev/8a76f48233da
Changing the reftest itself to only be 1 image fails to reproduce the intermittent :/
Comment 86•10 years ago
|
||
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #82)
> I saw a different size for blockHtmlRS.ComputedWidth() and
> blockHtmlRS.ComputedHeight() passing at [1].
> The size in good case is (1500,1500), and the failed case is (0,0).
> Still add more log in nsBlockFrame::ReflowBlockFrame().
>
> [1]
> https://hg.mozilla.org/mozilla-central/annotate/0b3c520002ad/layout/generic/
> nsBlockFrame.cpp#l3358
In both the block and inline case that comes from nsImageFrame::ComputeSize. I added logging there and did a push but it didn't get any failures any many retries https://treeherder.mozilla.org/#/jobs?repo=try&revision=21f576e6ddd3
I did another push and I'm still waiting to get more results for it
https://treeherder.mozilla.org/#/jobs?repo=try&revision=317d01083a66
Comment 87•10 years ago
|
||
(In reply to Timothy Nikkel (:tn) from comment #86)
> (In reply to Jerry Shih[:jerry] (UTC+8) from comment #82)
> > I saw a different size for blockHtmlRS.ComputedWidth() and
> > blockHtmlRS.ComputedHeight() passing at [1].
> > The size in good case is (1500,1500), and the failed case is (0,0).
> > Still add more log in nsBlockFrame::ReflowBlockFrame().
> >
> > [1]
> > https://hg.mozilla.org/mozilla-central/annotate/0b3c520002ad/layout/generic/
> > nsBlockFrame.cpp#l3358
>
> In both the block and inline case that comes from nsImageFrame::ComputeSize.
> I added logging there and did a push but it didn't get any failures any many
> retries https://treeherder.mozilla.org/#/jobs?repo=try&revision=21f576e6ddd3
>
> I did another push and I'm still waiting to get more results for it
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=317d01083a66
The failure really only occurs with the vsync refresh driver. Add the patch from https://bug1123762.bugzilla.mozilla.org/attachment.cgi?id=8557494. It just sets the pref "gfx.vsync.refreshdriver" to true on b2g.
Comment 88•10 years ago
|
||
(In reply to Mason Chang [:mchang] from comment #87)
> The failure really only occurs with the vsync refresh driver. Add the patch
> from https://bug1123762.bugzilla.mozilla.org/attachment.cgi?id=8557494. It
> just sets the pref "gfx.vsync.refreshdriver" to true on b2g.
That push has that patch, it just doesn't show up because I pushed that same changeset on an earlier try push (which failed to build).
Comment 89•10 years ago
|
||
duplicate the failed test case to reduce the try re-trigger time.
Comment 90•10 years ago
|
||
Attachment #8572447 -
Attachment is obsolete: true
Comment 91•10 years ago
|
||
(In reply to Timothy Nikkel (:tn) from comment #88)
> (In reply to Mason Chang [:mchang] from comment #87)
> > The failure really only occurs with the vsync refresh driver. Add the patch
> > from https://bug1123762.bugzilla.mozilla.org/attachment.cgi?id=8557494. It
> > just sets the pref "gfx.vsync.refreshdriver" to true on b2g.
>
> That push has that patch, it just doesn't show up because I pushed that same
> changeset on an earlier try push (which failed to build).
Oops, I had the patch that flips those prefs in the main all.js, which actually only flips them for windows and mac because even the false version of the prefs are inside a windows or mac ifdef. I needed the b2g.js change that you pointed out here.
Comment 92•10 years ago
|
||
In failed case, I got (appWidth, appHeight)=>(0,0) at [1].
Try to add more log at [2]
[1]
https://hg.mozilla.org/mozilla-central/annotate/c5b90c003be8/layout/generic/nsImageFrame.cpp#l763
[2]
https://hg.mozilla.org/mozilla-central/annotate/c5b90c003be8/dom/base/nsImageLoadingContent.cpp#l989
Comment 93•10 years ago
|
||
We got computed size (0,0) because we can't get a image from imgRequestProxy::GetImage().
The call stack for width are:
nsImageFrame::ComputeSize()
->nsImageLoading::GetNaturalWidth()
->imgRequestProxy::GetImage()
In imgRequestProxy::GetImage(), we go through [1], but we still get null image at [2]. Then we got reflow size (0,0).
Seth, I saw the Decoder::PostSize() has already called and the size is correct. What's the reason we can't get the image in GetImage()? Does it mean the image is still under decoding?
[1]
https://hg.mozilla.org/mozilla-central/annotate/56492f7244a9/image/src/imgRequestProxy.cpp#l500
[2]
https://hg.mozilla.org/mozilla-central/annotate/56492f7244a9/dom/base/nsImageLoadingContent.cpp#l999
| Assignee | ||
Comment 94•10 years ago
|
||
We discussed this on IRC. I don't know for sure what the explanation is, but there's definitely a data race on imgRequest::mImage happening in that imgRequestProxy code. It's possible that the race is what is causing this test failure.
I'm working on fixing this in bug 1137002 and the bugs blocking it.
Flags: needinfo?(seth)
Comment 95•10 years ago
|
||
Comment 96•10 years ago
|
||
(In reply to Seth Fowler [:seth] from comment #94)
> We discussed this on IRC. I don't know for sure what the explanation is, but
> there's definitely a data race on imgRequest::mImage happening in that
> imgRequestProxy code. It's possible that the race is what is causing this
> test failure.
>
> I'm working on fixing this in bug 1137002 and the bugs blocking it.
If this is the case, can we disable this test for now until it is fixed?
Comment 97•10 years ago
|
||
Disable these tests until the image proxy code is fixed so we can land silk.
Attachment #8573291 -
Flags: review?(seth)
Comment 98•10 years ago
|
||
In case someone needs this, the good log from comment 86 - https://treeherder.mozilla.org/#/jobs?repo=try&revision=d0c3043a6e9d
Comment 99•10 years ago
|
||
Updated•10 years ago
|
Attachment #8573291 -
Flags: review?(seth) → review?(tnikkel)
Updated•10 years ago
|
Attachment #8573291 -
Flags: review?(tnikkel) → review+
Comment 100•10 years ago
|
||
From irc: :tn and I agree that since this is blocking silk, and while we have yet to confirm that this is a problem in the image proxy code, we both agree that this is independent of the vsync refresh driver. Therefore, we can disable this test until the image code is fixed as it is blocking silk.
Comment 101•10 years ago
|
||
Also note, since the test is passing right now, I will only land in conjunction with bug 1123762 .
Comment 102•10 years ago
|
||
Comment 103•10 years ago
|
||
I now have an explanation for this that seems quite plausible and matches all printf logs & observations.
1) the image file is loaded by a previous test, so it's in the image cache.
2) the failing test loads the image file, it gets a proxy to the imgRequest that's already in the cache, but it also kicks off an async imgCacheValidator to make sure the image is still valid.
3) we init the image frame, it grabs the image (the one that was cached from a previous test) and caches it locally.
4) the async imgCacheValidator comes back and says that we can't use the old image request, it changes the proxy it handed back before to a new request and starts the process of creating loading the image anew.
5) the image frame reflows, we still have the mImage cached on the image frame, but it's the old image. ComputeSize ignores mImage and mIntrinsicSize/Ratio and asks the nsImageLoadingContent for the natural width/height. nsImageLoadingContent goes through the imgRequestProxy which now points to an imgRequest that doesn't have an image yet, it returns success and 0/0. We get reflowed at 0,0.
6) the new image is created and size available is sent to the image frame. We update the cached mImage and cached mIntrinsicSize/Ratio, size it's the same image on disk the intinsic size/ratio do not change, no reflow is triggered.
There are some silly things going on here:
1) nsImageLoadingContent::GetNaturalWidth/Height should probably return error if they have no image.
2) nsImageFrame::ComputeSize calculates the intrinsic size differently from other reflow metrics like GetMinISize, GetPrefISize, GetIntrinsicSize
3) it's not clear to me how the values returned from nsImageLoadingContent::GetNaturalWidth/Height can ever differ from the stored mIntrinsic size when nsImageLoadingContent can get ahold of an image.
4) file uris always have to validate (when loaded in a different doc) because of this http://mxr.mozilla.org/mozilla-central/source/image/src/imgRequest.cpp?rev=be76c67b791a#519 So we kick off a new imgCacheValidator to load the file again, in OnStartRequest we check if we can re-use the existing image by checking if the channel is a caching channel (only htttp channels are caching channels). So since the uri is a file uri we know the channel cannot be a caching channel and hence we will not be able to use the existing image, all before we kick off the imgCacheValidator. We should just deny using the cached entry in this case if this is the img cache behaviour we want for file uris.
Comment 104•10 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
status-firefox39:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Comment 105•10 years ago
|
||
I now have a try server log that confirms that comment 103 is indeed what is happening.
Comment 106•10 years ago
|
||
From inbound - https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=3d6cd86c1791
Forgot to disable one more of the failing tests, it has the same signature as the others (http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/mozilla-inbound-emulator/1425816509/mozilla-inbound_ubuntu64_vm-b2g-emulator_test-reftest-4-bm121-tests1-linux64-build53.txt.gz).
Attachment #8574403 -
Flags: review?(tnikkel)
Comment 107•10 years ago
|
||
(In reply to Timothy Nikkel (:tn) from comment #105)
> I now have a try server log that confirms that comment 103 is indeed what is
> happening.
Very nice! Should we spin up another bug or keep this one open?
Updated•10 years ago
|
Attachment #8574403 -
Flags: review?(tnikkel) → review+
Comment 108•10 years ago
|
||
We should actually have a bug on fixing the problem. Reopening this one for now, but filing a new one is ok too.
Comment 110•10 years ago
|
||
Comment 111•10 years ago
|
||
Re-enabling the tests due to backout of bug 1123762.
| Assignee | ||
Comment 112•10 years ago
|
||
Thanks to everyone who helped investigate this one! I'm glad we've figured it out now.
(In reply to Timothy Nikkel (:tn) from comment #103)
> 3) we init the image frame, it grabs the image (the one that was cached from
> a previous test) and caches it locally.
After bug 1103157, there's no need to cache the image in nsImageFrame::Init. SIZE_AVAILABLE is always delivered now. It was an oversight that this code wasn't removed in that bug.
> 2) nsImageFrame::ComputeSize calculates the intrinsic size differently from
> other reflow metrics like GetMinISize, GetPrefISize, GetIntrinsicSize
Yeah, I'm skeptical of that code. We really need to cut down on the number of places where we compute image-related sizes.
> 3) it's not clear to me how the values returned from
> nsImageLoadingContent::GetNaturalWidth/Height can ever differ from the
> stored mIntrinsic size when nsImageLoadingContent can get ahold of an image.
It looks to me like there is a difference indeed, in that nsImageLoadingContent::GetNaturalWidth/Height does not take image-orientation into account. I am actually not totally sure what the spec intends here. I've sent an email to the what-wg mailing list asking for clarification.
> 4) We should just deny
> using the cached entry in this case if this is the img cache behaviour we
> want for file uris.
Agreed.
Comment 113•10 years ago
|
||
(In reply to Seth Fowler [:seth] from comment #112)
> > 3) it's not clear to me how the values returned from
> > nsImageLoadingContent::GetNaturalWidth/Height can ever differ from the
> > stored mIntrinsic size when nsImageLoadingContent can get ahold of an image.
>
> It looks to me like there is a difference indeed, in that
> nsImageLoadingContent::GetNaturalWidth/Height does not take
> image-orientation into account. I am actually not totally sure what the spec
> intends here. I've sent an email to the what-wg mailing list asking for
> clarification.
Actually I later realized that HTMLImageElement extends nsImageLoadingContent so the code that actually gets called here is HTMLImageElement::GetNaturalWidth/Height which multiplies the image size by a density factor, so it is different. (And indeed using the stored mIntrinsicSize here instead of asking the HTMLImageElement/nsImageLoadingContent does cause some srcset reftests to fail.)
| Assignee | ||
Comment 114•10 years ago
|
||
Yeah, I noticed that too. The structure of the code actually makes it impossible
for us to make GetNaturalWidth/Height return an error in this situation, because
it seems that that error value would be exposed to the web via HTMLImageElement.
I've taken an alternative approach here: we check if the nsImageLoadingContent's
current request has a size before calling GetNaturalWidth/Height. This is ugly,
and I don't like it, but I think a clean solution is nontrivial and needs its
own bug. (Which I've filed - it's bug 1141395.)
I've also filed bug 1141376 and bug 1141380 to fix some other things that got
uncovered during this investigation.
Attachment #8575004 -
Flags: review?(tnikkel)
| Assignee | ||
Updated•10 years ago
|
Assignee: mchang → seth
Status: REOPENED → ASSIGNED
| Assignee | ||
Comment 115•10 years ago
|
||
Oh yes, I also filed bug 1141398 about the file URI issue.
Updated•10 years ago
|
Attachment #8575004 -
Flags: review?(tnikkel) → review+
| Assignee | ||
Comment 116•10 years ago
|
||
Thanks for the review!
As a reminder to myself, we also need to make validation mean "assume validation will fail and just fetch again" for non-HTTP channels. That's another issue Timothy uncovered above that still needs to be fixed.
Comment 117•10 years ago
|
||
Re-enabled tests are on b2g - https://hg.mozilla.org/mozilla-central/rev/e54521cd1c32
| Assignee | ||
Comment 118•10 years ago
|
||
Went ahead and pushed now that some recent churn in ImageLib has settled down:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1b724ad95f55
Comment 119•10 years ago
|
||
| Assignee | ||
Comment 120•10 years ago
|
||
I don't think this should be leave-open anymore.
Keywords: leave-open
| Assignee | ||
Updated•10 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
| Reporter | ||
Comment 121•10 years ago
|
||
I'm guessing this needs Aurora and b2g37 nominations.
status-b2g-v2.2:
--- → affected
status-b2g-master:
--- → fixed
status-firefox37:
--- → wontfix
status-firefox38:
--- → affected
Flags: needinfo?(seth)
| Assignee | ||
Comment 122•10 years ago
|
||
Yep, planning to nominate as part of bug 1083072, but I want to make sure the fix is verified first.
Flags: needinfo?(seth)
| Reporter | ||
Updated•10 years ago
|
status-firefox-esr31:
--- → unaffected
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Assignee | ||
Comment 125•10 years ago
|
||
Comment on attachment 8575004 [details] [diff] [review]
Use cached intrinsic size in nsImageFrame::ComputeSize unless the image loader has a size
Approval Request Comment
[Feature/regressing bug #]: srcset
[User impact if declined]: Images intermittently fail to display. This is bug 1083072, which is tracked. This patch is the main fix, but we want this to come together with bug 1141376. Note that in addition to fixing bug 1083072, this also fixes an intermittent orange, which the sheriffs will certainly appreciate.
[Describe test coverage new/current, TreeHerder]: On m-c for days.
[Risks and why]: Some risk but this code gets run very frequently, so a bug that results in crashes or other serious issues should've showed up very quickly.
[String/UUID change made/needed]: The nsIImageLoadingContent interface gained a new method, which resulted in a UUID change. The new method is not exposed to addons, so it's unlikely to cause a problem.
Attachment #8575004 -
Flags: approval-mozilla-b2g37?
Attachment #8575004 -
Flags: approval-mozilla-aurora?
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 127•10 years ago
|
||
Comment on attachment 8575004 [details] [diff] [review]
Use cached intrinsic size in nsImageFrame::ComputeSize unless the image loader has a size
Taking as we will have time in case of regression.
Attachment #8575004 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Reporter | ||
Comment 129•10 years ago
|
||
Flags: in-testsuite+
Updated•10 years ago
|
Attachment #8575004 -
Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
| Reporter | ||
Comment 130•10 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•