Closed Bug 1194837 Opened 9 years ago Closed 9 years ago

Photo of potch is missing the top half when it's first decoded

Categories

(Core :: Graphics: ImageLib, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox40 --- unaffected
firefox41 + wontfix
firefox42 + fixed
firefox43 - fixed
firefox44 --- fixed

People

(Reporter: dholbert, Assigned: tnikkel)

References

Details

(Keywords: regression, testcase)

Attachments

(5 files)

STR:
 1. Load attached photo of potch. (or shift+reload it, if you've loaded it before)

EXPECTED RESULTS:
 Full photo loads (including cupcake).

ACTUAL RESULTS:
 Only the bottom 2/3 of the photo loads. (up to about the edge of potch's face) The cupcake and the people in the background are not visible.
 - If you resize the window or reload, then the rest of the photo becomes visible. But a shift-reload shows the bug again.

Mozregression gives this range:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=51168e14388c&tochange=a79cb8e688cd

--> regression from bug 1174923
[Tracking Requested - why for this release]: Functional regression in Firefox 41.

ni=seth since this seems to be a regression from a patch of his.
Flags: needinfo?(seth)
I can't reproduce this as reliably with the bugzilla-hosted version of the image. (On shift+reload, sometimes only an upper-left chunk of the image is missing, instead of the whole top strip. And I don't actually hit the bug on most shift+reloads.)

BUT, I can reproduce 100% reliably with a locally saved copy of the testcase. (using shift+reload)
Attached video screencast of bug
I'm using latest Nightly 43.0a1 (2015-08-14), and I hit this bug regardless of whether e10s is enabled, btw.
Keywords: testcase
Daniel, I'm not able to reproduce this on OS X, but I think I have an idea of the cause of the bug anyway.

Just to verify my theory: am I correct in thinking that you can only reproduce this when loading the image as an image document? Can you reproduce if it's embedded in a page via an <img> element or similar?
Flags: needinfo?(seth) → needinfo?(dholbert)
I can reproduce using <img>, *if* I use "image-orientation: from-image".

e.g. this trivial HTML reproduces the bug (saved alongside potch.jpg locally):
 <img src="potch.jpg" style="image-orientation: from-image; width: 600px">

If I remove either "image-orientation" or "width" from that <img> element, then the bug goes away.
Flags: needinfo?(dholbert)
Thanks. That suggests the race is not in ImageDocument. I thought that nsImageFrame was in better shape, but I suppose that previous fixes have just made the problem harder to reproduce - until bug 1174923 landed, which puts greater stress on our invalidation logic.
My just-attached "testcase 2" demonstrates this failing with <img>, as discussed in comment 6, with a standalone file. (It also reproduces the bug reliably on reload; no need to shift-reload. Probably due to the use of data URI which maybe bypasses some cache, or something.)

The delayed paint (making the img 'display:none' until you click the link) seems to be useful in making the bug reproduce more reliably, too.
(For reference, the context of this Potch-photo is https://blog.mozilla.org/places/2015/07/30/firefox-growth-campaign/ . I originally encountered this bug when clicking the photo at the end of that post.)
(And it seems that all of the tall-and-skinny photos in that blog post trigger the same issue.)
Tracked for 41+, I am able to repro this bug on FF41 and 42.
I see it on OS X.  If I go away from the tab and spend long enough time away from it, coming back to the tab doesn't fix the problem.  Immediate away and back to the tab does fix it.  I guess the image cache expires.
Probably won't get to this in 41, we should wontfix.
With downscale during decode we get an image frame size that is smaller than the size of the original image. The frame update notification sends a rect relative to this smaller frame size. When we call GetImageSpaceInvalidationRect on the OrientedImage, the matrix that OrientedImage uses is based on the original sized image (it has a translate equal to the height of the original image).
We also create a decoder for the image as the following sizes: 1800 x 1350, 0 x 0 (so original size), and 1350 x 1800. So there is a mix up with the image orientation when requesting decodes as well.
Given that resizing the windows helps to workaround this issue and comment 14, updating FF41 status to wontfix.
Timothy, let me know if you're not the right person to deal with this.
Assignee: nobody → tnikkel
Attached patch patchSplinter Review
Actually comment 15 is wrong, not sure how I got that.

The problem is that we are using the inverse of the orientation matrix to convert the invalid rect from decoded image space to oriented image space. This is backwards since the orientation matrix converts from decoded image space to oriented image space.
Attachment #8664703 - Flags: review?(seth)
Comment on attachment 8664703 [details] [diff] [review]
patch

Review of attachment 8664703 [details] [diff] [review]:
-----------------------------------------------------------------

Good catch!
Attachment #8664703 - Flags: review?(seth) → review+
Tim, would you mind following a followup bug about adding a gtest for this? I'm going to set up a gtest suite for testing invalidation rects and this is something we should be testing.
Sure. It should be easy to write a reftest based on the testcase too.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2e052f6f899f
Attached patch reftestSplinter Review
I spent a ridiculously amount of time making a reftest for this that reproduces the problem consistently on my local machine. But it never reproduced the problem on try server (not on any platform, not even once).

The problem is that when we remove the "display: none" from the image that causes use to repaint the whole bounds of the image frame. When we do that the image hasn't been decoded yet so nothing is painted. The reftest harness then detects this paint via MozAfterPaint and updates the rect of the canvas that the MozAfterPaint event says was painted. But by this time the image is decoded, so when the reftest updates it's canvas it gets the full image. This is despite the fact that the MozAfterPaint comes right after we paint.

I then tried to make the image be served by a sjs script that would send enough of the image to decode the size, but then pause for a second (so the paint to the screen and the subsequent reftest canvas update both have very little of the image to draw after the "display: none" of the image is removed). But this doesn't work because we have to download the whole image before the load event of the page can fire, so all the image data is there when we remove "display: none". So then I tried making the image have no src and setting the src when removing the "display: none". This fails because now we have an nsImageFrame when the load complete notification is sent and nsImageFrame::OnLoadComplete calls NotifyNewCurrentRequest which invalidates the whole frame. Thus we redraw the whole image, thus the problem gets fixed.

So then I tried never finishing the request for the image from the sjs, so the load complete handler doesn't run (we finish it after the snapshot has been taken). This works locally for me, but never on try server. I'm not sure why.

Here is the failed attempt at a reftest.

I guess we'll just need to go the gtest route.
(In reply to Seth Fowler [:seth] [:s2h] from comment #21)
> Tim, would you mind following a followup bug about adding a gtest for this?
> I'm going to set up a gtest suite for testing invalidation rects and this is
> something we should be testing.

Bug 1209340.
https://hg.mozilla.org/mozilla-central/rev/d3362879b8b1
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
Timothy, could you fill the uplift request to aurora & beta? Thanks
Flags: needinfo?(tnikkel)
I feel complete.
Comment on attachment 8664703 [details] [diff] [review]
patch

Approval Request Comment
[Feature/regressing bug #]: the bug has a regression range of bug 1174923, but that likely just changed the order of some operations to expose an existing bug. The line touched by this patch was introduced by bug 1043560, but it also made other massive changes, so it's not clear if it introduced the bug or propagated the existin behaviour.
[User impact if declined]: images with exif info that rotates the image may not display properly the first time in some circumstances, anything that causes a repaint will fix the problem
[Describe test coverage new/current, TreeHerder]: none currently (despite a huge amount of effort), Seth plans to write gtests for this area in the future.
[Risks and why]: this is pretty low risk
[String/UUID change made/needed]:none
Flags: needinfo?(tnikkel)
Attachment #8664703 - Flags: approval-mozilla-beta?
Attachment #8664703 - Flags: approval-mozilla-aurora?
Comment on attachment 8664703 [details] [diff] [review]
patch

Fix a regression, taking it. Should be in 42 beta 5.
Attachment #8664703 - Flags: approval-mozilla-beta?
Attachment #8664703 - Flags: approval-mozilla-beta+
Attachment #8664703 - Flags: approval-mozilla-aurora?
Attachment #8664703 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: