Horizontal "blank" lines when decoding some large images

NEW
Unassigned

Status

()

defect
P3
major
4 years ago
2 months ago

People

(Reporter: Virtual, Unassigned)

Tracking

({nightly-community})

42 Branch
x86_64
Windows 7
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox41 unaffected, firefox42+ wontfix, firefox43+ wontfix, firefox44+ wontfix, firefox45- wontfix, firefox46- wontfix, firefox47 wontfix, firefox48 wontfix, firefox49 wontfix, firefox-esr38 unaffected, firefox-esr45 wontfix, firefox50 wontfix, firefox51 wontfix, firefox52 wontfix, firefox-esr52 wontfix, firefox-esr60 wontfix, firefox53 wontfix, firefox54 wontfix, firefox55 wontfix, firefox56 wontfix, firefox57 wontfix, firefox58 wontfix, firefox59 wontfix, firefox60 wontfix, firefox61 wontfix, firefox62 wontfix, firefox63 wontfix, firefox64 wontfix, firefox65 wontfix, firefox66 wontfix, firefox67 wontfix, firefox67.0.1 wontfix, firefox68 wontfix, firefox69 affected)

Details

(Whiteboard: [gfx-noted])

Attachments

(3 attachments)

[Tracking Requested - why for this release]: Regression

STR:
1. Open some page with large images like photographs
2. Open image in new tab
3. Force reload page with Shift+Ctrl+R some times too see horizontal "blank" lines when image decoding happens
Flags: needinfo?(seth)
Whiteboard: [gfx-noted]
Using a trunk build from today, I haven't been able to reproduce this yet on my Win10 system. I'm trying to load http://i.4cdn.org/g/1442224840164.jpg as shown in the attached screenshot.

Virtual_ManPL, do you happen to know how recently this was working? Would you be interested in trying out the mozregression tool to help narrow it down?
Flags: needinfo?(bernesb)
Looks like an invalidation problem. This is for sure a platform-dependent problem.

Virtual_ManPL, could you let us know the details of your machine (what CPU and what OS) and the graphics info from about:support?
Flags: needinfo?(seth)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #2)
> Virtual_ManPL, do you happen to know how recently this was working? Would
> you be interested in trying out the mozregression tool to help narrow it
> down?

This is almost certainly a regression from bug 1060609.
Blocks: 1060609
(In reply to Seth Fowler [:seth] [:s2h] from comment #4)
> This is almost certainly a regression from bug 1060609.

No wait, the fact that the screenshot is a PNG threw me off. Looks like we are hitting this problem on JPEGs, which have been using DDD for six months, so it's probably not DDD.

Gotta agree with Ryan on this one: Virtual_ManPL, if you're willing, running mozregression would be a huge help in tracking down the source of the bug.
No longer blocks: 1060609
I'm always posting regression ranges in bugs that I report,
but unfortunately, I won't be having time to find a regression range in about 3 weeks.
I think that regression started in few weeks ago and version 42 is unaffected, but it need to be diagnosed further.
(In reply to Virtual_ManPL [:Virtual] from comment #8)
> but unfortunately, I won't be having time to find a regression range in
> about 3 weeks.

Any chance you're able to bisect this now? :)
Flags: needinfo?(bernesb)
Severity: normal → major
Flags: needinfo?(bernesb)
This was one hella a ride with this regression search, as it was kinda very hard to reproduce in normal way, so so much time consuming, but in the end, I finally found the way to reproduce it 100% each time. The key to do this was to disable all caches (RAM, disc and etc. in about:config).


So let's go to the main part;

Regression window (mozilla-central)
Good:
https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2015/07/2015-07-19-03-02-19-mozilla-central/

Bad:
https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2015/07/2015-07-20-03-02-13-mozilla-central/

Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9c919ce631ea&tochange=5df788c56ae7

Probably caused by:
Bug #1151359 - Predict size at which nsImageFrame's images will be drawn for downscale-during-decode
or
Bug #1176124 - Add placeholders in the SurfaceCache to track when we've started decoding a frame, even if we haven't allocated it yet



[Tracking Requested - why for this release]: Regression
Blocks: 1151359
Flags: needinfo?(bernesb) → needinfo?(seth)
Version: 43 Branch → 42 Branch
Thanks for tracking this down! A Herculean effort :)
Thanks so much for the regression range! The cause isn't immediately obvious to me, but since you also posted instructions on how to reproduce reliably, hopefully this is enough information to get this tracked down and fixed.
Based on my simple observation, I think that "bug-free" Firefox builds decoded images in partial way, compared to "bugged" Firefox builds, which decode images faster and constantly producing results. So maybe it's too fast to decode with no data to render. That's why it maybe gave sometime these free horizontal "blank" places.
Seth, I guess you are going to be the one working it.
Seems that 42 is going to be released with this bug.
Assignee: nobody → seth
(In reply to Virtual_ManPL [:Virtual] from comment #14)
> FYI - nearly identical issue was in Bug #1145560

The visual effect was the same there, but the cause is something different. (And unfortunately, we still don't know what the cause is.)

(In reply to Sylvestre Ledru [:sylvestre] from comment #15)
> Seth, I guess you are going to be the one working it.
> Seems that 42 is going to be released with this bug.

Yep. There wasn't enough time to address this issue during this cycle.
Flags: needinfo?(seth)
(In reply to Seth Fowler [:seth] [:s2h] from comment #16)
> (In reply to Virtual_ManPL [:Virtual] from comment #14)
> > FYI - nearly identical issue was in Bug #1145560
> 
> The visual effect was the same there, but the cause is something different.
> (And unfortunately, we still don't know what the cause is.)
Any more things that I can do to help diagnosing this?
Maybe some debug build with console that shows what's going on?
(In reply to Virtual_ManPL [:Virtual] from comment #17)
> Any more things that I can do to help diagnosing this?
> Maybe some debug build with console that shows what's going on?

If you're able to make builds yourself, finding the exact regressing commit would be a huge help. Both of the image-related patches in the regression range you found so far are pretty complicated, so it'd be really helpful to know which one to investigate more closely.
(In reply to Seth Fowler [:seth] [:s2h] from comment #18)
> If you're able to make builds yourself, finding the exact regressing commit
> would be a huge help. Both of the image-related patches in the regression
> range you found so far are pretty complicated, so it'd be really helpful to
> know which one to investigate more closely.

Unfortunately I'm unable to make builds by myself and I will need some help with it.


(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #19)
> Builds for rev f52a8f3b15ed (bug 1176124):
> https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/ryanvm@gmail.
> com-b23f5b7ad19d
> 
> Builds for rev 03be986cf1aa (bug 1176124 + bug 1151359):
> https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/ryanvm@gmail.
> com-56784829a8e5

These URLs aren't valid, even if I will add "/" at the end.
Looking deeply even inside
https://ftp-ssl.mozilla.org/pub/firefox/try-builds/ryanvm@gmail.com-56784829a8e57906695f18120e6a656a1bee9b8e/try-win32/
and
https://ftp-ssl.mozilla.org/pub/firefox/try-builds/ryanvm@gmail.com-b23f5b7ad19dbef9f8c43a2beb96df0e1b55a633/try-win32/
I'm seeing no builds.
(In reply to Virtual_ManPL [:Virtual] from comment #20)
> These URLs aren't valid, even if I will add "/" at the end.
> Looking deeply even inside

Looks like our Try post-push commit hook is broken, whee. Try these links instead :)

Rev f52a8f3b15ed (bug 1176124):
https://queue.taskcluster.net/v1/task/sv28loNTSxmRFC_1dC9PDg/artifacts/public/build/firefox-42.0a1.en-US.win32.zip

Rev 03be986cf1aa (bug 1176124 + bug 1151359):
https://queue.taskcluster.net/v1/task/DT3SdeN1RU-ghSik47ujIA/artifacts/public/build/firefox-42.0a1.en-US.win32.zip
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #21)
> Rev f52a8f3b15ed (bug 1176124):
> https://queue.taskcluster.net/v1/task/sv28loNTSxmRFC_1dC9PDg/artifacts/
> public/build/firefox-42.0a1.en-US.win32.zip

Unaffected


(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #21)
> Rev 03be986cf1aa (bug 1176124 + bug 1151359):
> https://queue.taskcluster.net/v1/task/DT3SdeN1RU-ghSik47ujIA/artifacts/
> public/build/firefox-42.0a1.en-US.win32.zip

Affected
No longer blocks: 1176124
Seth, should we be backing out bug 1151359, or at least putting that behaviour behind a pref so that we can verify this is the cause?  if it is, this got introduced during 42, which we just released; I'd like not to have too many releases with this bug, even if it is difficult to reproduce.
Flags: needinfo?(seth)
Looks like 43 will ship with this, pinging seth again for his thoughts on where to go from here. 
We could likely still take a patch for 44.
(In reply to Milan Sreckovic [:milan] from comment #23)
> Seth, should we be backing out bug 1151359, or at least putting that
> behaviour behind a pref so that we can verify this is the cause?  if it is,
> this got introduced during 42, which we just released; I'd like not to have
> too many releases with this bug, even if it is difficult to reproduce.

That bug cannot actually be the cause, because it has nothing to do with invalidation or drawing the image. Regression ranges are very misleading here. The bug is definitely in painting or invalidation, and it's just being tickled because timing is different.
Flags: needinfo?(seth)
Depends on: 1225934
Given that inactivity on this bug and the fact that we haven't had more end-users complaining about this, I do not think this is release blocking for FF44, wontfixing as such. If a fix is ready sometime soon, I'd be happy to uplift. Also wondering if we need to track this any more given that this is a wontfix for over 3 releases now. (!)
Sorry but I don't see the point of tracking it... We released 2 major releases with it and the impact has not been important. Not tracking.
Virtual_ManPL, can you still reproduce this on Nightly? The original image on 4chan is gone, so I can't test with that image. I've clicked around a lot on imgur and haven't been able to reproduce. If you can still reproduce, could you post a new URL? It's not clear to me whether bug 1145560 fixed this, and so much code has changed in the last few months in this area that it may have been fixed by something else even if bug 1145560 didn't do the job.
Flags: needinfo?(bernesb)
Note that I'm seeing no dupes for this bug, so people do not seem to be hitting this frequently, if at all, which suggests to me that it's probably fixed, but I'd like to know for sure.
Yes, I could still reproduce the issue 2 days ago.
I will try to upload the image here and URL, if I see that it happens with it frequently.
Flags: needinfo?(bernesb)
(In reply to Virtual_ManPL [:Virtual] - (ni? me) from comment #30)
> Yes, I could still reproduce the issue 2 days ago.
> I will try to upload the image here and URL, if I see that it happens with
> it frequently.

It's been over a month, do you have a status update?
Flags: needinfo?(bernesb)
I can still reproduce it, but it's very hard to reproduce it on the same file.

So my STR:
1. disable all caches for RAM, disc, image and etc. in about:config.
2. go to website page with many images, like for example http://boards.4chan.org/p
3. go the the bottom of the page and click "All" to toggle infinite scroll
4. when the page will be loading, press "Page Down" keyboard button to get all next pages loaded into this one
5. click on some thumbnails to get bigger image and in 1/20 or less you probably will be able to reproduce the issue
Flags: needinfo?(bernesb)
(In reply to Virtual_ManPL [:Virtual] - (ni? me) from comment #32)
> I can still reproduce it, but it's very hard to reproduce it on the same file.

Thanks.

Milan, it looks like this issue is still valid. Is Seth or someone else able to work on this?
Flags: needinfo?(milan)
Eventually :)
Flags: needinfo?(milan)
Assignee: seth.bugzilla → nobody
Status: ASSIGNED → NEW
Assignee: nobody → seth.bugzilla
Has Regression Range: --- → yes
Has STR: --- → yes
Too late for firefox 52, mass-wontfix.
This is a lot of wontfixes in a row without any progress. Dropping it from regression triage.
I'm gonna remove the regression keyword since it doesn't seem useful to track it as such (and keeps getting punted from release to release).
Keywords: regression
(going to remove regression keyword -- we've been shipping this so long it should just be considered a product bug)
Keywords: regression
You need to log in before you can comment on or make changes to this bug.