Closed Bug 1392076 Opened 4 years ago Closed 4 years ago

Regression: WebGL canvas does not display when vertical scroll position is at zero.

Categories

(Core :: Canvas: WebGL, defect, P2)

57 Branch
x86_64
Windows 10
defect

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- unaffected
firefox57 --- fixed

People

(Reporter: jujjyl, Assigned: dvander)

References

Details

(Keywords: regression, Whiteboard: [gfx-noted])

Attachments

(3 files)

STR:

Visit https://clbri.com/OffscreenCanvas/Cube/Cube.html

Expected: The canvas should show a spinning cube rendered with WebGL.

Observed: The canvas does not render, but displays a light blue background. Curiously, resizing the browser window vertically small enough so that the vertical page scroll bar appears, and then scrolling the page Y scroll position to > 0, the contents of the canvas appear. Scrolling back to Y=0 will make the canvas contents disappear again, so that the cube is visible only when scroll position y > 0.

This looks like a regression that does not occur on stable Firefox, only on Nightly channel.
Mozregression points to

13:17.55 INFO: Running autoland build built on 2017-08-14 11:43:44.677000, revision 8f507b2b
13:18.61 INFO: Launching c:\Users\clb\AppData\Local\Temp\tmpl7gm4h\firefox\firefox.exe
13:18.61 INFO: Application command: c:\Users\clb\AppData\Local\Temp\tmpl7gm4h\firefox\firefox.exe -profile c:\users\clb\appdata\local\temp\tmpbgv9lj.mozrunner
13:18.62 INFO: application_buildid: 20170814104532
13:18.62 INFO: application_changeset: 8f507b2b4981c5d92f6a57037e66b4b8822f0260
13:18.62 INFO: application_name: Firefox
13:18.62 INFO: application_repository: https://hg.mozilla.org/integration/autoland
13:18.62 INFO: application_version: 57.0a1
Was this inbound build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry', 'back' or 'exit' and press Enter): good
13:25.80 INFO: Narrowed inbound regression window from [f667fdab, f7124b57] (20 builds) to [8f507b2b, f7124b57] (10 builds) (~3 steps left)
13:25.80 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=8f507b2b4981c5d92f6a57037e66b4b8822f0260&tochange=f7124b5734fac723e987a3f95801cfd094a2e2c5

13:25.80 INFO: Downloading build from: https://queue.taskcluster.net/v1/task/H7KfJWjhToe8vHxrIk0kSQ/runs/0/artifacts/public%2Fbuild%2Ftarget.zip
===== Downloaded 100% =====
13:41.09 INFO: Running autoland build built on 2017-08-14 12:37:05.777000, revision 5beaa5c8
13:42.23 INFO: Launching c:\Users\clb\AppData\Local\Temp\tmp3t6isp\firefox\firefox.exe
13:42.23 INFO: Application command: c:\Users\clb\AppData\Local\Temp\tmp3t6isp\firefox\firefox.exe -profile c:\users\clb\appdata\local\temp\tmpkaw4yx.mozrunner
13:42.23 INFO: application_buildid: 20170814114232
13:42.23 INFO: application_changeset: 5beaa5c88e2008fb5cad46fe7c3d77928298a227
13:42.24 INFO: application_name: Firefox
13:42.24 INFO: application_repository: https://hg.mozilla.org/integration/autoland
13:42.24 INFO: application_version: 57.0a1
Was this inbound build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry', 'back' or 'exit' and press Enter): bad
13:49.67 INFO: Narrowed inbound regression window from [8f507b2b, f7124b57] (10 builds) to [8f507b2b, 5beaa5c8] (6 builds) (~2 steps left)
13:49.67 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=8f507b2b4981c5d92f6a57037e66b4b8822f0260&tochange=5beaa5c88e2008fb5cad46fe7c3d77928298a227

13:49.67 INFO: Downloading build from: https://queue.taskcluster.net/v1/task/Eo1DqFh9QzWMBuSZKwHPXA/runs/0/artifacts/public%2Fbuild%2Ftarget.zip
===== Downloaded 100% =====
14:15.59 INFO: Running autoland build built on 2017-08-14 12:10:41.326000, revision 27adfa4a
14:16.64 INFO: Launching c:\Users\clb\AppData\Local\Temp\tmpp8csed\firefox\firefox.exe
14:16.64 INFO: Application command: c:\Users\clb\AppData\Local\Temp\tmpp8csed\firefox\firefox.exe -profile c:\users\clb\appdata\local\temp\tmp0jua8q.mozrunner
14:16.64 INFO: application_buildid: 20170814110717
14:16.65 INFO: application_changeset: 27adfa4ab2d882afaa3d2c9403855973d78d8c00
14:16.65 INFO: application_name: Firefox
14:16.65 INFO: application_repository: https://hg.mozilla.org/integration/autoland
14:16.65 INFO: application_version: 57.0a1
Was this inbound build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry', 'back' or 'exit' and press Enter): good
14:24.36 INFO: Narrowed inbound regression window from [8f507b2b, 5beaa5c8] (6 builds) to [27adfa4a, 5beaa5c8] (3 builds) (~1 steps left)
14:24.36 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=27adfa4ab2d882afaa3d2c9403855973d78d8c00&tochange=5beaa5c88e2008fb5cad46fe7c3d77928298a227

14:24.36 INFO: Downloading build from: https://queue.taskcluster.net/v1/task/R2fvaOgxREiwJDGj8yae2w/runs/0/artifacts/public%2Fbuild%2Ftarget.zip
===== Downloaded 100% =====
14:37.98 INFO: Running autoland build built on 2017-08-14 12:10:16.486000, revision 37ba4f93
14:39.04 INFO: Launching c:\Users\clb\AppData\Local\Temp\tmps5vqb0\firefox\firefox.exe
14:39.04 INFO: Application command: c:\Users\clb\AppData\Local\Temp\tmps5vqb0\firefox\firefox.exe -profile c:\users\clb\appdata\local\temp\tmpkw1yin.mozrunner
14:39.05 INFO: application_buildid: 20170814111123
14:39.05 INFO: application_changeset: 37ba4f932f57fc60bf27554b5d36df3c721df75f
14:39.05 INFO: application_name: Firefox
14:39.05 INFO: application_repository: https://hg.mozilla.org/integration/autoland
14:39.05 INFO: application_version: 57.0a1
Was this inbound build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry', 'back' or 'exit' and press Enter): bad
14:45.95 INFO: Narrowed inbound regression window from [27adfa4a, 5beaa5c8] (3 builds) to [27adfa4a, 37ba4f93] (2 builds) (~1 steps left)
14:45.95 INFO: No more inbound revisions, bisection finished.
14:45.95 INFO: Last good revision: 27adfa4ab2d882afaa3d2c9403855973d78d8c00
14:45.95 INFO: First bad revision: 37ba4f932f57fc60bf27554b5d36df3c721df75f
14:45.95 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=27adfa4ab2d882afaa3d2c9403855973d78d8c00&tochange=37ba4f932f57fc60bf27554b5d36df3c721df75f

Which gives

> jhofmann@mozilla.com Mon Aug 14 11:11:23 2017 +0000	37ba4f932f57	Johann Hofmann — Bug 1375335 - Fix window control height calculation on Windows 10. r=dao

Johann, could you take a peek at this?
Flags: needinfo?(jhofmann)
(Note that I'll be on PTO from tomorrow)

I can't reproduce this in the latest Nightly (2017-08-20) on neither Windows 10 nor Mac OS.

Most of the changes in bug 1375335 were reverted in bug 1390448. Can you try updating your Nightly?

In general, those were changes in browser chrome UI that did not touch WebGL code and should accordingly not affect WebGL display. Even if this is fixed in the latest Nightly, it feels like WebGL code should really not break because of browser frontend changes in any case.
Flags: needinfo?(jhofmann)
Attached file bisection_log.txt
Trying out on Firefox Nightly 57.0a1 (2017-08-20) (64-bit) on the Windows 10 box, the issue does still persist.

Here is a video showing the issue: http://clbri.com/dump/webgl_vertical_scroll.mp4

Reran the bisection, attached the whole mozregression log, which still led to the same commit.
Attached file about_support.txt
about:support from the affected PC.
Further poking, I notice that Ctrl-Wheel zooming in or out the page affects the visibility of the cube as well. The bug appears at default 100% zoom page zoom, but zooming in to just 110% for example makes the cube show up properly.

window.devicePixelRatio and Windows system DPI scaling ("Change the size of text, apps, and other items" in Display Settings) also affect the bug. The bug appears on a 3840x2160 resolution display when display scaling is set to 150%, which is my usual setting. In that mode, Firefox reports window.devicePixelRatio as 1.5 at 100% zoom.

Running on the 3840x2160 display with Windows display scaling set to 100%, and viewing the page at any zoom level, the cube shows up normally.
Pinging on this. The issue still occurs, retested today on Firefox Nightly 57.0a1 (2017-09-12) (64-bit) on Windows.
Flags: needinfo?(jhofmann)
This seems to be related to device resolution and display scaling, my patch is entirely unrelated to WebGL and should not be able to break it. I don't see how I can help you here.
Flags: needinfo?(jhofmann)
Milan, can you find someone to take a look at this and find the actual issue?
Flags: needinfo?(milan)
Kats, what's special about Y=0 with the scrollbar around?  See comment 0.
Flags: needinfo?(milan) → needinfo?(bugmail)
Priority: -- → P3
Version: Trunk → 57 Branch
Whiteboard: [gfx-noted]
Offhand I'm not sure why being at y=0 on a scrollable page would affect this.

I'm not able to reproduce this problem on my Windows 10 machine on the latest nightly. Since we have a regression window it seems like we should continue down that path. The next step would be to make a build with the offending changeset backed out (it sounds like it was already partially reverted, but we should revert the rest of it) and see if that fixes the problem. We can bisect the offending changes to narrow it down if that works. Jukka, can you try doing this?
Flags: needinfo?(bugmail)
Sorry, I can't get into developing this myself. I would vote for reverting the offending commits, and finding a non-offending way to introduce the features there.
(In reply to Jukka Jylänki from comment #11)
> Sorry, I can't get into developing this myself. I would vote for reverting
> the offending commits, and finding a non-offending way to introduce the
> features there.

Do you volunteer to do the non-offending work so that this part of Photon lands in 57? Since the "offending" patches are part of Photon, I highly recommend analyzing what is happening here before backing out. Again, these patches have _nothing_ to do with WebGL. This is a bug in WebGL code. None of us here has even been able to reproduce your problem.
Since the code seems to be flaky about screen dimensions/resolution, you should try looking into what happens when you revert this change:

https://hg.mozilla.org/mozilla-central/rev/fdc6247f8fca#l3.18

That should have changed your viewport dimensions by a pixel. Maybe there's e.g. a rounding error when dealing with certain viewports.
(In reply to Johann Hofmann [:johannh] from comment #13)
> Since the code seems to be flaky about screen dimensions/resolution, you
> should try looking into what happens when you revert this change:
> 
> https://hg.mozilla.org/mozilla-central/rev/fdc6247f8fca#l3.18
> 
> That should have changed your viewport dimensions by a pixel. Maybe there's
> e.g. a rounding error when dealing with certain viewports.

Actually, nevermind. This specific change just caused tabs to grow by a pixel if I'm not mistaken. I still have the feeling this is due to flakiness with different viewports.
Turns out I can reproduce this after all. I missed the step in comment 5 about changing the display scale to 150% in the Windows settings.

I'll do some local builds to see if I can bisect the offending patches.
So I reproduced the problem on a local m-c build and then backed out bug 1375335 and bug 1390448, but the problem was still there. Then I ran mozregression and got a result (different from what Jukka got), but when I tried to verify that result by re-running the good/bad builds I was unable to. In short, it seems like the problem is intermittent and so it's hard to get a reliable regression range. We'll probably need to debug this without the benefit of a known regressing patch.
No longer blocks: 1375335
Hey Milan, any chance this might be breaking lots of sites in the wild? P3, unassigned, hard to repro, I'm thinking fix-optional for 57. Appreciate if you could sign off on this change.
Flags: needinfo?(milan)
New regression, and I don't know how many people run with zoom, so I'd rather find out a bit more about this.  Jeff, any thoughts?
Assignee: nobody → jgilbert
Flags: needinfo?(milan)
Priority: P3 → P2
To be precise, the issue does not require using browser page zoom, but users that have Windows system DPI scaling enabled are affected. For example, in my case it appears on a 3840x2160 display when one has Windows DPI scaling set to 150% and browser page zoom at default 100%.

It seems that if one has Windows DPI scaling set to 100%, the bug is avoided. So users running 1440p and 4K displays are probably the possible affected audience, since users who have a <=27" 1440p display or a <=32" 4K display are likely to run with Windows DPI scaling enabled, because otherwise operating system fonts can appear too small.

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #16)
> So I reproduced the problem on a local m-c build and then backed out bug
> 1375335 and bug 1390448, but the problem was still there. Then I ran
> mozregression and got a result (different from what Jukka got), but when I
> tried to verify that result by re-running the good/bad builds I was unable
> to. In short, it seems like the problem is intermittent and so it's hard to
> get a reliable regression range. We'll probably need to debug this without
> the benefit of a known regressing patch.

I ran the regression range twice on this, after Johann first insisted that his change has no way to affect any of this, and landed to the same result on both runs. On second time I verified the good/bad state on each commit two times, to make sure it would not be nondeterministic. The logs are in comments 1 and 3. Any chance you have a log around from your run? If one starts with the same command line, mozregression should go through same builds, until finding one where the runs disagree.

Do we currently have a QA team that could do more precise regression range searching? Having some nondeterminism in bisection does not mean it would be impossible.
(In reply to Jukka Jylänki from comment #19)
> Any chance you have a log around from your run? If one
> starts with the same command line, mozregression should go through same
> builds, until finding one where the runs disagree.

Sorry, I didn't keep a log - I didn't think it would be useful but you have a good point.
WFM: standard dpi, win7, zooming content didn't impact things.
(In reply to Jim Mathies [:jimm] from comment #21)
> WFM: standard dpi, win7, zooming content didn't impact things.

The issue does not occur on standard DPI setting. In the affected systems, the Windows OS DPI setting is set to 150%, and browser page zoom level is at 100%. When Windows OS DPI setting is at 100%, I don't see the issue happening when zooming in content either.
I can't repro at Win10, 150% dpi, 100% browser zoom, with or without zooming.
I can't work on this without being able to repro it, particularly since it's likely nothing having to do with webgl directly. (maybe a layer visibility issue?)
Assignee: jgilbert → nobody
Flags: needinfo?(milan)
Micael, could you check with Jukka about the environment to reproduce this issue?
Flags: needinfo?(cleu)
I cannot get to the website, it just says the server doesn't respond.
Flags: needinfo?(cleu)
Sorry about that Michael, I notice there was a problem with the server. I uploaded the same page to be hosted in a backup location:

http://clb.demon.fi/OffscreenCanvas/Cube/Cube.html

Trying out bisection on the code again, I see that the issue is actually fixed now. Running

> mozregression --find-fix --good=2017-10-02 --bad=2017-08-16

points to this range

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=b382ec54d164fde891d5f520fd15654d1d521ba9&tochange=400e455a06dad4d88bf441bfe83087c798555348

which touches composition related changes.

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #16)
> In short, it seems like the problem is intermittent and so it's hard to
> get a reliable regression range. We'll probably need to debug this without
> the benefit of a known regressing patch.

I found when bisecting now to also have some unreliability in reproducing. In particular, I notice that in some times, I had to have Firefox maximized to fullscreen mode for the issue to appear, and in other times, I'd have to reduce Firefox to windowed mode. Double-clicking on Firefox's address bar to windowize<->maximize would show the issue most reliably to me.

In any case, looks like this can be closed as resolved. Attached the find-fix bisection log for reference.
Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(milan)
Resolution: --- → FIXED
Assignee: nobody → dvander
Target Milestone: --- → mozilla57
Looks like patches in bug 1396507 fixed this, but we can leave this as a separate bug rather than duplicating it.
See Also: → 1396507
You need to log in before you can comment on or make changes to this bug.