window.scrollY always returns 0 when Restore (Maximize) button is used
Categories
(Core :: Panning and Zooming, defect, P3)
Tracking
()
People
(Reporter: d3_max_t, Assigned: botond)
References
(Regression)
Details
(Keywords: perf, perf-alert, regression)
Attachments
(2 files)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36 Edg/79.0.309.71
Steps to reproduce:
I have some very simple JavaScript code running to detect when a user scrolled away from the top of the page and when a user scrolled back to the top of the page. I am simply checking the value of "window.scrollY". What happens is when I go between maximizing and restoring the browser Window, the scrollY variable starts to be always returned as 0, but only in the restored state and only using the mouse wheel. If i use the vertical scrollbar, everything works as it should. The mouse wheel also works perfectly fine when the browse window is maximized, so i know the "onscroll" events are firing. In fact, I can confirm they are firing in the restored window state as well, but scrollY never changes from 0, until i scorll to about the middle of the page, and all of a sudden scrollY starts returning a value.
This happens in both the current version 72.0.2 and the latest developer edition.
Here is the JS code i have:
function scrollDetect() {
console.log('Scroll event fired');
console.log(window.scrollY);
if (window.scrollY > 0) {
console.log('Scrolled down');
} else {
console.log('At the top');
}
}
window.onscroll = function () { scrollDetect(); };
Steps to reproduce:
-
Open the console page in the developer tools to monitor the value of window.scrollY
-
Maximize the browser window and scroll up and down a few times using the mouse wheel. Then scroll all the way to the top and stay at the top of the page.
-
Hit the restore button on the browser (so the window is not maximized) and scroll up and down a few times using the mouse wheel. Then scroll all the way to the top and stay at the top of the page.
-
Repeat step 2. Important to stay at the top of the page.
-
Hit the restore button again, and gently scroll down using the mouse wheel by one tick. Make sure you don't go too far down, so you don't hit the middle of the page. I would suggest making the body very long, something like 100rem so you could clearly observer this bug. You will notice that scrollY returns 0, even when scrolled down a little.
-
Either manually scroll using the scrollbar or scroll down past the middle of the page using the mouse wheel and you will see scrollY starting to return values again.
Actual results:
scrollY seems to get stuck on returning 0 when the mosue wheel is used after the window was maximized and restored a few times and is left in the restored state, as long as scrolling down doesn't go past the middle of the page.
Expected results:
scrollY should have had a value as soon as mouse wheel was used to scroll down
I should have mentioned, this works perfectly fine in Chrome and the new chromium-based Edge.
Comment 2•5 years ago
|
||
Bug 179857, Bug 189308 is back?
Comment 3•5 years ago
|
||
Open the console page in the developer tools to monitor the value of window.scrollY
I checked it with DOM pane of DevTools, but I cannot reproduce this bug with Nightly nor Firefox 72.
Emilio, you touched nsGfxScrollFrame.cpp
a lot in this several months, do you have any ideas?
Comment 4•5 years ago
|
||
This is pretty hard to debug without a URL or a reproducer. Is there any chance you could attach a page to this bug (using the "attach new file" button), or share a URL where we could reproduce this?
I suspect what's happening is that the main thread scroll offset is not getting synced properly from the APZ thread, or something related to scroll-anchoring. Does the bug reproduce with layout.css.scroll-anchoring.enabled=false
?
Sure:
https://hummingmindidp.azurewebsites.net
The site is in early stages of development, but you can observe this happening on that page. It's supposed to change the background color of the navigation bar when scrolled down. Just make sure when you are in the restored state, the browser window is still big enough to stay within the same media query (lets say over 1200px).
Comment 6•5 years ago
|
||
I can reproduce the problem on the url of comment #5 with Nightly74.0a1 Windows10.
The problem seems to happen with both normal and maximum sizemode
STR:
- Make browser normal sizemode
- Resize the browser to 1920x768px (against 1920x1080 screen)
- Maximized
- Open the url https://hummingmindidp.azurewebsites.net and wait for page loading
- Restore browser sizemode to normal
- Scrolled down a little. with mouse wheel
- Observe background color of header
(if not reproduce)
8. Scroll to top
9. Maximized
10. Scrolled down a little. with mouse wheel
11. Observe background color of header
Actual Results:
The header background color would not be changed to black
And Webconsole shows 0 when evaluate window.scrollY
Expected Results:
The header background color should be changed to black
Regression window:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=b62d582090e11edcdd01c2c707e0b07b33ad0b12&tochange=ecafdbe899a7b2e574922250e356f8610788ad75
Updated•5 years ago
|
Updated•5 years ago
|
Comment 7•5 years ago
|
||
I can repro on Linux too, with and without webrender... Botond, any idea of what might be going on?
Assignee | ||
Comment 8•5 years ago
|
||
It looks like APZ's knowledge of the visual and layout viewports is getting out of sync for a brief period of time, with the layout viewport reflecting the older (maximized) size and the visual viewport reflecting the newer (unmaximized) size. Need to dig a bit further to understand how this comes to be.
Assignee | ||
Comment 9•5 years ago
|
||
I haven't verified this, but I suspect this code is at fault. I'd really like to get rid of it, but last time I tried it broke other things.
Assignee | ||
Comment 10•5 years ago
|
||
(In reply to Botond Ballo [:botond] [spotty responses until Feb 19] from comment #9)
I'd really like to get rid of it, but last time I tried it broke other things.
Specifically, these are the things that broke when I tried to get rid of this code.
Assignee | ||
Comment 11•5 years ago
|
||
At least the macOS crashtest that used to fail without that code is passing now: https://treeherder.mozilla.org/#/jobs?repo=try&tier=1&revision=a628ac932168c607ee63948c752d08c204a7c7f6
We can try removing that code again and see what happens.
Assignee | ||
Comment 12•5 years ago
|
||
Previously, we would wait until the following frame (for uncertain reasons
that date back to B2G), but this meant the layout and visual viewports would
be out of sync for a frame, causing APZ to misbehave.
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Comment 13•5 years ago
|
||
Comment 14•5 years ago
|
||
bugherder |
Assignee | ||
Comment 15•5 years ago
|
||
Comment 16•5 years ago
|
||
Backed out changeset 9dc9deb3ba3b (bug 1611660) for Reftest failure in layout/reftests/bugs/605138-1.html == layout/reftests/bugs/605138-1-ref.html
Log:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=287021072&repo=autoland&lineNumber=8805
Push with failures:
https://treeherder.mozilla.org/#/jobs?repo=autoland&selectedJob=287021072&revision=9dc9deb3ba3bdfb7875b8259308d31ec46f4d6d1
Backout:
https://hg.mozilla.org/integration/autoland/rev/3c7cdd9f7b803c14af51d263f6fa62d4623c00a8
Comment 17•5 years ago
|
||
This is failing on windows platform too: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&classifiedState=unclassified&fromchange=92d62985486856c0fb0b5a7b940816e19b1faf64&selectedJob=287020174
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 20•5 years ago
|
||
I will try, but how soon we can re-land this fix will depend on how tricky tracking down the intermittent regression it caused (bug 1572900) will be.
Comment 21•5 years ago
|
||
Wontfix for 74 as we are nearing the end of the beta cycle.
Assignee | ||
Comment 22•5 years ago
•
|
||
So the intermittent failure tracked in bug 1572900 has had a low background occurrence rate on a variety of platforms since long before these patches landed, but, looking at the failures from the day these patches landed and were backed out, they seem to have caused a spike on "windows7-32-mingwclang" -- which is a Tier 2 platform -- in particular.
I did before and after try pushes and indeed, reftests on windows7-32-mingwclang have a 65% failure rate with the patches applied (compared to 0%, out of 20 runs for each task), without the patches.
Kats, do you think we should block the re-landing of these patches on figuring out what's going on with those mingw failures? On the one hand, they're Tier 2 failures and these patches fix some user-noticeable regressions. On the other hand, the high failure rate with these patches may indicate a potential cause for concern even on a Tier 2 platform.
Comment 23•5 years ago
|
||
I don't think a high failure rate on only mingw-clang should block landing these patches, no. If we can disable the test on mingw-clang that would be good to avoid the sheriffs some headache, but regardless we should land these and continue investigation of the failure. If there's a subtle bug in the patch then maybe we'll get some additional user reports which will help us track down the problem (since debugging the test via try pushes will likely be somewhat painful).
Assignee | ||
Comment 24•5 years ago
|
||
I don't think the failure points to a single test we could disable. The harness itself seems to be timing out near the end. Shutdown hang perhaps?
Comment 25•5 years ago
|
||
Hm, ok. If you want me to investigate the failure let me know. I'm happy to offload stuff from you so you can focus more on desktop zooming.
Assignee | ||
Comment 26•5 years ago
|
||
Tom, IIRC you've worked on MinGW support. Do you have any theories for why a patch might be causing a reftest harness error, intermittently but very frequently (65% failure rate), only on the "windows7-32-mingwclang" platform?
Do you have any suggestions for how one might investigate such a failure? Is it straightforward to build this configuration locally on a Windows machine?
Comment 27•5 years ago
|
||
No theories. This build is a cross build; it builds on Linux. It's not trivial to build it locally, but doing so won't get you more than downloading the builds from Try and running them on a Windows machine. FWIW I would confirm the platform: is it only x86 and not x64? Is it both debug and opt or only one of them?
Does the failure reproduce outside the harness? If so you can just debug with WinDbg. If it doesn't, then one would need to run the test+harness locally, which is not easy for a cross build. I've heard it's been done, but I've never succeeded myself (and filed Bug 1577101 for it.) Some people have succeeded in debugging it using an interactive task in try, so that's another option.
This seems to be indicative of a general problem, it's just we managed to reproduce it reliably in an annoying place to reproduce it locally. If you want to disable the MinGW test, I'm okay with that (just please file a new bug about investigating it...)
Assignee | ||
Comment 28•5 years ago
|
||
(In reply to Tom Ritter [:tjr] (ni for response to sec-[approval|rating|advisories|cve]) from comment #27)
FWIW I would confirm the platform: is it only x86 and not x64?
It looks to be only x86 and not x64, yes.
Is it both debug and opt or only one of them?
I've only tested debug. I can't find a way to trigger opt reftests for mingw, not even in mach try fuzzy --full
.
Does the failure reproduce outside the harness?
It's not a failure of a specific test, but a failure in the harness itself. I've previously speculated it might be a shutdown hang, but looking at the logs more closely, it actually appears to be a startup hang, during the startup of one of the many browser invocations performed in the reftest chunk. It's not even the same test directory every time, but a different one each time.
This means there isn't a single test (or even a single directory of tests) that we could disable. We'd have to disable all reftests on mingw, which we presumably don't want to do.
Assignee | ||
Comment 29•5 years ago
|
||
(In reply to Max from comment #5)
Max, this link doesn't resolve any more.
Could you provide another link which does (or, preferably, attach a local testcase to the bug)? Thanks!
Assignee | ||
Comment 30•5 years ago
|
||
I explored writing an alternative fix to this bug with the hope of avoiding the MinGW test failures. Unfortunately, a Try push shows that the alternative fix triggers the same failures as well.
Assignee | ||
Comment 31•5 years ago
|
||
Based on debugging on Try, it looks like for the affected browser invocations, the reftest webextension is not getting installed or initialized, even though at least some other webextensions are. No idea what might be happening there...
Assignee | ||
Comment 32•5 years ago
|
||
We don't even get as far as the point where the marionette client tries to install the reftest extension.
Assignee | ||
Comment 33•5 years ago
|
||
It looks like the browser process never gets as far as initializing the marionette server, and the client hangs when trying to connect to the server.
Assignee | ||
Comment 34•5 years ago
|
||
The browser process does not even get as far as gBrowserInit.onLoad()
.
Assignee | ||
Comment 35•5 years ago
|
||
It gets as far as creating a document viewer for the top-level chrome document and calling LoadStart()
on it, but LoadComplete()
(which is what would trigger firing the onload
event and calling gBrowserInit.onLoad()
as its handler) is never called on that document viewer.
It's not immediately clear to me where additional logging could be added to diagnose how far the loading process gets / why LoadComplete()
never gets called. I'm open to suggestions for how to investigate this further.
Updated•5 years ago
|
Comment 36•5 years ago
|
||
There are some r+ patches which didn't land and no activity in this bug for 2 weeks.
:botond, could you have a look please?
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 37•5 years ago
|
||
The patches are not in a state where they can land, as they cause high-frequency intermittents for which we do not have a fix yet.
Comment 38•5 years ago
|
||
The high-frequency intermittents (at least the ones in bug 1572900) seem to have gone away in the last day or two, probably due to changes in the workers. It's probably a good time to rebase these patches and do a try push and reland.
Assignee | ||
Comment 39•5 years ago
|
||
Huh, weird. But good news for this bug, I guess!
Assignee | ||
Comment 40•5 years ago
|
||
Unfortunately, I'm still seeing a high volume of the same failure as before: https://treeherder.mozilla.org/#/jobs?repo=try&revision=956bd07738080bedbb2970079313f4aeed73fa8a
Comment 41•5 years ago
|
||
Hm, ok I'll continue investigating using the win 7 mingw platform. Hopefully it's the same problem I was chasing down on win10-aarch64.
Comment 42•5 years ago
|
||
Thinking about this more - even if I do track down all the different hang locations, it's unlikely that they will be fixed soon since the root cause is not obvious and may be inside mingw.
Tom, can you elaborate on these mingw builds - are they builds that we ship to users, or do we just keep them to ensure a mingw-based build doesn't break? In the latter case, do we need to keep running win 7 reftests on these builds? I feel like an increase in intermittent failures in a tier-2 test suite that we're not using for anything in particular shouldn't prevent an obviously-unrelated patch from landing.
Comment 43•5 years ago
|
||
I assumed they were used for Tor browser but I have nothing to back that up.
Comment 44•5 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #42)
Tom, can you elaborate on these mingw builds - are they builds that we ship to users,
No.
or do we just keep them to ensure a mingw-based build doesn't break?
Yes.
In the latter case, do we need to keep running win 7 reftests on these builds?
No.
I feel like an increase in intermittent failures in a tier-2 test suite that we're not using for anything in particular shouldn't prevent an obviously-unrelated patch from landing.
You are correct. Feel free to disable the test(s) you need to, file a bug against Bug 1475994 for re-enabling it with details/linking here. You can also make the hang bugs children of the new bug you filed.
Comment 45•4 years ago
|
||
Bug 1642720 is on autoland so this should be good to re-land now.
Comment 46•4 years ago
|
||
Comment 47•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/5cab400444e9
https://hg.mozilla.org/mozilla-central/rev/5a874f1887ff
Updated•4 years ago
|
Comment 48•4 years ago
|
||
Seeing as this patch has caused a number of other regressions/spikes, should we consider backing it out until we have a better understanding of what's going on?
Comment 49•4 years ago
|
||
== Change summary for alert #26127 (as of Thu, 04 Jun 2020 10:45:24 GMT) ==
Regressions:
39% raptor-tp6-google-firefox-cold replayed windows10-64-shippable opt 812.71 -> 497.75
37% raptor-tp6-google-firefox-cold replayed windows7-32-shippable opt 740.00 -> 466.17
28% raptor-tp6-google-firefox-cold replayed linux64-shippable opt 850.29 -> 614.75
Improvements:
64% raptor-tp6-google-firefox-cold loadtime windows7-32-shippable opt 1,101.83 -> 397.08
58% raptor-tp6-google-firefox-cold loadtime windows10-64-shippable opt 1,112.67 -> 464.50
23% raptor-tp6-google-firefox-cold windows7-32-shippable opt 385.83 -> 298.67
21% raptor-tp6-google-firefox-cold windows10-64-shippable opt 384.43 -> 304.04
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=26127
Updated•4 years ago
|
Comment 50•4 years ago
•
|
||
As the above alert is generating impovments and regressing the "replayed" metric should we further investigate this?
Comment 51•4 years ago
|
||
I think yes because we are seeing a simultaneous improvement in loadtime and a regression in the replayed value - the improvements are likely false-positives.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Description
•