Completely blank window in Nightly starting 2020-04-07
Categories
(Core :: Graphics, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox75 | --- | unaffected |
firefox76 | --- | unaffected |
firefox77 | --- | unaffected |
firefox78 | --- | fixed |
People
(Reporter: pokechu022, Assigned: sotaro)
References
(Regression)
Details
(Keywords: regression)
Attachments
(3 files)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:75.0) Gecko/20100101 Firefox/75.0
Steps to reproduce:
Start Firefox Nightly without holding shift. Firefox 75 (my normal browser) is not affected.
My machine is very old; it's an HP laptop with CPU AMD A6-3420M APU with Radeon(tm) HD Graphics, 1500 Mhz, 4 Core(s), 4 Logical Processor(s)
and GPU AMD Radeon HD 6520G
.
Using mozregression, I eventually bisected to this:
8:15.88 INFO: Running autoland build built on 2020-04-07 01:14:27.697000, revision 707b309f
8:20.30 INFO: Launching c:\Users\Pokechu22\AppData\Local\Temp\tmp2gmfkt\firefox\firefox.exe
8:20.30 INFO: Application command: c:\Users\Pokechu22\AppData\Local\Temp\tmp2gmfkt\firefox\firefox.exe --allow-downgrade --wait-for-browser -profile c:\users\pokech~1\appdata\local\temp\tmpdl9yhn.mozrunner
8:20.42 INFO: application_buildid: 20200406233834
8:20.42 INFO: application_changeset: 707b309fb85e305b0dd42cc6701b02aaa3de3273
8:20.43 INFO: application_name: Firefox
8:20.43 INFO: application_repository: https://hg.mozilla.org/integration/autoland
8:20.44 INFO: application_version: 77.0a1
Was this integration build good, bad, or broken? (type 'good', 'bad', 'skip', 'retry', 'back' or 'exit' and press Enter): bad
8:44.49 INFO: Narrowed integration regression window from [bcff183d, 21633ffc] (3 builds) to [bcff183d, 707b309f] (2 builds) (~1 steps left)
8:44.50 INFO: No more integration revisions, bisection finished.
8:44.50 INFO: Last good revision: bcff183d3130db1765d505ba1e75e1a9b986d679
8:44.50 INFO: First bad revision: 707b309fb85e305b0dd42cc6701b02aaa3de3273
8:44.51 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=bcff183d3130db1765d505ba1e75e1a9b986d679&tochange=707b309fb85e305b0dd42cc6701b02aaa3de3273
Actual results:
The window was completely white/blank. However, I am able to open new tabs (with Control+T), and the cursor changes appropriately when I hover over where text or an input field should be. Dragging a tab does show a small preview of it correctly. Attempting to close Firefox with multiple tabs open brings up a white window with a proper title (see https://i.imgur.com/qcuXPBN.png) which does cause Firefox to exit when the right place is clicked.
Starting Nightly in safe mode by holding shift on startup does prevent the issue from occurring; as such, I was able to access about:support. (Refreshing Nightly didn't help.)
Expected results:
The window should have been visible. It's kinda hard to use a browser when you can't see...
Comment 1•5 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
Comment 2•5 years ago
|
||
Thanks for bisecting and reporting this bug!
Could you try this build? https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/f3cGM2agSRqeIZoi0cNVWQ/runs/0/artifacts/public/build/target.zip
(it comes from this try push https://treeherder.mozilla.org/#/jobs?repo=try&revision=8c5cc170b91ef9d556b9c4d2c544d54b8ebdf37d )
Updated•5 years ago
|
Reporter | ||
Comment 3•5 years ago
|
||
Same issue with that build; all windows are blank unless safe mode is used (I get https://i.imgur.com/q52mWE6.png if I launch it with firefox.exe -P Nightly
and https://i.imgur.com/uxuxfev.png with firefox.exe -P Nightly -safe-mode
; adding --allow-downgrade
gives a blank main window and --allow-downgrade
plus safe mode gives a visible main window).
Updated•5 years ago
|
Updated•5 years ago
|
Comment 4•5 years ago
|
||
Do you use Chrome? Does it have the same problem or does it work fine? The reason I ask is because they use the same flag that caused the issue for you in Firefox.
If Chrome works fine for you, could you download winspy++ ( http://www.catch22.net/software/winspy ) and then in winspy++ uncheck "minimize winspy" and then drag the finder tool from winspy++ over the top part of the Chrome window (the url bar etc) and see if you get Intermediate D3D Window in the winspy window.
Reporter | ||
Comment 5•5 years ago
|
||
After installing chrome, it seems to also be affected (producing a white window, though it turns black on resizing while Nightly doesn't). Edge, however, is not affected (but I'm not sure if I'm using the chromium-based edge or not; the version is Microsoft Edge 44.19041.1.0, Microsoft EdgeHTML 18.19041).
Comment 6•5 years ago
|
||
Thanks!
Can you try this build?
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/blF3SuUSTIqPwmIa6UzqDQ/runs/2/artifacts/public/build/target.zip
(It's from this try push
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0490b2e39bdd68fda684b0f00471917370bfd732 )
Comment 8•5 years ago
|
||
Thanks for testing. I'll, at the very least, give you a pref to flip so you can continue using Firefox. Hopefully we can do better.
Comment 9•5 years ago
|
||
On older machines it creates a blank window and we only need the pref to direct manipulation (which hasn't landed yet and will be preffed off by default when it lands).
Updated•5 years ago
|
Updated•5 years ago
|
Comment 10•5 years ago
|
||
Comment 11•5 years ago
|
||
Comment 12•5 years ago
|
||
Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&revision=428d3ada2cb2f7d65e30d5c51f4c8633fdd6d38a&selectedJob=299185705
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=299185705&repo=autoland
Backout link: https://hg.mozilla.org/integration/autoland/rev/8b50748f444b41f9ad14c367c098b9cd93344c35
[task 2020-04-24T08:14:41.535Z] 08:14:41 INFO - make[4]: Leaving directory '/builds/worker/workspace/obj-build/layout/inspector'
[task 2020-04-24T08:14:41.536Z] 08:14:41 INFO - make[4]: Entering directory '/builds/worker/workspace/obj-build/layout/inspector'
[task 2020-04-24T08:14:41.536Z] 08:14:41 INFO - layout/inspector/Unified_cpp_layout_inspector0.obj
[task 2020-04-24T08:14:41.536Z] 08:14:41 INFO - make[4]: Leaving directory '/builds/worker/workspace/obj-build/layout/inspector'
[task 2020-04-24T08:14:42.644Z] 08:14:42 INFO - make[4]: Entering directory '/builds/worker/workspace/obj-build/widget/windows'
[task 2020-04-24T08:14:42.644Z] 08:14:42 INFO - /builds/worker/fetches/sccache/sccache /builds/worker/fetches/clang/bin/clang-cl -Xclang -std=c++17 -m32 -FoUnified_cpp_widget_windows1.obj -c -I/builds/worker/workspace/obj-build/dist/stl_wrappers -guard:cf -DNDEBUG=1 -DTRIMMED=1 -DUNICODE -D_UNICODE -D_CRT_RAND_S -DCERT_CHAIN_PARA_HAS_EXTRA_FIELDS -D_SECURE_ATL -DCHROMIUM_BUILD -DU_STATIC_IMPLEMENTATION -DOS_WIN=1 -DWIN32 -D_WIN32 -D_WINDOWS -DWIN32_LEAN_AND_MEAN -DCOMPILER_MSVC -DMOZ_UNICODE -DWINAPI_NO_BUNDLED_LIBRARIES -DMOZ_HAS_MOZGLUE -DMOZILLA_INTERNAL_API -DIMPL_LIBXUL -DSTATIC_EXPORTABLE_JS_API -I/builds/worker/checkouts/gecko/widget/windows -I/builds/worker/workspace/obj-build/widget/windows -I/builds/worker/workspace/obj-build/ipc/ipdl/_ipdlheaders -I/builds/worker/checkouts/gecko/ipc/chromium/src -I/builds/worker/checkouts/gecko/ipc/glue -I/builds/worker/checkouts/gecko/layout/forms -I/builds/worker/checkouts/gecko/layout/generic -I/builds/worker/checkouts/gecko/layout/xul -I/builds/worker/checkouts/gecko/toolkit/xre -I/builds/worker/checkouts/gecko/widget -I/builds/worker/checkouts/gecko/widget/headless -I/builds/worker/checkouts/gecko/xpcom/base -I/builds/worker/workspace/obj-build/dist/include -I/builds/worker/workspace/obj-build/dist/include/nspr -I/builds/worker/workspace/obj-build/dist/include/nss -MD -FI /builds/worker/workspace/obj-build/mozilla-config.h -DMOZILLA_CLIENT -Qunused-arguments -Qunused-arguments -fcrash-diagnostics-dir=/builds/worker/artifacts -TP -Zc:sizedDealloc- -D_HAS_EXCEPTIONS=0 -W3 -Gy -Zc:inline -arch:SSE2 -Gw -Wno-inline-new-delete -Wno-invalid-offsetof -Wno-microsoft-enum-value -Wno-microsoft-include -Wno-unknown-pragmas -Wno-ignored-pragmas -Wno-deprecated-declarations -Wno-invalid-noreturn -Wno-inconsistent-missing-override -Wno-implicit-exception-spec-mismatch -Wno-microsoft-exception-spec -Wno-unused-local-typedef -Wno-ignored-attributes -Wno-used-but-marked-unused -D_SILENCE_TR1_NAMESPACE_DEPRECATION_WARNING -GR- -Z7 -Xclang -load -Xclang /builds/worker/workspace/obj-build/build/clang-plugin/libclang-plugin.so -Xclang -add-plugin -Xclang moz-check -O2 -Oy- -Werror -I/builds/worker/workspace/obj-build/dist/include/cairo -Xclang -fexperimental-new-pass-manager -Xclang -MP -Xclang -dependency-file -Xclang .deps/Unified_cpp_widget_windows1.obj.pp -Xclang -MT -Xclang Unified_cpp_widget_windows1.obj Unified_cpp_widget_windows1.cpp
[task 2020-04-24T08:14:42.644Z] 08:14:42 INFO - In file included from Unified_cpp_widget_windows1.cpp:11:
[task 2020-04-24T08:14:42.644Z] 08:14:42 INFO - /builds/worker/checkouts/gecko/widget/windows/WinCompositorWindowThread.cpp(163,27): error: no member named 'apz_windows_force_disable_direct_manipulation' in namespace 'mozilla::StaticPrefs'
[task 2020-04-24T08:14:42.644Z] 08:14:42 INFO - if (!StaticPrefs::apz_windows_force_disable_direct_manipulation()) {
[task 2020-04-24T08:14:42.644Z] 08:14:42 INFO - ~~~~~~~~~~~~~^
[task 2020-04-24T08:14:42.645Z] 08:14:42 INFO - 1 error generated.
[task 2020-04-24T08:14:42.645Z] 08:14:42 INFO - /builds/worker/checkouts/gecko/config/rules.mk:750: recipe for target 'Unified_cpp_widget_windows1.obj' failed
[task 2020-04-24T08:14:42.645Z] 08:14:42 ERROR - make[4]: *** [Unified_cpp_widget_windows1.obj] Error 1
[task 2020-04-24T08:14:42.645Z] 08:14:42 INFO - make[4]: Leaving directory '/builds/worker/workspace/obj-build/widget/windows'
[task 2020-04-24T08:14:42.645Z] 08:14:42 INFO - /builds/worker/checkouts/gecko/config/recurse.mk:74: recipe for target 'widget/windows/target-objects' failed
[task 2020-04-24T08:14:42.646Z] 08:14:42 ERROR - make[3]: *** [widget/windows/target-objects] Error 2
[task 2020-04-24T08:14:42.646Z] 08:14:42 INFO - make[3]: *** Waiting for unfinished jobs....
Updated•5 years ago
|
Updated•5 years ago
|
Comment 13•5 years ago
|
||
Comment 14•5 years ago
|
||
bugherder |
Updated•5 years ago
|
Comment 15•5 years ago
|
||
Jeff, do you have thoughts on what to do here?
Direct Manipulation sends DM_POINTERHITTEST to windows to ask them if they want to start a dmanip session. The compositor window seems to get in the way of this event. There seems to be a hittesting fast path in Windows that finds our compositor window unless we use the WS_EX_LAYERED style on it. We know of this work around because MS gave it to Chrome to work around a different issue with messages getting sent to their compositor window delaying events to their main window. The problem is that this causes a completely blank window for some users on very old hardware, and it seems like Chrome has the same problem on that hardware. We've had two reports, both machines from ~2011. Details below. One Intel/Nvidia with two gpus. One ATI, only one gpu. So it seems like the issue might be hard to pin down with a blocklist approach.
- Don't do anything. We have a force disable dmanip pref so these users can get a working Firefox. But finding the pref seems pretty impossible unless they interact with us via bugzilla and the bug gets seen by us.
- Blocklist specific machines for dmanip as bugs are filed so that we don't use the WS_EX_LAYERED flag for them.
- We only need dmanip when there is a precision touchpad. We can probably (haven't looked into this yet) detect if there is a precision touchpad on the system and only use the WS_EX_LAYERED flag if there is one present. This means precision touchpads that are plugged in while Firefox is running don't get dmanip (yes, such a thing exists, but probably pretty rare).
- Try to come up with a more general rule to target "old" machines that are more likely to hit this bug and less likely to have precision touchpads (they were first introduced with Windows 8.1 in 2013). Old enough driver date? A database mapping cpuid to release date? A database mapping (vendorid,deviceid) to release date?
- Somehow detect the completely blank window and auto-block dmanip? Gfx sanity test maybe?
- Something I haven't thought of?
Machines affected so far:
HP laptop with CPU AMD A6-3420M APU with Radeon(tm) HD Graphics, 1500 Mhz, 4 Core(s), 4 Logical Processor(s) and GPU AMD Radeon HD 6520G
GPU #1
Active: Yes
Description: P¥│Lルト ᄁ^₩ᆭzヌ↓ᄀ
Vendor ID: 0x1002
Device ID: 0x9647
Driver Version: 15.200.1012.2
Driver Date: 3-11-2015
Drivers: aticfx64 aticfx64 aticfx64 aticfx32 aticfx32 aticfx32 atiumd64 atidxx64 atidxx64 atiumdag atidxx32 atidxx32 atiumdva atiumd6a atitmm64 amdxc32 amdxc64
Subsys ID: 358b103c
RAM: 512
GPU #2
Active: No
RAM: 0
Think Pad 520 on Windows 10 with Nvidia Optimus (switchable graphics - see below). When using integrated graphics (Intel(R) HD Graphics 3000) firefox works normally. When discrete one is used (NVIDIA Quadro 2000M)
GPU #1
Active: Yes
Description: Intel(R) HD Graphics 3000
Vendor ID: 0x8086
Device ID: 0x0126
Driver Version: 9.17.10.4459
Driver Date: 5-19-2016
Drivers: igdumd64 igd10umd64 igd10umd64 igdumd32 igd10umd32 igd10umd32
Subsys ID: 21d117aa
RAM: Unknown
GPU #2
Active: No
Description: NVIDIA Quadro 2000M
Vendor ID: 0x10de
Device ID: 0x0dda
Driver Version: 21.21.13.7748
Driver Date: 6-8-2017
Comment 16•5 years ago
|
||
pokechu022, can you file an issue in the Chrome bug tracker: https://crbug.com/wizard and link to it from here?
Reporter | ||
Comment 17•5 years ago
|
||
Comment 18•5 years ago
|
||
Hey Jessie, would you mind setting a priority on this one?
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 19•5 years ago
|
||
After talking to Microsoft they confirmed that DManip is behaving as expected. It does a "speed hittest" to determine which window to start a dmanip session with, and that WS_EX_LAYERED is how you get the speed hittest to find our main window instead of the compositor window and this is the expected behaviour from their perspective.
Assignee | ||
Comment 20•5 years ago
|
||
I wonder if parent window might block to show compositor windows content like Bug 1570879.
Comment 21•5 years ago
|
||
Because this bug's Severity has not been changed from the default since it was filed, and it's Priority is P3
(Backlog,) indicating it has been triaged, the bug's Severity is being updated to S3
(normal.)
Assignee | ||
Comment 22•5 years ago
|
||
Assignee | ||
Comment 23•5 years ago
|
||
(In reply to pokechu022 from comment #17)
Done: https://bugs.chromium.org/p/chromium/issues/detail?id=1074582
From the issue, it seems that IDXGIFactory2::CreateSwapChainForComposition() was failed on the PC. Then chrome falls back to disabling hardware acceleration. As a result, child window usage was disabled.
On Firefix, I am not sure if it works. Firefox has more fallback routes. The CreateSwapChainForComposition() is used with WebRender. When the CreateSwapChainForComposition() was failed, it falls back to IDXGIFactory2::CreateSwapChainForHwnd() on current gecko. In this case, we might need to quit compositor window usage. But there is a remaining problem with WebRender. "WebRender + native compositor(current default setting)" does not use the CreateSwapChainForComposition(). Instead, IDCompositionDesktopDevice::CreateTargetForHwnd() is used. We do not know yet if the CreateTargetForHwnd() fails on the PC.
And if CompositorD3D11 or MLGPU is used, Compositor window is also used when gfxVars::UseDoubleBufferingWithCompositor() is true on Win10. In this case, IDXGIFactory2::CreateSwapChainForHwnd() + DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL is used. We also do not know yet if "CreateSwapChainForHwnd() + DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL" fails on the PC. By the way, gfxVars::UseDoubleBufferingWithCompositor() is true only on nightly.
https://phabricator.services.mozilla.com/D76470 is created for possible patch, though it might not work on the PC.
Comment 24•5 years ago
|
||
Thanks Sotaro, I'll push that to try so we can get it tested by someone who sees this bug.
Comment 25•5 years ago
|
||
Sotaro already pushed it
https://treeherder.mozilla.org/#/jobs?repo=try&revision=38304ada606c995c553dfb329eea452bd407e136
pokechu, could you try this build
Thanks!
Reporter | ||
Comment 26•5 years ago
|
||
That build appears to render properly, but nightly 20200522094316 also appears to render properly so that's odd (even after refreshing nightly, so it's not related to any preference changes (and I don't remember changing preferences either)).
After bisecting, it looks like it was fixed on 2020-04-28 in https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c1b1ba2c99b1268ba658158daa200a8c79b1b8e7&tochange=a5bdcb8018458a3bf0795b0d41adf6cc5651f058 -- which is slightly odd since AMD Radeon HD 6520G should be Northern Islands and not Evergreen (if I'm reading Wikipedia right), but makes some sense. Looks like when webrenderer is enabled, the issue goes away, but if I force MOZ_WEBRENDER=0
, the issue comes back on current nightly builds.
But, oddly, I'm not able to retroactively confirm this by setting MOZ_WEBRENDER=1
and starting the 2020-04-07 build. A second bisect with that variable set gives 2020-04-20 and https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=e001cc8bb74f0fc4ffa5c40f72fee6d02f4f8c88&tochange=20aad8c2fc424a4ef056386d3c54e18ca0c741d7 (bug 1631312) -- that's the build that fixed it with WebRenderer enabled (or made WebRenderer properly enable in the first place).
I can also confirm that the test build renders properly with MOZ_WEBRENDER=0
forced -- so the test build does actually fix the issue as well.
Comment 27•5 years ago
|
||
(In reply to pokechu022 from comment #26)
I can also confirm that the test build renders properly with
MOZ_WEBRENDER=0
forced -- so the test build does actually fix the issue as well.
Hooray! Thanks for all that testing!
Sotaro, do you want to make whatever changes you want to make to your patch and request review and land it?
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Comment 28•4 years ago
|
||
Good! I updated a patch comment and am going to ask a review.
Updated•4 years ago
|
Comment 29•4 years ago
|
||
Comment 30•4 years ago
|
||
bugherder |
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 31•4 years ago
|
||
@Pokechu22 Could you please take a look and confirm if the issue is fixed or not? We lack the specifed hardware in order to verify the issue.
Reporter | ||
Comment 32•4 years ago
|
||
Yes, it is fixed as of Nightly 79.0a1 (2020-06-12), tested starting normally and also with MOZ_WEBRENDER=0
and MOZ_WEBRENDER=1
.
Description
•