Closed Bug 1273946 Opened 8 years ago Closed 2 years ago

[e10s]Browser window goes transparent across all tabs after watching Youtube videos

Categories

(Core :: Web Painting, defect)

46 Branch
x86_64
Windows 10
defect
Not set
normal
27

Tracking

()

RESOLVED WORKSFORME
Iteration:
48.3 - Apr 25
Tracking Status
platform-rel --- +
e10s - ---
firefox46 --- wontfix
firefox47 --- wontfix
firefox48 --- wontfix
firefox49 --- wontfix
firefox50 --- wontfix
firefox51 --- wontfix
firefox52 --- fix-optional
firefox53 --- fix-optional

People

(Reporter: stefan.dimitrov01, Unassigned)

References

Details

(Keywords: regression, Whiteboard: [gfx-noted] [platform-rel-Youtube])

Attachments

(5 files)

Attached image screenshot
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:48.0) Gecko/20100101 Firefox/48.0
Build ID: 20160518004015

Steps to reproduce:

The issue occurs after watching Youtube videos. It can often be triggered by scrolling through the page and click on any non-player buttons on the page.


Actual results:

The browser window turned transparent across all tabs. The only way of resolving it is to restart Firefox.
Iteration: --- → 48.3 - Apr 25
OS: Unspecified → Windows 10
Hardware: Unspecified → x86_64
Hi Reporter, can you still able to reproduce in safe-mode and clean profile? Thanks
safe mode: https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode
clean profile: https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles
Flags: needinfo?(stefan.dimitrov01)
Hello,
I've successfully reproduced it in safe mode and with a clean profile before finding the cause of the issue. 
The issue is caused by the latest AMD graphics driver (version 16.5.2.1). Disabling the GPU acceleration or reverting to an older driver version is a workaround. I'm currently using an older version of the drivers and I don't experience the issue.
Component: Untriaged → Graphics
Product: Firefox → Core
This is e10s specific issue. It is reproducible when YouTube is set to 'worldwide' mode and mouse is moved over several video thumbnails on windows-10 x64 with the latest AMD graphics driver (version 16.5.2.1) or beta version(16.5.3). 
Here is the screen record: https://goo.gl/ZqYUaY.
Status: UNCONFIRMED → NEW
tracking-e10s: --- → ?
Ever confirmed: true
Summary: Browser window goes transparent across all tabs after watching Youtube videos → [e10s]Browser window goes transparent across all tabs after watching Youtube videos
Flags: needinfo?(stefan.dimitrov01)
Milan, this looks gfx related, see the screencast on comment 3.
Flags: needinfo?(milan)
Abe, once this happens, does resizing the window bring the display back?  If you run about:support in a new tab when this happens - do you see the page, and what's in the graphics section?  Is this on a 64-bit build or 32-bit build, or both?  What version?
Flags: needinfo?(milan) → needinfo?(amasresha)
Flags: needinfo?(amasresha) → needinfo?(ciprian.muresan)
Attached file aboutsupport page.txt
User Agent Mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0
Build ID 20160601030219

The issue is reproducible on Windows 10 x64 on the latest Nightly (49.0a1, Build ID 20160601030219) on both 32 and 64 bit builds. 
about:* pages are properly rendered after this bug appears. 
Resizing the window does not make the issue go away.
Flags: needinfo?(ciprian.muresan)
OK, so something is causing a device reset, and then we don't quite recover from it.  I'm guessing the device resets are caused by video (we've already blocked the D3D11 api on this driver, but it looks like even the D3D9 api is causing problems.  Driver resets showing up in the first place is a different bug (perhaps worth filing under Audio/Video, now that we have a nice local reproducible setup), so let's focus on the "recover from reset" here.

The errors, as expected are D2DERR_RECREATE_TARGET (0x8899000c.)  The 0x887a0005 that shows up once is really weird - that's a DXGI_ERROR_DEVICE_REMOVED.
I imagine this is a regression, but it could be pointing to a big change that doesn't help us much.  Still it would be good if we could run a regression range on this, making sure E10S is enabled in the profile used for the testing, so that we don't end up with the "e10s enabled" as the patch in question.

about:pages working sort of makes sense - all the driver reset messages are from the child process.

I don't think the diagnostic patch I put up is really going to give us any new information, but it'd be interesting to know what the last reset reason is.

Chris, is there any information you want from this reproducible test case, that would let you get more information about video failures?  As in, the first part of the comment 7 - why the resets happen?
Flags: needinfo?(cpearce)
Flags: needinfo?(ciprian.muresan)
Comment on attachment 8758776 [details]
MozReview Request: Bug 1273946: Get more information on the cause of the device reset.  Remove unused UpdateRenderModeIfDeviceReset method. r?dvander

https://reviewboard.mozilla.org/r/56936/#review53668
Attachment #8758776 - Flags: review?(dvander) → review+
Comment on attachment 8758776 [details]
MozReview Request: Bug 1273946: Get more information on the cause of the device reset.  Remove unused UpdateRenderModeIfDeviceReset method. r?dvander

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56936/diff/1-2/
Attachment #8758776 - Attachment description: MozReview Request: Bug 1273946: Get more information on the cause of the device reset. r?dvander → MozReview Request: Bug 1273946: Get more information on the cause of the device reset. Remove unused UpdateRenderModeIfDeviceReset method. r?dvander
(In reply to Milan Sreckovic [:milan] from comment #9)
> I imagine this is a regression, but it could be pointing to a big change
> that doesn't help us much.  Still it would be good if we could run a
> regression range on this, making sure E10S is enabled in the profile used
> for the testing, so that we don't end up with the "e10s enabled" as the
> patch in question.
> 
> about:pages working sort of makes sense - all the driver reset messages are
> from the child process.
> 
> I don't think the diagnostic patch I put up is really going to give us any
> new information, but it'd be interesting to know what the last reset reason
> is.
> 
> Chris, is there any information you want from this reproducible test case,
> that would let you get more information about video failures?  As in, the
> first part of the comment 7 - why the resets happen?

According to NSPR_LOG_MODULES=nsMediaElement:5 the YouTube main page doesn't create any video elements, so I don't think this is video related.
Flags: needinfo?(cpearce)
Last good revision: 2ec54b38a33da939a8255612999d9e867eb11664
First bad revision: bb5becd378f40a9be14e4e635d7034f2835fc7b5
Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=2ec54b38a33da939a8255612999d9e867eb11664&tochange=bb5becd378f40a9be14e4e635d7034f2835fc7b5

Looks like the following bug has the changes which introduced the regression:
https://bugzilla.mozilla.org/show_bug.cgi?id=1147673
Flags: needinfo?(ciprian.muresan)
based on the regression range which points to bug 1147673
(In reply to Chris Pearce (:cpearce) from comment #12)
> ...
> According to NSPR_LOG_MODULES=nsMediaElement:5 the YouTube main page doesn't
> create any video elements, so I don't think this is video related.

Thanks for the quick analysis, I wouldn't have been able to do that :)

Markus, I'll land the diagnostic patch above, i could come in handy for other things.
Component: Graphics → Layout
Pushed by msreckovic@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/16b4946069a9
Get more information on the cause of the device reset.  Remove unused UpdateRenderModeIfDeviceReset method. r=dvander
Assignee: nobody → mstange
Blocks: 1147673
Version: 48 Branch → 46 Branch
In the crashes, the reset causes are all over the place, but we'd need the user to reproduce and check which ones show up in this scenario (which is not a crash.)  In crashes, we get everything except DeviceResetReason::INVALID_CALL. 

Ciprian, if you run with the latest nightly and reproduce this problem, what error messages do you see in about:support?  In particular, for the ones that contain "Detected rendering device reset on refresh:", what is the number that follows?  It should be between 1 and 7.
Flags: needinfo?(ciprian.muresan)
I see multiple instances of "Detected rendering device reset on refresh: 4" in the about:support page.
Flags: needinfo?(ciprian.muresan)
That's the "DRIVER_ERROR" one.
(In reply to Milan Sreckovic [:milan] from comment #20)
> That's the "DRIVER_ERROR" one.

Which is DXGI_ERROR_DRIVER_INTERNAL_ERROR, with a not-that-helpful description "The driver encountered a problem and was put into the device removed state."
See Also: → 1278973
platform-rel: --- → ?
Whiteboard: [gfx-noted] → [gfx-noted] [platform-rel-Youtube]
platform-rel: ? → +
I'm going to 'wontfix' this for 48 but can we get a re-test now that bug 1284322 is fixed?
Flags: needinfo?(ciprian.muresan)
User Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0
Build ID: 20160718081125

I have retested the issue on the latest Nightly and it is still reproducible. I'm still seeing "Detected rendering device reset on refresh: 4" in the about:support page. Also the issue is with the AMD graphics driver, not an Nvidia one.

Attached a .txt file with the about:support page, FWIW.
Flags: needinfo?(ciprian.muresan)
Peter, your team was successful in dealing with some of these "device removed error" handling; can you have somebody take a look at this one?

For example: https://crash-stats.mozilla.com/report/index/8444ed9c-ccf8-436b-aba7-68e892160815 and https://crash-stats.mozilla.com/report/index/7f0da5eb-272c-418d-95d7-58d692160815
Flags: needinfo?(howareyou322)
(In reply to Ciprian Muresan [:cmuresan] [PTO until 08/12/2016] from comment #23)
> Created attachment 8772316 [details]
> aboutsupport from nightly 2016-07-18.txt
> 
> I have retested the issue on the latest Nightly and it is still
> reproducible. I'm still seeing "Detected rendering device reset on refresh:
> 4" in the about:support page. Also the issue is with the AMD graphics
> driver, not an Nvidia one.

4 means DXGI_ERROR_DRIVER_INTERNAL_ERROR, with a not-that-helpful description "The driver encountered a problem and was put into the device removed state."

(In reply to Milan Sreckovic [:milan] from comment #24)
> Peter, your team was successful in dealing with some of these "device
> removed error" handling; can you have somebody take a look at this one?
> 
> For example:
> https://crash-stats.mozilla.com/report/index/8444ed9c-ccf8-436b-aba7-
> 68e892160815 and
> https://crash-stats.mozilla.com/report/index/7f0da5eb-272c-418d-95d7-
> 58d692160815

Based on above crash-stats reports, they all got problem when calling DrawTargetD2D1::PushLayer.
Jerry, please help to take a look.
Flags: needinfo?(howareyou322) → needinfo?(hshih)
I will check this crash.
Status: NEW → ASSIGNED
Flags: needinfo?(hshih)
Assignee: mstange → hshih
Rank: 27

The bug assignee didn't login in Bugzilla in the last 7 months.
:dholbert, could you have a look please?
For more information, please visit auto_nag documentation.

Assignee: bignose1007+bugzilla → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(dholbert)

Reclassifying as Web Painting (which was spun out of Core::Layout for display-listy-things in the time since this bug was filed, I think). It looks like this was in layout due to being a regression from bug 1147673, which nowadays we would categorize under Web Painting.

Given the passage of time and substantial amount of changes since this bug was last active (e.g. WebRender shipping, fission shipping), I'm going to guess that this is no longer reproducible. If it was reproducing, I suspect we would have had more activity / concern here at some point in the past 5-6 years.

cmuresan, it looks like you reproduced this at one point (comment 6) -- could you confirm that this isn't reproducing anymore for you?

Component: Layout → Web Painting
Flags: needinfo?(dholbert) → needinfo?(cmuresan)

Yup, I can confirm that the issue is no longer reproducible with the latest Nightly 100.0a1 BuildID 20220320213921. Youtube also went through a huge UI change since this bug was logged.
I've also tried to reproduce the issue on Nightly 49.0a1, Build ID 20160601030219, where I initially reproduced the issue, but outside of some weird page snapping while scrolling it doesn't look like the issue is reproducible anymore.

@dholbert, I think we can close this out as Worksforme.

Flags: needinfo?(cmuresan) → needinfo?(dholbert)

Thanks!

Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(dholbert)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: