Closed Bug 1109718 Opened 5 years ago Closed 4 months ago

Figure out why and when RenderTargetView's need to be recreated

Categories

(Core :: Canvas: WebGL, defect)

x86_64
Windows 7
defect
Not set

Tracking

()

RESOLVED INACTIVE
mozilla42
Tracking Status
firefox42 --- fixed

People

(Reporter: vladan, Assigned: jrmuizel, NeedInfo)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(1 file)

Google Maps only blinks when I try to zoom the map or get directions. It's likely broken by the patch that turned on D3D11 on WebGL.
The 3D cube on http://get.webgl.org/ shakes, but doesn't spin.

Tested on Nightly 20141210030207, Windows 7 x64, Optimus Quadro 1000M + Intel HD 3000
*Tested on a non-e10s build
Blocks: 1079398
This works for me on a Intel HD 4000
Summary: Turning on D3D11 for WebGL broke Google Maps on Windows → Turning on D3D11 for WebGL broke Google Maps on some Windows hardware
Graphics
--------
Adapter Description: Intel(R) HD Graphics 3000
Adapter Description (GPU #2): NVIDIA Quadro 1000M
Adapter Drivers: igdumd64 igd10umd64 igd10umd64 igdumd32 igd10umd32 igd10umd32
Adapter Drivers (GPU #2): nvd3dumx,nvwgf2umx,nvwgf2umx nvd3dum,nvwgf2um,nvwgf2um
Adapter RAM: Unknown
Adapter RAM (GPU #2): 2048
Device ID: 0x0126
Device ID (GPU #2): 0x0dfa
Direct2D Enabled: true
DirectWrite Enabled: true (6.2.9200.16571)
Driver Date: 3-20-2014
Driver Date (GPU #2): 9-12-2014
Driver Version: 9.17.10.3517
Driver Version (GPU #2): 9.18.13.4084
GPU #2 Active: false
GPU Accelerated Windows: 1/1 Direct3D 11 (OMTC)
Subsys ID: 21d117aa
Subsys ID (GPU #2): 21d117aa
Vendor ID: 0x8086
Vendor ID (GPU #2): 0x10de
WebGL Renderer: Google Inc. -- ANGLE (NVIDIA Quadro 1000M Direct3D11 vs_5_0 ps_5_0)
windowLayerManagerRemote: true
AzureCanvasBackend: direct2d 1.1
AzureContentBackend: direct2d 1.1
AzureFallbackCanvasBackend: cairo
AzureSkiaAccelerated: 0
This worked also worked on Sotaro's Nvidia machine.
Shaking here and no spin with nvidia 344.75 driver on Windows 8.1 Pro 64-bit with FX 37 Nightly 64-bit (e10s enabled):

Graphics:
Adapter Description: NVIDIA GeForce GTX 460
Adapter Drivers: nvd3dumx,nvwgf2umx,nvwgf2umx nvd3dum,nvwgf2um,nvwgf2um
Adapter RAM: 1024
ClearType Parameters: Gamma: 2200 Pixel Structure: R ClearType Level: 0 Enhanced Contrast: 100
Device ID: 0x0e22
Direct2D Enabled: true
DirectWrite Enabled: true (6.3.9600.17415)
Driver Date: 11-12-2014
Driver Version: 9.18.13.4475
GPU #2 Active: false
GPU Accelerated Windows: 1/1 Direct3D 11 (OMTC)
Subsys ID: 00000000
Vendor ID: 0x10de
WebGL Renderer: Google Inc. -- ANGLE (NVIDIA GeForce GTX 460 Direct3D11 vs_5_0 ps_5_0)
windowLayerManagerRemote: true
AzureCanvasBackend: direct2d 1.1
AzureContentBackend: direct2d 1.1
AzureFallbackCanvasBackend: cairo
AzureSkiaAccelerated: 0
(In reply to Zlip792 from comment #5)
> Shaking here and no spin with nvidia 344.75 driver on Windows 8.1 Pro 64-bit
> with FX 37 Nightly 64-bit (e10s enabled):

Do you have a dual GPU machine?
Flags: needinfo?(zlip.792)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)
> (In reply to Zlip792 from comment #5)
> > Shaking here and no spin with nvidia 344.75 driver on Windows 8.1 Pro 64-bit
> > with FX 37 Nightly 64-bit (e10s enabled):
> 
> Do you have a dual GPU machine?

No desktop with single GPU.
Flags: needinfo?(zlip.792)
I had Vladan try a version of the build that didn't use KeyedMutex's for synchronization and that fixed the problem for him. At this point it looks like an Nvidia driver bug with keyed mutexes.
Summary: Turning on D3D11 for WebGL broke Google Maps on some Windows hardware → Turning on D3D11 for WebGL broke Google Maps on some Windows (Nvidia?) hardware due to KeyedMutexes
If you run http://people.mozilla.org/~jmuizelaar/canvas2webgl/canvas2webgl.html we end up repeating the first 4 frames in a loop
(In reply to Jeff Muizelaar [:jrmuizel] from comment #9)
> If you run
> http://people.mozilla.org/~jmuizelaar/canvas2webgl/canvas2webgl.html we end
> up repeating the first 4 frames in a loop

Left one does not align and "0" keeps on blinking rather than changing.
Zlip792, does the build here fix the problem?

http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmuizelaar@mozilla.com-6b4f2f8c1d89/try-win64/
Flags: needinfo?(zlip.792)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #11)
> Zlip792, does the build here fix the problem?
> 
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmuizelaar@mozilla.
> com-6b4f2f8c1d89/try-win64/

Confirmed, this try build fixed both Comment 1 and Comment 9 Testcases.

Comment 1 Testcase:
- No Shake and spinning like Chromium 41.

Comment 9 Testcase:
- Aligned and in sync with right side.

Feel free to ask about anything else.
Flags: needinfo?(zlip.792)
I didn't have much luck debugging this. My current theory is that there might be something wrong with this pattern:

producer1->release();

producer2->acquire()
 consumer1->acquire()
 consumer1->release();
producer2->release();

producer1->acquire();
 consumer2->acquire();
 consumer2->release();


vs

producer1->release();

 consumer1->acquire()
 consumer1->release();
producer2->acquire()
producer2->release();

 consumer2->acquire();
 consumer2->release();
producer1->acquire();
I ran a test where I dumped the contents of the frames on the client side. This showed the same results as we were seeing in the compositor. This suggests that we're not able to successfully write to the framebuffers after a while and just keep cycling through the old content.
So I've narrowed down the problem some. It looks like when we AcquireSync on a surface for a second time we can no longer write to that surface. I don't yet know why. The other thing I've discovered is that ReleaseSync seems to clear the RenderTarget from the device context. This was surprising to me.
So it looks the problem is as follows:
ANGLE creates a RenderTargetView during texture creation. On most drivers we seem to be able to reuse this across AcquireSync/ReleaseSync operations, however on machines with this problem it seems like we can not.

The solution seems to be to recreate our RenderTargetView everytime we AcquireSync. How to convince ANGLE to do this is unclear to me at the moment.
Blocks: 1112385
I can reproduce this on an IntelHD-Machine, so it's not nVidia-only.
The try-build of comment 11 works for me, too.
Summary: Turning on D3D11 for WebGL broke Google Maps on some Windows (Nvidia?) hardware due to KeyedMutexes → Turning on D3D11 for WebGL broke Google Maps on some Windows hardware due to KeyedMutexes
I've disabled KeyedMutexes on trunk until we can fix this issue properly.
Depends on: 1112780
 Zlip792 and Elbart can you post your d3d11 dll version?
Flags: needinfo?(zlip.792)
Vladan has version 6.2.9200.16570
(In reply to Jeff Muizelaar [:jrmuizel] from comment #20)
> Vladan has version 6.2.9200.16570
Same for me, on both machines.
(In reply to Elbart from comment #21)
> (In reply to Jeff Muizelaar [:jrmuizel] from comment #20)
> > Vladan has version 6.2.9200.16570
> Same for me, on both machines.

Are you able to reproduce on both machines?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #19)
>  Zlip792 and Elbart can you post your d3d11 dll version?

Mine: 6.3.9600.17415

I'm now on Nvidia latest Beta driver.
Flags: needinfo?(zlip.792)
Flags: needinfo?(elbart)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #22)
> (In reply to Elbart from comment #21)
> > (In reply to Jeff Muizelaar [:jrmuizel] from comment #20)
> > > Vladan has version 6.2.9200.16570
> > Same for me, on both machines.
> 
> Are you able to reproduce on both machines?

Yes, with affected Nightlies before today's.
Flags: needinfo?(elbart)
Elbart and Zlip792 can you try running this program and report it's output and whether it crashes or not:

http://people.mozilla.org/~jmuizelaar/d3d11-tests/firefox.exe

(I've renamed it to firefox.exe because drivers sometimes change behaviour depending on the executable name)
Flags: needinfo?(zlip.792)
Flags: needinfo?(elbart)
@jrmuizel

This exe file is crashing as soon as I start it with "MS-DOS" like window and "firefox.exe has stopped working".

I tried compatibility mode as well, removed EMET as well if it could be the case. Restarted as well. Any idea how to run it? I am willing to help as much as you want, even willing to share profile or system through Remote connection.
Flags: needinfo?(zlip.792)
(In reply to Zlip792 from comment #26)
> @jrmuizel
> 
> This exe file is crashing as soon as I start it with "MS-DOS" like window
> and "firefox.exe has stopped working".

This is somewhat expected. If you run it from a cmd window it should print something before crashing.
Flags: needinfo?(zlip.792)
Do I need to look into Event Viewer for app crash reason?
Flags: needinfo?(zlip.792)
(In reply to Zlip792 from comment #28)
> Do I need to look into Event Viewer for app crash reason?

No just run cmd.exe navigate to the path where you downloaded the executable and then run it by typing 'firefox'
It doesn't show any log etc, this does happen:

URL: http://i.imgur.com/py4APWb.png

Willing to help any way you want. Remote session?
(In reply to Zlip792 from comment #30)
> It doesn't show any log etc, this does happen:
> 
> URL: http://i.imgur.com/py4APWb.png
> 
> Willing to help any way you want. Remote session?

I've uploaded a new binary. This one should not crash.
Hashes - http://i.imgur.com/A8Jcg9w.png

It shows same result.

URL: http://i.imgur.com/TbuWuyI.png
(In reply to Zlip792 from comment #32)
> Hashes - http://i.imgur.com/A8Jcg9w.png
> 
> It shows same result.
> 
> URL: http://i.imgur.com/TbuWuyI.png

Try again. The previous builds were using the d3d debug layer which I expect you don't have. This was probably the cause of the crashes.
After trying again, this got printed:

URL: http://i.imgur.com/im9bA7Y.png

Ask me as much as you want for any help, I'm willing to help.
(In reply to Zlip792 from comment #34)
> After trying again, this got printed:
> 
> URL: http://i.imgur.com/im9bA7Y.png
> 
> Ask me as much as you want for any help, I'm willing to help.

Ok. That means what was broken on the machine that I have to test on is not broken for you. Which means you might have a different problem.

Just to confirm, does webgl work properly for you with the current nightly build?
Yes, with current Nightly build, WebGL works fine. Not witnessed any issue.
Blocks: 1122912
Summary: Turning on D3D11 for WebGL broke Google Maps on some Windows hardware due to KeyedMutexes → Figure out why and when RenderTargetView's need to be recreated
I'm looking into this. I don't think it's as wide spread as I originally thought, so I'm intentionally breaking it on Nightly to see what reports of brokeness we get.
Attachment #8630177 - Flags: review?(jgilbert)
Comment on attachment 8630177 [details] [diff] [review]
Avoid recreating the rendertargetview to see what hardware is all affected.

Review of attachment 8630177 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think we should break users in the hopes that they'll tell us what hardware they're using.

We certainly should not purposefully break them and leave them stranded. Strongly consider a programatic solution (particularly since a machine in the office reproduces the issue), or at the very least include a pref to re-enable the current behavior.
Attachment #8630177 - Flags: review?(jgilbert) → review-
https://hg.mozilla.org/mozilla-central/rev/139680a3393b
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 1183288
Depends on: 1186002
Status: REOPENED → RESOLVED
Closed: 4 years ago4 months ago
Resolution: --- → INACTIVE
You need to log in before you can comment on or make changes to this bug.