Closed Bug 648484 Opened 13 years ago Closed 13 years ago

Cross-process rendering with d2d/d3d10

Categories

(Core :: Graphics, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla8

People

(Reporter: cjones, Assigned: cjones)

References

Details

Attachments

(18 files, 1 obsolete file)

963 bytes, patch
Details | Diff | Splinter Review
6.99 KB, patch
roc
: review+
Details | Diff | Splinter Review
11.04 KB, patch
roc
: review+
Details | Diff | Splinter Review
13.90 KB, patch
Details | Diff | Splinter Review
1.53 KB, patch
bas.schouten
: review+
Details | Diff | Splinter Review
12.17 KB, patch
bas.schouten
: review+
Details | Diff | Splinter Review
3.51 KB, patch
bas.schouten
: review+
Details | Diff | Splinter Review
6.75 KB, patch
bas.schouten
: review+
Details | Diff | Splinter Review
4.49 KB, patch
bas.schouten
: review+
Details | Diff | Splinter Review
6.83 KB, patch
bas.schouten
: review+
Details | Diff | Splinter Review
4.49 KB, patch
bas.schouten
: review+
Details | Diff | Splinter Review
4.47 KB, patch
joe
: review+
Details | Diff | Splinter Review
5.05 KB, patch
roc
: review+
Details | Diff | Splinter Review
3.76 KB, patch
bas.schouten
: review+
Details | Diff | Splinter Review
3.76 KB, patch
bas.schouten
: review+
Details | Diff | Splinter Review
1.50 KB, patch
BenWa
: review+
Details | Diff | Splinter Review
16.76 KB, patch
Details | Diff | Splinter Review
4.82 KB, patch
roc
: review+
Details | Diff | Splinter Review
One option we have is to add d3d10 --> d3d10 forwarding.  This doesn't seem particularly difficult, but requires a fair amount of new, dead-end code (to be obsoleted by NGFX).  Another option is to forward basic-layers+d2d, but that has unknown perf impact on, e.g., drawImage demos.

Bas proposes instead handing the content process an iframe-sized texture, like a window back buffer, to which the content process would composite its layers as we do in the current single-process world.  (Details TBD.)  This ought to be considerably simpler than full shadow-layer support.  To ramp up multi-process on win7, this proposal sounds awfully appealing.  Here are some goals on the radar screen that would force us to implement full shadow-layer support
 - async scrolling.  Awaiting feedback from UX/UI folk.
 - DirectVideo.  Bug 598868.
 - compositor-driven layer animation

We might get NGFX before any of the above.  SO, another very attractive feature of the "window buffer" proposal is that in that case, we would only throw away a small amount of code.
Might need to bail, will get as far as I can.
Assignee: nobody → jones.chris.g
Update: this is about ready to go.  I had some issues with keyed-mutex sync, but I think we can land without that and fix in a followup.  Plenty of bugs, I'm sure.
Overview of the upcoming patches
 - random helpers/fixes
 - enable sharing of D3D10 device resources
 - implement sharing of D3D10 textures
 - allow D3D10 layer managers to render directly to a shared "window texture"
 - implement just enough shadow-layer goop to forward window textures content->compositor

I'm leaving the following work for later bugs, which I'll file in a bit
 - used textures' keyed mutexes for synchronization
 - handle device resets/crashes/etc.  This is actually a big chunk of work.
In content processes, we need to be able to specifically request creation of d3d10 layers.  Instead of duplicating the logic in the windows backend, the content process will just ask the compositor what the layer backend is and then request that.  If the backed is d3d10, PuppetWidget will use that, otherwise it'll use basic layers.
Attachment #537402 - Flags: superreview?(roc)
I wrote this long enough ago that I don't remember the details of why we need to do this.  The code is taken from MSDN examples.
Attachment #537404 - Flags: review?(bas.schouten)
The shadow container will only have one child layer, the WindowLayer, so this implementation is very simple.
Attachment #537406 - Flags: review?(bas.schouten)
Rendering to the "window texture" is similarly simple, so on the compositor side, we don't need full thebes-layer support, only what's here.  As noted in the comments, it doesn't particular matter which layer type we use for this, but thebes layers seemed closest semantically.
Attachment #537407 - Flags: review?(bas.schouten)
This is probably the most interesting patch here, but quite simple thankfully.
Attachment #537409 - Flags: review?(bas.schouten)
These classes are just minimal wrappers around the window buffer.  They will die eventually.
Attachment #537410 - Flags: review?(bas.schouten)
This turns on d3d10 sharing when we're using d3d10 in the compositor.
Attachment #537412 - Flags: review?(roc)
Bas: let me know how comfortable you are with reviewing the shadow-layer stuff.  It's pretty straightforward for this bug, but if you don't feel up to it, Joe and Rob have reviewed this kind of stuff in the past.
Comment on attachment 537403 [details] [diff] [review]
part 4: Log layers transactions in the d3d10 backend

Review of attachment 537403 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/d3d10/LayerManagerD3D10.cpp
@@ +298,5 @@
>  void
>  LayerManagerD3D10::BeginTransaction()
>  {
> +  MOZ_LAYERS_LOG(("[----- BeginTransaction"));
> +  Log();

These should be wrapped in MOZ_LAYERS_HAVE_LOG like in other layer managers probably, to prevent calls to Log() in builds that don't do logging.
Attachment #537403 - Flags: review?(bas.schouten) → review+
Comment on attachment 537404 [details] [diff] [review]
part 5: Create our D3D10 device so as to allow cross-process resource sharing

Review of attachment 537404 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxWindowsPlatform.cpp
@@ +433,5 @@
> +                        adapter1 = nsnull;
> +                    }
> +                }
> +            }
> +

We need to do this for the 10.0 device too. The fact we have CreateDXGIFactory1 available does not mean we need a device which supports 10.1 features, there's a lot of devices out there, particularly for NVidia that are just 10.0, and not 10.1, using an adapter from DXGI 1.1 they'll still support the keyed mutex. Additionally we need to make sure this is only called when DXGI 1.1 is available, non-updated Vista machine will not have the CreateDXGIFactory1 function I believe.
Attachment #537404 - Flags: review?(bas.schouten) → review-
That patch only attempts to create the adapter if CreateDXGIFactory1 was successfully resolved.  I'm not sure I completely follow the rest ... are you saying that the D3D10_FEATURE_LEVEL_10_0 device will work OK if we have the right adapter, so we should try creating the factor+adapter first and then pass the adapter to both createD3DDevice() calls?  I've forgotten pretty much all the details of this, would need to browse MSDN for a while to refresh.  If you happen to know off-hand that'd help a lot.

I'm not too worried about what to do if DXGI 1.1 isn't available.  We'll need to handle that of course, but at worst we can just fall back to single-process mode.
(In reply to comment #21)
> That patch only attempts to create the adapter if CreateDXGIFactory1 was
> successfully resolved.  I'm not sure I completely follow the rest ... are
> you saying that the D3D10_FEATURE_LEVEL_10_0 device will work OK if we have
> the right adapter, so we should try creating the factor+adapter first and
> then pass the adapter to both createD3DDevice() calls?  I've forgotten
> pretty much all the details of this, would need to browse MSDN for a while
> to refresh.  If you happen to know off-hand that'd help a lot.
 
I'm saying right now this code would only create a proper adapter on D3D10_FEATURE_LEVEL_10_1 devices. D3D10_FEATURE_LEVEL_10_0 can create a DXGI 1.1 compatible device just fine and still be used.

So right now, if D3D10_FEATURE_LEVEL_10_0 succeeds, but D3D10_FEATURE_LEVEL_10_1 fails (any 10.0 DDI hardware). You'd be stuck with your non-DXGI 1.1 device created by the first createDevice call.. You'll basically want to move this code up in scope above the first createDevice call, and pass the adapter into both createdevice calls, that should be fine. I'd check for ID3D10Device support instead of ID3D10Device1 too.

The reason this worked for you was because you happened to test this on 10.1 hardware.

> I'm not too worried about what to do if DXGI 1.1 isn't available.  We'll
> need to handle that of course, but at worst we can just fall back to
> single-process mode.

If it isn't available D2D isn't available (they're in the same update I believe), so you don't need to worry :). I'm not even sure this code is reached in that case even or if it bails earlier because of not finding DWrite/D2D. It might actually be fine in general, since if the function isn't resolved, as you say it won't call it, and passing NULL for the adapter is a valid option.
(In reply to comment #18)
> Bas: let me know how comfortable you are with reviewing the shadow-layer
> stuff.  It's pretty straightforward for this bug, but if you don't feel up
> to it, Joe and Rob have reviewed this kind of stuff in the past.

I'll give it a shot, I think it'll be fine but should I run into trouble I'll let you know :).
(In reply to comment #22)
> I'm saying right now this code would only create a proper adapter on
> D3D10_FEATURE_LEVEL_10_1 devices. D3D10_FEATURE_LEVEL_10_0 can create a DXGI
> 1.1 compatible device just fine and still be used.

Gotcha.  Easy fix.
Comment on attachment 537400 [details] [diff] [review]
part 1: Fix some warning spam

Review of attachment 537400 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/ipc/ContentParent.cpp
@@ -502,5 @@
>      clipboard->GetData(trans, whichClipboard);
>      nsCOMPtr<nsISupports> tmp;
>      PRUint32 len;
>      rv  = trans->GetTransferData(kUnicodeMime, getter_AddRefs(tmp), &len);
>      NS_ENSURE_SUCCESS(rv, rv);

Fix bad return of rv while you're here?
Attachment #537400 - Flags: review?(roc) → review+
(In reply to comment #25)
> Comment on attachment 537400 [details] [diff] [review] [review]
> part 1: Fix some warning spam
> 
> Fix bad return of rv while you're here?

Done.

(In reply to comment #19)
> Comment on attachment 537403 [details] [diff] [review] [review]
> part 4: Log layers transactions in the d3d10 backend
> 
> Review of attachment 537403 [details] [diff] [review] [review]:
> These should be wrapped in MOZ_LAYERS_HAVE_LOG like in other layer managers
> probably, to prevent calls to Log() in builds that don't do logging.

Sure.  If only for consistency.
Try using a dxgi1.1 adapter for 10_0 devices too.
Attachment #537404 - Attachment is obsolete: true
Attachment #537579 - Flags: review?(bas.schouten)
Comment on attachment 537401 [details] [diff] [review]
part 2: Add various helpers, refactor ContainerLayer::SetSpecificAttributes

Review of attachment 537401 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #537401 - Flags: review?(roc) → review+
Comment on attachment 537402 [details] [diff] [review]
part 3: Allow passing a "backend hint" to GetLayerManager() to request a non-default layer manager backend

Review of attachment 537402 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #537402 - Flags: superreview?(roc) → superreview+
Comment on attachment 537412 [details] [diff] [review]
part D: Allow PuppetWidgets to create D3D10 layer managers (for the time being)

Review of attachment 537412 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #537412 - Flags: review?(roc) → review+
Comment on attachment 537411 [details] [diff] [review]
part C: Forward a shadow-layer transaction after rendering in the D3D10 backend, if remote

Review of attachment 537411 [details] [diff] [review]:
-----------------------------------------------------------------

Can you explain to me how the "if remote" part is effected?
Attachment #537411 - Flags: review?(joe) → review+
Sure: LayerManagerD3D10 renders to mSwapChain if "local", and mBackBuffer if remote.  So mBackBuffer != null => (remote and mSwapChain == null), and mSwapChain != null => (!remote and mBackBuffer == null).  Does that answer your question?
yes sir! Thanks.
Comment on attachment 537579 [details] [diff] [review]
part 5: Create our D3D10 device so as to allow cross-process resource sharing, v2

Review of attachment 537579 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxWindowsPlatform.cpp
@@ +397,5 @@
>  
> +    HMODULE dxgiModule = LoadLibraryA("dxgi.dll");
> +    CreateDXGIFactory1Func createDXGIFactory1 = (CreateDXGIFactory1Func)
> +        GetProcAddress(dxgiModule, "CreateDXGIFactory1");
> +

Not too important, but I suppose we can move this into if (createD3DDevice)?
Attachment #537579 - Flags: review?(bas.schouten) → review+
To save overhead if d3d isn't available but DXGI 1.1 is?  Can that happen?
Actually, never mind, I see what you mean.  Sure, that change makes sense.
Comment on attachment 537406 [details] [diff] [review]
part 7: Implement a very basic shadow container layer for D3D10, only enough to support the upcoming WindowLayer

Review of attachment 537406 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/d3d10/ContainerLayerD3D10.cpp
@@ +378,5 @@
> +ShadowContainerLayerD3D10::ShadowContainerLayerD3D10(LayerManagerD3D10 *aManager) 
> +    : ShadowContainerLayer(aManager, NULL)
> +    , LayerD3D10(aManager)
> +{
> +    mImplData = static_cast<LayerD3D10*>(this);

Indent 2 spaces here everywhere! :)

::: gfx/layers/d3d10/ContainerLayerD3D10.h
@@ +72,5 @@
>      DefaultComputeEffectiveTransforms(aTransformToSurface);
>    }
>  };
>  
> +class ShadowContainerLayerD3D10 : public ShadowContainerLayer,

We should document that this class will not properly adhere to its transform/cliprect/etc.
Attachment #537406 - Flags: review?(bas.schouten) → review+
Comment on attachment 537407 [details] [diff] [review]
part 8: Implement a very basic shadow thebes layer for D3D10, only enough to support the upcoming WindowLayer

Review of attachment 537407 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/d3d10/ThebesLayerD3D10.cpp
@@ +515,5 @@
>           aYRes != mYResolution;
>  }
>  
> +ShadowThebesLayerD3D10::ShadowThebesLayerD3D10(LayerManagerD3D10* aManager)
> +    : ShadowThebesLayer(aManager, NULL)

Indent in this file to 2!

@@ +555,5 @@
> +    // buffer (yet).
> +    *aReadOnlyFront = null_t();
> +
> +    // XXX what happens to newBackBuffer on exit here?  Do we need to
> +    // keep a phony ref to it?  Should the content process?

Windows should take of that, on exit all open handles something holds are closed. There is no implicit owner, when the last process pointing at a texture dies, only then should it go away. In theory.

@@ +564,5 @@
> +                                      getter_AddRefs(sync));
> +        sync->AcquireSync(1, INFINITE);
> +        sync->ReleaseSync(0);
> +    }
> +*/

If this stays in let's #if 0

@@ +588,5 @@
> +/*
> +  nsRefPtr<IDXGIKeyedMutex> sync;
> +  mTexture->QueryInterface(_uuidof(IDXGIKeyedMutex), getter_AddRefs(sync));
> +  sync->AcquireSync(1, INFINITE);
> +*/

#if 0

@@ +630,5 @@
> +  }
> +
> +/*
> +  sync->ReleaseSync(1);
> +*/

#if 0
Attachment #537407 - Flags: review?(bas.schouten) → review+
Comment on attachment 537405 [details] [diff] [review]
part 6: Add code to share D3D10 textures across processes

Review of attachment 537405 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #537405 - Flags: review?(bas.schouten) → review+
Comment on attachment 543788 [details] [diff] [review]
Unrot bug 648484. To be folded

Review of attachment 543788 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #543788 - Flags: review?(bas.schouten) → review+
Comment on attachment 537408 [details] [diff] [review]
part 9: Make LayerManagerD3D10 a shadow-layer manager and forwarder

Review of attachment 537408 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #537408 - Flags: review?(bas.schouten) → review+
Comment on attachment 537409 [details] [diff] [review]
part A: Allow D3D10 layers to render directly to a share-able texture

Review of attachment 537409 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/d3d10/LayerManagerD3D10.cpp
@@ +596,5 @@
> +    CD3D10_TEXTURE2D_DESC desc(DXGI_FORMAT_B8G8R8A8_UNORM,
> +                               rect.width, rect.height, 1, 1);
> +    desc.BindFlags = D3D10_BIND_RENDER_TARGET | D3D10_BIND_SHADER_RESOURCE;
> +    desc.MiscFlags = D3D10_RESOURCE_MISC_SHARED
> +                     /*D3D10_RESOURCE_MISC_SHARED_KEYEDMUTEX*/;

We should probably document why this is commented out.

@@ +667,5 @@
>  
>    if (mTarget) {
>      PaintToTarget();
> +  } else if (mBackBuffer) {
> +    swap(mBackBuffer, mRemoteFrontBuffer);

We should think about synchronization here.
Attachment #537409 - Flags: review?(bas.schouten) → review+
Comment on attachment 537410 [details] [diff] [review]
part B: Implement shadowable layer goop for D3D10, just enough to allow sending a window buffer to the compositor

Review of attachment 537410 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/d3d10/LayerManagerD3D10.cpp
@@ +734,5 @@
>  
> +WindowLayer::WindowLayer(LayerManagerD3D10* aManager)
> +    : ThebesLayer(aManager, nsnull)
> +{
> + }

Indent is wrong in this file.

::: gfx/layers/d3d10/LayerManagerD3D10.h
@@ +293,5 @@
> + * WindowLayer being implemented as a thebes layer isn't an important
> + * detail; other layer types could have been used.
> + */
> +class WindowLayer : public ThebesLayer, public ShadowableLayer {
> +public:

2 space indent here.
Attachment #537410 - Flags: review?(bas.schouten) → review+
These patches are timing out in test_process_error.xul on d3d9 machines:

WARNING: shadow layers no sprechen D3D backend yet: file e:/builds/moz2_slave/try-w32-dbg/build/layout/ipc/RenderFrameParent.cpp, line 721
Attachment #544243 - Flags: review?(bgirard) → review+
Thanks.

It's not obvious what went wrong here.  I think an unrelated cleanup I did (two confused nsresult/bool return values) might have had unexpected consequences.  Testing that hypothesis.
Nope :/.  Will need to diagnose when I get back to my dev env.
I'm pretty sure that these patches are just tickling bug 662334 really hard, for some random reason.  The failures from these patches and the ones linked from bug 662334 look very similar.  Here's the general pattern

TEST-INFO | chrome://mochitests/content/browser/mobile/chrome/tests/browser_thumbnails.js | Console message: [JavaScript Error: "redeclaration of const window" {file: "chrome://mochitests/content/browser/mobile/chrome/tests/remote_head.js" line: 4}]
TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/mobile/chrome/tests/browser_thumbnails.js | Test timed out
INFO TEST-END | chrome://mochitests/content/browser/mobile/chrome/tests/browser_thumbnails.js | finished in 30060ms

  ^^^ timeout, no progress on the test at all

TEST-INFO | checking window state
TEST-START | chrome://mochitests/content/browser/mobile/chrome/tests/browser_vkb.js
TEST-INFO | chrome://mochitests/content/browser/mobile/chrome/tests/browser_vkb.js | Test interactions with a VKB from content
TEST-PASS | chrome://mochitests/content/browser/mobile/chrome/tests/browser_vkb.js | Tab Opened
TEST-INFO | chrome://mochitests/content/browser/mobile/chrome/tests/browser_vkb.js | Console message: [JavaScript Error: "redeclaration of const window" {file: "chrome://mochitests/content/browser/mobile/chrome/tests/remote_head.js" line: 4}]

  ^^^ bunch of garbage output from browser_vkb.js, snipped here

TEST-INFO | chrome://mochitests/content/browser/mobile/chrome/tests/browser_vkb.js | Console message: [JavaScript Error: "is is not defined" {file: "chrome://mochitests/content/browser/mobile/chrome/tests/browser_thumbnails.js" line: 121}]

  ^^^ finally this, the Clinton-eque "is is not defined"

I have no idea how to run fennec b-c tests, investigating that.
I can't reproduce the same timeout as on tinderbox, but locally I can reliably repro a timeout in browser_navigation.js.  Bisecting over the patch queue says part 3 is to blame.  In the timeouts, I see this spew

  ===== BAD =====
[snip]
TEST-INFO | chrome://mochitests/content/browser/mobile/chrome/tests/browser_navigation.js | Console message: Invalid chrome URI: /
TEST-INFO | chrome://mochitests/content/browser/mobile/chrome/tests/browser_navigation.js | Console message: Invalid chrome URI: /c
TEST-PASS | chrome://mochitests/content/browser/mobile/chrome/tests/browser_navigation.js | The title should be equal to the typed page url title
TEST-PASS | chrome://mochitests/content/browser/mobile/chrome/tests/browser_navigation.js | The title should be equal to the url of the clicked row

In passes I see

  ===== GOOD =====
[snip]
TEST-INFO | chrome://mochitests/content/browser/mobile/chrome/tests/browser_navigation.js | Console message: Invalid chrome URI: /
TEST-INFO | chrome://mochitests/content/browser/mobile/chrome/tests/browser_navigation.js | Console message: Invalid chrome URI: /c
###################################### forms.js loaded
###################################### content loaded
loading about:blank, 1
[TabChild] RESIZE to (w,h)= (480d, 800d)
TEST-PASS | chrome://mochitests/content/browser/mobile/chrome/tests/browser_navigation.js | The title should be equal to the typed page url title
creating 1!
[TabChild] SHOW (w,h)= (0, 0)
###################################### forms.js loaded
###################################### content loaded
loading about:blank, 1
[TabChild] RESIZE to (w,h)= (480d, 800d)
TEST-PASS | chrome://mochitests/content/browser/mobile/chrome/tests/browser_navigation.js | The title should be equal to the url of the clicked row
TEST-PASS | chrome://mochitests/content/browser/mobile/chrome/tests/browser_navigation.js | The title should be equal to the clicked page url
TEST-INFO | chrome://mochitests/content/browser/mobile/chrome/tests/browser_navigation.js | Check for appearance of the favicon

so maybe there's a race condition here being tickled by the new timing of the SendGetParentType message?  Digging.
Removing the message makes the test pass reliably again :/.
Ever had one of those days where every line of code you touch seems to crumble away in your hands, and fixing brokenness only reveals more brokenness?  I just did.
What was happening in the test was, in the chrome process a tab was created and destroyed in short succession.  This disconnected the frame loader before we had a chance to create shadow layers for the tab.  This was previously being ignored, which wasn't so great.  But because of bug 671189, the failure to hook up PLayers was causing the content process to hang.  I noticed that the test that was failing locally for me was disabled on android.  This fix might allow us to turn some of those back on.

The fix itself is somewhat gross but not complicated.  It introduces a kind of "half-destroyed" state for PBrowser/PLayers, in which stuff is still alive but neutered.  This patch makes the PLayers constructor synchronous so that the half-destroyed state is immediately communicated to content.  Then the rest of the patch fixes things that were trying to do too much in the half-destroyed state and crashing.

To be folded into part 3.
Attachment #545603 - Flags: superreview?(roc)
Comment on attachment 545603 [details] [diff] [review]
part 3.1: Deal with failure to hook up shadow layers

Review of attachment 545603 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #545603 - Flags: superreview?(roc) → superreview+
Is this bug related to electrolysis, or to Azure ?
Does it have a feature page. I Somewhat got the idea from comments, but still not sure.
Depends on: 671839
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I can't repro the corruption described in bug 671839 in a debug build on 10.6.8.  Trying opt.
Bisect once again blames the infamous part 3.
Now the question is, how to write a test for this.  Any ideas?  Is it possible to snapshot the content of popup windows, somehow?
Attachment #551680 - Flags: review?(roc)
I don't think we have a way to do that.

What we really need is to use NS_OVERRIDE and have something that checks it...
A checker exists, it just doesn't run on tinderbox.

Alright.  Sigh.
Whiteboard: [inbound]
Blocks: 678207
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: