Closed Bug 1015218 Opened 10 years ago Closed 10 years ago

Use Cairo image surfaces and XShmPutImage instead of XRender on GTK/Linux OMTC basic

Categories

(Core :: Graphics, defect)

All
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: fred23, Assigned: fred23)

References

(Blocks 2 open bugs)

Details

Attachments

(8 files, 14 obsolete files)

4.05 KB, text/plain
Details
482.11 KB, text/plain
Details
6.71 KB, patch
karlt
: review+
Details | Diff | Splinter Review
9.32 MB, application/octet-stream
Details
603.18 KB, text/plain
Details
605.47 KB, text/plain
Details
13.54 KB, patch
karlt
: review+
Details | Diff | Splinter Review
13.35 KB, patch
karlt
: review+
Details | Diff | Splinter Review
I will branch all the OMTC basic efforts we had from bug 738937 onto this new bug here for better clarity.
Attachment #8427796 - Flags: review?(nical.bugzilla)
(In reply to Frederic Plourde (:fred23) from comment #1)
> Created attachment 8427796 [details] [diff] [review]
> Remove_Xlib_surfaces_from_CreateOffscreenSurface.patch

Please see the patch above. It's preparatory work needed for upcoming patch for GTK3 OMTC basic using XRender() with Image offscreen surfaces.

This patch simply adds another gfxPrefs (forget about the one I made you review, :nical, in bug 738937), called layers.use-image-offscreen-surfaces. (tell me how you like/don't like the name) and use it to branch/condition in gfxPlatformGtk::CreateOffscreenSurface.

so now, having if UseXRender() and UseImageOffscreenSurfaces() in the condition, we end up having 4 branches for offscreen surface creation. Here's which ones :


1) UseXRender() + !UseImageOffscreenSurfaces
2) !UseXRender() + !UseImageOffscreenSurfaces

Those are the two already present. 1) Makes our layers use gfxXlibSurfaces and use XRender() for compositing and 2) Makes our layers use gfxImageSurfaces and cairo XShm copy for compositing.

3) UseXRender() + UseImageOffscreenSurfaces

New path that will be taken care of by upcoming patches in this bug. In this case, we're forcing layers to use gfxImageSurfaces and we will be maintaining an intermediate cairo-xlib surface on compositing side (at the DataTextureSource level) to buffer image layer content to this cairo-xlib surface using a cairo-copy to be able to keep using XRender() for the "xlib layer->xlib RenderTarget" compositing step.

4) !UseXRender() + UseImageOffscreenSurfaces

New path that will be taken care of by upcoming patches in this bug. In this case, we're forcing layers to use gfxImageSurfaces and, having no support for XRender(), we will simply let our gtkWindow create the usual nsShmImage intermediate target as a ThebesSurface. Layers will get composited to this image target by software copies and the widget will XShmPutImage it onto its xlib drawable once per composited frame.
Depends on: gtk3
See Also: → 738937
Depends on: 1015211
Assignee: nobody → frederic.plourde
No longer depends on: 1015211
This patch fixes compositing for following case :

4) !UseXRender() + UseImageOffscreenSurfaces

When using an OMTC compositor (like BasicCompositor) with forced image offscreen surfaces, our compositor's RenderTarget happens to be an nsShmImage owned by our gtkWidget. In this case, the patch just simply ->Put that ShmImage onto widget's drawable when compositing is done, as it's already done for MTC paths when using image offscreens.
Attachment #8427883 - Flags: review?(nical.bugzilla)
Attachment #8427883 - Flags: review?(bas)
Comment on attachment 8427883 [details] [diff] [review]
Put intermediate nsShmImage RenderTarget to widget's drawable when OMTC + image offscreen surfaces

Review of attachment 8427883 [details] [diff] [review]:
-----------------------------------------------------------------

I am not a peer of Widget and I don't know this code as well as I should to give a proper review. It looks good to me (f+) but i'll let Karl give the actual review.
Attachment #8427883 - Flags: review?(nical.bugzilla)
Attachment #8427883 - Flags: review?(karlt)
Attachment #8427883 - Flags: feedback+
Attachment #8427796 - Flags: review?(nical.bugzilla) → review+
Comment on attachment 8427883 [details] [diff] [review]
Put intermediate nsShmImage RenderTarget to widget's drawable when OMTC + image offscreen surfaces

With OMTC, I assume the composition is uploaded on the compositing thread (or much of the benefit of OMTC would be lost) but this is the main / user event thread.

Perhaps there is also still a synchronous composition in response to expose events?

If so, then an upload is required on this thread, but an upload will still be required on the compositing thread (bug 1015262 perhaps).  I'd like to hold off reviewing this until I understand how/where the compositing thread uploads.

Also, performance measurements in bug 738937 seem to indicate that using nsShmImage does not perform as well as using cairo to upload, so I'm not seeing much benefit in supporting this path with nsShmImage.
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #5)

> With OMTC, I assume the composition is uploaded on the compositing thread
> (or much of the benefit of OMTC would be lost) but this is the main / user
> event thread.

Ah! Thanks for those considerations, Karl... This is getting really interesting !
My comments/explanations below :

> Perhaps there is also still a synchronous composition in response to expose
> events?

Exactly! When using OMTC (no matter if it's 'basic' or 'ogl'...), those lines in nsWindow::OnExposeEvent are all called synchronously :

2121    if (GetLayerManager()->GetBackendType() == LayersBackend::LAYERS_CLIENT) {
2122        listener->PaintWindow(this, region);
2123        listener->DidPaintWindow();
2124        return TRUE;

In the "OMTC basic" case, listener->PaintWindow() will in turn end up in BasicCompositor::Render in a different thread.

> Also, performance measurements in bug 738937 seem to indicate that using
> nsShmImage does not perform as well as using cairo to upload, so I'm not
> seeing much benefit in supporting this path with nsShmImage.

Now, ok you're right, this is suboptimal perf-wise as compared to using XRender with cairo, but we may not have a choice depending on which paths are taken, remember that the proposed patch fixes this path only : 

"4) !UseXRender() + UseImageOffscreenSurfaces for OMTC"

Even if most linux systems do support XRender(), I wouldn't like to see some users without XRender() support fall into this path #4 and get bogus refreshes when Alt-Tabing or moving windows around. Now, speaking of which, you just made me realize that my patch isn't complete, hence the updated one above.

The updated patch just adds a condition in nsWindow::GetThebesSurface (GetThebesSurface provides the compositor Thread with either 1) the gtkWidget cairo-xlib surface to copy to when doing ::EndFrame   or   2) the good-old nsShmImage-wrapped gfxImageSurface intermediate target *if* we're having useShm()) that will AVOID returning the intemediate nsShmImage image surface if we're Supporting XRender(). That new patch now opens up path #3 (detailed in Comment 2)

All that being said, and since we're already forcing synched compositions on Expose events, I'd suggest we review and push that patch.

> If so, then an upload is required on this thread, but an upload will still
> be required on the compositing thread (bug 1015262 perhaps). I'd like to
> hold off reviewing this until I understand how/where the compositing thread
> uploads.

Totally needs an upload on the compositing thread, that's for sure :) and it's coming right up next. I'm finishing the last bits of BasicCompositor TextureSource's ::Update() twists and will post shortly in here. You can hold off reviewing of the patch until then if it makes more sense, I'm fine with that.
Attachment #8427883 - Attachment is obsolete: true
Attachment #8427883 - Flags: review?(karlt)
Attachment #8427883 - Flags: review?(bas)
Attachment #8430162 - Flags: review?(karlt)
This patch is necessary preliminary work (for what's coming up next) that basically just moves DataTextureSourceBasic from its .cpp file into the corresponding header file to allow others to subclass it.
Attachment #8433310 - Flags: review?(karlt)
Attached patch OMTC basic implementation (obsolete) — Splinter Review
This patch is the core implementation of OMTC basic, aligned with comments/suggestions previously made by Karl. It basically buffers Image Layer content to Xlib surfaces on compositor side (X11DataTextureSource::Update) and when time comes to Composite the buffer to our widget's drawable (our ThebesSurface), we're  correctly hitting (I checked by tracing the code) the xlib -> xlib XRender path in cairo.

I haven't gotten any perf. numbers yet, but this is my next task. Stay tuned for more news.

Some considerations about the patch :
====================================

 1) Please Ignore all the gfx/layers/moz.build changes.
    Those are modifs I had to do in order to build because there are blaming files
    in the UNIFIED_SOURCES list that messes with namespaces. This will be fixed properly
    in another bug, thanks.

 2) Concerning the Compositor::SetUserData trick, I know it's a little invasive, but really, since X11 paths/features do not align really well with the DataTextureBufferSource/DataBufferTextureHost mechanisms, I really had to go this way in order to avoid overriding the huge DataBufferTextureHost::Upload function that call mCompositor->CreateDataTextureSource() **without any argument**, whereas I'd need it too pass its xlib surface here. So after all, calling Compositor::SetUserData from my overriden ::MaybeUpload) and then, using that xlibSurface from X11DataTextureSourceBasic did not feel too hackish. Tell me what you think.

 3) One thing I'm not totally happy about is the code duplication between X11MemoryTextureHost and X11ShmemTextureHost  contructors and :MaybeUpload functions.
I tried to come up with a nice polymorphism solution that would put this common X11-related code in, say, X11DataBufferTextureSource, but I realized this would make base classes ambiguous, as in the following ASCII diagram :

 
                         X11MemoryTextureHost      X11ShmemTextureHost
                          /       |                        |
                         /--------+------------------------+
                        /         |                        |
                       /  MemoryTextureHost        ShmemTextureHost
                      /              |                  |
                     /                \                /
           X11BufferTextureHost        \              /
           with xlib-related stuff      \            /
                         \               \          /
                          \------------BufferTextureHost


   So of course, C++ forbids that, as BufferTextureHost becomes an ambiguous base class for X11MemoryTextureHost (via X11BufferTextureHost or via MemoryTextureHost). I thought also about putting the common code in a 'new' helper class, but we're only talking about 30-ish lines... so I kept it that way. I don't feel it's worth the pain, but if you come up with a great idea, lemme know ;-)
Attachment #8433353 - Flags: review?(karlt)
(In reply to Frederic Plourde (:fred23) from comment #2)
> 2) !UseXRender() + !UseImageOffscreenSurfaces
> 
> Those are the two already present.
> [...] 2) Makes our layers use
> gfxImageSurfaces and cairo XShm copy for compositing.

> 4) !UseXRender() + UseImageOffscreenSurfaces
> 
> New path that will be taken care of by upcoming patches in this bug. In this
> case, we're forcing layers to use gfxImageSurfaces and, having no support
> for XRender(), we will simply let our gtkWindow create the usual nsShmImage
> intermediate target as a ThebesSurface. Layers will get composited to this
> image target by software copies and the widget will XShmPutImage it onto its
> xlib drawable once per composited frame.

IIUC the difference between 2 and 4 here is that 2 uses cairo and 4 uses
nsShmImage?  Is that the only difference?

Do you know whether existing code already supports 2?

I'm not sure we need to support both cairo and nsShmImage for uploading layers
for compositing.  I'm OK if you just want to use cairo, but, if you prefer
nsShmImage, I suspect that is going to be more complicated so it would be
nice to do some measurements to indicate that the additional complication is worthwhile.

If we should retain both options, then it may be appropriate to have a
different method on gfxPlatformGtk to indicate whether to use nsShmImage or
cairo.  UseNSShm() maybe more meaningful than UseImageOffscreenSurfaces().
Comment on attachment 8430162 [details] [diff] [review]
Put_nsShmImage_to_xlibDrawable_even_when_on_OMTC_when_using_image_offscreen_surfaces (v2)

>+            mShmImage->Put(mGdkWindow, exposeRegion.mRects);

I won't review this bit until I understand how it is going to work on the
other thread.  This deserves to be in a separate patch to the part below.

>+    // UseXRender() is not the only criteria anymore that may lead to 
>+    // creation of image offscreen surfaces for content in
>+    // gfxPlatformGtk::CreateOffscreenSurface. Even if we're having 
>+    // image surfaces (UseShm = true) here, we will still want to use
>+    // XRender() in the compositor if possible.
>+    if ((nsShmImage::UseShm()) && (!gfxPlatformGtk::GetPlatform()->UseXRender()))

This is good in principal but OnExposeEvent() also tests against UseShm() to
determine the BufferMode, and these tests should be consistent.

Would it make sense to move the UseXRender() call to UseShm()?

When UseXRender() returns false,
ScreenReferenceSurface()->GetType() == gfxSurfaceType::Image
in UseShm() is always true and so can be removed.
Attachment #8430162 - Flags: review?(karlt) → review-
Comment on attachment 8433310 [details] [diff] [review]
Expose DataTextureSourceBasic for Subclassing

This looks fine, but I'll wait and see how it gets used.
Comment on attachment 8433353 [details] [diff] [review]
OMTC basic  implementation

The X11BasicCompositor bits are missing from this patch.

However, we seem to have quite a different view of the purpose of TextureHost.

I see TextureHost as representing a surface and transferring necessary details
between threads or processes.

Here, you are adding uploading to the TextureHost, so that it creates the
uploaded surface, and then passing this surface to the compositor is difficult.

I expect TextureSource is the class to manage uploads and uploaded surfaces.

One of the goals here is unifying the code as much as possible, and part of
that is making platform-specific decisions at consistent places.  It would be
nice if the upload paths with XRender would be similar to those with OpenGL.

(In reply to Frederic Plourde (:fred23) from bug 738937 comment #62)
> (In reply to Karl Tomlinson (needinfo?:karlt) from bug 738937 comment #56)
> > Perhaps this could be hooked up by creating an XRenderCompositor, derived
> > from the BasicCompositor, that overrides CreateDataTextureSource().
> 
> See here, I wouldn't subclass BasicCompositor, because we're still having
> the "basic" behavior... so I'd stick to the "basic" path (which is, instead
> of using, say, OGL to composite, is using cairo... and cairo will call
> XRender() for us to composite). I'm trying to come up with a nice way to
> integrate this into existent classes. 

I read this, but didn't understand the problem with overriding
CreateDataTextureSource().

>+      if (gfxPlatformGtk::GetPlatform()->UseImageOffscreenSurfaces() &&
>+          gfxPlatformGtk::GetPlatform()->UseXRender()) {
>+        compositor = new X11BasicCompositor(mWidget);

I wonder whether the X11BasicCompositor should be used even when
UseImageOffscreenSurfaces() returns false.

Even with UseXRender() && !UseImageOffscreenSurfaces(), there may be some
layers with image surface data that would benefit from an X11 upload path.
Attachment #8433353 - Flags: review?(karlt) → review-
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #10)
> Comment on attachment 8430162 [details] [diff] [review]
> Put_nsShmImage_to_xlibDrawable_even_when_on_OMTC_when_using_image_offscreen_s
> urfaces (v2)
> 
> >+            mShmImage->Put(mGdkWindow, exposeRegion.mRects);
> 
> I won't review this bit until I understand how it is going to work on the
> other thread.  This deserves to be in a separate patch to the part below.

fair enough, just remember that this ->Put is actually needed when doing "synchronous compositions" when, e.g., users clicks window's title bar or alt-tabs the window, etc... In those cases, we're having OnExposeEvents even if we're using a Compositor and in that case (in the UseShm() case), a sync'ed composition is triggered all the way through compositor-side and back so we should be copying final composited frame onto our gtkWidget's xlib surface just as it's being done currently with regular OnExposeEvents for non-Compositor paths.

> >+    // UseXRender() is not the only criteria anymore that may lead to 
> >+    // creation of image offscreen surfaces for content in
> >+    // gfxPlatformGtk::CreateOffscreenSurface. Even if we're having 
> >+    // image surfaces (UseShm = true) here, we will still want to use
> >+    // XRender() in the compositor if possible.
> >+    if ((nsShmImage::UseShm()) && (!gfxPlatformGtk::GetPlatform()->UseXRender()))
> 
> This is good in principal but OnExposeEvent() also tests against UseShm() to
> determine the BufferMode, and these tests should be consistent.

I just put that additional condition there as a means to disable the post-composition nsShmImage step when we're supporting XRender() because not doing so would result in returning an shShmImage-wrapped gfxIMAGEsurface back to compositor and that would completely defeat the purpose of having XRender() support (in fact, if the compositor gets an image surface from here, it's not going to use cairo's XRender at all and we'll fall back to sw)
 
> Would it make sense to move the UseXRender() call to UseShm()?
> 
> When UseXRender() returns false,
> ScreenReferenceSurface()->GetType() == gfxSurfaceType::Image
> in UseShm() is always true and so can be removed.

This would kill path #2 (from Comment 2 above), which is already supported by existing code, but if we really think we can drop this path, then I don't have anything against it.
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #12)
> Comment on attachment 8433353 [details] [diff] [review]
> OMTC basic  implementation
> 
> The X11BasicCompositor bits are missing from this patch.

Duh! Stupid me, I forgot to 'hg add X11BasicCompositor" to my patch, sorry.
Here's the updated patch with those bits in...

But please, keep on reading, because I'm referring to this updated patch for my reply to your below comments about TextureHost. thanks ;-)

> However, we seem to have quite a different view of the purpose of
> TextureHost.
> 
> I see TextureHost as representing a surface and transferring necessary
> details
> between threads or processes.
> 
> Here, you are adding uploading to the TextureHost, so that it creates the
> uploaded surface, and then passing this surface to the compositor is
> difficult.
> I expect TextureSource is the class to manage uploads and uploaded surfaces.

Well, because I initially forgot the X11BasicCompositor bits from my patch, you could not actually see that I indeed implemented the actual upload in X11DataTextureSourceBasic::Update() just as you suggested :).

What you saw in the former patch and probably didn't like is "mXlibSurface" that I stick in both X11MemoryTextureHost and X11ShmemTextureHost. Reason is simple, I initially thought that mFirstSource (within BufferTextureHost::Upload()) was NOT permanent and got overwritten at each Upload() !! But it's not the case (my bad), as it's guarded with if (!mFirstSource) everytime it's used. I will just move mXlibSurface from both TextureHosts and move it into X11DataTextureSourceBasic and that will pretty do the trick ;-)

thanks for catching this ! :)


> One of the goals here is unifying the code as much as possible, and part of
> that is making platform-specific decisions at consistent places.  It would be
> nice if the upload paths with XRender would be similar to those with OpenGL.

yep, with the updated patch coming right up, uploads will be completely done within TextureSources.


> >+      if (gfxPlatformGtk::GetPlatform()->UseImageOffscreenSurfaces() &&
> >+          gfxPlatformGtk::GetPlatform()->UseXRender()) {
> >+        compositor = new X11BasicCompositor(mWidget);


> I wonder whether the X11BasicCompositor should be used even when
> UseImageOffscreenSurfaces() returns false.
> 
> Even with UseXRender() && !UseImageOffscreenSurfaces(), there may be some
> layers with image surface data that would benefit from an X11 upload path.

mhh... I'm not sure I understand what you mean. If UseImageOffscreenSurfaces is false, then we're *not* going to have any layer with image surface data. We will have instead xlib surface layers and this "UseXRender() + !UseImageOffscreenSurfaces()" path is already supported atm with the creation of X11TextureHost from TextureHost::Create().
http://lxr.mozilla.org/mozilla-central/source/gfx/layers/composite/TextureHost.cpp#190

Does that make sense ? Please don't hesitate to add more precision to your question, as I'm not sure I understood correctly. thanks :)
Attached patch OMTC basic implementation (v2) (obsolete) — Splinter Review
(In reply to Frederic Plourde (:fred23) from comment #16)
> (In reply to Karl Tomlinson (needinfo?:karlt) from comment #12)
> > Comment on attachment 8433353 [details] [diff] [review]
> > OMTC basic  implementation
> > 
> > The X11BasicCompositor bits are missing from this patch.
> 
> Duh! Stupid me, I forgot to 'hg add X11BasicCompositor" to my patch, sorry.
> Here's the updated patch with those bits in...

Here's the patch with the X11BasicCompositor bits so everythin that I said above makes sense ;-)
(btw, another updated patch with come right up, with mXlibSource moved to X11DataTextureSourceBasic)
Attachment #8433353 - Attachment is obsolete: true
Attachment #8437660 - Flags: feedback?(karlt)
Attached patch OMTC basic implementation (v2) (obsolete) — Splinter Review
This is the right one, sorry.
Attachment #8437660 - Attachment is obsolete: true
Attachment #8437660 - Flags: feedback?(karlt)
Attachment #8437663 - Flags: feedback?(karlt)
Comment on attachment 8437663 [details] [diff] [review]
OMTC basic  implementation (v2)

Review of attachment 8437663 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/Compositor.h
@@ +503,5 @@
>  
> +  // Some compositor backends need to create specialized TextureSources and will
> +  // need extra UserData to make it happen.
> +  void SetUserData(void* data = nullptr);
> +  void* GetUserData();

I would much prefer to avoid void* style userData. If you need to, add a down-cast method (say, AsX11Compositor() and make the X11 compositor have a SetXlibSurface method (since it seems that's what you are storing somewhere else in the patch).

::: gfx/layers/composite/TextureHost.h
@@ +638,5 @@
>   * Creates a TextureHost that can be used with any of the existing backends
>   * Not all SurfaceDescriptor types are supported
>   */
>  TemporaryRef<TextureHost>
> +CreateDataTextureHost(const SurfaceDescriptor& aDesc,

I think that we should contain the backend stuff in TextureSource and leave this "BackendIndependent"

::: gfx/layers/composite/X11TextureHost.h
@@ +49,5 @@
>    RefPtr<NewTextureSource> mTextureSource;
>    RefPtr<gfxXlibSurface> mSurface;
>  };
>  
> +class X11MemoryTextureHost : public MemoryTextureHost

Would it be possible to move the X11 specific stuff from the TextureHosts To the TextureSources? If so we should do it.
The basic rule is: TextureHost cares about the shared Data (shmem, etc.) and the TextureSource cares about the rendering device stuff (gl texture, etc.). Sometimes they need to be coupled because the shared data and the rendering device data are the same (for example with Gralloc), but in the general case it's best to keep backend stuff into TextureSource.

::: gfx/layers/d3d11/TextureD3D11.cpp
@@ +129,5 @@
>    RefPtr<TextureHost> result;
>    switch (aDesc.type()) {
>      case SurfaceDescriptor::TSurfaceDescriptorShmem:
>      case SurfaceDescriptor::TSurfaceDescriptorMemory: {
> +      result = CreateTextureHost(aDesc, aDeallocator, aFlags);

I think that you meant CreateDataTextureHost here (CreateTextureHost doesn't seem to exist)
(In reply to Frederic Plourde (:fred23) from comment #15)
> > >+    // UseXRender() is not the only criteria anymore that may lead to 
> > >+    // creation of image offscreen surfaces for content in
> > >+    // gfxPlatformGtk::CreateOffscreenSurface. Even if we're having 
> > >+    // image surfaces (UseShm = true) here, we will still want to use
> > >+    // XRender() in the compositor if possible.
> > >+    if ((nsShmImage::UseShm()) && (!gfxPlatformGtk::GetPlatform()->UseXRender()))
> > 
> > This is good in principal but OnExposeEvent() also tests against UseShm() to
> > determine the BufferMode, and these tests should be consistent.
> 
> I just put that additional condition there as a means to disable the
> post-composition nsShmImage step when we're supporting XRender() because not
> doing so would result in returning an shShmImage-wrapped gfxIMAGEsurface
> back to compositor and that would completely defeat the purpose of having
> XRender() support (in fact, if the compositor gets an image surface from
> here, it's not going to use cairo's XRender at all and we'll fall back to sw)

That makes sense.

> > Would it make sense to move the UseXRender() call to UseShm()?
> > 
> > When UseXRender() returns false,
> > ScreenReferenceSurface()->GetType() == gfxSurfaceType::Image
> > in UseShm() is always true and so can be removed.
> 
> This would kill path #2 (from Comment 2 above), which is already supported
> by existing code, but if we really think we can drop this path, then I don't
> have anything against it.

Why would it kill path #2, where UseXRender() returns false?
(In reply to Frederic Plourde (:fred23) from comment #16)
> (In reply to Karl Tomlinson (needinfo?:karlt) from comment #12)
> > >+      if (gfxPlatformGtk::GetPlatform()->UseImageOffscreenSurfaces() &&
> > >+          gfxPlatformGtk::GetPlatform()->UseXRender()) {
> > >+        compositor = new X11BasicCompositor(mWidget);
> 
> > I wonder whether the X11BasicCompositor should be used even when
> > UseImageOffscreenSurfaces() returns false.
> > 
> > Even with UseXRender() && !UseImageOffscreenSurfaces(), there may be some
> > layers with image surface data that would benefit from an X11 upload path.
> 
> mhh... I'm not sure I understand what you mean. If UseImageOffscreenSurfaces
> is false, then we're *not* going to have any layer with image surface data.
> We will have instead xlib surface layers and this "UseXRender() +
> !UseImageOffscreenSurfaces()" path is already supported atm with the
> creation of X11TextureHost from TextureHost::Create().
> http://lxr.mozilla.org/mozilla-central/source/gfx/layers/composite/
> TextureHost.cpp#190

If !UseImageOffscreenSurfaces(), we won't have and content/Thebes layer with
Xlib surfaces, but there are other layer types that may have image surfaces:
Canvas, Image (where image layer may have either an image surface or xlib surface).
Comment on attachment 8433310 [details] [diff] [review]
Expose DataTextureSourceBasic for Subclassing

AFAIS X11DataTextureSourceBasic only inherits mSurface, GetFormat(), and
DeallocateDeviceData() from DataTextureSourceBasic, but I'm not clear that
mSurface needs to be retained nor that DeallocateDeviceData() does enough.

I suspect it might be simpler to inherit directly from DataTextureSource and
TextureSourceBasic.
Attachment #8433310 - Flags: review?(karlt)
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #20)
> 
> The description of configuration 2 in comment 2 says "Makes our layers use
> gfxImageSurfaces" and gfxPlatformGtk::CreateOffscreenSurface() returns an
> image (not xlib) surface when !UseXRender().  Where are xlib surfaces
> involved
> in configuration 2 but not configuration 4?

Damn, my bad, you're absolutely right... config #2 gives rise to gfxImageSurface.

> 
> I thought bug 1015262 meant that any nsShmImage path was already broken for
> OMTC and I didn't notice your changes making it worse.
> 
> I don't see how configuration 2 gets Xlib surfaces for content.

Again, my bad, see what I'll do, I'll re-reply to your previous comment asking explanations about the possible paths and I will take care this time ;).. hereunder :

==================


(In reply to Karl Tomlinson (needinfo?:karlt) from comment #9)
> (In reply to Frederic Plourde (:fred23) from comment #2)
> > 2) !UseXRender() + !UseImageOffscreenSurfaces
> > 
> > Those are the two already present.
> > [...] 2) Makes our layers use
> > gfxImageSurfaces and cairo XShm copy for compositing.
> 
> > 4) !UseXRender() + UseImageOffscreenSurfaces
> > 
> > New path that will be taken care of by upcoming patches in this bug. In this
> > case, we're forcing layers to use gfxImageSurfaces and, having no support
> > for XRender(), we will simply let our gtkWindow create the usual nsShmImage
> > intermediate target as a ThebesSurface. Layers will get composited to this
> > image target by software copies and the widget will XShmPutImage it onto its
> > xlib drawable once per composited frame.
> 
> IIUC the difference between 2 and 4 here is that 2 uses cairo and 4 uses
> nsShmImage?  Is that the only difference?

Ok, both #2 and #4 will be using Image surface layers, as you said, because there's no XRender support.
But OMG, I realize there will *not* be any difference between those two paths in the OMTC basic context.
They will both get a nsShmImage ThebesSurface for mRenderTarget so on both cases, they will be using {Shmem,Memory}TextureHosts without any buffuring/uploading to gpu (nothing is accelerated) and BasicCompositor will be rendering image layer content onto -> the nsShmImage-wrapped image surface (using cairo-image-surface sw copy) and then, post-composition, both will have the final frame posted to window drawable using nsShmImage->Put(), which basically just calls XShmPutImage().

Indeed, take a look at the code I wrote for GetThebesSurface, fot the nsShm bit :

if ((nsShmImage::UseShm()) && (!gfxPlatformGtk::GetPlatform()->UseXRender())) 
  then return an nsShmImage-wrapped image surface.


> Do you know whether existing code already supports 2?

Yep, that's still the case. It's supported, just as #4 is.


> I'm not sure we need to support both cairo and nsShmImage for uploading
> layers
> for compositing.  I'm OK if you just want to use cairo, but, if you prefer
> nsShmImage, I suspect that is going to be more complicated so it would be
> nice to do some measurements to indicate that the additional complication is
> worthwhile.

ok, in the new light (explained in previous comments) that the "nsShmImage" you're referring to is only for post-composition posting to the window drawable, I think this comment drops, right ? (since there is no such thing now as "uploading content with nsShmImage" - at the buffer "upload" level. That's an optimization I had implemented for MTC work, but it has been dropped now. When uploading now, it's always done with cairo (and cairo chooses to use accelerated paths if available).

> If we should retain both options, then it may be appropriate to have a
> different method on gfxPlatformGtk to indicate whether to use nsShmImage or
> cairo.  UseNSShm() maybe more meaningful than UseImageOffscreenSurfaces().

Ok, I guess this comment drops too.
But since you made me realize that #2 and #4 will be exactly the same under Basic OMTC, I may reconsider the decision logic that give rise to those paths... I guess it's not really a problem that #2 and #4 are the same, it's just it may indicate I haven't thought this over thoroughly.

thanks for your help !
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #20)
> 
> I thought bug 1015262 meant that any nsShmImage path was already broken for
> OMTC and I didn't notice your changes making it worse.
> 
nope, bug 1015262 is related to the "synch compositions" we talked about previously... The bug title is horrible and I will rephrase/rework the bug now that you draw my attention on it...thanks.

Let's forget 1015262 for now... the post-composition posting with nsShmImage circuitery works pretty well on all paths.
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #21)
> (In reply to Frederic Plourde (:fred23) from comment #15)
> > > >+    // UseXRender() is not the only criteria anymore that may lead to 
> > > >+    // creation of image offscreen surfaces for content in
> > > >+    // gfxPlatformGtk::CreateOffscreenSurface. Even if we're having 
> > > >+    // image surfaces (UseShm = true) here, we will still want to use
> > > >+    // XRender() in the compositor if possible.
> > > >+    if ((nsShmImage::UseShm()) && (!gfxPlatformGtk::GetPlatform()->UseXRender()))
> > > 
> > > This is good in principal but OnExposeEvent() also tests against UseShm() to
> > > determine the BufferMode, and these tests should be consistent.
> > 
> > I just put that additional condition there as a means to disable the
> > post-composition nsShmImage step when we're supporting XRender() because not
> > doing so would result in returning an shShmImage-wrapped gfxIMAGEsurface
> > back to compositor and that would completely defeat the purpose of having
> > XRender() support (in fact, if the compositor gets an image surface from
> > here, it's not going to use cairo's XRender at all and we'll fall back to sw)
> 
> That makes sense.
> 
> > > Would it make sense to move the UseXRender() call to UseShm()?
> > > 
> > > When UseXRender() returns false,
> > > ScreenReferenceSurface()->GetType() == gfxSurfaceType::Image
> > > in UseShm() is always true and so can be removed.
> > 
> > This would kill path #2 (from Comment 2 above), which is already supported
> > by existing code, but if we really think we can drop this path, then I don't
> > have anything against it.
> 
> Why would it kill path #2, where UseXRender() returns false?

again, sorry Karl, that was another bogus reply from me not correctly assuming that #2 was also using gfxImageSurfaces... Now that it's clear, let's see...
let me rewrite the four paths above, with more detail about image surfaces and the provided ThebesSurface for rendering in each case : 


1)  UseXRender() + !UseImageOffscreenSurfaces ==> gfxXlibSurface  layers  //  Xlib ThebesSurface

2) !UseXRender() + !UseImageOffscreenSurfaces ==> gfxImageSurface layers  //  nsShmImage-wrapped Image ThebesSurface

3)  UseXRender() +  UseImageOffscreenSurfaces ==> gfxImageSurface layers  //  Xlib ThebesSurface

4) !UseXRender() +  UseImageOffscreenSurfaces ==> gfxImageSurface layers  //  nsShmImage-wrapped Image ThebesSurface 


I think you're right... By looking at the above, we clearly see that nsShm-wrapped Image ThebesSurface *depends only* on UseXRender() begin FALSE, no matter UseImageOffscreenSurfaces. Now, if you think that conceptually, we could add XRender-related logics to UseShm(), I'm totally fine with changing nsShmImage::UseShm by something like :

bool nsShmImage::UseShm()
{
    return (gfxPlatform::GetPlatform()->ScreenReferenceSurface()->GetType() == gfxSurfaceType::Image && gShmAvailable && !UseXRender());
}


cheers !
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #22)
> (In reply to Frederic Plourde (:fred23) from comment #16)
> > (In reply to Karl Tomlinson (needinfo?:karlt) from comment #12)
> > > >+      if (gfxPlatformGtk::GetPlatform()->UseImageOffscreenSurfaces() &&
> > > >+          gfxPlatformGtk::GetPlatform()->UseXRender()) {
> > > >+        compositor = new X11BasicCompositor(mWidget);
> > 
> > > I wonder whether the X11BasicCompositor should be used even when
> > > UseImageOffscreenSurfaces() returns false.
> > > 
> > > Even with UseXRender() && !UseImageOffscreenSurfaces(), there may be some
> > > layers with image surface data that would benefit from an X11 upload path.
> > 
> > mhh... I'm not sure I understand what you mean. If UseImageOffscreenSurfaces
> > is false, then we're *not* going to have any layer with image surface data.
> > We will have instead xlib surface layers and this "UseXRender() +
> > !UseImageOffscreenSurfaces()" path is already supported atm with the
> > creation of X11TextureHost from TextureHost::Create().
> > http://lxr.mozilla.org/mozilla-central/source/gfx/layers/composite/
> > TextureHost.cpp#190
> 
> If !UseImageOffscreenSurfaces(), we won't have and content/Thebes layer with
> Xlib surfaces, but there are other layer types that may have image surfaces:
> Canvas, Image (where image layer may have either an image surface or xlib
> surface).

oh, ok.. I thought all the others layer types *had* to pass through the gtk platform code (CreateOffscreenSurface) to get their surface and then, were consequently falling under the same set of conditions...

If you've got some free cycles to elaborate a little more on this (maybe with lxr urls to point some locations in the code where those layer types get their surfaces), I'd be more that happy to adapt and open X11BasicCompositor to the UseXRender() && !UseImageOffscreenSurfaces() path.
> (In reply to Karl Tomlinson (needinfo?:karlt) from comment #9)
> Ok, both #2 and #4 will be using Image surface layers, as you said, because
> there's no XRender support.
> But OMG, I realize there will *not* be any difference between those two
> paths in the OMTC basic context.
> They will both get a nsShmImage ThebesSurface for mRenderTarget so on both
> cases, they will be using {Shmem,Memory}TextureHosts without any
> buffuring/uploading to gpu (nothing is accelerated) and BasicCompositor will
> be rendering image layer content onto -> the nsShmImage-wrapped image
> surface (using cairo-image-surface sw copy) and then, post-composition, both
> will have the final frame posted to window drawable using nsShmImage->Put(),
> which basically just calls XShmPutImage().

> > Do you know whether existing code already supports 2?
> 
> Yep, that's still the case. It's supported, just as #4 is.

Well it works with MTC, but I don't know where the Put() happens for OMTC (on
the compositor thread), but let's ignore bug 1015262 and the first hunk of
attachment 8430162 [details] [diff] [review], for now at least.

> > I'm not sure we need to support both cairo and nsShmImage for uploading
> > layers for compositing.

> ok, in the new light (explained in previous comments) that the "nsShmImage"
> you're referring to is only for post-composition posting to the window
> drawable, I think this comment drops, right ?

Yes.  Ignore this now.

> > If we should retain both options, then it may be appropriate to have a
> > different method on gfxPlatformGtk to indicate whether to use nsShmImage or
> > cairo.  UseNSShm() maybe more meaningful than UseImageOffscreenSurfaces().
> 
> Ok, I guess this comment drops too.
> But since you made me realize that #2 and #4 will be exactly the same under
> Basic OMTC, I may reconsider the decision logic that give rise to those
> paths... I guess it's not really a problem that #2 and #4 are the same,

Yes, it is not a problem that #2 and #4 are the same.

layers.use-image-offscreen-surfaces is only relevant if
gfx.xrender.enabled is true, and that's fine.
(In reply to Frederic Plourde (:fred23) from comment #26)
> > > > When UseXRender() returns false,
> > > > ScreenReferenceSurface()->GetType() == gfxSurfaceType::Image
> > > > in UseShm() is always true and so can be removed.

> Now, if you think that
> conceptually, we could add XRender-related logics to UseShm(), I'm totally
> fine with changing nsShmImage::UseShm by something like :
> 
> bool nsShmImage::UseShm()
> {
>     return (gfxPlatform::GetPlatform()->ScreenReferenceSurface()->GetType()
> == gfxSurfaceType::Image && gShmAvailable && !UseXRender());
> }

Yes, that means the callers are consistent in their decisions, but there is no need to keep "ScreenReferenceSurface()->GetType() == gfxSurfaceType::Image"
(In reply to Frederic Plourde (:fred23) from comment #27)
> If you've got some free cycles to elaborate a little more on this (maybe
> with lxr urls to point some locations in the code where those layer types
> get their surfaces), I'd be more that happy to adapt and open
> X11BasicCompositor to the UseXRender() && !UseImageOffscreenSurfaces() path.

The process from layer to surface is quite indirect, but
CreateBufferTextureClient() [1] is at the centre of fetching image surfaces
for layers.

Note that when !UseXRender() some layers (notably from windowless plugins) will
have X11/Pixmap surfaces.  Whatever UseXRender() returns, the compositor will
need to be able to handle both surface types, even if one will be more
efficient than the other.

The decision on which compositor is more efficient will depend on the type of
target surface.  I think the only target surface for which we care about
performance is the one from StartRemoteDrawing() and UseXRender() is a
reasonable indicator of what that will be.

[1] http://dxr.mozilla.org/mozilla-central/search?q=%2Bfunction-ref%3A%22mozilla%3A%3Alayers%3A%3ACompositableClient%3A%3ACreateBufferTextureClient%28enum+mozilla%3A%3Agfx%3A%3ASurfaceFormat%2C+enum+mozilla%3A%3Alayers%3A%3ATextureFlags%2C+gfx%3A%3ABackendType%29%22
Attachment #8430162 - Attachment is obsolete: true
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #29)
> > (In reply to Karl Tomlinson (needinfo?:karlt) from comment #9)
> > surface (using cairo-image-surface sw copy) and then, post-composition, both
> > will have the final frame posted to window drawable using nsShmImage->Put(),
> > which basically just calls XShmPutImage().
> 
> > > Do you know whether existing code already supports 2?
> > 
> > Yep, that's still the case. It's supported, just as #4 is.
> 
> Well it works with MTC, but I don't know where the Put() happens for OMTC (on
> the compositor thread), but let's ignore bug 1015262 and the first hunk of
> attachment 8430162 [details] [diff] [review], for now at least.

I think I did not tell you more about this issue after realizing yesterday that #2 and #4 were the same, but I really think you and I are on the same page now... because in fact, you're totally right on this one too (not knowing where the Put() happens for OMTC on comp. thread)... Well, you're right, the "Put()" just does not happen for OMTC, it just can't because it doesn't get an Image ThebesSurfaces from GetThebesSurface but rather, an xlib surface... The bug I had initially created about it (bug 1015262) was based on a false assumption and turned out to be only basically caused by my naive implementation of "GTK3 implementation of StartRemoteDrawing() that basically recycled 'mCr' even outside the scope on OnExposeEvent... So when I was rendering stuff, say, for Ctrl-+ events (when user wants to scale web content), the passed-in mCr cairo context was inexistent for the gtkWindow and the returned ThebesSurface was null... thus inhibiting the compositing. With Chris' implementation of GTK3 support for StartRemoteDrawing/GetThebesSurface() (Bug 1013552), everything's fine, so I closed Bug 1015262 as INVALID. 



(In reply to Karl Tomlinson (needinfo?:karlt) from comment #30)
> (In reply to Frederic Plourde (:fred23) from comment #26)
> > {
> >     return (gfxPlatform::GetPlatform()->ScreenReferenceSurface()->GetType()
> > == gfxSurfaceType::Image && gShmAvailable && !UseXRender());
> > }
> 
> Yes, that means the callers are consistent in their decisions, but there is
> no need to keep "ScreenReferenceSurface()->GetType() ==
> gfxSurfaceType::Image"

You're totally right. the nsShmImage-wrapped image surface returned on useShm() cases now end up being dependant only on !UseXRender().. so I updated it in the patch version 3.
Attached patch OMTC basic implementation (v3) (obsolete) — Splinter Review
Here's version 3 of my xlibSurface-less OMTC basic implementation.

It's completely controlled by gfxPrefs::use-image-offscreen-surfaces and doesn't seem to affect any other path when this pref is off.

I have tested it and it works well, apart from :
  * a weird crash initiated from the js side that doesn't seem related to my work (still to confirm though).

  * crash with plugin not getting the right surface (of course, I will need to patch those cases later on)

Numbers coming up next to compare perf of this solution to the regular case using xlib surface layers.
Below are some replies to Karl and Nical previous comments :


(In reply to Karl Tomlinson (needinfo?:karlt) from comment #23)
> Comment on attachment 8433310 [details] [diff] [review]
> Expose DataTextureSourceBasic for Subclassing
> 
> AFAIS X11DataTextureSourceBasic only inherits mSurface, GetFormat(), and
> DeallocateDeviceData() from DataTextureSourceBasic, but I'm not clear that
> mSurface needs to be retained nor that DeallocateDeviceData() does enough.
> 
> I suspect it might be simpler to inherit directly from DataTextureSource and
> TextureSourceBasic.

Yes, that might be indeed simpler. Let's review v.3 for now, and I'll consider your suggestion and I'll probably implement for v.4.


(In reply to Nicolas Silva [:nical] from comment #19)
> Comment on attachment 8437663 [details] [diff] [review]
> OMTC basic  implementation (v2)
> 
> Review of attachment 8437663 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/layers/Compositor.h
> @@ +503,5 @@
> >  
> > +  // Some compositor backends need to create specialized TextureSources and will
> > +  // need extra UserData to make it happen.
> > +  void SetUserData(void* data = nullptr);
> > +  void* GetUserData();
> 
> I would much prefer to avoid void* style userData. If you need to, add a
> down-cast method (say, AsX11Compositor() and make the X11 compositor have a
> SetXlibSurface method (since it seems that's what you are storing somewhere
> else in the patch).

ok, Done.
I completely removed the SetUserData mechanism and added a Casting helper function in Compositor called AsX11BasicCompositor()


> ::: gfx/layers/composite/TextureHost.h
> @@ +638,5 @@
> >   * Creates a TextureHost that can be used with any of the existing backends
> >   * Not all SurfaceDescriptor types are supported
> >   */
> >  TemporaryRef<TextureHost>
> > +CreateDataTextureHost(const SurfaceDescriptor& aDesc,
> 
> I think that we should contain the backend stuff in TextureSource and leave
> this "BackendIndependent"

Done.

> ::: gfx/layers/composite/X11TextureHost.h
> @@ +49,5 @@
> >    RefPtr<NewTextureSource> mTextureSource;
> >    RefPtr<gfxXlibSurface> mSurface;
> >  };
> >  
> > +class X11MemoryTextureHost : public MemoryTextureHost
> 
> Would it be possible to move the X11 specific stuff from the TextureHosts To
> the TextureSources? If so we should do it.
> The basic rule is: TextureHost cares about the shared Data (shmem, etc.) and
> the TextureSource cares about the rendering device stuff (gl texture, etc.).
> Sometimes they need to be coupled because the shared data and the rendering
> device data are the same (for example with Gralloc), but in the general case
> it's best to keep backend stuff into TextureSource.

Ok, good news, with your comment, joined with another suggestion from Karl previously, I revised my class architectures and successfully removed both X11MemoryTextureHost and X11ShmemTextureHost classes that, anyways, had redundant stuff... Now, We have X11-related stuff well better localized inside X11DataTextureSources,  **but** there's one X11-dependency remaining in TextureHost class, which I couldn't really remove :


+#ifdef MOZ_X11
+#ifdef MOZ_WIDGET_GTK
+  // This TextureHost might have been initiated by an image surface layer
+  // with XRender() support that would like to use it to render buffer layer contents.
+  if (mCompositor) {
+    if (mFlags & TextureFlags::BACKEND_DEPENDANT) {
+      X11BasicCompositor *compositor = mCompositor->AsX11BasicCompositor();
+      if (compositor) {
+        compositor->SetBufferTextureHost(this);
+      }
+    }
+  }
+#endif
+#endif
+


See, Now that I'm using the SAME textureHost as the regular xlib path, I still needed to differenciate my new path's behavior because X11 and Data are intertwined in the proposed solution (I'm basically adding gfxXlibSurface buffering to BufferTextureHosts.... 

So instead of subclassing BufferTextureHost to X11{Memory,Shmem}TextureHost, I added a new TextureFlag called BACKEND_DEPENDANT that'S added at the TextureClient level when some conditions are met and, on compositor side, when this flag is encountered, I can create the appropriate compositor... and it works.

Now, I'd like you to consider that approach and tell me what you think. There might be better options to specialize the BufferTextureHost capabilites with Xlib buffering without subclassing too much (or other side-effects), but I'm out of ideas.

thanks :)

Karl : ^^  You input on the latter of welcome too :)
Attachment #8437663 - Attachment is obsolete: true
Attachment #8437663 - Flags: feedback?(karlt)
Attachment #8439668 - Flags: review?(nical.bugzilla)
Attachment #8439668 - Flags: review?(karlt)
Comment on attachment 8439668 [details] [diff] [review]
OMTC basic implementation (v3)

Review of attachment 8439668 [details] [diff] [review]:
-----------------------------------------------------------------

I need more time to review this, so marking f+ for now.

::: gfx/layers/CompositorTypes.h
@@ +93,5 @@
> +  // For example, X11 Image surface layers with XRender() support will want to
> +  // render layer contents with XRender and will create regular
> +  // BufferTextureHosts, but need this TextureFlags bit to differenciate their
> +  // behavior afterwards
> +  BACKEND_DEPENDANT  = 1 << 18,

I am not found of this flag but I see why you added it and I don't have a better solution to propose, so it will do fine. I like this patch better than the previous version.
Attachment #8439668 - Flags: feedback+
Comment on attachment 8439668 [details] [diff] [review]
OMTC basic implementation (v3)

Review of attachment 8439668 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/basic/X11BasicCompositor.cpp
@@ +97,5 @@
> +  if (mXlibDrawTarget) {
> +    return true;
> +  }
> +
> +  if (!EnsureXlibSurface()) {  

Trailing spaces to fix in version 4

::: gfx/layers/basic/X11BasicCompositor.h
@@ +35,5 @@
> +  bool EnsureXlibSurface();
> +
> +  BasicCompositor* mCompositor;
> +
> +  // *mTextureHost will live certainly longer than 

Trailing spaces to fix in version 4

@@ +44,5 @@
> +  // We are going to buffer layer content on this xlib surface.
> +  // Note: We could have used a cairo_surface_t here, as it's prettier...
> +  // But on the other hand, we are creating a X11TextureSourceBasic
> +  // hereunder that requires an explicit gfxXlibSurface so let's keep that.
> +  nsRefPtr<gfxXlibSurface> mXlibSurface; 

Trailing spaces to fix in version 4
Comment on attachment 8439668 [details] [diff] [review]
OMTC basic implementation (v3)

Review of attachment 8439668 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good except the TextureHost* pointer business in the compositor and the texture source, see comment below. I read that a new version of this patch is coming so I'm cancelling the review until then.

::: gfx/layers/basic/X11BasicCompositor.h
@@ +38,5 @@
> +
> +  // *mTextureHost will live certainly longer than 
> +  // the current X11DataTextureSourceBasic lifetime, so we can
> +  // safely hold a ref on this BufferTextureHost that created us.
> +  BufferTextureHost *mBufferTextureHost;

My understanding is that you are temporarily keeping mBufferTextureHost here because the X11DataTextureSource will need the reference and because CreateDataTextureSource doesn't have a TextureHost* parameter.
It took me some time to figure out why the compositor was depending one one given TextureHost, I think this is misleading. 
If you really need the DataTextureSource to hold a reference to the host, I'd prefer that you instead extend the CreateDataTextureSource method into taking an optional TextureHost* parameter (that can be null) and have the call sites fixed up whenever it makes sense.
But I am actually not sure you need this: The BufferTextureHost* pointer you are keeping in the TextureSource seems to only be used in EnsureXlibSurface to get a raw pointers to the pixels and a SurfaceFormat. Why not have this happen in DataTextureSource::Update? In Update you have a DataSourceSurface which provides you with the pointer and format you need. This way you don't need to keep an unsafe pointer to the TextureHost, and I suppose it would make it possible to use the X11DataTextureSource without a TextureHost, which we need to render some of our debugging stuff like the FPS counter.
Attachment #8439668 - Flags: review?(nical.bugzilla)
Comment on attachment 8439668 [details] [diff] [review]
OMTC basic implementation (v3)

What Nicolas says about using the DataSourceSurface from Update() sounds good.

It would be more conventional if AsSourceBasic() were to return this.
The format of the DataSourceSurface can be recorded if that is easier than getting the format from the Xlib surface.

I assume a new Xlib surface will need to be created if the size or format of the DataSourceSurface in Update() is different, like in TextureImageTextureSourceOGL::Update().
Attachment #8439668 - Flags: review?(karlt) → review-
Attachment #8433310 - Attachment is obsolete: true
hey guys, since I feel we're approaching final review round, here's this patch, exactly the same as before, but with my commiter email + name header included.

Nical : you had this patch reviewed r+ before. You might just wanna r+ it now, as it's exactly the same.

Karl : any comment on it ?  I have marked you as reviewer '?', thanks.
Attachment #8427796 - Attachment is obsolete: true
Attachment #8444182 - Flags: review?(nical.bugzilla)
Attachment #8444182 - Flags: review?(karlt)
Attached patch OMTC basic implementation (v4) (obsolete) — Splinter Review
Hi guys, 
thanks to both of you for your constructive comments, it's really appreciated.
I have included most of your suggestions, 
see comments below :

**************

(In reply to Karl Tomlinson (needinfo?:karlt) from comment #23)
> Comment on attachment 8433310 [details] [diff] [review]
> Expose DataTextureSourceBasic for Subclassing
> 
> AFAIS X11DataTextureSourceBasic only inherits mSurface, GetFormat(), and
> DeallocateDeviceData() from DataTextureSourceBasic,
> I suspect it might be simpler to inherit directly from DataTextureSource and
> TextureSourceBasic.

One comment from you Karl - since long time ago - that I hadn't forgotten.
Subclassing DataTextureSourceBasic was - as you suggested - a little foolish, 
since I was inheriting a *very* tiny portion of the class. So in the updated 
patch v.4, I'm deriving from DataTextureSource + TextureSourceBasic and there's 
no need anymore to make DataTextureSourceBasic, so that's the reason why
I made the first patch 'obsolete'.

> but I'm not clear that
> mSurface needs to be retained nor that DeallocateDeviceData() does enough.

Done. I have implemented ::DeallocatedDeviceData() in a more complete way now.

**************

(In reply to Nicolas Silva [:nical] from comment #36)
> Comment on attachment 8439668 [details] [diff] [review]
> But I am actually not sure you need this: The BufferTextureHost* pointer you
> are keeping in the TextureSource seems to only be used in EnsureXlibSurface
> to get a raw pointers to the pixels and a SurfaceFormat. Why not have this
> happen in DataTextureSource::Update?

Good point, you're right in fact ! :)
I have done exactly as you suggested and it's way simpler like that. No need for SetTextureHost(),
no need for a Compositor::AsX11BasicCompositor() cast function anymore... And more importantly, 
I could even drop the BACKEND_DEPENDENT TextureFlags that I disliked ... The differentiated
X11 functionnality now only comes from the creation of a X11BasicCompositor compositor in 
CompositorParent::InitializeLayerManager() when XRender() is supported. thanks !

**************

(In reply to Karl Tomlinson (needinfo?:karlt) from comment #37)
> Comment on attachment 8439668 [details] [diff] [review]
> OMTC basic implementation (v3)
>
> It would be more conventional if AsSourceBasic() were to return this.

Done. I have made AsSourceBasic() return 'this' and instead, I have
made GetSurface() return "mXlibTextureSource->GetSurface(aTarget)".

> The format of the DataSourceSurface can be recorded if that is easier than
> getting the format from the Xlib surface.

All in all, I didn't have to record the format inside DataSourceSurface.
I could read it directly from the passed-in SourceSurface in ::Update().

> I assume a new Xlib surface will need to be created if the size or format of
> the DataSourceSurface in Update() is different, like in
> TextureImageTextureSourceOGL::Update().

Done. You're right!  I forgot about that. Now, when the size changes, I
recreate the xlib surface with proper size.
Attachment #8439668 - Attachment is obsolete: true
Attachment #8444191 - Flags: review?(nical.bugzilla)
Attachment #8444191 - Flags: review?(karlt)
Attached patch OMTC basic implementation (v4) (obsolete) — Splinter Review
Sorry, review this one guys, with comitter's info. thanks.
Attachment #8444182 - Attachment is obsolete: true
Attachment #8444182 - Flags: review?(nical.bugzilla)
Attachment #8444182 - Flags: review?(karlt)
Attachment #8444192 - Flags: review?(nical.bugzilla)
Attachment #8444192 - Flags: review?(karlt)
Attachment #8444191 - Attachment is obsolete: true
Attachment #8444191 - Flags: review?(nical.bugzilla)
Attachment #8444191 - Flags: review?(karlt)
Attachment #8444182 - Attachment is obsolete: false
Attachment #8444182 - Flags: review?(nical.bugzilla)
Attachment #8444182 - Flags: review?(karlt)
Ah yes !... I forgot to mention.
I did test the patch on my local GTK3 build.. I couldn't notice any glitch. everything seemed to be working just fine (appart from plugin rendering of course, that I'll tackle next).

Nical : I would ask you of course to push the patch to the try server...but we don't have that option right now, as the buildservers do not support GTK3... I heard it's coming up soon, though.
Nical is away for a couple of weeks

Bas, who can review this instead?

Fred - see bug 1016641.
Flags: needinfo?(bas)
Comment on attachment 8444182 [details] [diff] [review]
Remove_Xlib_surfaces_from_CreateOffscreenSurface.patch

Nical already OKed most of this, so I think I can review.

>+    bool UseImageOffscreenSurfaces()
>+    {
>+        // We want to turn on image offscreen surfaces ONLY for GTK3 builds
>+        // since GTK2 theme rendering still requires xlib surfaces per se.
>+    #if (MOZ_WIDGET_GTK == 3)
>+        return gfxPrefs::UseImageOffscreenSurfaces();
>+    #else
>+        return false;
>+    #endif

This file seems to put the opening '{' after the function name for inline
method definitions.

Please move the preprocessor directives to the left hand margin, as they are
elsewhere in this file.

The change to nsShmImage::UseShm() really belongs in this patch, as
layers.use-image-offscreen-surfaces doesn't add much more than
gfx.xrender.enabled without that change.

>+#ifdef MOZ_WIDGET_GTK
>+#include "gfxPlatformGtk.h"
>+#endif

> bool nsShmImage::UseShm()
> {
>-    return gfxPlatform::GetPlatform()->
>-        ScreenReferenceSurface()->GetType() == gfxSurfaceType::Image
>-        && gShmAvailable;
>+    return (gShmAvailable && !gfxPlatformGtk::GetPlatform()->UseXRender());
> }

The gfxPlatformGtk usage should be #ifdef MOZ_WIDGET_GTK.
gfxQtPlatform always uses gfxSurfaceType::Image, so this can just return 
gShmAvailable for #else, and the "gfxPlatform.h" include can be removed.
Attachment #8444182 - Flags: review?(nical.bugzilla)
Attachment #8444182 - Flags: review?(karlt)
Attachment #8444182 - Flags: review-
Comment on attachment 8444192 [details] [diff] [review]
OMTC basic implementation (v4)

I think it would be worth running this through reftests on your development
machine (./mach reftest).  You'll need to run without the patch also to
compare results because some reftests will fail on some systems.

Do you have any performance measurements?

>+    if (aSurface->GetSize() != mUpdateSurface->GetSize()) {
>+      DeallocateDeviceData();

Need to also create a new xlib draw target when the format changes.

>+     // We're uploading to our buffer, so let's just plainly overwrite
>+     // contents (OP_SOURCE)
>+     mXlibDrawTarget->DrawSurface(mUpdateSurface, rect, rect, options,
>+                                 DrawOptions(1.0f, CompositionOp::OP_SOURCE));

CopySurface() is sugar for this kind of DrawSurface(), but would be more
explicit about what is happening and avoid the int/float conversions for
rects.

>+SourceSurface*
>+X11DataTextureSourceBasic::GetSurface(DrawTarget* aTarget)

Return mXlibDrawTarget->Snapshot() (when mXlibDrawTarget is set).
Then mXlibTextureSource, and mXlibSurface, and mCompositor won't be required.

>+  nsRefPtr<mozilla::gfx::SourceSurface> mUpdateSurface;

I don't think it is necessary to keep mUpdateSurface as mXlibDrawTarget has
similar format and size methods.

>diff --git a/gfx/layers/client/TextureClient.cpp b/gfx/layers/client/TextureClient.cpp

>+#ifdef MOZ_WIDGET_GTK
>+#include "gfxPlatformGtk.h"             // For gfxPlatformGtk
>+#endif

>diff --git a/gfx/layers/composite/TextureHost.cpp b/gfx/layers/composite/TextureHost.cpp

>+#include "mozilla/layers/X11BasicCompositor.h"
>+#ifdef MOZ_WIDGET_GTK
>+#include "gfxPlatformGtk.h"                 // for gfxPlatformGtk
>+#endif

>diff --git a/gfx/layers/composite/X11TextureHost.cpp b/gfx/layers/composite/X11TextureHost.cpp

>-#include "mozilla/layers/BasicCompositor.h"
>+#include "mozilla/layers/X11BasicCompositor.h"

These changes are no longer required.

>diff --git a/gfx/layers/ipc/CompositorParent.cpp b/gfx/layers/ipc/CompositorParent.cpp

>+#include "mozilla/layers/X11BasicCompositor.h" // for X11BasicCompositor

Need to ensure this compiles on non-GTK/X11 platforms.

Also, I think this can be "basic/X11BasicCompositor.h" so that the header need
not be exported.

>+#ifdef MOZ_X11
>+#ifdef MOZ_WIDGET_GTK

MOZ_X11 is always defined when MOZ_WIDGET_GTK is defined, so its test is not necessary, but if you wish to keep it please collect these tests into a single
conditional line.

>+      if (gfxPlatformGtk::GetPlatform()->UseImageOffscreenSurfaces() &&
>+          gfxPlatformGtk::GetPlatform()->UseXRender()) {
>+        compositor = new X11BasicCompositor(mWidget);

Remove the UseImageOffscreenSurfaces() test, please. (Comment 31)
That will also mean that this is used and tested to some extent even with GTK2.
Attachment #8444192 - Flags: review?(nical.bugzilla)
Attachment #8444192 - Flags: review?(karlt)
Attachment #8444192 - Flags: review-
KarlT does seem like an excellent candidate!
Flags: needinfo?(bas)
thanks for the nice review, Karl!
I'll update with patch v.5 very shortly !

(In reply to Karl Tomlinson (needinfo?:karlt) from comment #44)
> Comment on attachment 8444192 [details] [diff] [review]
> OMTC basic implementation (v4)
> 
> I think it would be worth running this through reftests on your development
> machine (./mach reftest).  You'll need to run without the patch also to
> compare results because some reftests will fail on some systems.

allright, 

> Do you have any performance measurements?

yep, I struggled a little with talos locally, but I have the nonPatched set of test results already in :
Here's the list of tests I drove :

tcanvasmark:cart:tart:tpaint:tresize:tscrollx:ts:ts_paint

I'll post them, along with the patchV.4 results (to compare) tomorrow.
> (In reply to Karl Tomlinson (needinfo?:karlt) from comment #44)
> > Do you have any performance measurements?

By the way, I'll get level 1 access on the try server very shortly, so I'll be able to post test jobs and we'll get better pictures of what's going on.

but for now, I have driven some local talos tests on my machine, and while some results show increases, the big picture is not so encouraging...

For both tests, I used those default prefs :

pref("layers.use-image-offscreen-surfaces", true);
pref("layers.offmainthreadcomposition.force-enabled", true);
pref("layers.offmainthreadcomposition.force-basic", true);
pref("layers.offmainthreadcomposition.enabled", true);

and no layers.acceleration what so ever.

Look at the result file attached.
The "noPatch" section is the one I got with regular, non-patched code with 'hg parent' = 189697:49b7b4e89e7c
The "patchv4" section is the one I got from running patch v4 (above) over the same parent commit.

Tell me what you think.
The patch is not so complicated... shouldn't we get clear perf gains by using it ?

also, could ->DrawSurface(...) instead of ->CopySurface() explain it ?
I'm implementing patch v.5 now and will use ->CopySurface to buffer layer's contents.

thanks.
(In reply to Frederic Plourde (:fred23) from comment #47)
> Created attachment 8445941 [details]
> Test results for OMTC basic with NoPatch and Patchv4

Are these GTK3 system-cairo builds?

What X11 graphics driver is in use?

> For both tests, I used those default prefs :
> 
> pref("layers.use-image-offscreen-surfaces", true);
> pref("layers.offmainthreadcomposition.force-enabled", true);
> pref("layers.offmainthreadcomposition.force-basic", true);
> pref("layers.offmainthreadcomposition.enabled", true);

> The "noPatch" section is the one I got with regular, non-patched code with
> 'hg parent' = 189697:49b7b4e89e7c

This is using Xlib surfaces from CreateOffscreenSurface for thebes/content layers, I assume?

> The "patchv4" section is the one I got from running patch v4 (above) over
> the same parent commit.

Was attachment 8444182 [details] [diff] [review] applied?

(I assume so because attachment 8444192 [details] [diff] [review] wouldn't compile without attachment 8444182 [details] [diff] [review].)
Comment on attachment 8446099 [details]
NoPatch REFTEST results

> REFTEST TEST-LOAD | file:///home/fredinfinite23/code/mozilla-central/layout/reftests/w3c-css/submitted/css21/pagination/moz-css21-block-page-break-inside-avoid-8.html | 651 / 11040 (5%)
> TEST-INFO | Main app process: killed by out-of-range signal, number 117

117 may be a bug in the test harness, so let's ignore that.

> TEST-UNEXPECTED-FAIL | file:///home/fredinfinite23/code/mozilla-central/layout/reftests/w3c-css/submitted/css21/pagination/moz-css21-block-page-break-inside-avoid-8.html | Exited with code -11 during test run

-11 sounds like SEGV.  You have 3 minutes available to find the process, attach the debugger and get a stack trace.
Comment on attachment 8445941 [details]
Test results for OMTC basic with NoPatch vs Patchv4+UseImageOffscreenSurfaces()

I don't know exactly which values tbpl pulls out of these data sets (and the main goal is to get before and after measurements, so that doesn't matter so much assuming we can guess whether higher or lower is better), but even some of the noPatch numbers look quite large (e.g. 680.03 for tpaint), compared with what tbpl reports (190.40).  Was this from an optimized build?
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #49)
> (In reply to Frederic Plourde (:fred23) from comment #47)
> > Created attachment 8445941 [details]
> > Test results for OMTC basic with NoPatch and Patchv4
> 
> Are these GTK3 system-cairo builds?

Yep, here's the .mozconfig that I used :

****
. $topsrcdir/browser/config/mozconfig

mk_add_options MOZ_MAKE_FLAGS="-j4"

# Disable optimization (for cleaner gdb tracing)
ac_add_options --disable-optimize 

# we want the gtk3 build
ac_add_options --enable-default-toolkit=cairo-gtk3
ac_add_options --enable-system-cairo
ac_add_options --enable-system-pixman

# crashreporter is causing me trouble with GTK3
ac_add_options --disable-crashreporter
*****
 
> What X11 graphics driver is in use?

Intel_drv (xserver-xorg-video-intel 2:2.21.6-0ubuntu4), running on :
Intel Corporation 3rd Gen Core processor Graphics Controller IvyBridge
with 'i915' kernel module
 
> > For both tests, I used those default prefs :
> > 
> > pref("layers.use-image-offscreen-surfaces", true);
> > pref("layers.offmainthreadcomposition.force-enabled", true);
> > pref("layers.offmainthreadcomposition.force-basic", true);
> > pref("layers.offmainthreadcomposition.enabled", true);
> 
> > The "noPatch" section is the one I got with regular, non-patched code with
> > 'hg parent' = 189697:49b7b4e89e7c
> 
> This is using Xlib surfaces from CreateOffscreenSurface for thebes/content
> layers, I assume?

yes, as in:
http://lxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxPlatformGtk.cpp#103

> > The "patchv4" section is the one I got from running patch v4 (above) over
> > the same parent commit.
> 
> Was attachment 8444182 [details] [diff] [review] applied?

yes, sorry for misunderstanding.
The Patchv4 results came from applying 
8444182 + 8444192.
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #50)
> Comment on attachment 8446099 [details]
> NoPatch REFTEST results
> 
> > REFTEST TEST-LOAD | file:///home/fredinfinite23/code/mozilla-central/layout/reftests/w3c-css/submitted/css21/pagination/moz-css21-block-page-break-inside-avoid-8.html | 651 / 11040 (5%)
> > TEST-INFO | Main app process: killed by out-of-range signal, number 117
> 
> 117 may be a bug in the test harness, so let's ignore that.

I'm glad you figured why I attached this result output alone without much explanations...
I got this bug here and the one below, which made the reftests stop at test #654 / 11040 !
And the funny thing is that, with my patches applied, all the 11040 tests go through (some failed, but all the tests did run). I'll update the comparative reftest results (noPatch + patchv4) as soon as I get all reftests to run with the noPatch code.

> > TEST-UNEXPECTED-FAIL | file:///home/fredinfinite23/code/mozilla-central/layout/reftests/w3c-css/submitted/css21/pagination/moz-css21-block-page-break-inside-avoid-8.html | Exited with code -11 during test run
> 
> -11 sounds like SEGV.  You have 3 minutes available to find the process,
> attach the debugger and get a stack trace.

ah! thanks for the tip, I'll trace that.
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #51)
> Comment on attachment 8445941 [details]
> Test results for OMTC basic with NoPatch and Patchv4
> 
> I don't know exactly which values tbpl pulls out of these data sets (and the
> main goal is to get before and after measurements, so that doesn't matter so
> much assuming we can guess whether higher or lower is better), but even some
> of the noPatch numbers look quite large (e.g. 680.03 for tpaint), compared
> with what tbpl reports (190.40).  Was this from an optimized build?

as you can see in my posted .mozconfig file (Comment 52), this is from an 
non-optimized build.

since what mattered to me, as you said, was before/after readings, I didn't care so much about optimization at this level., So no worries ;-)

Anyways, within a few hours (I hope) / days, I'll have access to the try servers.
Depends on: 1010235
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #50)
> Comment on attachment 8446099 [details]
> NoPatch REFTEST results
> 
> > REFTEST TEST-LOAD | file:///home/fredinfinite23/code/mozilla-central/layout/reftests/w3c-css/submitted/css21/pagination/moz-css21-block-page-break-inside-avoid-8.html | 651 / 11040 (5%)
> > TEST-INFO | Main app process: killed by out-of-range signal, number 117
> 
> 117 may be a bug in the test harness, so let's ignore that.
> 
> > TEST-UNEXPECTED-FAIL | file:///home/fredinfinite23/code/mozilla-central/layout/reftests/w3c-css/submitted/css21/pagination/moz-css21-block-page-break-inside-avoid-8.html | Exited with code -11 during test run
> 
> -11 sounds like SEGV.  You have 3 minutes available to find the process,
> attach the debugger and get a stack trace.

argh, I have lost a significant amount of time on that today...
Can't seem to be able to attach to anything after the crash, seems this bug kills all the test processes right away.
Also, even weirder, for reasons/usecases/conditions I can't yet understand/reproduce, I could make those crash disappear (but I got other crashes a little later) sometimes...

What's interesting is that I've tried to run the reftests on the noPatch code *without* any 'pref' modifs, that is, withOUT 

offmainthreadcomposition.enabled = true
offmainthreadcomposition.force-basic = true

...(so BasicCompositor was not kicking in without the offmaintheadcomposition.force-basic pref) and it *passed*. This mean we're having something weird happenning when using regular BasicCompositor with OMTC

I'll investigate some more.
Hey Karl, Got the trace, for what it's worth :

#0  0x00002aaab4ee9e3c in PL_DHashTableFinish (table=0x2aaadaba0170)
    at /home/fredinfinite23/code/mozilla-central/xpcom/glue/pldhash.cpp:326
#1  0x00002aaab5d515bf in ~nsBaseHashtable (this=0x2aaadaba0170, __in_chrg=<optimized out>)
    at ../../dist/include/nsBaseHashtable.h:52
#2  ~nsRefPtrHashtable (this=0x2aaadaba0170, __in_chrg=<optimized out>)
    at ../../dist/include/nsRefPtrHashtable.h:22
#3  nsGlobalWindow::~nsGlobalWindow (this=0x2aaadaba0000, __in_chrg=<optimized out>)
    at /home/fredinfinite23/code/mozilla-central/dom/base/nsGlobalWindow.cpp:1244
#4  0x00002aaab5d51662 in nsGlobalWindow::~nsGlobalWindow (this=0x2aaadaba0000, 
    __in_chrg=<optimized out>)
    at /home/fredinfinite23/code/mozilla-central/dom/base/nsGlobalWindow.cpp:1344
#5  0x00002aaab4f160ee in SnowWhiteKiller::~SnowWhiteKiller (this=0x7fffffffba78, 
    __in_chrg=<optimized out>)
    at /home/fredinfinite23/code/mozilla-central/xpcom/base/nsCycleCollector.cpp:2611
#6  0x00002aaab4f0d456 in nsCycleCollector::FreeSnowWhite (this=0x2aaabb0f5000, 
    aUntilNoSWInPurpleBuffer=aUntilNoSWInPurpleBuffer@entry=false)
    at /home/fredinfinite23/code/mozilla-central/xpcom/base/nsCycleCollector.cpp:2777
#7  0x00002aaab4f0d504 in nsCycleCollector_doDeferredDeletion ()
    at /home/fredinfinite23/code/mozilla-central/xpcom/base/nsCycleCollector.cpp:4108
#8  0x00002aaab5c8be93 in AsyncFreeSnowWhite::Run (this=0x2aaabcadcfc0)
    at /home/fredinfinite23/code/mozilla-central/js/xpconnect/src/XPCJSRuntime.cpp:214
#9  0x00002aaab4f53ef2 in ProcessNextEvent (aResult=0x7fffffffbb9f, aMayWait=false, this=0x2aaaabd710c0)
    at /home/fredinfinite23/code/mozilla-central/xpcom/threads/nsThread.cpp:766
#10 nsThread::ProcessNextEvent (this=0x2aaaabd710c0, aMayWait=<optimized out>, aResult=0x7fffffffbb9f)
    at /home/fredinfinite23/code/mozilla-central/xpcom/threads/nsThread.cpp:685
#11 0x00002aaab4eeaf45 in NS_ProcessNextEvent (thread=<optimized out>, mayWait=<optimized out>)
    at /home/fredinfinite23/code/mozilla-central/xpcom/glue/nsThreadUtils.cpp:263
#12 0x00002aaab51e6ccc in mozilla::ipc::MessagePump::Run (this=0x2aaabb0ab280, aDelegate=0x2aaaabd756e0)
    at /home/fredinfinite23/code/mozilla-central/ipc/glue/MessagePump.cpp:95
#13 0x00002aaab51c544a in MessageLoop::RunInternal (this=this@entry=0x2aaaabd756e0)
    at /home/fredinfinite23/code/mozilla-central/ipc/chromium/src/base/message_loop.cc:229
#14 0x00002aaab51c54be in RunHandler (this=0x2aaaabd756e0)
    at /home/fredinfinite23/code/mozilla-central/ipc/chromium/src/base/message_loop.cc:222
#15 MessageLoop::Run (this=0x2aaaabd756e0)
    at /home/fredinfinite23/code/mozilla-central/ipc/chromium/src/base/message_loop.cc:196
#16 0x00002aaab5bfe981 in nsBaseAppShell::Run (this=0x2aaabe8b1350)
    at /home/fredinfinite23/code/mozilla-central/widget/xpwidgets/nsBaseAppShell.cpp:164
#17 0x00002aaab692f935 in nsAppStartup::Run (this=0x2aaabeb6ff10)
    at /home/fredinfinite23/code/mozilla-central/toolkit/components/startup/nsAppStartup.cpp:278
#18 0x00002aaab68b99ec in XREMain::XRE_mainRun (this=this@entry=0x7fffffffbe60)
    at /home/fredinfinite23/code/mozilla-central/toolkit/xre/nsAppRunner.cpp:4011
#19 0x00002aaab68bc313 in XREMain::XRE_main (this=this@entry=0x7fffffffbe60, argc=argc@entry=6, 
    argv=argv@entry=0x7fffffffd358, aAppData=aAppData@entry=0x7fffffffc050)
    at /home/fredinfinite23/code/mozilla-central/toolkit/xre/nsAppRunner.cpp:4082
#20 0x00002aaab68bc56a in XRE_main (argc=6, argv=0x7fffffffd358, aAppData=0x7fffffffc050, 
    aFlags=<optimized out>) at /home/fredinfinite23/code/mozilla-central/toolkit/xre/nsAppRunner.cpp:4296
#21 0x0000000000404354 in do_main (argc=argc@entry=6, argv=argv@entry=0x7fffffffd358, xreDirectory=
    0x2aaaabd526c0) at /home/fredinfinite23/code/mozilla-central/browser/app/nsBrowserApp.cpp:282
#22 0x0000000000404859 in main (argc=6, argv=0x7fffffffd358)
    at /home/fredinfinite23/code/mozilla-central/browser/app/nsBrowserApp.cpp:643


any hint ?
(In reply to Frederic Plourde (:fred23) from comment #56)
That's going to be a difficult one to track down.  I suggest moving on to something else.  And the try server will enable you to run tests in parallel.
(In reply to Frederic Plourde (:fred23) from comment #47)
> Created attachment 8445941 [details]
> Test results for OMTC basic with NoPatch and Patchv4

> The patch is not so complicated... shouldn't we get clear perf gains by
> using it ?

I suspect most of the differences in the performance are from switching content surfaces from xlib to image.  Big differences in performance are expected when changing from server/gpu rendering to client/cpu.

X11DataTextureSourceBasic is about trying to optimize (cache) uploads from client to server, which is not usually necessary with server-side rendering (on xlib surfaces).
Attachment #8445941 - Attachment description: Test results for OMTC basic with NoPatch and Patchv4 → Test results for OMTC basic with NoPatch vs Patchv4+UseImageOffscreenSurfaces()
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #43)
> Comment on attachment 8444182 [details] [diff] [review]
> Remove_Xlib_surfaces_from_CreateOffscreenSurface.patch
>
> Please move the preprocessor directives to the left hand margin, as they are
> elsewhere in this file.

done.

> The change to nsShmImage::UseShm() really belongs in this patch, as
> layers.use-image-offscreen-surfaces doesn't add much more than
> gfx.xrender.enabled without that change.
> The gfxPlatformGtk usage should be #ifdef MOZ_WIDGET_GTK.
> gfxQtPlatform always uses gfxSurfaceType::Image, so this can just return 
> gShmAvailable for #else, and the "gfxPlatform.h" include can be removed.

done.
Attachment #8444182 - Attachment is obsolete: true
Attachment #8446909 - Flags: review?(karlt)
Sorry, review this one instead, previous one had some trailing spaces.
thanks.
Attachment #8446909 - Attachment is obsolete: true
Attachment #8446909 - Flags: review?(karlt)
Attachment #8446913 - Flags: review?(karlt)
Attachment #8446913 - Flags: review?(karlt) → review+
Attached patch OMTC basic Layer Buffering (v5) (obsolete) — Splinter Review
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #44)
> Comment on attachment 8444192 [details] [diff] [review]
> OMTC basic implementation (v4)
> >+    if (aSurface->GetSize() != mUpdateSurface->GetSize()) {
> >+      DeallocateDeviceData();
> 
> Need to also create a new xlib draw target when the format changes.

I honestly didn't notice a layer could change format, good catch.
done.

> 
> >+     // We're uploading to our buffer, so let's just plainly overwrite
> >+     // contents (OP_SOURCE)
> >+     mXlibDrawTarget->DrawSurface(mUpdateSurface, rect, rect, options,
> >+                                 DrawOptions(1.0f, CompositionOp::OP_SOURCE));
> 
> CopySurface() is sugar for this kind of DrawSurface(), but would be more
> explicit about what is happening and avoid the int/float conversions for
> rects.

done.

 
> >+SourceSurface*
> >+X11DataTextureSourceBasic::GetSurface(DrawTarget* aTarget)
> 
> Return mXlibDrawTarget->Snapshot() (when mXlibDrawTarget is set).
> Then mXlibTextureSource, and mXlibSurface, and mCompositor won't be required.

oh yeeaah !  Good catch.
done... it's way cleaner.

> >+  nsRefPtr<mozilla::gfx::SourceSurface> mUpdateSurface;
> 
> I don't think it is necessary to keep mUpdateSurface as mXlibDrawTarget has
> similar format and size methods.

this is gone too. tnx.
 
> >diff --git a/gfx/layers/client/TextureClient.cpp b/gfx/layers/client/TextureClient.cpp
> 
> >+#ifdef MOZ_WIDGET_GTK
> >+#include "gfxPlatformGtk.h"             // For gfxPlatformGtk
> >+#endif
> 
> >diff --git a/gfx/layers/composite/TextureHost.cpp b/gfx/layers/composite/TextureHost.cpp
> 
> >+#include "mozilla/layers/X11BasicCompositor.h"
> >+#ifdef MOZ_WIDGET_GTK
> >+#include "gfxPlatformGtk.h"                 // for gfxPlatformGtk
> >+#endif
> 
> >diff --git a/gfx/layers/composite/X11TextureHost.cpp b/gfx/layers/composite/X11TextureHost.cpp
> 
> >-#include "mozilla/layers/BasicCompositor.h"
> >+#include "mozilla/layers/X11BasicCompositor.h"
> 
> These changes are no longer required.
> 
> >diff --git a/gfx/layers/ipc/CompositorParent.cpp b/gfx/layers/ipc/CompositorParent.cpp
> 
> >+#include "mozilla/layers/X11BasicCompositor.h" // for X11BasicCompositor
> 
> Need to ensure this compiles on non-GTK/X11 platforms.
> Also, I think this can be "basic/X11BasicCompositor.h" so that the header
> need
> not be exported.

have guarded the include with #ifdef MOZ_WIDGET_GTK
and removed the export.
done.

> >+#ifdef MOZ_X11
> >+#ifdef MOZ_WIDGET_GTK
> 
> MOZ_X11 is always defined when MOZ_WIDGET_GTK is defined, so its test is not
> necessary, but if you wish to keep it please collect these tests into a
> single
> conditional line.

done. I simply removed that extra X11 condition

> 
> >+      if (gfxPlatformGtk::GetPlatform()->UseImageOffscreenSurfaces() &&
> >+          gfxPlatformGtk::GetPlatform()->UseXRender()) {
> >+        compositor = new X11BasicCompositor(mWidget);
> 
> Remove the UseImageOffscreenSurfaces() test, please. (Comment 31)
> That will also mean that this is used and tested to some extent even with
> GTK2.

done.
Attachment #8444192 - Attachment is obsolete: true
Attachment #8447180 - Flags: review?(karlt)
Hey Karl, 

  I got try server access, and I drove a series of trys, take a look : 

GTK3 linux build with BasicCompositor with Xlib surfaces :
https://tbpl.mozilla.org/?tree=Try&rev=ba4801f0617a

GTK3 linux build with BasicCompositor with Image surfaces :
(with patch 8446913 - Remove Xlib Surfaces v.2)
https://tbpl.mozilla.org/?tree=Try&rev=feec18d8b468

GTK3 linux build with BasicCompositor with Images Surfaces *buffered with X11TextureSourceBasic*
(with patch 8446913 - Remove Xlib Surfaces)
(with patch 8447180 - OMTC Basic Layer Buffering v.5)
https://tbpl.mozilla.org/?tree=Try&rev=91f9804d818c


I'd like you thoughts on the following :

reftests : We see that many fail for all all 3 try jobs : I'm quite sure that's because it's a GTK3 build, right ?. Good thing is that my patches do not seem like they trigger additional fails in reftests... I will take a look at bc3, though.

talos : Some perf drops are huge... Did you guys already know that image surfaces were that slower to draw? I can write more specific and local tests to see where exactly is that slowness coming from... drawing, compositing, etc...
(In reply to Frederic Plourde (:fred23) from comment #62)
> talos : Some perf drops are huge... Did you guys already know that image
> surfaces were that slower to draw?

Hmm, if I see things correctly, the Xlib->image change itself doesn't have the really huge drops (though the shutdown paint regression is somewhat large, but it is mitigated by the other change in the end):
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=ba4801f0617a&newRev=feec18d8b468&submit=true

The OMTC buffering change has the really large regressions:
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=feec18d8b468&newRev=91f9804d818c&submit=true

The overall sum of them surely doesn't look good either, that's right:
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=ba4801f0617a&newRev=91f9804d818c&submit=true

That said, are the first two already using OMTC or not? We know that OMTC on/off changes a number of perf numbers in both directions, as we have also seen on Windows, I wonder if that plays into this as well.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #63)
>
> That said, are the first two already using OMTC or not? We know that OMTC
> on/off changes a number of perf numbers in both directions, as we have also
> seen on Windows, I wonder if that plays into this as well.

They are. In fact, all three of them are going the OMTC basic path.

> (In reply to Frederic Plourde (:fred23) from comment #62)
> > talos : Some perf drops are huge... Did you guys already know that image
> > surfaces were that slower to draw?
> 
> Hmm, if I see things correctly, the Xlib->image change itself doesn't have
> the really huge drops (though the shutdown paint regression is somewhat
> large, but it is mitigated by the other change in the end):
> http://perf.snarkfest.net/compare-talos/index.
> html?oldRevs=ba4801f0617a&newRev=feec18d8b468&submit=true

right, but I'm wondering ultimately if those perf drops aren't mainly due to the fact that my patches mainly add *buffering* capability to OMTC basic layers whereas the tests that show worst perf drops may be using/rendering those layers only once per test, hence, we would be hitting the "buffering overhead" only, not showing the gain we might have by recycling the layer and reusing the buffered data over time.

what do you think ?
Depends on: 1034064
No longer depends on: 1034064
(In reply to Frederic Plourde (:fred23) from comment #62)
> GTK3 linux build with BasicCompositor with Xlib surfaces :
> https://tbpl.mozilla.org/?tree=Try&rev=ba4801f0617a

This branch includes
pref("layers.use-image-offscreen-surfaces", true);
and "Remove Xlib offscreen surfaces on GTK3 platform."
https://hg.mozilla.org/try/rev/094d6d782cc9
so this is using image surfaces IIUC.

> GTK3 linux build with BasicCompositor with Image surfaces :
> (with patch 8446913 - Remove Xlib Surfaces v.2)
> https://tbpl.mozilla.org/?tree=Try&rev=feec18d8b468

So this is the same as above but based off a different trunk revision.

> GTK3 linux build with BasicCompositor with Images Surfaces *buffered with
> X11TextureSourceBasic*
> (with patch 8446913 - Remove Xlib Surfaces)
> (with patch 8447180 - OMTC Basic Layer Buffering v.5)
> https://tbpl.mozilla.org/?tree=Try&rev=91f9804d818c

This is based on the same trunk revision as ba4801f0617a but differs from that
in that it has attachment 8447180 [details] [diff] [review] IIUC.

That would mean that 

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #63)
> The overall sum of them surely doesn't look good either, that's right:
> http://perf.snarkfest.net/compare-talos/index.
> html?oldRevs=ba4801f0617a&newRev=91f9804d818c&submit=true

would be measuring the effects of the buffering from attachment 8447180 [details] [diff] [review].

tscrollx is 20% better, but other things are significantly worse.

If things are working as expected, then the main drawback from the buffering
would be that it is creating extra Pixmaps for the upload.  Without the
patch, cairo can do a direct upload to the window or its backing surface in
common simple cases.

The advantage of the buffering is that only the damaged areas of layers need
to be uploaded.  The parts that merely move don't need to be uploaded again,
which benefits scrolling.

The number that is most concerning is the 80-fold increase in X resource
usage.  That is much more than expected from the extra buffering (unless the
layer depth is about 80).  There was a time when we were getting funny results
on some of our systems.  That may no longer be an issue, but I suggest running
tests on the 64-bit systems to verify these results.  If there really is an
80-fold increase, then the cause of that may point to the cause of the
performance problem.

Unfortunately, we don't have any runs with full debug builds completing, so we
don't have info on leaks.  Perhaps try a GTK2 run and see if there are
perf/leak issues there.

(In reply to Frederic Plourde (:fred23) from comment #62)
> I'd like you thoughts on the following :
> 
> reftests : We see that many fail for all all 3 try jobs : I'm quite sure
> that's because it's a GTK3 build, right ?. Good thing is that my patches do
> not seem like they trigger additional fails in reftests... I will take a
> look at bc3, though.

The mochitests including bc3 are likely to be GTK3 issues (and there may be
some existing intermittent bugs).  Even though the bc3 failure did not happen
on ba4801f0617a, it did happen on feec18d8b468, which has the same patches on
a different base revision.  Compare results with xlib surfaces (no patches
from this bug) anyway.

reftests are the results most likely related to graphics.
Some bad paths for performance to check with break points would be 
XGetImage(), XShmGetImage(), and 
http://hg.mozilla.org/mozilla-central/annotate/06e9a27a6fcc/gfx/2d/DrawTargetCairo.cpp#l1354
(In reply to Karl Tomlinson (needinfo?:karlt) from comment #65)
> (In reply to Frederic Plourde (:fred23) from comment #62)
> > GTK3 linux build with BasicCompositor with Xlib surfaces :
> > https://tbpl.mozilla.org/?tree=Try&rev=ba4801f0617a
> 
> This branch includes
> pref("layers.use-image-offscreen-surfaces", true);
> and "Remove Xlib offscreen surfaces on GTK3 platform."
> https://hg.mozilla.org/try/rev/094d6d782cc9
> so this is using image surfaces IIUC.

argh, you're right !
I was mistaken by the fact that tbpl was *not* showing the "Remove Xlib offscreen..." change when checking https://tbpl.mozilla.org/?tree=Try&rev=ba4801f0617a

My intention was to in incrementally test all three cases :

non-patched with xlib
patched with images
patches with images + buffering.

I will resend the jobs and try to get a more meaningful result as to what's the impact of the "Remove_Xlib_offscreen_surfaces" patch alone, thanks.


(In reply to Karl Tomlinson (needinfo?:karlt) from comment #65)
> That would mean that 
> 
> (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #63)
> > The overall sum of them surely doesn't look good either, that's right:
> > http://perf.snarkfest.net/compare-talos/index.
> > html?oldRevs=ba4801f0617a&newRev=91f9804d818c&submit=true
> 
> would be measuring the effects of the buffering from attachment 8447180 [details] [diff] [review]
> [details] [diff] [review].

Yes, even if I screwed up try jobs for the first patch, you can certainly say that this compare-talos is effectively showing that patch's impact.

> tscrollx is 20% better, but other things are significantly worse.
> If things are working as expected

I have tested and retested both patches together with the required prefs, and everything seems to be working just fine (including plugins and canvas 2D examples).
  
> The number that is most concerning is the 80-fold increase in X resource
> usage. That may no longer be an issue, but I suggest running
> tests on the 64-bit systems to verify these results.

wasn't my try jobs already run on 64-bit system ?
If I click the job's "B" field, I can see : 

"using slave: b-linux64-hp-0014"

maybe this slave was cross-compiling for 32bit, I don't know.
But what I know is that Mike's patch to enable GTK3 try runs is supposed to good for linux64 only :
https://bugzilla.mozilla.org/show_bug.cgi?id=1016641#c4


> Unfortunately, we don't have any runs with full debug builds completing, so
> we don't have info on leaks.  Perhaps try a GTK2 run and see if there are
> perf/leak issues there.

Do you have custom in-house tools to spot/investigate leaks ?
(In reply to Frederic Plourde (:fred23) from comment #67)
> > The number that is most concerning is the 80-fold increase in X resource
> > usage. That may no longer be an issue, but I suggest running
> > tests on the 64-bit systems to verify these results.
> 
> wasn't my try jobs already run on 64-bit system ?
> If I click the job's "B" field, I can see : 
> 
> "using slave: b-linux64-hp-0014"
> 
> maybe this slave was cross-compiling for 32bit, I don't know.

Yes.  "-p linux" selects 32-bit builds.  Use linux64 for 64-bit builds.
http://trychooser.pub.build.mozilla.org/

But I suspect now that the X resource usage just seems high compared to the image-only layers because the image-only layers uses very few xlib surfaces.
5 MB is actually still less than the 8 MB used by mozilla-central builds.

> But what I know is that Mike's patch to enable GTK3 try runs is supposed to
> good for linux64 only :
> https://bugzilla.mozilla.org/show_bug.cgi?id=1016641#c4

I think that was an early version.

> > Unfortunately, we don't have any runs with full debug builds completing, so
> > we don't have info on leaks.  Perhaps try a GTK2 run and see if there are
> > perf/leak issues there.
> 
> Do you have custom in-house tools to spot/investigate leaks ?

Yes, but I don't recall the details of how to run them.  GTK2 debug tests will run with leak checking.  Though, understanding the XRes numbers better now, leaks don't seem so likely to be the problem here.
Comment on attachment 8447180 [details] [diff] [review]
OMTC basic Layer Buffering (v5)

This looks close, but the performance issues need to be understood.

>+  nsIntRegionRectIterator *iter;
>+  if (aDestRegion) {
>+    iter = new nsIntRegionRectIterator(*aDestRegion);
>+  } else {
>+    IntSize size = aSurface->GetSize();
>+    iter = new nsIntRegionRectIterator(nsIntRect(0, 0, size.width, size.height));
>+  }

I don't see an nsIntRegionRectIterator(const nsIntRect &) constructor, so I
suspect this is creating a temporary nsIntRegion from the nsIntRegion(const
nsIntRect&) constructor (which should be explicit, but is not).

nsIntRegionRectIterator holds a pointer to the contents of the region provided
to the constructor, so the region must not be a temporary, or it may be
overwritten by other local variables.

This is main issue I see here.  Fix this before any further performance tests
with this patch.

If the performance is still poor then I'm surprised that adding a little work
to the compositor thread slows down the test results because I would have
expected the tests to be limited by the main-thread cpu requirements.  Unless
perhaps there is only one cpu thread on the test machines, or if the main
thread is blocking on the compositor thread for some reason.

The other comments below are smaller issues, and not relevant unless we can
understand the performance results.

Also, best to avoid memory allocations where practical, so rearrange things to
make |iter| a stack variable.

>+#include "mozilla/layers/ImageDataSerializer.h"  // for ImageDataSerializer

No longer needed.

>+  if ( (!mXlibDrawTarget) ||
>+     ( (mXlibDrawTarget) &&
>+     ( (aSurface->GetSize() != mXlibDrawTarget->GetSize()) ||
>+       (aSurface->GetFormat() != mXlibDrawTarget->GetFormat())) ) ) {

Remove (mXlibDrawTarget) and unnecessary parentheses.

>+    DeallocateDeviceData();

I think this can be removed now.

>+    if (!xrenderFormat) {
>+      return false;
>+    }
>+
>+    surf = gfxXlibSurface::Create(screen, xrenderFormat,
>+                                  ThebesIntSize(aSurface->GetSize()));
>+
>+    if (!surf) {
>+      return false;
>+    }

Perhaps this should fall back to an image surface if the format or xlib
surface is not available.  The fall back case doesn't need to perform well,
but could print a warning.  An image surface would mean something is still
drawn.

>+  // Note : Incremental update with a source offset is only used
>+  // on Mac, so let's ignore aSrcOffset

Can you add an assertion that aSrcOffset is null, please.

>+  while ((iterRect = iter->Next())) {

Don't need so many parentheses here.

>--- a/gfx/layers/composite/TextureHost.cpp
>+++ b/gfx/layers/composite/TextureHost.cpp
>@@ -36,20 +36,24 @@
> #ifdef MOZ_WIDGET_GONK
> #include "../opengl/GrallocTextureClient.h"
> #include "../opengl/GrallocTextureHost.h"
> #include "SharedSurfaceGralloc.h"
> #endif
> 
> #ifdef MOZ_X11
> #include "mozilla/layers/X11TextureHost.h"
> #endif
> 
>+#ifdef MOZ_WIDGET_GTK
>+#include "basic/X11BasicCompositor.h"
>+#endif
>+

Not required.
Attachment #8447180 - Flags: review?(karlt) → review-
(In reply to Karl Tomlinson (back July 17 :karlt) from comment #69)
> This is main issue I see here.  Fix this before any further performance tests
> with this patch.
> 
> If the performance is still poor then I'm surprised that adding a little work
> to the compositor thread slows down the test results because I would have
> expected the tests to be limited by the main-thread cpu requirements.  Unless
> perhaps there is only one cpu thread on the test machines, or if the main
> thread is blocking on the compositor thread for some reason.
> Also, best to avoid memory allocations where practical, so rearrange things
> to
> make |iter| a stack variable.

Just considering your comments about nsIntRegionRectIterator... is patch v.6 above is aligned with what you suggested ? I'm not sure I fully got what you said. I understood that the "if (aDestRegion)" branch was ok, while the if (!aDestRegion) one did call nsIntRegionRectIterator with an nsIntRect argument (instead of nsIntRegion). So I changed that to plainly use on-stack Iter that would, in the latter case, just *not* use nsIntRegionRectIterator at all.

If that's basically aligned with what you suggested, then I'm afraid we're out of luck, since this change doesn't seem to add an extra value if we compare it to patch v.5... here :

patch v.5 try run : 
https://tbpl.mozilla.org/?tree=Try&rev=33ad5bf67423

patch v.6 try run :
https://tbpl.mozilla.org/?tree=Try&rev=65a6f5ab8a1c

comparison between the 2:
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=33ad5bf67423&newRev=65a6f5ab8a1c&submit=true

We see that we're making only less-than-1-percent gains on almost every tests (which is far from the 12%, 18% and so on for tart, cart and others that we had)... except for tp5o-shutdown_paint. But this one has got such a huge variance, that I bet this gain will flatten out if I retrigger 'tp' talos jobs couple of times.

Still trying to understand what's going on.
In the meantime, feel free to comment, cheer up, etc... ;-)
thanks.
The patch I'm referring to in Comment 70.
Attachment #8447180 - Attachment is obsolete: true
Attachment #8452597 - Flags: feedback?(karlt)
Complete operf opreport results comparing CPU profiles for GTK3 OMTC basic ImageSurf VS. GTK3 OMTC basic ImageSurf+Xlib_buffering builds.
Here are the results of my operf CPU profiling sessions.
The profiles have been gathered comparing those two configs :

1) GTK3 build with OMTC basic with gfxImageSurface surfaces for content
   See this attachment for results : OPERF_SYMBOLS_GTK3_OMTC_basic_ImageSurf__tsvgx.opreport

2) GTK3 build with OMTC basic with gfxImageSurface surfaces + gfxXlibSurface for content
   See this attachment for results : OPERF_SYMBOLS_GTK3_OMTC_basic_ImageSurf+XlibBuffering__tsvgx.opreport

The intention is to help us understand the root cause of performance degradation brought by this patch :
OMTC-Basic_Buffer-Image-Layers-to-Xlib-Surfaces_v6.patch

Those CPU profiling session were driven while running those talos tests :
 cart
 tart
 tcanvasmark
 tsvgx

as those tests were the main ones incurring the largest performance drops seen on initial talos tests on try runs. 

(for the complete result set, see this other attached file : operf_complete_results.tar.gz
This file includes also all the Callgraphs obtained with "opreport -c -l" )


By looking at those results, we can see that for each one of them, there's a recurring set of symbols brought up in topmost lines :

32866     4.2924  libpixman-1.so.0.28.2    firefox                  sse2_blt
21920     2.8628  intel_drv.so             Xorg                     memcpy_to_tiled_x

In my case (on intel driver), they seems to replace this : 

4054      0.5858  libpixman-1.so.0.28.2    firefox                  sse2_composite_src_x888_8888
Nice!
I finally found out what's happening!
After having run all those CPU profiles, I figured that we seemed to copy stuff around too much considering we're supposed to have partial updates for damaged layer parts only ! (main hits were localized in memcpy- related functions)

So I checked if the "if (aDestRegion)" path was taken (even though I'm sure I had already checked that previously) and the answer was "no"... with the patches applied we were *always* doing full surface updates !

I went up couple of steps higher and discovered that ContentHostDoubleBuffered is NOT driving partial updates by default because it supposes that it's exclusively using image texture updates with "render-to-texture" and thus, has no "real" updates to do...

See : http://lxr.mozilla.org/mozilla-central/source/gfx/layers/composite/ContentHost.cpp#357

Now, those assumptions become FALSE with my patches applied (Xlib buffering).


So I tried 2 things :
---------------------

1) Quick hack to ContentHostDoubleBuffered that adds partialDamage circuitery 

2) Force creation of a ContentHostSingleBuffered content host instead (which uses partial updates by default).


Good news !  Both solutions made us gain in performance (when comparing to OMTC basic + Image surfaces for content). And what surprised me a little is that solution 2) with singleBuffered host was faster than double buffered.

I'll post the results in a little moment (and links to proper try runs). Just wanted to share that as soon as I could.
Hi Karl, the more I read about DoubleBuffered/SingleBuffered contentHosts, the more I think that ContentHostDoubleBuffered/ContentClient might not the the most suitable contentClient/Host pair for our OMTC basic Image+XlibBuffering patch...

See those comments :
http://lxr.mozilla.org/mozilla-central/source/gfx/layers/composite/ContentHost.h#198
http://lxr.mozilla.org/mozilla-central/source/gfx/layers/client/ContentClient.h#307
and
http://lxr.mozilla.org/mozilla-central/source/gfx/layers/client/ContentClient.h#372

It's clearly indicated both in ContentClientXXX and ContentHostXXX that the "DoubleBuffered" flavor is not meant for direct uploads in the server... which made me hack solution 2) (explained in Comment 76) which basically just forces creation of ContentClientSingleBuffered/ContentHostSingleBuffered pair when having use-image-offscreen-surfaces prefs and UseXRender().

it works and at last, we get bonus points for performance this time (over the gfxImageSurface for content solution)

Now, I'd like to get your thoughts about that as soon as you've got free cycles, because I'd like to wrap up patch version 7 and this will require that we agree on the SingleBuffer/DoubleBuffer question.
thanks !
Tag needinfo for Karl when he comes back.  Jeff, anybody else we can ask in the meantime?
Flags: needinfo?(karlt)
Flags: needinfo?(jmuizelaar)
Ah much better. I finally got try server talos test results for the two solution above. Here's the results :

0) Regular GTK3 build with Image-backed layers : https://tbpl.mozilla.org/?tree=Try&rev=a0ed04882602
1) DoubleBuffered patch : https://tbpl.mozilla.org/?tree=Try&rev=e5ad08c40318
2) force SingleBuffered patch : https://tbpl.mozilla.org/?tree=Try&rev=3f29599afdb2

and the comparison (Vs.(0), the image-only GTK3 build) are :

(1) VS. (0) : http://perf.snarkfest.net/compare-talos/index.html?oldRevs=a0ed04882602&newRev=e5ad08c40318&submit=true

(2) VS. (0) : http://perf.snarkfest.net/compare-talos/index.html?oldRevs=a0ed04882602&newRev=3f29599afdb2&submit=true


Some observations : 
**********************

 We see that the "SingleBuffered" solution seems better perf-wise
 
 For SingleBuffered solution, we see that we're gaining much better than before.
 All the negative perf. are close to 0%, except for those :

   - tcanvasmark
   - tpaint_64bit
   - webrtc (this test was not previously tried with former talos runs)
  Some afterthoughts :

In the last couple of comments above, forcing the use ContentHostSingleBuffered works ok, but it's not ideal perf-wise because we have to *synchronously* upload before returning control the main thread (ContentClientSingleBuffered will wait for reply from compositor before it writes to the buffer again).

So really, what we'd like to have here is a DoubleBufferred scheme.
I already proposed a quick hack uphere, but it's not working as expected, and I discovered why.
The actual visual result when applying this patch  :https://hg.mozilla.org/try/rev/9b6980c4faee
is that we seem to swap from a correctly updated frame to another one, mostly black, and that's exactly what's happening in fact. Here :

We're having this now:

               ContentClientRemoteBuffer
                           ^
                           |
               ContentClientDoubleBuffered----------+----------------+
                    |                               |                |
                    |                          mFrontClient    mTextureClient
client              |                               |                |
--------------------+-------------------------------------------------------------
host                |                               |                |
                    |                       mFrontTextureHost  mBackTextureHost
               ContentHostDoubleBuffered            |            ^   |
                            |                       |            |   |
                            +--UseTextureHost()------------------+   |
                                                    |                |
                                                    |                |
                                  X11DataTextureSourceBasic    X11DataTextureSourceBasic
                                              |                             |
                                           Upload                        Upload
                                        CopySurface()                 CopySurface()
                                              |                             |
                                       ---------------               ---------------
                                       |             |               |             |
                                       | Xlib/pixmap |               | Xlib/pixmap |
                                       |    buffer   |               |   buffer    |
                                       ---------------               ---------------
                                                   |                    |
                                                    \ COMPOSITE LAYERS /
                                                              |
                                                              .
                                                        mRenderTarget
                                                  BasicCompositor::EndFrame()
                                                              |
                                                        CopySurface()
                                                              |            
******************************************************************************************
GtkWidget                                                     |
                         ---------------               -------v-------
                         | FRONT buffer|               | BACK BUFFER |
                         |  Drawable   |<---glxSwap--->|  Drawable   |
                         |             |               |             |
                         ---------------               ---------------


See, so in short, we :
1) Update one TextureHost with new content
2) Upload updated content to this TextureHost's **own** X11DataTextureSourceBasic's XlibSurface
3) Composite final result with that new content and copying that to our gtkWidget Backbuffer.
4) Swap the window buffers (back buffer becomes all black)
then... we:
5) Update the OTHER TextureHost with new content (updated content only)
6) Upload to the *other* XlibSurface only this new content
7) Composite final image with those incomplete/partial bits and copy to window.
8) Swap

when that second swap occurs, we end up with partial content on screen.

Now, we should fix that by :

  ** Allocating only ONE XlibSurface (or one X11DataTextureSourceBasic) per mFrontClient/mTextureClient pair. So the above diagram would become : 


               ContentClientRemoteBuffer
                           ^
                           |
               ContentClientDoubleBuffered----------+----------------+
                    |                               |                |
                    |                          mFrontClient    mTextureClient
client              |                               |                |
--------------------+-------------------------------------------------------------
host                |                               |                |
                    |                       mFrontTextureHost  mBackTextureHost
               ContentHostDoubleBuffered            |            ^   |
                            |                       |            |   |
                            +--UseTextureHost()------------------+   |
                                                    |                |
                                                    |                |
                                  X11DataTextureSourceBasic    X11DataTextureSourceBasic
                                              |                             |
                                           Upload                        Upload
                                        CopySurface()                 CopySurface()
                                                    \                 /
                                                     \               /
                                                      v-------------v
                                                      |             |
                                                      | Xlib/pixmap |
                                                      |    buffer   |
                                                      ---------------
                                                              |
                                                      COMPOSITE LAYERS
                                                              |
                                                              .
                                                        mRenderTarget
                                                  BasicCompositor::EndFrame()
                                                              |
                                                        CopySurface()
                                                              |            
******************************************************************************************
GtkWidget                                                     |
                         ---------------               -------v-------
                         | FRONT buffer|               | BACK BUFFER |
                         |  Drawable   |<---glxSwap--->|  Drawable   |
                         |             |               |             |
                         ---------------               ---------------




... but quite frankly, I'm not sure we can crossReference the same XlibSurface from on TextureHost to the other ! I'm not sure the current architecture allows that (So would it be better finally to revert to SingleBuffering ??)

One way I could see that working is if the *ContentHostDoubleBuffered* was to allocate and maintain the gfxXlibSurface to buffer content and, in turn, it could pass it to one textureHost or the other. But that would require some API change at the TextureHost level.

ok, enough for now.
cheers,
Comment on attachment 8452597 [details] [diff] [review]
OMTC-Basic_Buffer-Image-Layers-to-Xlib-Surfaces_v6.patch

Yes, this avoids the heap allocation and referencing a deleted temporary,
thanks.

Keep the DeallocateDeviceData() method override, though.
In my comment before, I meant that there was no need to reset mXlibDrawTarget
in Update() because that is where it is set.

>+    const nsIntRect *iterRect;
>+    while ((iterRect = iter.Next())) {

I see the extra parentheses here are the compiler's recommendation for stating
"yes, I really mean assignment".  Better though is to change it to
initialization, which makes it clearer that iterRect is only used in the loop:

    while (const nsIntRect* iterRect = iter.Next()) {
Attachment #8452597 - Flags: feedback?(karlt) → feedback+
(In reply to Frederic Plourde (:fred23) from comment #76)
> I went up couple of steps higher and discovered that
> ContentHostDoubleBuffered is NOT driving partial updates by default because
> it supposes that it's exclusively using image texture updates with
> "render-to-texture" and thus, has no "real" updates to do...

Ah, thanks.  That explains things.

(In reply to Frederic Plourde (:fred23) from comment #77)
> It's clearly indicated both in ContentClientXXX and ContentHostXXX that the
> "DoubleBuffered" flavor is not meant for direct uploads in the server...

Yes, sounds like this is not designed for this purpose.
And I suspect it would take considerable work to make it suitable.

(In reply to Frederic Plourde (:fred23) from comment #80)
>   ** Allocating only ONE XlibSurface (or one X11DataTextureSourceBasic) per
> mFrontClient/mTextureClient pair. So the above diagram would become : 

Sounds sensible.  We only need one server-side buffer, but ...

> ... but quite frankly, I'm not sure we can crossReference the same
> XlibSurface from on TextureHost to the other ! I'm not sure the current
> architecture allows that

Exactly.
I think ContentClientIncremental is most similar to your idea.

The options available I think are ContentClientIncremental or
ContentClientSingleBuffered.  "Incremental" allocates and copies more, while
"single" blocks on the compositor thread.  I don't know which would be better.
I would guess that the single buffer would not need to block often, but I
don't know whether or not the current implementation provides minimal locking
interference.

Different platforms have chosen different solutions and it might be
interesting to know the reasons for the different choices.
Or maybe just try each and see which gives better results.

If you are short on time, ContentClientSingleBuffered seems sensible to me.
Any further improvements can be added later.
Flags: needinfo?(karlt)
(In reply to Karl Tomlinson (back July 17 :karlt) from comment #82)
> Yes, sounds like this is not designed for this purpose.
> And I suspect it would take considerable work to make it suitable.
> 
Indeed!, I took an hour or so yesterday to hack something that would mainly make ContentHostDoubleBuffered own the XlibDrawTarget and pass it along to the right TextureHost (and then to its X11DataTextureSourceBasic)... The quick hack was ugly, and it required * a lot* of API additions to pass stuff around, or new X11 overrides of ContentHosts, etc... so I really don't think we could make that "clean" without much pain.


> The options available I think are ContentClientIncremental or
> ContentClientSingleBuffered.  "Incremental" allocates and copies more, while
> "single" blocks on the compositor thread.  I don't know which would be
> better.
> I would guess that the single buffer would not need to block often, but I
> don't know whether or not the current implementation provides minimal locking
> interference.
> Different platforms have chosen different solutions and it might be
> interesting to know the reasons for the different choices.
> Or maybe just try each and see which gives better results.
> 
> If you are short on time, ContentClientSingleBuffered seems sensible to me.
> Any further improvements can be added later.

Cool! I have 2 days left, it's enough time to try both solutions (Single + Incremental) and get talos results before making a final decision. I'll do that thanks !
(In reply to Frederic Plourde (:fred23) from comment #83)
> Cool! I have 2 days left, it's enough time to try both solutions (Single +
> Incremental) and get talos results before making a final decision. I'll do
> that thanks !

Ok, I tried Content{Client,Host}Incremental on linux GTK3 and it segfaults at some point. So this path really seems polish only for macosx at the moment. I'll go for Content{Client,Host}SingleBuffered, which I know works ok.
Ok, here's the big shiny post.
See patch version 7 herein attached for review.
I included every reviewer's suggestions/amendments :

(In reply to Karl Tomlinson (back July 17 :karlt) from comment #69)
> The other comments below are smaller issues, and not relevant unless we can
> understand the performance results.
> 
> Also, best to avoid memory allocations where practical, so rearrange things
> to
> make |iter| a stack variable.
> 
> >+#include "mozilla/layers/ImageDataSerializer.h"  // for ImageDataSerializer
> No longer needed.

Gone.

> >+  if ( (!mXlibDrawTarget) ||
> >+     ( (mXlibDrawTarget) &&
> >+     ( (aSurface->GetSize() != mXlibDrawTarget->GetSize()) ||
> >+       (aSurface->GetFormat() != mXlibDrawTarget->GetFormat())) ) ) {
> 
> Remove (mXlibDrawTarget) and unnecessary parentheses.

Done.
 
> >+    DeallocateDeviceData();
> 
> I think this can be removed now.


I replaced the override and got rid of the call in ::Update


> >+    if (!xrenderFormat) {
> >+      return false;
> >+    }
> >+
> >+    surf = gfxXlibSurface::Create(screen, xrenderFormat,
> >+                                  ThebesIntSize(aSurface->GetSize()));
> >+
> >+    if (!surf) {
> >+      return false;
> >+    }
> 
> Perhaps this should fall back to an image surface if the format or xlib
> surface is not available.  The fall back case doesn't need to perform well,
> but could print a warning.  An image surface would mean something is still
> drawn.

Done.

> >+  // Note : Incremental update with a source offset is only used
> >+  // on Mac, so let's ignore aSrcOffset
> 
> Can you add an assertion that aSrcOffset is null, please.

Done.

> >+  while ((iterRect = iter->Next())) {
> 
> Don't need so many parentheses here.

Done. And added the 'const' declaration within the parentheses.


> >--- a/gfx/layers/composite/TextureHost.cpp
> >+++ b/gfx/layers/composite/TextureHost.cpp
> >@@ -36,20 +36,24 @@
> > #ifdef MOZ_WIDGET_GONK
> > #include "../opengl/GrallocTextureClient.h"
> > #include "../opengl/GrallocTextureHost.h"
> > #include "SharedSurfaceGralloc.h"
> > #endif
> > 
> > #ifdef MOZ_X11
> > #include "mozilla/layers/X11TextureHost.h"
> > #endif
> > 
> >+#ifdef MOZ_WIDGET_GTK
> >+#include "basic/X11BasicCompositor.h"
> >+#endif
> >+
> 
> Not required.

Removed.



(In reply to Karl Tomlinson (back July 17 :karlt) from comment #69)
> Comment on attachment 8447180 [details] [diff] [review]
> OMTC basic Layer Buffering (v5)
> 
> This looks close, but the performance issues need to be understood.


Now, concerning performance.
I couldn't get new talos results for *this* particular patch #7 because  hg.m.o is literraly choking with requests today (2014/07/17) as per Bug 1040308. But fortunately enough, I had already gathered full talos results for the previous version, patch v.6, which should have the same perf because v.7 only got minor fixes, really.

So let's take a look at
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=a0ed04882602&newRev=3f29599afdb2&submit=true

This shows performance results of the current patch VS the "Remove_Xlib_surfaces_from_CreateOffscreenSurface (v2)" patch
So it's showing the impact of buffering layer content into Xlib surfaces on compositor side.

Generally speaking, performance is *way* better than what it was !
Here are some examples : 

Tp5 Optimized MozAfterPaint (tp5o_shutdown_paint)
Ubuntu 32 (details)	846.33	±263.92	212.33	±337.34	-634	-74.91%
Ubuntu 64 (details)	932	±0	240	±0	-692	-74.25%

Customization Animation Tests (cart)
Ubuntu 32 (details)	51.89	±0.32	50.03	±1.2	-1.86	-3.58%
Ubuntu 64 (details)	47.28	±0	44.35	±0	-2.93	-6.2%

TResize (tresize)
Ubuntu 32 (details)	26.73	±0.12	25.65	±2.2	-1.08	-4.04%
Ubuntu 64 (details)	26.76	±0	23.07	±0	-3.69	-13.79%

tscroll-ASAP (tscrollx)
Ubuntu 32 (details)	13.58	±9.48	7.6	±0.04	-5.98	-44.04%
Ubuntu 64 (details)	9.88	±0	6.56	±0	-3.32	-33.6%

.
There are still some performance drops, though, and the following will be taking a look at every single one of them carefully, because I think that we may be able to check the patches in anyway, since most of the regressions can be understood.

Ok first, as you can see, most of the regressions are within the 3% limit.
There are still some of them with higher percentages, here :



Tp5 Optimized (XRes) (tp5o_xres_paint)
Ubuntu 32 (details)	14191.58	±24053.38	3138863.33	±37480.5	3124671.75	22017.79%
Ubuntu 64 (details)	32000.2	±0	3152950	±0	3120949.8	9752.91%

This one, we mostly understood it. We're comparing this one against an "Image-only" implementation (patch #8446913), so we're getting fooled here. If we compare against regular builds on m-c using xlib surfaces, we're actually GAINING memory. So all good.



Session Restore no Auto Restore Test (sessionrestore_no_auto_restore)
Ubuntu 32 (details)	725.11	±16.58	728.19	±19.24	3.08	0.42%
Ubuntu 64 (details)	668.33	±0	692.89	±0	24.56	3.67%

I'm tempted to disregard this one, because it's very close to the 3% and got 1.34% on another run with very close config)



Paint (tpaint)
Ubuntu 32 (details)	187.93	±7.76	186.13	±1.18	-1.8	-0.96%
Ubuntu 64 (details)	170.39	±0	178.75	±0	8.36	4.91%

This test only paints the window once and exits. So my guess is that the test is actually measuring our "Xlib buffer creation" overhead here and I'm pretty sure that in a regular/real-life browsing session, we would not even see/notice it.



WebRTC Media Performance Tests (media_tests)
Ubuntu 64 (details)	4.43	±0	4.74	±0	0.31	7%

This test appeared not long ago in the linux/linux64 talos test series and quite frankly, I don't know why it didn't run for linux32. I know nothing about this test, as I haven't seen it run on my local machines (is it part of talos ?) Would it be possible to revert to Xlib content surfaces in the /WebRTC code ?



CanvasMark (tcanvasmark_paint)
Ubuntu 32 (details)	8174.33	±213.62	9167	±336.78	992.67	12.14%
Ubuntu 64 (details)	8938.5	±0	9627	±0	688.5	7.7%

That's the only one that annoys me really. At first, couldn't figure out why we had such a hit on canvasMark, because, by the way, I had extensively tested the patch with numerous canvas2D/canvas3D examples and demos out there and everything seemed just fine. But I found out why we're having trouble here. There's one test in the canvasMark talos test suite that just "freezes" for, like, 14 seconds !... it's the canvas3D test, which renders many successive numbered boxes in a spiral shape. Well, this test starts just well but then, after like 20 boxes, it freezes, and that's what's impacting the test results.

If it seems sensible to you, Karl, and since the patches are not triggered by default (but by preferences), and if the above performance analysis seems valid to your opinion, I suggest we check the patches in after review and open a new bug about this specific canvas3D example that freezes.
Attachment #8452597 - Attachment is obsolete: true
Attachment #8458274 - Flags: review?(karlt)
Flags: needinfo?(jmuizelaar)
Comment on attachment 8458274 [details] [diff] [review]
OMTC-Basic_Buffer-Image-Layers-to-Xlib-Surfaces (v.7)

>+    XRenderPictFormat *xrenderFormat =
>+      gfxXlibSurface::FindRenderFormat(display,
>+      SurfaceFormatToImageFormat(aSurface->GetFormat()));
>+
>+    if (xrenderFormat) {
>+      surf = gfxXlibSurface::Create(screen, xrenderFormat,
>+                                    ThebesIntSize(aSurface->GetSize()));
>+      if (!surf) {
>+        NS_WARNING("Couldn't create native surface, fallback to image surface");
>+        surf = new gfxImageSurface(ThebesIntSize(aSurface->GetSize()),
>+                            SurfaceFormatToImageFormat(aSurface->GetFormat()));
>+      }
>+    } else {
>+      NS_WARNING("Couldn't find valid xrenderFormat, fallback to image surface");
>+      surf = new gfxImageSurface(ThebesIntSize(aSurface->GetSize()),
>+                            SurfaceFormatToImageFormat(aSurface->GetFormat()));
>+    }

Please save the result of SurfaceFormatToImageFormat(aSurface->GetFormat()))
in a local variable, so that there is only one call, and move the "if (!surf)"
{} block to after the "if (xrenderFormat)" {} block, so that only one block of
code is required for the gfxImageSurface.  (Simpler code wins over one less
branch in the !xrenderFormat situation.)
Attachment #8458274 - Flags: review?(karlt) → review+
In this bug we have:

Part (a) a patch that adds a GTK3-only option, disabled by default, to use
         image surfaces for drawing while still compositing with Xrender.

         This provides a configuration option similar in general concept to
         D3D9 and GL platforms.

Performance wasn't really the goal here, but the goal was to make Linux
graphics more similar to other platforms, perhaps hoping that improvements on
other platforms would make up for losses, and perhaps Skia might one day be the silver bullet.  Still, some kind of sanity in performance is desirable.

We don't have all performance results for the differences here from Xlib
surfaces, due to talos crashes in the base GTK3 OMTC configuration.
(Using image surfaces seems to work around these crashes, even though the
crashes are not obviously related to Xlib code.)

https://tbpl.mozilla.org/?tree=Try&rev=77f8cec08771

We do have some performance comparisons:
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=77f8cec08771&newRev=a0ed04882602&submit=true

One perf drop is in SVG, which may be kind of expected from switching from GPU to
CPU drawing.

SVG-ASAP (tsvgx)
Ubuntu 32 (details)	491.75	±1.6	531.25	±10.78	39.5	8.03%
SVG, Opacity Row Major (tsvgr_opacity)
Ubuntu 32 (details)	317.25	±0.7	380.1	±7.4	62.85	19.81%

Given the goals, I don't think that should prevent this patch from landing.
This patch provides a configuration to further develop.

However, the big cost in performance is the drop in scrolling speed.
That is expected, but very large.  Part (b) is about addressing this.

tscroll-ASAP (tscrollx)
Ubuntu 32 (details)	7.63	±0.08	12.49	±7.34	4.86	63.7%

FWIW here are some comparisons with the corresponding Elm branch MTC build,
but differences include both OMTC and image surface drawing, so it's not
possible to draw much in the way of conclusions.
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=e43e762afb47&newRev=a0ed04882602&submit=true

AFAIK we also don't have performance comparisons with images surfaces with
software compositing (if that works).

Part (b) a patch that caches uploads, similar to D3D9 and GL platforms, and
         switches to single buffered OMTC client-host communication, similar
         to D3D9 platforms.

I triggered a few additional runs for perf measurements here, which helps
perf.snarkfest.net realize that most regressions were not statistically
significant.

http://perf.snarkfest.net/compare-talos/index.html?oldRevs=a0ed04882602&newRev=3f29599afdb2&submit=true

The big win was tscrollx, making up for the loss in part (a).

tscroll-ASAP (tscrollx)
Ubuntu 32 (details)	12.49	±7.34	7.59	±0.04	-4.9	-39.23%
Ubuntu 64 (details)	9.99	±0.2	6.54	±0.1	-3.45	-34.53%

The only clearly significant regression was

CanvasMark (tcanvasmark_paint)
Ubuntu 32 (details)	8125.3	±225.84	9121.8	±277.56	996.5	12.26%
Ubuntu 64 (details)	9048.33	±229.56	9906.33	±506.54	858	9.48%

Considering this together with the scrollx result is still a net win for
part (b) IMO.

(In reply to Frederic Plourde (:fred23) from comment #85)

> There's one test in the canvasMark talos test suite that just
> "freezes" for, like, 14 seconds !... it's the canvas3D test, which renders
> many successive numbered boxes in a spiral shape. Well, this test starts
> just well but then, after like 20 boxes, it freezes, and that's what's
> impacting the test results.

Obviously there is a problem here, but one that can be addressed separately, I
guess.

> If it seems sensible to you, Karl, and since the patches are not triggered
> by default (but by preferences), and if the above performance analysis seems
> valid to your opinion, I suggest we check the patches in after review and
> open a new bug about this specific canvas3D example that freezes.

I'm happy to consider this a step forward for the image surface rendering
approach, available as a new configuration option.

However, I ran some GTK2 OMTC tests today and there are two small regressions
there, from the second patch I assume because the first patch should have no
effect with GTK2.  These compare results with and without the patches here,
both with OMTC:

http://perf.snarkfest.net/compare-talos/index.html?oldRevs=d908fdd983ed&newRev=c88525908c0b&submit=true

CanvasMark (tcanvasmark_paint)
Ubuntu 32 (details)	5086.08	±76.6	5179.83	±260.92	93.75	1.84%
Ubuntu 64 (details)	5317.33	±92.26	5440.44	±130.92	123.11	2.32%

This is a much smaller regression than the GTK3 regression.  I notice that
OMTC layers use image surfaces for canvas while MTC layers use Xlib surfaces.
The speed here is also surprisingly better than GTK3 numbers.

I don't claim to understand TextureFlags::IMMEDIATE_UPLOAD well but wonder
whether that might be causing trouble here.

I'd be happy with the small GTK2 OMTC CanvasMark regression in the name of
making things more similar to other platforms, and hope to benefit from OMTC
future update optimizations implemented for other platforms.

However there is one other GTK2 regression:

TResize (tresize)
Ubuntu 32 (details)	21.76	±0.24	22.69	±2.86	0.93	4.27%
Ubuntu 64 (details)	19.3	±0.6	19.94	±2.44	0.64	3.32%

I don't know what this test does, but the results with the patch look quite
bi-modal.  Looking at mozilla-central (MTC) results they are also quite
bi-modal.

http://graphs.mozilla.org/graph.html#tests=[[254,131,35]]&sel=none&displayrange=7&datatype=running
http://graphs.mozilla.org/graph.html#tests=[[254,131,33]]&sel=none&displayrange=7&datatype=running

Somehow the unpatched OMTC (GTK2) results hold the better mode, but I don't
know enough about this test to know whether we should make decisions on it.

A safe approach might be to re-introduce the UseImageOffscreenSurfaces() test
for choosing X11BasicCompositor that I asked you to remove in comment 44.
That way X11BasicCompositor would only be used for GTK3, where it has shown
significant benefit.
Hey Karl, project ends today, but it doesn't mean that I'll stop following-up those patches suddenly ;-) However, I'm starting a 1-week vacation next monday (July 21st), so I'll be back on the 28th if there's anything to change/add to those, no problem, I'll do it gladly.

In the meantime though, I'm attaching patch v.8 here (without obsoleting v.7) with the very latest modifs that you suggested after having investigated performance. So feel free to r+ that one instead and push if it suits our needs better, even if I'm still on vacation, no worries. And if not, we'll check it out together when I come back.

thanks for your sustained cooperation, much appreciated !
cheers, 


Below are the latest modifs included in v.8
-------------------------------------------
(In reply to Karl Tomlinson (back July 17 :karlt) from comment #87)
> (In reply to Frederic Plourde (:fred23) from comment #85)
> 
> > There's one test in the canvasMark talos test suite that just
> > "freezes" for, like, 14 seconds !... it's the canvas3D test, which renders
> > many successive numbered boxes in a spiral shape. Well, this test starts
> > just well but then, after like 20 boxes, it freezes, and that's what's
> > impacting the test results.
> 
> Obviously there is a problem here, but one that can be addressed separately,
> I guess.

Yes, I believe it's rather isolated issue trackable separately in a new bug.
I'll file it today and CC you on it. You can afterwards CC the relevant people on it and assign.

> > If it seems sensible to you, Karl, and since the patches are not triggered
> > by default (but by preferences), and if the above performance analysis seems
> > valid to your opinion, I suggest we check the patches in after review and
> > open a new bug about this specific canvas3D example that freezes.
> 
> I'm happy to consider this a step forward for the image surface rendering
> approach, available as a new configuration option.

nice!

> A safe approach might be to re-introduce the UseImageOffscreenSurfaces() test
> for choosing X11BasicCompositor that I asked you to remove in comment 44.
> That way X11BasicCompositor would only be used for GTK3, where it has shown
> significant benefit.

Ok, in the light of the numbers you brought, I agree.
See patch v.8, it's got the UseImageOffscreenSurfaces() test back to exclude this implementation from GTK2 builds.

> Please save the result of SurfaceFormatToImageFormat(aSurface->GetFormat()))
> in a local variable, so that there is only one call, and move the "if
> (!surf)"
> {} block to after the "if (xrenderFormat)" {} block, so that only one block
> of
> code is required for the gfxImageSurface.  (Simpler code wins over one less
> branch in the !xrenderFormat situation.)

Done.
Attachment #8458780 - Flags: review?(karlt)
See Also: → 1040917
Comment on attachment 8458780 [details] [diff] [review]
OMTC-Basic_Buffer-Image-Layers-to-Xlib-Surfaces_v8.patch

Thanks, Frederic!

>+    nsRefPtr<gfxASurface> surf = nullptr;

Just one little nit: The nsRefPtr constructor already initializes its pointer to nullptr, so we don't usually initialize explicitly.

I'll see if I can push these sometime this week.
Attachment #8458780 - Flags: review?(karlt) → review+
(In reply to Karl Tomlinson (back July 17 :karlt) from comment #87)
> The only clearly significant regression was
> 
> CanvasMark (tcanvasmark_paint)
> Ubuntu 32 (details)	8125.3	±225.84	9121.8	±277.56	996.5	12.26%
> Ubuntu 64 (details)	9048.33	±229.56	9906.33	±506.54	858	9.48%

Well, actually this is a gain, sorry :)
perf.snarkfest.net doesn't know that higher is better for CanvasMark [1].

> However, I ran some GTK2 OMTC tests today and there are two small regressions
> there, from the second patch I assume because the first patch should have no
> effect with GTK2.  These compare results with and without the patches here,
> both with OMTC:
> 
> http://perf.snarkfest.net/compare-talos/index.
> html?oldRevs=d908fdd983ed&newRev=c88525908c0b&submit=true
> 
> CanvasMark (tcanvasmark_paint)
> Ubuntu 32 (details)	5086.08	±76.6	5179.83	±260.92	93.75	1.84%
> Ubuntu 64 (details)	5317.33	±92.26	5440.44	±130.92	123.11	2.32%

So this was a gain too, but doesn't make much sense because the only
overridden BasicCompositor method, CreateDataTextureSource() is not called
while running CanvasMark with GTK2.

> TResize (tresize)
> Ubuntu 32 (details)	21.76	±0.24	22.69	±2.86	0.93	4.27%
> Ubuntu 64 (details)	19.3	±0.6	19.94	±2.44	0.64	3.32%

I ran these tests again with X11BasicCompositor enabled even when
!UseImageOffscreenSurfaces() and got quite different results [2]

TResize (tresize)
Ubuntu 32 (details)	23.27	±3.54	21.86	±1.34	-1.41	-6.06%
Ubuntu 64 (details)	19.36	±1.52	19.64	±2.56	0.28	1.45%

I think we're just seeing the variability in the test results here.

So I'll check in the variation with X11BasicCompositor enabled even when
!UseImageOffscreenSurfaces().

[1] https://wiki.mozilla.org/Buildbot/Talos/Tests#CanvasMark
[2] http://perf.snarkfest.net/compare-talos/index.html?oldRevs=adc1ba232865&newRev=02ea79bac9f5&submit=true
https://hg.mozilla.org/mozilla-central/rev/6e532c9826e7
https://hg.mozilla.org/mozilla-central/rev/a500c62330d4
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: