Closed Bug 528551 Opened 12 years ago Closed 12 years ago

Faster plugin drawing in Fennec.

Categories

(Core :: Plug-ins, defect)

ARM
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
status1.9.2 --- beta4-fixed
fennec 1.0+ ---

People

(Reporter: dougt, Assigned: dougt)

References

Details

Attachments

(4 files, 4 obsolete files)

Vlad and I spent some time looking at ways we could make flash as fast as possible.  The newly added NPImageExpose event helps, but we are stuck at only a few fps.  I would like to propose a new way handling plugins (and possibly video) for Fennec:

A bit of background.  The way it currently works is that:

1) the object frame get a paint request

2) we pass a memory buffer (gfxImageSurface) to flash and ask it to draw what was invalidated.

3) we draw this memory buffer to the xlib surface that is passed into our paint call.  (this xlib surface basically represents one of the "tile" canvas elements in fennec).

For most YouTube videos, we are seeing two paints (the video spans two tiles).  We are using 24 bit surfaces, and the screen is at 16 bit -- so someone down stream of all of this drawing (X11) is doing a conversion.

I added XShm so that we can avoid copying so much data between Fennec and X11.  This made some improvement, but we are no where close to what the default browser (MicroB) is doing in terms of performance.  Vlad and I tried many different approaches -- non yield anything close to double digit fps.  To debug further, we tried a simple Xul app that loaded YouTube.  As expected, FPS were comparable to MicroB and running flash directly in fullscreen mode.

In this simple application we avoiding all unnecessary copying and conversions.  The screen is at 16bpp, and the flash plugin is optimized for 16bpp (as told to us by Nokia).  Anything less than this will most likely half our FPS.

Given this, I think that having plugins draw directly to the screen bypassing the fennec's tile manager may be the only short term solution to having fast video.  

The approach we are thinking about was having Fennec tell each of the ObjectFrames where they should be drawn.  The frame painting code would honor the scaling, position, and size that the front end set.  In this way, we could have plugins do the right thing during pans and when content is below the Fennec sidebars/urlbar.
Flags: blocking1.9.2?
Attached patch patch v.1 (obsolete) — Splinter Review
Does not support clipping, or zooming.  wip.
Assignee: nobody → doug.turner
Attachment #412276 - Flags: review?(mozbugz)
Comment on attachment 412276 [details] [diff] [review]
patch v.1

karl, looking for a review.  it is a wip, but getting feedback early is appreciated.

please ignore:

-  mInstance->GetValue(nsPluginInstanceVariable_WindowlessLocalBool,
+  mInstance->GetValue(nsPluginInstanceVariable_WindowlessBool,
Attachment #412276 - Flags: review?(mozbugz)
Comment on attachment 412276 [details] [diff] [review]
patch v.1

This looks worth a shot.  Some things I noticed:

>+  void setAbsoluteScreenPosition(in nsIDOMElement element,
>+                                 in nsIDOMClientRect position,
>+                                 in nsIDOMClientRect clip);

You may want to think about making these integer rects.
Otherwise rounding issues could get complicated.

>+    shmctl(mSharedSegmentInfo.shmid, IPC_RMID, 0);

IIUC this should be called as early as possible, in case the application exits
abnormally, leaking the segment.

GDK calls this after XShmAttach and XSync in _gdk_image_new_for_depth.  I guess
it's not possible to attach to destroyed segments, so it can't be called before
XShmAttach has completed.

>+  XSetErrorHandler(handleXError);

gdk_error_trap_push() / gdk_error_trap_pop() may be convenient here.

>+  if (!XShmAttach(gdk_x11_get_default_xdisplay(), &mSharedSegmentInfo)) {
>+    XDestroyImage(mSharedXImage);
>+    mSharedXImage = nsnull;
>+    return PR_FALSE;
>+  }

There are some issues with cleaning up mSharedSegmentInfo here and elsewhere,
I think.
Blocks: 529098
Comment on attachment 412276 [details] [diff] [review]
patch v.1


>+  //memset (mSharedXImage->data, 'a', mPluginSize.width  * mPluginSize.height * 2);
>+  XShmPutImage(gdk_x11_get_default_xdisplay(),
>+               mBlitWindow,
>+               mXlibSurfGC,


What is the point of this implementation?
If you give to flash plugin normal X11 16bpp surface (normal windowless), flash will paint internal buffer picture with XShmPutImage to X11 surface...

In this case you just need to paint flash with normal WindowLess API to XSurface, and then blit that XSurface to the target XWindow...  (X11 Canvas surface or final widget drawable)

Btw: Windowless local rendering API make sense only in case if your rendering pipeline not interacting with X-surfaces...
I am not sure if you are asking about the approach in general, or if you are asking why I am calling XShmPutImage.  I will answer the general question first, and if I missed a specific question, please let me know.

In fennec, we have a tile manager (a set of 24bpp canvas elements each backed with a xlib surface).  The plugin currently draws to one more more of these canvas elements per frame using x surfaces.  E.g. a video might overlap as many as 4 or 5 tiles.  Each draw is a separate memcpy and conversion.

We are seeing no more than 5 fps using this approach.  As I mentioned in comment 1, we are looking for ways to get a 4-5x improvement and the fast path is to have a direct gfx path from flash to the screen.
Current MicroB pipeline can be represented like:
*Shared Buffer surface between engine and UI. (16bpp)
**Shared Buffer surface on UI - offscreen surface. (16bpp)

1) Flash paint to SHMBuffer*
2) Picture blitted from SHMBuffer* to SHMBuffer**
3) Fast painting from SHMBuffer** to target drawable with XShmPutImage.

In fennec case it should looks like:
1) Flash paint to Fennec Tile or Fennec canvas (XSurface) with XShmPutImage
3) Fast blit from Tile(Canvas) to target drawable (screen)
romaxa is pointing out that the mBlitWindow could be passed to the plugin to draw with normal windowless mode.  If there are no issues with scaling and flash handles the visual you request, then the only problem I can think of with that is that it is not possible to clip to anything other than a rectangle.  Complex clips could be managed by doing multiple draws (if we can get the clip as rectangles), or just falling back to the slow path.

In wmode=opaque does Flash skip the copy from Xlib surface to image?
> In wmode=opaque does Flash skip the copy from Xlib surface to image?
yes.
Actually, NPAPI says that the Drawable is a Pixmap.  With Xlib drawing a Window would probably be fine, but I assume Flash is using GDK.
I have reason to believe that GDK will get very confused if Flash passes a Window from a GdkWindow to gdk_pixmap_foreign_new_for_display().
http://bugzilla.gnome.org/show_bug.cgi?id=590690
a couple of thoughts ---

We arent using 16bpp, so going through gecko painting will cause us to do a conversion.  Vlad+Stuart can articulate better than I why we didn't accept the 16bpp patch.

The tile manager is composed of many tiles whereas the microb arch. just has one big tile.  For example, if you load a youtube video, in fennec, you might have 4 tiles that get dirty for every flash frame.  This causes 4x the work that needs to be done (eg each tile will separately be be drawn).  Even if we optimized the crap out of this pipeline, having to redraw each separate tile may be a large bottleneck for us.

the point of this bug, more or less, is to avoid as much as the tile manager, until we have layers support and the pepper plugin api.
> In fennec, we have a tile manager (a set of 24bpp canvas elements each backed
on N900 it is most probably 16bpp xlib surfaces...

> with a xlib surface).  The plugin currently draws to one more more of these
> canvas elements per frame using x surfaces.  E.g. a video might overlap as many
> as 4 or 5 tiles.  Each draw is a separate memcpy and conversion.
> 

then you should paint flash to X surface, and blit with clip to fennec tiles....

> is to have a direct gfx path from flash to the screen.

I would suggest to add normal support for XSHM to cairo, also add support for 16bpp image surface. and keep mozilla code without XSHM related hacks.
> on N900 it is most probably 16bpp xlib surfaces...

nope.  http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/src/gfxPlatformGtk.cpp#169

> then you should paint flash to X surface, and blit with clip to fennec
tiles....

each tile is a separate canvas and has a separate backend.  you cant really do that.
well, you can do what you suggested, but you still have multiple copies from your x surface to each canvas.
(In reply to comment #13)
> well, you can do what you suggested, but you still have multiple copies from
> your x surface to each canvas.

Yes, this is the problem...
We have experimental tiles implementation for MicroB, and I need to check is tiles dimension is fixed or dynamic there....

I think this problem possible to solve with dynamic tiles dimension...
ill give a go at removing the local rendering to see if has a perf impact.
i made use of the NativeDraw and basically did a bit of setup, and passed mBlitWindow as the drawable.  When we use the default wmode of youtube, i would see video play very slowly and it wasnt clipped.  when the wmode was forced to opaque, i did not see any video, just a black box where the video should be.

I am also worried about what karl was talking about -- does the drawable that we pass is assumed to be associated with a GdkPixmap and events will not be received.

My 1am thinking is that... although, what we currently have might not be ideal, it is working, very fast, and doesnt have the unknowns that karl is worried about.  And I hate to bring it up, but we are late in the Fennec cycle. Switching to anything that has more unknowns than the unknowns we know about is not what I think we should do.
Status: NEW → ASSIGNED
Stuart: does this block Fennec? I don't think it blocks us otherwise.
tracking-fennec: --- → 1.0+
Attached patch patch v.2 (obsolete) — Splinter Review
Attachment #412276 - Attachment is obsolete: true
Attachment #413409 - Flags: review?
Attachment #413409 - Flags: review?(mozbugz)
Attachment #413409 - Flags: review?(jst)
Attachment #413409 - Flags: review?
Comment on attachment 413409 [details] [diff] [review]
patch v.2

Looks good, but I didn't really read the X specific changes here. A couple of minor things caught my eye...

- In layout/generic/nsObjectFrame.cpp:

+  void SetAbsoluteScreenPosition(nsIDOMElement* element,
+                                 nsIDOMClientRect* position,
+                                 nsIDOMClientRect* clip)
+  {

I don't see why this would need to be inline.

- In nsPluginInstanceOwner::Paint():

+#ifdef MOZ_PLATFORM_HILDON
+  PRBool simpleImageRender = PR_FALSE;
+  mInstance->GetValue(nsPluginInstanceVariable_WindowlessLocalBool,
+                       (void *)&simpleImageRender);

Fix second line argument indentation.

+  if (simpleImageRender) {
+    gfxMatrix matrix = aContext->CurrentMatrix();
+    if (!matrix.HasNonAxisAlignedTransform() &&
+        NS_SUCCEEDED(NativeImageDraw()))
+        return;

"return" is over indented here.
Attachment #413409 - Flags: review?(jst) → review+
Comment on attachment 413409 [details] [diff] [review]
patch v.2

I'm still working through this, but here's some things thus far

>+  nsIObjectFrame* frame = GetExistingFrame(eFlushLayout);
>+  nsIObjectFrame *objFrame = do_QueryFrame(frame);

GetExistingFrame() has already done the do_QueryFrame for you.

>+  void SetAbsoluteScreenPosition(nsIDOMElement* element, nsIDOMClientRect* position, nsIDOMClientRect* clip);

80 columns please

>+#include <X11/Xlib.h>

Already have this

>+#include <gdk/gdkx.h>

no need for this as it only makes sense with MOZ_WIDGET_GTK2 anyway.

>+  void SetAbsoluteScreenPosition(nsIDOMElement* element,
>+                                 nsIDOMClientRect* position,
>+                                 nsIDOMClientRect* clip)
>+  {
>+#ifdef MOZ_PLATFORM_HILDON

I think it would be tidier to have the method definition outside the class
definition.

The declaration and definition can be completely within
"#ifdef MOZ_PLATFORM_HILDON"

>+    NS_ASSERTION(mBlitParentElement && (mBlitParentElement != element), "Can not change parent element.");

The logic doesn't look right here.
NS_ASSERTION asserts that the expression is true,
and warns/traps/aborts if false.

Should this be returning an nsresult != NS_OK for unexpected arguments?

>+    shmctl(mSharedSegmentInfo.shmid, IPC_RMID, 0);

Call XSync(Display*, False) after XShmAttach() and then do the IPC_RMID there.
This saves leaking the segment on ungraceful shutdown.

>+  PRBool simpleImageRender = PR_FALSE;
>+  mInstance->GetValue(nsPluginInstanceVariable_WindowlessLocalBool,
>+                       (void *)&simpleImageRender);

No need for the (void *) cast.

> void nsPluginInstanceOwner::Paint(gfxContext* aContext,
>                                   const gfxRect& aFrameRect,
>                                   const gfxRect& aDirtyRect)
> {
>   if (!mInstance || !mOwner)
>     return;
> 
>-  // Align to device pixels where sensible
>+#ifdef MOZ_PLATFORM_HILDON
>+  PRBool simpleImageRender = PR_FALSE;
>+  mInstance->GetValue(nsPluginInstanceVariable_WindowlessLocalBool,
>+                       (void *)&simpleImageRender);
>+  if (simpleImageRender) {
>+    gfxMatrix matrix = aContext->CurrentMatrix();
>+    if (!matrix.HasNonAxisAlignedTransform() &&
>+        NS_SUCCEEDED(NativeImageDraw()))
>+        return;
>+  }
>+#endif

Are you sure you don't want to fall back to the slow path when
SetAbsoluteScreenPosition() has not been called?

> void
>+nsObjectFrame::SetAbsoluteScreenPosition(nsIDOMElement* element,
>+                                         nsIDOMClientRect* position,
>+                                         nsIDOMClientRect* clip)
>+{
>+  NS_ASSERTION(mInstanceOwner, "failed to get mInstanceOwner");
>+  mInstanceOwner->SetAbsoluteScreenPosition(element, position, clip);
>+}

I would only do this when MOZ_PLATFORM_HILDON and return an error nsresult for
other platforms.

>+  // The hildon plugin needs to have it visibility poked
>+  UpdateVisibility();

Doing this in NativeImageDraw() means that the plugin's own invalidations will
mark it visible.

In nsPluginInstanceOwner::Paint(), before the NativeImageDraw() call, should
be a good place to UpdateVisibility().
thanks for the review so far.  I agree with all of your comments, and will post a new patch when you complete your review.

>Are you sure you don't want to fall back to the slow path when SetAbsoluteScreenPosition() has not been called?

We discussed this back and forth a bit.  The goal is to get the best video in front of the user as fast as possible.  The current thinking is that going down the slow path will delay the user from getting to the fast path as fast as possible.  So, we are ignoring the slow path.  Maybe we will reconsider this after more experimentation (or even pref it).
If you don't draw in nsPluginInstanceOwner::Paint(), then consider drawing in SetAbsoluteScreenPosition() because the plugin may not necessarily invalidate.
Unless you can be sure that SetAbsoluteScreenPosition() is called before nsPluginInstanceOwner::Paint()?
Comment on attachment 413409 [details] [diff] [review]
patch v.2

>+    mXlibSurfGC = nsnull;

= None;

>+  if (mSharedXImage) {
>+    XShmDetach(gdk_x11_get_default_xdisplay(), &mSharedSegmentInfo);
>+    XDestroyImage(mSharedXImage);
>+    shmdt(mSharedSegmentInfo.shmaddr);
>+    shmctl(mSharedSegmentInfo.shmid, IPC_RMID, 0);
>+    mSharedXImage = nsnull;
>+  }

mXlibSurfGC needs to be released here.

~nsPluginInstanceOwner gets it right.
I suggest a ReleaseXShm() method on nsPluginInstanceOwner that is called here and
in ~nsPluginInstanceOwner.

By initializing mSharedSegmentInfo.shmaddr = NULL and mXlibSurfGC = None, and
checking before releasing, the other cleanup code in SetupXShm() can also use
ReleaseXShm().

(Getting there.)
Comment on attachment 413409 [details] [diff] [review]
patch v.2

>+    XFreeGC(gdk_x11_get_default_xdisplay(), mXlibSurfGC);

The owner of mBlitWindow will need to make sure not to destroy the window (or
any of its ancestors before this is called.  That might be hard to guarantee,
so I think it would be best to only Create and Free the GC when needed in
NativeImageDraw().

>+    // The flash plugin on Maemo n900 requires the width/height to be
>+    // even.
>+    mAbsolutePosition = nsIntRect(NSToIntFloor(left),
>+                                  NSToIntFloor(top),
>+                                  NSToIntCeil(width) / 2 * 2,
>+                                  NSToIntCeil(height) / 2 * 2);

Think about only doing the rounding for rects passed to the plugin, but still
drawing the requested size on mBlitWindow.  That might produce fewer edge
effects.

>-  newClipRect.left = clipRect.x;
>-  newClipRect.top = clipRect.y;
>-  newClipRect.right = clipRect.XMost();
>-  newClipRect.bottom = clipRect.YMost();
>+  newClipRect.left = 0;
>+  newClipRect.top = 0;
>+  newClipRect.right = window->width;
>+  newClipRect.bottom = window->height;

Please include a comment to indicate that this is to work around a Flash bug.
The painting issues with non-origin clip offsets are probably bug 526797.

I can't see why you still need to call SetWindow() every time even when the
window members stay the same, but I'll let you decide what to do.

>-  exposeEvent.x = mDirtyRect.x;
>-  exposeEvent.y = mDirtyRect.y;
>-  exposeEvent.width = mDirtyRect.width;
>-  exposeEvent.height = mDirtyRect.height;

>-  imageExpose.x = mDirtyRect.x;
>-  imageExpose.y = mDirtyRect.y;
>-  imageExpose.width = mDirtyRect.width;
>-  imageExpose.height = mDirtyRect.height;

>+  exposeEvent.x = 0;
>+  exposeEvent.y = 0;
>+  exposeEvent.width  = window->width;
>+  exposeEvent.height = window->height;
>+
>+  imageExpose.x = 0;
>+  imageExpose.y = 0;
>+  imageExpose.width  = window->width;
>+  imageExpose.height = window->height;

You should still be able to get a dirty rect from InvalidateRect() and
nsPluginInstanceOwner::Paint() and intersect with the clip rect for this, but
file a follow-up for that if you prefer.  The same rect can be used for
XShmPutImage().

(Done.)
Attachment #413409 - Flags: review?(mozbugz) → review-
Flags: blocking1.9.2? → blocking1.9.2+
Attached patch patch v.3 (obsolete) — Splinter Review
Attachment #413409 - Attachment is obsolete: true
Attachment #413511 - Flags: review?(mozbugz)
Comment on attachment 413511 [details] [diff] [review]
patch v.3

We're way past the point where you can change nsIObjectLoadingContent on branch.  The branch version will need to use a separate interface.
Sorry, switching this back to a ?

Stuart: can you say *why* this blocks Fennec? Seems like a pretty late and big change to be coming in before RC for either of the products ...
Flags: blocking1.9.2+ → blocking1.9.2?
A few other quick comments:

The !doc thing in GetClosestWindow makes no sense.  If domDocument is non-null it'll QI to doc.  |element| will never QI to doc.  domDocument might be null if the element's owner document has been garbage collected.  How likely is that to happen?

Why is the GetInterface change needed, exactly?
Stuart explained that this is a shipping requirement for Fennec 1.0; bz and dougt are working for a branch-safe patch.
Flags: blocking1.9.2? → blocking1.9.2+
Blocks: 529996
Comment on attachment 413511 [details] [diff] [review]
patch v.3

+  frame->SetAbsoluteScreenPosition(element, position, clip);
+  return NS_OK;

return frame->SetAbsoluteScreenPosition(element, position, clip);

In nsIObjectFrame.h:

+  virtual void SetAbsoluteScreenPosition(class nsIDOMElement* element,

I think this should return an error nsresult when not implemented,
including when !MOZ_PLATFORM_HILDON and

>+    if ((mBlitParentElement && (mBlitParentElement != element)) ||
>+        !position || !clip)
>+      return;

when changing parent elements is not supported.
Attached patch patch v.4Splinter Review
incorporates karlt and bz comments.  This does not have the new interface yet
Attachment #413523 - Flags: review?
Comment on attachment 413523 [details] [diff] [review]
patch v.4

I want nsPluginInstanceOwner::SetAbsoluteScreenPosition() to only exist for MOZ_PLATFORM_HILDON.
nsObjectFrame::SetAbsoluteScreenPosition() can return the error for other platforms.

UpdateVisibility() can be private now and its declaration moved to the private MOZ_PLATFORM_HILDON block.

>+  mSharedSegmentInfo.shmaddr = mSharedXImage->data = static_cast<char*>(shmat(mSharedSegmentInfo.shmid, 0, 0));

Start a newline or two here.
Attachment #413523 - Flags: review? → review+
Attached patch trunk patch (obsolete) — Splinter Review
rolls up karlt comments.
applies to trunk.
carrying forward r=.
Attachment #413511 - Attachment is obsolete: true
Attachment #413539 - Flags: review+
Attachment #413511 - Flags: review?(mozbugz)
Attached patch 192 patchSplinter Review
same as trunk patch, but uses a new interface to ensure API stability on the 192 branch.
build pushed to try.
Attached patch trunk patchSplinter Review
Attachment #413539 - Attachment is obsolete: true
Comment on attachment 413542 [details] [diff] [review]
trunk patch

i missed a few pieces on the last trunk patch i posted.
the 192 patch will _not_ change the iid.  locally fixed.
http://hg.mozilla.org/mozilla-central/rev/f32dd283535f
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/8407edbf57a7
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
these lines should not have been removed when i landed.
Depends on: 536036
You need to log in before you can comment on or make changes to this bug.