[Mac] Flash performance heavily degraded in Firefox 4 Beta 4

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
9 years ago
8 years ago

People

(Reporter: notforyourmail, Assigned: mattwoodrow)

Tracking

({regression})

Trunk
All
macOS
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking2.0 betaN+)

Details

(Whiteboard: [hardblocker][has patch], )

Attachments

(6 attachments, 25 obsolete attachments)

7.56 KB, patch
Details | Diff | Splinter Review
13.58 KB, patch
Details | Diff | Splinter Review
11.37 KB, patch
Details | Diff | Splinter Review
9.73 KB, patch
Details | Diff | Splinter Review
6.09 KB, patch
Details | Diff | Splinter Review
1.97 KB, patch
mattwoodrow
: review+
Details | Diff | Splinter Review
User-Agent:       Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b4) Gecko/20100818 Firefox/4.0b4
Build Identifier: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b4) Gecko/20100818 Firefox/4.0b4

On the same computer, Adobe Flash renders the same video with much lower framerates, and with frequent dropped frames,in Firefox 4 Beta 4 than in Firefox 3.6.  In 3.6, video is typically smooth.  In 4, there are often dropped frames.  The affected system is a MacBook Pro Core Duo 1.83 GHz with a ATI RadeonX1600.  The performance difference is less noticeable on newer systems. Additionally, the amount of resources that can be taken by other processes while playing a video must be lower in 4.0 B4 in order to maintain smooth video.

Reproducible: Always

Steps to Reproduce:
1. Play a 380p video on YouTube
2. Frames drop.  80% of CPU is taken by the Flash process, 60-70% of a second core is taken by Firefox.
3. Frames drop.
Actual Results:  
Choppy video

Expected Results:  
Smooth video
Which Firefox 3.6 ? 3.6.8 ?
Any.  The video performance was the same on all of the 3.6 series on that machine.
I asked that because Firefox 3.6.4 added support for running plugins like Flash in a separate process, which changed the behavior somewhat. Your figures don't work for an older release, as there was no Flash process (you mean plugin-container.exe).
I'm running the Mac version.  I do mean plugin container.  I don't think the mac version had flash is a separate process in 3.6.x, but I could be wrong.  I can definitely say that I was running the latest 3.6 immediately before 4.0 b4, and I had much smoother video.
I can also see that the flash video playback performance of Firefox 4.0 is very disappointing on OSX 10.6.
For example: http://www.youtube.com/watch?v=H8jP4xI9gOk (measured with 360p mode)

Firefox 3.6.9
firefox-bin takes 60-70% of a CPU core

Firefox 4.0b5 with OOPP
firefox-bin takes 50-60% of a CPU core
plugin-container takes 40-50% of a CPU core

Firefox 4.0b5 without OOPP
firefox-bin takes 80-90% of of a CPU core, and the memory consumption continues to increase while playing (Bug 567589)

on OSX 10.6.4, Core 2 Duo 2.2GHz, GeForce 8600M GT
I should note that on the same machine where playing 360p video on youtube is problematic in Firefox 4, I can play 480p smoothly with upscaling to 1440x900 in Chrome 6.  Same machine and video.
Component: General → Plug-ins
Product: Firefox → Core
QA Contact: general → plugins
Version: unspecified → Trunk
> I asked that because Firefox 3.6.4 added support for running plugins like Flash
> in a separate process

Not on Mac.  Out of process plugins on Mac only happened with Gecko 2.0.
Status: UNCONFIRMED → NEW
blocking2.0: --- → ?
Ever confirmed: true
Keywords: regression
Assignee: nobody → joshmoz
Blocking on this regression.
blocking2.0: ? → betaN+
Tested using an OK Go YouTube video at 720p and scaled (enlarged) to ~930x532. I was using a MacBook Pro with a 3.0GHz Core 2 Duo, 4GB RAM and Mac OS X 10.6.4. I'll list total percentages here, not broken down by process.

Safari (out-of-process): ~58%
Chrome (out-of-process): ~58%
Firefox 3.6 (in-process): ~65%
Firefox 4.0b7pre (out-of-process): 105%

These numbers were recorded after the network load for the video had completed.

This is pretty high, we should definitely invest some time in getting that number down. We're expecting some extra overhead due to IPC, but right now we're adding 40% to Firefox 3.6's CPU usage and that is too much. We should try to at least half the regression before we ship.

One thing we should look at is the drawing and event models being negotiated by the different browsers. Firefox 3.6 is not negotiating the same models as Firefox 4.0 (the latter added support for Cocoa and Core Animation NPAPI) and that could have a significant impact.
I should note that the video performance is now so bad that even on a mid-2009 Mac Pro (2.66 GHz, 12 GB RAM, 8 cores), I'm getting poor video frame rates.  (It has gotten worse since I originally reported the bug.)  On a hulu video, I recently had to reduce the size of the video significantly in order to get smooth video.  In Firefox 3, I get smooth video, even when zoomed to 2560x1600.  In Chrome on my 1.83 GHz Core Duo Macbook Pro, I get smooth full screen flash video.  It's not just CPU usage.  Performance is very severely degraded.
I've mention before how horrible the read back we are doing to draw Core Animation back into Core Graphics is for performance. I was hoping we would have Retained Layers to fix this performance regression. Are/When will OpenGL retained layers be on by default?

My suspicion is that this read back is by far the largest cause of the performance regression. It would be worth getting the performance numbers with the read back disabled (painting black instead). I can make a patch for that quickly. This would give us an idea how much of that CPU time is spent there. We can then know for sure what the impact is.

Also is Flash using Core Animation in those Safari/Chrome test? We need to know if Flash got faster/slower by rendering internally using Core Animation.
After running a trunk build through Instruments, the top CPU usage is:

(~10.0%) nsIOSurfaceLib::IOSurfaceLock (which causes an IOSurface memory flush)
 - called from nsIOSurface::Lock
 - called from CGContextDrawImage
 - called from nsCARenderer::DrawSurfaceToCGContext(CGContext*, nsIOSurface*, CGColorSpace*, int, int, int, int)
 - called from mozilla::plugins::PluginInstanceParent::NPP_HandleEvent(void*)

(~5.5%) gfxImageSurface::gfxImageSurface(gfxIntSize const&, gfxASurface::gfxImageFormat) (which zeroes the surface)
Note: those numbers are based only on the parent process, which has ~65% CPU usage
BenWa, retained layers will be implemented when you or Josh or someone takes bug 598425. I have it implemented for Windows in bug 596451 which should land after b7.
That's the read back I discussed in comment 12.
> (~10.0%) nsIOSurfaceLib::IOSurfaceLock (which causes an IOSurface memory flush)
>  - called from nsIOSurface::Lock
>  - called from CGContextDrawImage
>  - called from nsCARenderer::DrawSurfaceToCGContext(CGContext*, nsIOSurface*,
> CGColorSpace*, int, int, int, int)
>  - called from mozilla::plugins::PluginInstanceParent::NPP_HandleEvent(void*)

I'm not 100% sure what that is but my guess is that it's clearing the background on every paint? Looks very expensive.
> (~5.5%) gfxImageSurface::gfxImageSurface(gfxIntSize const&,
> gfxASurface::gfxImageFormat) (which zeroes the surface)
Depends on: 598425
The same effect occurs on Linux as well. (bug 609189)

there is no regression on windows XP.

this is with firefox 4.0b6 on windowz and firefox 4.0b8/9 (latest trunk) on linux.
The performance is now bad enough in Minefield that even on a Mac Pro, many videos are not watchable in flash at all (at least, not smoothly).  I actually have to switch to Safari or Chrome to view Hulu videos at anything > about 1.5 or 2x the original size (360p).
Performance is improved somewhat in the new Flash 10.2 beta that was just released tonight.  Full screen 1080p plays smoothly on my Mac Pro.  However, windowed 1080p drops frames very frequently.  (There is supposed to be a performance improvement in the latest flash beta version.)  For comparison, in Safari, Flash 10.2 plays smoothly in 1080p in both a youtube window and full screen.
Let me add to that that the frame rate is terrible in flash 10.2 beta for video at *any* resolution, not just 1080p.  240p also has a frame rate of only a few frames per second.
I don't expect full screen performance to be relevant to this issue since we are not doing the composition in that case.  Bug 598425 is really the missing piece causing us this performance regression.
The performance is still slow in chrome, due to drivers, but is 2x what minefield does and a little faster than the stable version (3.6)
Running through instruments again we now have:

(~10%) memcpy
from mozilla::layers::BindAndDrawQuadWithTextureRect
(~5%)  IOSurfaceRoot::flushMemory from nsCARenderer::DrawSurfaceToCGContext
(since there is no edit to change the last comment, here it is finished)

Running through instruments again we now have:

(~10%) We will probably get this back from retained layers (bug 598425)
(~5%)  Seems to be a problem with how we use IOSurface. From the headers:
   "Locking and unlocking a IOSurface is not a particularly cheap operation,
    so care should be taken to avoid the calls whenever possible."
We should be using the seed to determine if there are changes in the IOSurface and caching the base pointer/properties for reuse.
I don't remember exactly if I had a strong reason to lock the IOSurface or if I just did it to be safe hoping to revisit that decision. I think it would be worth while trying it without the lock call since I don't think the code currently read/writes at the same time.
And see bug 614623 comment #9, we should make sure that Flash is using NPAPI:InvalidatingCoreAnimation and not NPAPI:CoreAnimation in 32-bits as the this will reduce the number of drawing calls.
Unfortunately removing the lock just results in a white box where there should be a video.
Duplicate of this bug: 615648
Are we only referring to only Minefield process in this bug?

With Flash 10.2 beta, for 360p video both Minefield process and Minefield Plugin Process use around 40-50% CPU each (80-100% total). Safari/Webkit process uses 20% in total when running the same video.

This isn't just with Youtube, but all flash content.
IOSurfaceLock problems seem to be machine-dependent. On my Macbook Pro it is ~5% but testing on a Mac Pro is was ~30% and on a Mac Mini it was ~0.5%.
I'm seeing significant improvement in the flash frame rate on the 12/31 build using the 10.2 flash beta on a Mac Pro with an ATI HD 4870 using a low definition Hulu source.  Video is (mostly) smooth, whether in a window or full screen.  However, about once every 10 seconds, the video drops to about 3 frames per second, then resumes smooth play.
Another note: video performance is severely degraded when the browser has another window being actively updated. For example:

http://www.doonesbury.com/strip has a scrolling text area, apparently also generated by the Flash plug-in/  If that tab is active, the frame rate drops to only a few frames per second.  If that tab is not active, the frame rate jumps up again.  (Right now, I'm playing a Hulu video in a separate window, and I have multiple tabs open in my main browser window, one of which contains the Doonesbury link http://www.doonesbury.com/strip .
Flash performance seems worse again on the same machine as comment 31.  Lower frame rates, plus flashes of blocky artifacts and horizontal lines in video.  Unfortunately, other browsers are still required to see smooth video performance in flash.
Whiteboard: [hardblocker]
An update - as of the Jan 8th build, flash video is unusable (about 3 frames per second embedded, maybe 6 in full screen) on Minefield on a MacBook Pro Core Duo.  Video plays smoothly on the same hardware in Chrome and Safari.  The hardware graphics specs are:

ATI Radeon X1600:

  Chipset Model:	ATY,RadeonX1600
  Type:	GPU
  Bus:	PCIe
  PCIe Lane Width:	x16
  VRAM (Total):	128 MB
  Vendor:	ATI (0x1002)
  Device ID:	0x71c5
  Revision ID:	0x0000
  EFI Driver Version:	01.00.068
  Displays:
Color LCD:
  Resolution:	1440 x 900
  Pixel Depth:	32-Bit Color (ARGB8888)
  Main Display:	Yes
  Mirror:	Off
  Online:	Yes
  Built-In:	Yes
Display Connector:
  Status:	No Display Connected
Duplicate of this bug: 625100
Duplicate of this bug: 625388
Hardware: All → x86
Summary: Flash performance heavily degraded in Firefox 4 Beta 4 → Flash performance heavily degraded in Firefox 4 Beta 4 (mac)
Matt, can you help here? We should *not* be reading back from GL when GL layers are enabled.

Unfortunately I believe GL layers are blacklisted with Michael's hardware...
Can we simply tell Quicktime and Flash not to use the Core Animation drawing model when we're not using GL layers?
(In reply to comment #38)
> Can we simply tell Quicktime and Flash not to use the Core Animation drawing
> model when we're not using GL layers?

I think that's the best thing to do for performance but when we had discussed this with Adobe in early 2010 we were told Flash might negotiate QuickDraw if Core Animation was not available so we couldn't go with that solution. Things likely have changed now so perhaps it would be worth looking into again. Perhaps Josh can confirm I am remembering this properly since it's been almost a year.
Robert: Would GL layers be blacklisted with *all* of my hardware?  I see the same performance problems on all three of my Macs:

-A Mac Pro with an ATI HD4870 (surely capable of rendering anything you are likely to throw at it quickly)

-A Macbook Pro with a Radeon X1600

-Another Macbook Pro with a Nvidia GeForce 8600M GT
Robert: Would GL layers be blacklisted with *all* of my hardware?  I see the same performance problems on all three of my Macs:

-A Mac Pro with an ATI HD4870 (surely capable of rendering anything you are likely to throw at it quickly)

-A Macbook Pro with a Radeon X1600

-Another Macbook Pro with a Nvidia GeForce 8600M GT
Michael: You can check the GL status in the Graphics section of about:support.

That said, the problem exists both with and without hardware acceleration enabled. The fix is different for each configuration though.
Oops.  Sorry about the duplicate.  Anyway, if it helps, here is some more info on the Mac Pro's hardware.  I experience the same flash performance problems on it, and this is high end hardware.  All OS X frameworks should be properly supported on this machine...

ATI Radeon HD 4870:

  Chipset Model:	ATI Radeon HD 4870
  Type:	GPU
  Bus:	PCIe
  Slot:	Slot-1
  PCIe Lane Width:	x16
  VRAM (Total):	512 MB
  Vendor:	ATI (0x1002)
  Device ID:	0x9440
  Revision ID:	0x0000
  ROM Revision:	113-B7710F-176
  EFI Driver Version:	01.00.318
  Displays:
Cinema HD:
  Resolution:	2560 x 1600
  Pixel Depth:	32-Bit Color (ARGB8888)
  Main Display:	Yes
  Mirror:	Off
  Online:	Yes
  Rotation:	Supported


 Model Name:	Mac Pro
  Model Identifier:	MacPro4,1
  Processor Name:	Quad-Core Intel Xeon
  Processor Speed:	2.66 GHz
  Number Of Processors:	2
  Total Number Of Cores:	8
'
OK.  Here's what I get from About:Support:

Mac Pro:
Adapter Description0x21a00,0x20400Vendor ID0000Device ID0000Adapter RAMAdapter DriversDriver VersionDriver DateDirect2D EnabledfalseDirectWrite EnabledfalseWebGL RendererATI Technologies Inc. -- ATI Radeon HD 4870 OpenGL Engine -- 2.1 ATI-1.6.26GPU Accelerated Windows2/2 OpenGL

Macbook Pro Core Duo:
Adapter Description0x21900,0x20400Vendor ID0000Device ID0000Adapter RAMAdapter DriversDriver VersionDriver DateDirect2D EnabledfalseDirectWrite EnabledfalseWebGL RendererATI Technologies Inc. -- ATI Radeon X1600 OpenGL Engine -- 2.0 ATI-1.6.26GPU Accelerated Windows0/1

Macbook Pro Core 2 Duo:
Adapter Description0x22600,0x20400Vendor ID0000Device ID0000Adapter RAMAdapter DriversDriver VersionDriver DateDirect2D EnabledfalseDirectWrite EnabledfalseWebGL RendererNVIDIA Corporation -- NVIDIA GeForce 8600M GT OpenGL Engine -- 2.1 NVIDIA-1.6.26GPU Accelerated Windows1/1 OpenGL
Open question to anyone that understands plugins on mac:

http://pastebin.mozilla.org/961536

Why are these drawing models different, and how can I find the PluginInstanceParent's drawing model (CoreGraphics) from within nsObjectFrame::GetLayerState.
(In reply to comment #45)
> Why are these drawing models different

This is where the translation happens:
http://mxr.mozilla.org/mozilla-central/source/dom/plugins/PluginInstanceParent.cpp#373

The reason they differ is that we don't know from nsObjectFrame if the plugin is being run out of process or not (this may no longer be the case, but it was previously). The OOP mechanism uses an IOSurface to pass across process so PluginInstanceParent does not have a Core Animation layer object. The IOSurface needs to ultimately draw itself to the CoreGraphics context so it requests the CoreGraphics model from nsObjectFrame rather then somehow pass an IOSurface back to it making the code messier. Of course with GL Layers this is no longer true.
Alright that makes sense, thanks BenWa.

Is there any way to detect what drawing model the plugin itself is using (and thus if an IOSurface will be available)?

I ask, because its been suggested that we only create a separate layer for the plugin if we can do the direct GL->GL copy and stick with the old render path otherwise. For this I need to know if it's going to be possible from within nsObjectFrame::GetLayerState.

If not, I can alternatively create a layer always and make it a canvas layer backed by a thebes surface for the non-IOSurface cases.
I couldn't find one, so here goes part 1, v1.

Any idea who I should ask to review this?
An outline of the plan I'm trying to implement.

If we detect that we can avoid the readback (Mac, using OpenGL layers and the plugin is rendering with CoreAnimation to an IOSurface), we change nsDisplayPlugin::GetLayerState to return LAYER_ACTIVE and create a separate layer for the plugin.

It looks like IOSurface allows sharing across processes/GLContext's without issue, so I'm adding a new Mac-specific layer type that can bind an IOSurface to a texture.

I think this should be all we need to improve performance in this case.

For users where our OpenGL layers have been disabled, we might want an extra patch to force them back to CoreGraphics to avoid the readback in that case - Is anyone else interested in taking this?
(In reply to comment #48)
> Created attachment 505527 [details] [diff] [review]
> Part 1: Interface to retrieve the plugin's drawing model from nsObjectFrame

My 2 cents: I'm not a fan of introducing a new internal NPAPI value, esp. 1004 which is likely to be used in the future. Future plug-in that implement the next NPAPI feature could break with this change.  Using a value in a different range might be safer.

Maybe we could take a approach similar to bug 556487?
https://bugzilla.mozilla.org/attachment.cgi?id=473610&action=diff

IOSurfaceLayerOGL looks good. Small note, I was told we shouldn't include anything that is 10.6 only so I don't think we can include CGLIOSurface.h.
OS: Mac OS X → Windows 7
And now the pièce de résistance (read: hacky mess), using IOSurfaces for Core Animations plugins.

Still a WIP, lightly tested on youtube, main video works fine but I'm getting some invalidation issues on the recommendations panel.

(In reply to comment #51)
> (In reply to comment #48)
> > Created attachment 505527 [details] [diff] [review]
> > Part 1: Interface to retrieve the plugin's drawing model from nsObjectFrame
> 
> My 2 cents: I'm not a fan of introducing a new internal NPAPI value, esp. 1004
> which is likely to be used in the future. Future plug-in that implement the
> next NPAPI feature could break with this change.  Using a value in a different
> range might be safer.
> 
> Maybe we could take a approach similar to bug 556487?
> https://bugzilla.mozilla.org/attachment.cgi?id=473610&action=diff

Thanks, I'll take a look at this.

> 
> IOSurfaceLayerOGL looks good. Small note, I was told we shouldn't include
> anything that is 10.6 only so I don't think we can include CGLIOSurface.h.

Right. nsCoreAnimationSupport.mm already handles this, so I should just be able to expose the wrapper from there and use that.
Forgot to qref my latest fixes.
Attachment #505575 - Attachment is obsolete: true
We should try get these new layers hooked into the empty transaction API to reduce CPU usage even further.
The only concern I have with this patch is of object lifetimes, I'm not sure what happens if we delete an IOSurface while it's still bound to a texture.

Under the old code we only ever accessed the IOSurface within painting, now it stays bound to a texture until the following paint. It shouldn't ever be used though.

The complete lack of docs for IOSurface isn't helping here, I'm hoping that CGLTexImageIOSurface2D adds a reference to the IOSurface and releases it when the texture goes away.

We don't seem to be hitting any issues (never using the texture should leave us safe), and hopefully try server will catch any possible problems. That said, I'd much rather be provably correct.
Attachment #505549 - Attachment is obsolete: true
Attachment #505732 - Flags: review?(joe)
Comment on attachment 505732 [details] [diff] [review]
Part 2: Add a Mac-only layer for handling IOSurfaces v2

Copyright year on new files should be 2011 :)

Looks fine from my point of view, but Benoit has more experience with IOSurfaces than I do.
Attachment #505732 - Flags: review?(joe)
Attachment #505732 - Flags: review?(b56girard)
Attachment #505732 - Flags: review+
(In reply to comment #55)
> Created attachment 505732 [details] [diff] [review]
> Part 2: Add a Mac-only layer for handling IOSurfaces v2
> 
> The only concern I have with this patch is of object lifetimes, I'm not sure
> what happens if we delete an IOSurface while it's still bound to a texture.
> 
> Under the old code we only ever accessed the IOSurface within painting, now it
> stays bound to a texture until the following paint. It shouldn't ever be used
> though.

That sounds like a risky assumption. What you can do is copy the nsIOSurface object, this will create another wrapper object around the underlying IOSurface increasing its retain count:
See IOSurface::GetIOSurfaceID && LookupSurface. It ment to be used in across processes but it should do nicely to solve this.

I noticed a mistake in my code when looking at the release rules, if you could include this as well:
http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/nsCoreAnimationSupport.h#322
should release surfaceRef instead of ioSurface.

The rest of the patch looks excellent. I can't wait to see what performance improvement this will give us!
Attachment #505732 - Flags: review?(b56girard) → review+
Depends on: 626962
Not sure about naming of the new functions, suggestions appreciated.
Attachment #505527 - Attachment is obsolete: true
Attachment #506084 - Flags: review?(joshmoz)
Fixed review comments.

Carrying forward r=joe,BenWa

Anyone interested in taking review on part 3?
Attachment #505732 - Attachment is obsolete: true
Attachment #506086 - Flags: review+
Comment on attachment 506086 [details] [diff] [review]
Part 2: Add a Mac-only layer for handling IOSurfaces v3

+  nsIOSurface* mIOSurface;

nsAutoPtr<nsIOSurface>. Otherwise looks good.
I'm unlikely to have time to get to these again before the 30th, any help getting them to pass tests in the meantime would be appreciated.
Part 3 looks like the wrong patch was uploaded.
It looks correct to me?
Hmm yeah. I don't know what I was thinking.
+      void *surface;

Unfortunately I don't think we can just add to npapi.h in this way. We need to find another way to get the nsIOSurface out. I think we might need to add new API to nsIPluginInstance alongside GetSurface, or extend GetSurface...

+  ioLayer->Init(surf->GetIOSurface(), nsIntSize(window->width, window->height));

We call Init over and over again on the same retained layer. Even if we need to re-init, we should probably call the method something else.

I wonder if it would be simpler to use more of the existing plugin-layer drawing code, i.e. create a new Image type that wraps an IOSurface, and create an ImageLayer on Mac like the other platforms.

We could then deal with the API problem above by changing GetSurface to Image* CreateImage(ImageContainer*). CreateImage would create the image directly instead of returning a cairo surface.
This patch looks promising but we need to find a solution for the case where gl layers will not be enabled.

I looked into attaching the core animation layer to the view when the plugin frame is not obstructed. However the layer overlaps the browser chrome. I believe previously when I had tried this approach in early 2010 it only covered the scroll bars. Is anyone familiar with the cocoa widget have any idea how we can fix this?

http://imgur.com/zKsoF

Otherwise we need to find a solution for the case when gl layers is not enabled to resolve this bug.
Is it possible to clip the CALayer to a rectangle or union of rectangles? If so, nsObjectFrame::ComputeWidgetGeometry can be altered to obtain the correct rectangle(s) for you.
It is possible to clip contents of a layer (and I believe any sublayers) to the bounds of that layer by setting masksToBounds property to YES.

More information can be found at: http://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/CoreAnimation_guide/Articles/LayerVisProps.html
Yes, this was my first approach and it works well. However by using the mask property we can clip to union of rectangles. This will allow us to only read back the covered area. 

The problem I'm facing now is that the CALayer can't be used by both the NSView layer hierarchy and the CARenderer at the same time. I'm working on seeing if it's feasible to move the layer back and forth between the NSView widget and the CARenderer. I can share my mq if someone is interested in taking a look.
Depends on: 629602
What work remains here, exactly? Are we at a point where this can land if the patches in this bug get reviewed, or is there more pending work that must happen before this is done?
I am working on continuing with Matt Woodrow's work. The patch with review requested right now are not going to work. I don't know for sure how long it'll take me to finish them but it will be at least 5 days.
I'm back again now and can spend time fixing these if needed.

Josh: Please let me know if you want me to take this back, and what changes you are proposing.
Attachment #506084 - Attachment is obsolete: true
Attachment #506084 - Flags: review?(joshmoz)
Assignee: joshmoz → matt.woodrow+bugzilla
Summary: Flash performance heavily degraded in Firefox 4 Beta 4 (mac) → [Mac] Flash performance heavily degraded in Firefox 4 Beta 4
Attachment #505578 - Attachment is obsolete: true
Attachment #505578 - Flags: review?(roc)
Attachment #506086 - Attachment is obsolete: true
Attachment #506086 - Flags: review+
Fixed infinite loop bug in certain situations.
Attachment #508306 - Flags: review?
Attachment #508306 - Flags: review? → review?(joshmoz)
Mainly a refactoring of the old part 2
Attachment #508307 - Flags: review?(roc)
Attachment #508307 - Flags: review?(b56girard)
This is mainly just a refactoring of the old code to create the Image in plugin code instead of nsObjectFrame.
Attachment #508308 - Flags: review?(joshmoz)
Taking random stabs at reviewers here, feel free to fight over them :)
Attachment #508309 - Flags: review?(roc)
Whiteboard: [hardblocker] → [hardblocker][has patch]
Attachment #508307 - Flags: review?(b56girard) → review+
Fixed bug checking mOwner instead of mPlugin.
Attachment #508306 - Attachment is obsolete: true
Attachment #508531 - Flags: review?(joshmoz)
Attachment #508306 - Flags: review?(joshmoz)
Fixed build on non-mac platforms.
Attachment #508309 - Attachment is obsolete: true
Attachment #508532 - Flags: review?(roc)
Attachment #508309 - Flags: review?(roc)
Is this ready to land? It would be great to get this in for Beta11 if possible, since there are a bunch of complaints around Flash problems in general and it would be great to get some Beta feedback.
Theres still one reftest failure on non-mac platforms, it looks like I've broken MozPaintWaitFinished with async plugins somehow. Looking into this now.

All the patches still need review as well.
Comment on attachment 508307 [details] [diff] [review]
Part 2: Add a Mac-only Image type for IOSurfaces

+#ifdef XP_MACOSX
+    /**
+     * The MAC_IO_SURFACE format...
+     */
+    , MAC_IO_SURFACE
+#endif

This doesn't need to be #ifdefed.

+#ifdef XP_MACOSX
+# include "nsCoreAnimationSupport.h"
+#endif

Nor does this (2 places).

I think MacIOSurfaceImageOGL should have its own header file MacIOSurfaceImageOGL.h and the source file should be called MacIOSurfaceImageOGL.mm.
Attachment #508307 - Flags: review?(roc) → review+
+    if (mIOSurface)
+        format = Image::MAC_IO_SURFACE;

{}

+#ifndef XP_MACOSX
     mInstanceOwner->NotifyPaintWaiter(aBuilder);
+#endif

Why are you #ifndefing this?

+#ifdef XP_MACOSX
+  PaintMacPlugin(nsnull, aDirtyRect, aPluginRect);
+#endif

Why do we need to call this at all in BuildLayer?
(In reply to comment #82)
> +    if (mIOSurface)
> +        format = Image::MAC_IO_SURFACE;
> 
> {}

This was to stay in style with the surrounding code, can change it if you want.
> 
> +#ifndef XP_MACOSX
>      mInstanceOwner->NotifyPaintWaiter(aBuilder);
> +#endif
> 
> Why are you #ifndefing this?

It appeared to only be relevant to async plugins since the ::Paint path never called it. Seems like it would be a change in behaviour to start calling this.

> 
> +#ifdef XP_MACOSX
> +  PaintMacPlugin(nsnull, aDirtyRect, aPluginRect);
> +#endif
> 
> Why do we need to call this at all in BuildLayer?

When else should it be called? Once GetLayerState has returned LAYER_ACTIVE, ::Paint won't get called and nothing actually gets drawn to the IOSurface (yes, I tried it thanks to poor rebasing skills).
(In reply to comment #83)
> (In reply to comment #82)
> > +    if (mIOSurface)
> > +        format = Image::MAC_IO_SURFACE;
> > 
> > {}
> 
> This was to stay in style with the surrounding code, can change it if you want.

OK doesn't matter.

> > +#ifndef XP_MACOSX
> >      mInstanceOwner->NotifyPaintWaiter(aBuilder);
> > +#endif
> > 
> > Why are you #ifndefing this?
> 
> It appeared to only be relevant to async plugins since the ::Paint path never
> called it. Seems like it would be a change in behaviour to start calling this.

True. Can you rename UseLayers to UseAsyncRendering, and wrap a check for that around the NotifyPaintWaiter call?

> > +#ifdef XP_MACOSX
> > +  PaintMacPlugin(nsnull, aDirtyRect, aPluginRect);
> > +#endif
> > 
> > Why do we need to call this at all in BuildLayer?
> 
> When else should it be called? Once GetLayerState has returned LAYER_ACTIVE,
> ::Paint won't get called and nothing actually gets drawn to the IOSurface (yes,
> I tried it thanks to poor rebasing skills).

OK but for Core Animation you're not using the CGContext that PaintMacPlugin sets up, so almost all the code in PaintMacPlugin is irrelevant. Perhaps we only need the call to mInstanceOwner->RenderCoreAnimation(). (It's really confusing to have "Paint" code in BuildLayer since the whole point of layers is that we don't explicitly paint anything here.) So what I'd like to see is the relevant code copied out of PaintMacPlugin into another function, say nsPluginInstanceOwner::SetupCoreAnimation() or something, and have us call that. (Hopefully SetupCoreAnimation and RenderCoreAnimation can share code.)
Attachment #508308 - Flags: review?(joshmoz) → review?(benjamin)
Attachment #508531 - Flags: review?(joshmoz) → review?(benjamin)
Fixed review suggestions.

Carrying forward r=roc,BenWa
Attachment #508307 - Attachment is obsolete: true
Attachment #508995 - Flags: review+
Fixed a bug in IsUpToDate(), still waiting on try server results to see if this fixes the reftest failures.
Attachment #508308 - Attachment is obsolete: true
Attachment #508996 - Flags: review?
Attachment #508308 - Flags: review?(benjamin)
Fixed review suggestions.
Attachment #508532 - Attachment is obsolete: true
Attachment #508998 - Flags: review?(roc)
Attachment #508532 - Flags: review?(roc)
Attachment #508996 - Flags: review? → review?(benjamin)
+    if (mInstanceOwner->UseAsyncRendering())
+      mInstanceOwner->NotifyPaintWaiter(aBuilder);

{}

Maybe I'm wrong but it seems to me that for CALayer rendering, it doesn't really matter what we pass for the 'draw' rect in the paint event. We might as well pass the whole plugin window rect. That would let us get rid of a bunch of the copied code here. And we shouldn't need BeginCGPaint/EndCGPaint either.
Attachment #508998 - Attachment is obsolete: true
Attachment #509018 - Flags: review?(roc)
Attachment #508998 - Flags: review?(roc)
It seems we don't need StartDrawPlugin and EndDrawPluginEither.

I think you should set width/height to the plugin width/height. Using zero seems a bit wrong.
Comment on attachment 508996 [details] [diff] [review]
Part 3: Change GetSurface to GetImage v2

Is this going to work when we have D3D surfaces for plugins? I thought those weren't going to use ImageContainers. But then again, that's not Fx4, so this can land now, I guess.
Attachment #508996 - Flags: review?(benjamin) → review+
Comment on attachment 508531 [details] [diff] [review]
Part 1: Interface to retrieve the plugin's drawing model from nsObjectFrame v4

Need to change the IID on nsIPluginInstance.

Did we decide that in-process plugins will always use CG instead of CA? That's what PluginPRLibrary::GetRemoteDrawingModel implies.

nsNPAPIPluginInstance already has a mDrawingModel member. Why can't we use that, instead of forwarding this through PluginLibrary and storing it again on PluginInstanceParent? I also don't see where mDrawingModel is declared on PluginInstanceParent...
Comment on attachment 508531 [details] [diff] [review]
Part 1: Interface to retrieve the plugin's drawing model from nsObjectFrame v4

I think this is r- unless I'm missing something re-request review if appropriate.
Attachment #508531 - Flags: review?(benjamin) → review-
Note that bug 626602 is adding nsIPluginInstance_MOZILLA_2_0_BRANCH and this change should be added to that interface so that we don't have the IID change after API freeze.
> 
> Did we decide that in-process plugins will always use CG instead of CA? That's
> what PluginPRLibrary::GetRemoteDrawingModel implies.

I believe in-process plugins can use either. This function is currently used more as a 'IsRemoteDrawingModelCoreAnimation' which is why the default return values haven't mattered. Maybe it should be renamed to prevent confusion, since I'm very much unsure on how else to handle this.

> 
> nsNPAPIPluginInstance already has a mDrawingModel member. Why can't we use
> that, instead of forwarding this through PluginLibrary and storing it again on
> PluginInstanceParent? I also don't see where mDrawingModel is declared on
> PluginInstanceParent...

PluginInstanceParent already has the mDrawingModel member defined. In the case we are interested in, the NSNPAPIPluginInstance has mDrawingModel set to CoreGraphics, while the PluginInstanceParent has it set to CoreAnimation.


I don't entirely understand the idl suggestions, will take a look at bug 626602.
Hopefully this looks better, I'm not a big fan of the name though.

Do I need to do something similar for the s/GetSurface/GetImage patch too?
Attachment #508531 - Attachment is obsolete: true
Attachment #509641 - Flags: review?
Attachment #509641 - Flags: review? → review?(benjamin)
Changed function to DoCocoaEventDrawRect() to avoid code duplication.
Attachment #509018 - Attachment is obsolete: true
Attachment #509684 - Flags: review?(roc)
Attachment #509018 - Flags: review?(roc)
Comment on attachment 509641 [details] [diff] [review]
Part 1: Interface to retrieve the plugin's drawing model from nsObjectFrame v5

You still need to rev the IID on nsIPluginInstance_MOZILLA_2_0_BRANCH.
Attachment #509641 - Flags: review?(benjamin) → review+
Added updated UUID. Carrying forward r=benjamin
Attachment #509641 - Attachment is obsolete: true
Attachment #509903 - Flags: review+
Rebased against latest changes.

Carrying forward r=roc,BenWa
Attachment #508995 - Attachment is obsolete: true
Attachment #509905 - Flags: review+
Moved changes onto branch interface and bumped the UUID.

Carrying forward r=benjamin
Attachment #508996 - Attachment is obsolete: true
Attachment #509906 - Flags: review+
Fixed review suggestion.

Josh do you mind reviewing this since roc is away?

I've pushed the whole patch queue to try again in case I broke anything during the refactoring.

http://tbpl.mozilla.org/?tree=MozillaTry&rev=47eb37d0c702
Attachment #509684 - Attachment is obsolete: true
Attachment #509907 - Flags: review?(joshmoz)
Attachment #509684 - Flags: review?(roc)
And again with more than just a linux debug build:

http://tbpl.mozilla.org/?tree=MozillaTry&rev=ca980fac29ed
Yeah, I can review.
Comment on attachment 509907 [details] [diff] [review]
Part 4: Create MacIOSurfaceImages for CoreAnimation plugins v6

+    if (mIOSurface)
+        format = Image::MAC_IO_SURFACE;

{}
Attachment #509907 - Flags: review+
(In reply to comment #105)
> Comment on attachment 509907 [details] [diff] [review]
> Part 4: Create MacIOSurfaceImages for CoreAnimation plugins v6
> 
> +    if (mIOSurface)
> +        format = Image::MAC_IO_SURFACE;
> 
> {}

See comment 84 :)
Whiteboard: [hardblocker][has patch] → [hardblocker][has patch][needs landing]
Well, at least I'm consistent :-)
Comment on attachment 509905 [details] [diff] [review]
Part 2: Add a Mac-only Image type for IOSurfaces v3

>+  if (mActiveImage->GetFormat() == Image::MAC_IO_SURFACE) {
>+    MacIOSurfaceImageOGL *ioImage =
>+      static_cast<MacIOSurfaceImageOGL*>(mActiveImage.get());
>+      return ioImage->mSize;
>+  }

The indentation is off for the return statement.
Comment on attachment 509907 [details] [diff] [review]
Part 4: Create MacIOSurfaceImages for CoreAnimation plugins v6

>--- a/layout/generic/nsObjectFrame.h
>+++ b/layout/generic/nsObjectFrame.h
>@@ -57,16 +57,17 @@ class nsIAccessible;
> #endif
> 
> class nsPluginInstanceOwner;
> class nsIPluginHost;
> class nsIPluginInstance;
> class nsPresContext;
> class nsDisplayPlugin;
> class nsIDOMElement;
>+class nsIOSurface;

Do you need to add this?
Attachment #509907 - Flags: review?(joshmoz) → review+
flash 10.2 final has been released today.. any improvements?
I've been following this for a while. From this thread

http://forums.mozillazine.org/viewtopic.php?f=23&t=2096621&p=10404077

on this site

http://www.youtube.com/adblitz

a user of Win and Firefox 4.0b12 20110207

reports plugin-container: 2-4% CPU Usage Flash on adblitz

I see ~120% for the Plugin Process using OSX 10.6.6 on a 2.66 GHz Mac Pro Quad core with 8G with flash 10.2 final.

Hope this data point is helpful.
Forgot to mention I'm running FF4b11
Just had a quick test myself, I have only observed usage using Activity Monitor is Mac OS X 10.6.6.

Playing a 480p YouTube video. MacBook Pro with nVidia 8600M GT, latest Minefield uses ~60% CPU and plugin process 40%, so 100% overall. Same video in WebKit uses 20% in total. Running final version of Flash Player 10.2.

Therefore, no improvements. 

Interesting to note. Link given to YouTube in comment 111. Minefield uses 20% CPU, and plugin process 100%, 120% overall. Minefield process on this page uses less CPU usage than a normal, simple YouTube page. WebKit uses 180% CPU, which is higher than Minefield, I'll leave that bug to WebKit bugzilla.

I'll try my newer MacBook Pro with YouTube HD videos tomorrow that uses H.264 hardware acceleration in Flash 10.2.
I have just tried a YouTube HD video on my newer MacBook Pro. Minefield processes uses ~50%, plugin process uses ~40%, so 90%.

Safari process uses 1-2%, and its plugin process 3-4%, 6% in total.

The figures seem to me that Flash 10.2 in Minefield may not be using hardware acceleration? They are very similar to my older MacBook Pro which doesn't support H.264 acceleration.
For me Flash 10.2 offers significantly worse performance than 10.1, videos on YouTube play really choppy in 10.2 compared to 10.1, almost unwatchable in some cases. 

Is any tryserver build still available with this patch?
Just tried the http://www.youtube.com/adblitz video in the latest version of Chrome. Process uses 3-5%

OSX 10.6.6 on a 2.66 GHz Mac Pro Quad
core with 8G with flash 10.2 final.
OK.  I've installed the new flash 10.2 release (10,2,152,26).  Testing a 1080p Avatar trailer on my Mac Pro in full screen (2560x1600), I get about 46% CPU usage in Safari.  In the latest daily build of Minefield, I get 47% by the FF plugin itself and another 24% by Firefox.

In Safari, the video is perfectly smooth.  In FF, I'm seeing jitters and frame drops (including occasional random pauses of the video for a fraction of a second), and the video doesn't appear to be running at 30 fps.  

Also, about:support now reports my graphics hardware as:

Adapter Description0x21a00,0x20400Vendor ID0000Device ID0000Adapter RAMAdapter DriversDriver VersionDriver DateDirect2D EnabledfalseDirectWrite EnabledfalseWebGL RendererATI Technologies Inc. -- ATI Radeon HD 4870 OpenGL Engine -- 2.1 ATI-1.6.26GPU Accelerated Windows1/1 OpenGL

This is different frmo before.  It previously reported Acceleration Windows 2/2 OpenGL.
On the same trailer, on a current generation Macbook Air (2.13 GHz, NVidia 320m), I get 14.4% CPU usage and 20% kernel usage.  Kernel usage is approx 1.5% on the Mac Pro when playing this video.  (I'm guessing the difference is because the kernel is handling a portion of the hardware rendering that isn't taking place on the Mac Pro.)  I'm playing it at 1080p on the Air, but the Air's display is 1440x900, so it isn't *displaying* at that resolution.  Video on the Air is smooth.  No dropped frames.  Much better than on the Mac Pro.  The hardware for this one is reported by Minefield as:

Adapter Description0x22600,0x20400Vendor ID0000Device ID0000Adapter RAMAdapter DriversDriver VersionDriver DateDirect2D EnabledfalseDirectWrite EnabledfalseWebGL RendererNVIDIA Corporation -- NVIDIA GeForce 320M OpenGL Engine -- 2.1 NVIDIA-1.6.26GPU Accelerated Windows1/1 OpenGL

Note: these tests are using:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b12pre) Gecko/20110207 Firefox/4.0b12pre and the 2/9 daily build on the Mac Pro.
Michael, this bug is not fixed, we know there are issues. Please refrain from commenting for the time being.
Duplicate of this bug: 633035
Whiteboard: [hardblocker][has patch][needs landing] → [hardblocker][has patch]
Attachment #511965 - Attachment is patch: true
Attachment #511965 - Attachment mime type: application/octet-stream → text/plain
Attachment #509906 - Attachment is obsolete: true
Thanks for doing this Matt!

Since the summary of this bug is about the perf regression and not specifically what Matt implemented I'm going to leave this bug open until we verify that we've got performance to a shippable level.
I think the work done by Matt is bug 598425, we should close that one instead.
This might have caused bug 633826.
I'm also seeing dropped frames, rendering glitches, and more frequent crashes (when running the current nightly in i386 mode to avoid bug 633826). I also don't see significant performance wins. I think we should leave the code in the tree but disable async layer painting on Mac OS X until we sort the issues out.
Blocks: 633826
I submitted a patch to disable plugin layers in bug 633826 because of the regression it caused with 64-bit builds.

I tried debugging and it wasn't clear to me why we were still trying to paint to a ThebesLayer if we have an IOSurface plugin layer and how Invalidating (updating the IOSurface) was meant to work with this patch.
> I also
> don't see significant performance wins. I think we should leave the code in the
> tree but disable async layer painting on Mac OS X until we sort the issues out.

Did anyone else notice performance differences while this was enabled? Performance appeared to be improved for me, but I don't have any tangible measurements of this.

If this isn't going to help much, then it probably shouldn't be a hardblocker.

I'll work on reproducing and fixing the problems in the meantime.
Matt: I'm not sure which builds it was enabled in, but there were major performance differences in the last two daily builds.

In the 2/13 build, video was unplayable.  I was getting a new frame once every 5-10 seconds.  That was on a Mac Pro 8 core with an HD4870.  In the 2/14 build, I get performance similar to what I had with some of the earlier builds - playable (mostly), but with dropped frames and not particularly smooth playback.
Matt - never mind my comment about not seeing a perf win, it wasn't working properly so it wasn't a useful build to measure with.
Matt Woodrow said, "it probably shouldn't be a hardblocker"

I'm not exactly sure what is causing the symptom but for this video

http://www.youtube.com/adblitz

On OSX 10.6.6 Pro Mac quad core

FF4b11 uses 90% CPU

Chrome uses 2% 

Safari uses 5%

To me this symptom is a BIG ISSUE for FF4 on OSX.

Sorry if I came off emotional.
Sorry if I wasn't clear, I was questioning this particular approach to fixing the performance problem, rather than the actual high level problem.
(In reply to comment #130)
> I tried debugging and it wasn't clear to me why we were still trying to paint
> to a ThebesLayer if we have an IOSurface plugin layer and how Invalidating
> (updating the IOSurface) was meant to work with this patch.

We shouldn't be!!! That would explain why you wouldn't see any performance improvement!
Just to clarify in case there is any confusion - Flash doesn't always negotiate Core Animation drawing all of the time. So even when this patch works properly it'll only improve perf for the videos in which Core Animation is negotiated. This is the majority of videos on YouTube and Hulu in my experience, but there are ads on YouTube, for example, that still negotiate Core Graphics.

I'm going to land the patch on bug 633942 today which will make seeing the models negotiated by OOP plugins easier.
We need someone to drive this hard over the next small handful of days, because it's a betaN hardblocker. Matt, do you intend that to be you? If you're busy, that's ok!

BenWa, do you have time to own this bug (perhaps only if Matt's not available)?
In midterms right now, but I have time starting friday (reading week is starting). If that's not too late I would need you to lend me a machine that support GL Layers :(.
Thanks Benoit - we want this bug done in about 24-48 hours, so it's down to either Matt or me.
Posted patch First fix (obsolete) — Splinter Review
This appears to be a regression from empty transactions landing about the same time as this bug.

This patch fixes the problem for me with all the original performance benefits, but we still don't get to to advantage of empty transaction (an even bigger performance gain on youtube).

I'm going to work on a horrible hack to make these work since I'm under the impression that async plugin rendering (the actual fix) is too much work.
Attachment #512630 - Flags: review?(roc)
+#ifndef XP_MACOSX
+      || aDisplayItemKey == nsDisplayItem::TYPE_PLUGIN
+#endif

Why do we need to do this?
(In reply to comment #142)
> +#ifndef XP_MACOSX
> +      || aDisplayItemKey == nsDisplayItem::TYPE_PLUGIN
> +#endif
> 
> Why do we need to do this?

Since Mac plugins aren't asynchronous, we can't mark them as valid for empty transactions. Otherwise we never call BuildLayer, and never actually render the plugin content.

The second patch that I have partially working instead calls the DrawRect callback from inside ImageLayerOGL::RenderLayer and gets drawing happening with empty transaction. But this is fairly horrible hack and I don't overly like it.
Posted patch Alternate fixSplinter Review
This leaves empty transactions enabled and instead lets us ask the plugin to draw during compositing.

This is pretty much horrible in every way, but the performance gain is oh so nice. Personally I think this approach (or possibly a similar, less ugly one if anyone has suggestions) is worthwhile as long as async rendering gets high priority post-4.0.

Still needs a lot of cleanup, and is causing flickers for me. Just interested in feedback on the concept.
Attachment #512689 - Flags: feedback?(roc)
(In reply to comment #143)
> Since Mac plugins aren't asynchronous, we can't mark them as valid for empty
> transactions. Otherwise we never call BuildLayer, and never actually render the
> plugin content.

It's OK to call InvalidateLayer and do a null transaction as long as the *only* thing that's changed is the ImageContainer's current image. It doesn't matter whether async rendering is being used or not. We need to make sure that a proper Invalidate is done whenever the plugin's layer or any of its properties have changed. If we do that, null transactions should work.
OK, so first let's get this working without support for empty transactions. Instead of hacking nsIFrame::InvalidateLayer, hack nsObjectFrame to not call InvalidateLayer but Invalidate when we don't support an empty transaction.

Then in followup, preferably in another bug, we can add empty transaction support. Basically, nsPluginInstanceOwner::InvalidateRect will need to call InvalidateLayer and also take responsibility for asking the plugin to paint and updating the ImageContainer's image.
Posted patch First fix v2 (obsolete) — Splinter Review
Attachment #512630 - Attachment is obsolete: true
Attachment #512699 - Flags: review?(roc)
Attachment #512630 - Flags: review?(roc)
Posted patch First fix v3Splinter Review
Fixed the case where we don't have a layer for a plugin.

Carrying forward r=roc
Attachment #512699 - Attachment is obsolete: true
Attachment #512705 - Flags: review+
Whiteboard: [hardblocker][has patch] → [hardblocker][has patch][needs landing]
Filed bug 634521 for handling empty transactions.
Last fix landed on mozilla-central:

http://hg.mozilla.org/mozilla-central/rev/1fd068c483d8
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
With the patch, I'm seeing smooth flash video.  :-)  However, CPU usage is still very high.  On a Macbook Air (320m graphics, hardware accelerated video), I'm getting 24.8% from Firefox-bin, 20% from kernel task, and 12% from the flash plugin while windowed.  When full screen, I get 25.5% in kernel task, 12% from the flash plugin, and 5.5% from Firefox-bin.

On a Mac Pro 8 core (software rendering), I'm getting about 40% on the flash plugin and 8% from firefox-bin in full screen, 47% and 16% respectively when windowed.

In Safari, the same video http://www.youtube.com/watch?v=_i2RCBa3l-g (1080p playback) on the Mac Pro takes 41.3% in the flash plugin and 1.7% for the Safari process when windowed or running full screen.  

On the Macbook Air, in Safari, the I get about 45% on the same video when windowed for Safari, and 11% for the flash plugin.  In full screen, that drops to 14% for the plugin, another 14% for the kernel task, and < 1% for Safari.
I'm looking at overall idle CPU % in top and am seeing about 60% idle in today's FF nightly, and about 80% idle in Safari, and about 50% idle in Chromium (recent build) -- using the 1080p Avatar video linked by Michael on a 2.4Ghz MBP Core i5 8GB ram. Oh, and when no video is playing I'm seeing about 90% idle system.

The video does seem to be playing smoother, but I'd say based on these numbers there's still some room for improvement. However, great job guys on what you've done so far! :-D
(In reply to comment #151)
> I'm getting 24.8% from Firefox-bin, 20% from kernel task, and 12% from the
> flash plugin while windowed.  When full screen, I get 25.5% in kernel task, 12%
> from the flash plugin, and 5.5% from Firefox-bin.
> On the Macbook Air, in Safari, the I get about 45% on the same video when
> windowed for Safari, and 11% for the flash plugin.  In full screen, that drops
> to 14% for the plugin, another 14% for the kernel task, and < 1% for Safari.

Sounds like it's the same as safari. 56.8% total in Firefox, 56% in Safari. I don't think you'll get much better than that until Adobe makes Flash less CPU hungry on the Mac.

Needless to say this is definitely a huge improvement over what it was. On my MacBook Pro I'm now seeing 40-50% CPU usage compared to the 130-140% that was typical before on 360p videos. And only roughly 90% CPU usage on 1080p videos.
Comment 151, On a Mac Pro 8 core (software rendering), I'm getting about 40% on the flash plugin and 8% from firefox-bin in full screen, 47% and 16% respectively when windowed.

Strange. USING FF4b11 that "Avatar FULL 1080p trailer" on a Mac Pro Quad Core I'm seeing about 19% from the FF bin and 11% on the process plug-in windowed. Based on those numbers the "fix" is worse than beta 11.

Note: in Chrome I see

4% for the main process and 2% for the renderer process.
Note that on the url I provided, YouTube will default to playing at < 1080p until you enter full screen.  You must change it to 1080p manually in windowed mode to ensure that you are playing back the video at a consistent resolution each time.
When you guys are comparing Chrome vs. Safari / Firefox, have you told Chrome to use the system Flash instead of its own version?
Michael. Thanks for the info. In 1080p results for FF4b11 on the trailer are

45% bin
45% plug-in

I am hoping the fixed version runs < 8% for both.
OK, noting jag's comment, I double-checked and yes, Chromium and FF are using the same flash version (10.2.152.26, 32-bit everything).

My previous numbers were not fair to FF because i have a bunch of tabs open and I didn't take that into account (the base idle% should be subtracted for FF because when it has a lot of tabs open, it uses up a decent amount of idle cpu just sitting there, a bug in its own right).

So, my new notes are as follows (based on Activity Monitor CPU usage numbers this time):

subtract 25% cpu idle from ff
89-25=64% idle minefield
66% idle plugin process
while video loading
Total: 130%

chromium: 40%
chromium renderer: 30.4%
flash plugin: 70%
Total: 140%

safari: 16%
flash plugin: 19%
Total: 35%
Definitely offloading everything to GPU properly
(In reply to comment #158)
> subtract 25% cpu idle from ff
> 89-25=64% idle minefield
> 66% idle plugin process
> while video loading
> Total: 130%

Whoops, those should read 64% cpu and 66% cpu (_not_ idle cpu).
I should probably also note the chromium build number as listed in the About Chromium box: 11.0.664.0 (74130)

(sorry for all the emails!)
Now I'm really confused! For

Avatar trailer at 1080p - windowed

FF4b11

bin ~55%
plugin ~55%

Chrome latest version (9.0.597.102)

bin ~45%
plugin ~45%

For the Superbowl vid's http://www.youtube.com/adblitz selecting the Bud commercial

FF4b11

bin ~45%
plugin ~100%

Chrome latest version (9.0.597.102)

bin ~3%
plugin ~4%

All measurements made using Activity Monitor. OSX 10.6.6 with Mac Pro Quad-Core Intel Xeon 5500 “Nehalem” circa mid 2009

When will the "fix" hit the nightly's?

Don't mean to be a PITA.
> When will the "fix" hit the nightly's?

2 days ago.... ;)
OK BORIS I'M EMBARRASSED.

Avatar trailer at 1080p - windowed

Minefield 4.0b12pre latest

bin ~35%
plugin ~47%

For the Superbowl vid's http://www.youtube.com/adblitz selecting the Bud
commercial

bin ~20%
plugin ~95%

Improvement in CPU is modest. NOTE: I never did see any artifacts in the vid's.

Last PITA question. Any clue how Chrome manages bin ~3% plugin ~4% on the Superbowl vids. Also the second post in this thread

http://forums.mozillazine.org/viewtopic.php?f=23&t=2096621&start=0

of a WIN USER using FF4b12pre on 07 Feb 2011 says

No problem here.

plugin-container: 2-4% CPU Usage

for the Superbowl vids.

Maybe it's not this bug but since the plugin for the Superbowl uses

FF4 on OSX ~95%
Chrome on OSX ~4%
FF4 on Win ~4%

About 20x more CPU usage for FF4 on OSX on that particular site seems like a big issue.
Robert, we should get a separate bug filed on the Superbowl ad.  On Mac there's weirdness where the flash plugin will negotiate different drawing methods for different movies; maybe something is going awry with that.  Please file and cc me and "joshmoz"?
Please wait for bug 634521 to be fixed before commenting further. Also, please check that about:support shows all windows being accelerated; the changes in this bug will only affect that case.
I hope it's OK to make this comment, as the reason I'm making it is not to report CPU usage, but rather to state that upon watching this (may I say, very funny) trailer, I noticed the skipping again (i.e. video would pause periodically during playpack, about once every 30 or 40 seconds). This is with the 16th's Minefield build:

http://www.cracked.com/video_18156_a-trailer-every-academy-award-winning-movie-ever.html
Bug 634521 has landed so everyone can comment again once they've grabbed today's nightly :-).
Using an hourly with patch for bug 634521 I get cpu usage of 20% on bin and 110% on plugin-container on the adblitz page. Chrome gets 3-5%.

Also this site is smooth in both Safari and Chrome but not in Firefox: 
http://www.bredbandskollen.se/ click on "Starta Mätningen" and the Flash object will resize itself, it should be smooth but it's not.
Tested with this video: http://www.youtube.com/watch?v=4a4MR8oI_B8 on 720p. With Flash Player 10.2.152.26

I've noticed while loading the video the Minefield process uses 100+% CPU, when the video is finished downloading completely this drops to 8-10%. Not really a big issue since it goes away once it's done loading but might be good to look at eventually in another bug.

With Flash Player 10.2.152.26:
Nightly:
Program - 8-10% (100% while loading)
Plugin - 20-26%

4.0b11:
Program - 20-25% (100% while loading)
Plugin - 20-22%

Safari:
Program - 3-4% (50-55% while loading)
Plugin - 20-22% (60-63% while loading)

I tested the same video on my laptop and got similar results though the CPU usage was higher:

Nightly:
Program - 15-20% (60% while loading)
Plugin - 40-50%

4.0b11:
Program - 35-45% (80% while loading)
Plugin - 40-50%

Safari:
Program - 3-4% (50-55% while loading)
Plugin - 40-45% (60-63% while loading)

So although for some reason my laptop uses twice the CPU as my desktop, it looks like there's about a 20% decrease in the program CPU going from 4.0b11 to the nightly.
Good, so it looks from comment #169 that bug 634521 helped, as expected.

I'm not sure what to make of comment #168, it's a bit ambiguous. Is the plugin process using more CPU with Firefox than with Chrome? If so, that doesn't make much sense to me.
But it looks like we still need to reprofile on Mac with GL to see what else we can squeeze out. Maybe not for FF4 though, at least comment #169 suggests that things are acceptable, if not optimal.
I found a video that's great for capturing the "stuttering" that occasionally (still) occurs when watching videos in FF. The latest nightlies have been a great improvement but it's still there.

There's a lot of fast-cuts in this trailer and FF chokes on them:

http://www.youtube.com/watch?v=LIVmsIJyj3A

Watch that same video (in HD) in Safari and you should definitely notice a difference in the smoothness of playback. Speaking out of complete ignorance as I haven't seen the code, this seems like some sort of buffering issue where memory isn't being optimally moved around.
I should add, once the video has actually been loaded, it seems like it plays back much smoother in FF... Sorry for the qualitative and not quantitative analysis..
(In reply to comment #169)
> I've noticed while loading the video the Minefield process uses 100+% CPU, when
> the video is finished downloading completely this drops to 8-10%. Not really a
> big issue since it goes away once it's done loading but might be good to look
> at eventually in another bug.
Is playback smoothness affected while this is going on? Comment 173 suggests that it is.
On YouTube, right click on the video and click 'show video info'. This will show how many frames are being dropped.

I'm noticing I'm dropping many frames every second (2-6 at least) for a 360p video. Safari I'm dropping from 2-5 a minute as a guestimate.
Nice! Thanks to comment 175, we can get quantitative on this "stuttering" effect. While loading the 4HB trailer, Minefield (today's build) dropped ~220 frames, while Safari dropped only 4, and Chromium dropped only 8 frames.
I should note that as Minefield is my work browser, it has many tabs open (and two windows)... don't know if that should affect these tests though. Safari/Chromium are tested after launching them with only one window open and one or two tabs. So, I guess what I'm saying is, feel free to reproduce these findings! ;-)
I got 13 dropped frames running the video from start to end (in 720p HD) mode and both processes under 30% CPU at all times.
I think at this point we should file a new bug for the dropped-frames issue.
Also, that new bug should be specifically about frame dropping when GL acceleration is enabled. When GL acceleration is not enabled, we have a different set of problems.
> don't know if that should affect these tests though.

It can affect the dropped-frame thing, for sure, depending on what scripts are running in those other tabs.

Like roc says, let's get a separate bug filed on the stuttering; I bet the js folks will want to be in on that.  ;)
Not sure if it is the same issue as the last two posts or not, but I've noticed the following happening:

1) When switching between a windowed and full screen video, the dropped frame count goes up quite a bit.  Afterwards, few or no frames or dropped.

2) Displaying a window on top of the video (Minefield -> About Minefield, for example) causes the video to freeze or jitter for a second or two.
(In reply to comment #174)
> (In reply to comment #169)
> > I've noticed while loading the video the Minefield process uses 100+% CPU, when
> > the video is finished downloading completely this drops to 8-10%. Not really a
> > big issue since it goes away once it's done loading but might be good to look
> > at eventually in another bug.
> Is playback smoothness affected while this is going on? Comment 173 suggests
> that it is.

I have noticed that the playback sometimes doesn't seem as smooth while loading, but I've always considered this more of a Flash issue than a Firefox issue since I've seen it happen in all browsers.
I'm now seeing lines of blocks that appear within the video (I did see this in some earlier versions too).  Picture blocks of about 100x100 pixels, in a line, that appear as artifacts.  The video within them is roughly correct, but it's really visible.
To demonstrate this, please view:
http://www.hulu.com/stand_alone/fd0a1e86d9572e066fc74680e5b90779?continuous_play_mode=2&continuous_play=on&continuous_play_sort=

(In case that doesn't play, this is on Hulu.  It's Carrier, Episode 10.  I'm seeing the artifacts all over the place here.  But I've seen them in other videos as well, especially when there are rapid changes in portions of the image.
Dropped frames issue, filed bug 635188.
(In reply to comment #186)
> Dropped frames issue, filed bug 635188.

Thanks for filing that!
Add to the video corruption dropped frames as well.  Not sure if this is the same as bug 635188 or a different issue...
Depends on: 635188
Depends on: 637184
(In reply to comment #188)
> Add to the video corruption dropped frames as well.  Not sure if this is the
> same as bug 635188 or a different issue...


I played the hula video for about ten minutes using FF4b12 and did not notice any corruption of any kind.
Is this bug fixed in the 4.0 RC?! Cause I can't see much difference.

Bug 625100 was marked as a duplicate but the results with 4.0 RC are nearly equal and still worse than 3.6.

Playing this video:

http://www.youtube.com/watch_popup?v=Ou6_MkIvKOo&vq=medium#t=24

on Mac OS X 10.6.6 with Flash 10.2 installed I get the following CPU usage:

firefox-4.0b10pre: 38 %
 - itself 30%
 - plugin process 18%

firefox-4.0 RC: 34 %
 - itself 15%
 - plugin process 19%

where is the great difference?

in comparision to:
 
 Safari 5.0.4: 20%
 - itself 4%
 - plugin process 16%

firefox 3.6.13: 25% in total

And I still can hear the difference. With 3.6 the internal fan isn't working at all. With 4.0 RC just opening a flash page once in a while and the fan is running non-stop.

What do I do wrong?
(In reply to comment #190)
> firefox-4.0b10pre: 38 %
>  - itself 30%
>  - plugin process 18%

Looks like you made a math error here :-)
Also the diagnosis greatly depends on whether you are getting GL acceleration or not (about:support).
Not sure if I should start a new bug report on this one, but watching a flash video (say, one on hulu) in one window while trying to load a web page in another window causes the video to stutter and drop some or all frames while the page is loading.  This doesn't occur in other browsers.  If I want to browse in Minefield while simultaneously watching a video, I have to watch the video in Safari.  Note that this has occurred for quite some time in the FF4 builds.  I think I didn't think to mention it because of the other video performance issues.
Also, the video issue that I mentioned in comment 185 is still present in the 3/13 daily build.
(In reply to comment #192)
> Also the diagnosis greatly depends on whether you are getting GL acceleration
> or not (about:support).

Same here with a MacBookPro from early 2009 and OpenGL accelerated windows. As Markus has mentioned the fan is always rotating at a higher speed.
(In reply to comment #191)
> (In reply to comment #190)
> > firefox-4.0b10pre: 38 %
> >  - itself 30%
> >  - plugin process 18%
> 
> Looks like you made a math error here :-)

Apart from the poor math above:

firefox-4.0 RC: 34% > firefox 3.6.13: 25% > Safari 5.0.4: 20%

=> Resulting in a non-stop working fan.


about:support states:

Direct2D false
DirectWrite false
WebGL-Renderer NVIDIA Corporation -- NVIDIA GeForce 9400M OpenGL Engine -- 2.1 NVIDIA-1.6.26
GPU-beschleunigte Fenster1/1 OpenGL


Is there a way to turn of the seperate plugin process to work around the performance regression in FF 4.0?
bz/mattwoodrow/BenWa: can one of you guys reprofile and file bugs on what else needs to be done for the CoreAnimation-drawing-model + GL-layers configuration here?
I can try to, though I'm not very good at telling which things "look slow" in drawing land yet.

Are there two configurations I can compare where one is slow and one is fast or something?  That would help a lot!
Not really.

When a Flash video is playing with OOPP and GL enabled, I would expect the firefox-bin profile to be the usual Appkit gunk plus a small amount of time under LayerManagerOGL::EndTransaction, and nothing significant anywhere else. Any deviation from that would be interesting.

One thing I wonder about is where Core Animation does its compositing. Presumably that happens in either Safari or the windowserver. If Core Animation compositing is a lot faster than our LayerManagerOGL::EndTransaction compositing, that would also be interesting.
(One possibility is that painting through Appkit draw events is just slower than using Core Animation. In that case, painting through the refresh driver should help.)
OK, so the I did in fact recently profile that configuration, on http://office.smedbergs.us/embed-youtube.html

On that video, I see Flash itself use about 45% of my CPU with both Safari and m-c.  Safari's main process uses about 3%.  Our main process uses about 12%.  When safari is running, windowserver uses 3% or so; when we're running it uses nothing.

So it sounds likely that windowserver is doing some of Safari's work, yes.

For that configuration, my time breakdown in the main browser process looks like this:

   10% is getting IPC messages, all of this is landing under
       nsPluginInstanceOwner::InvalidateRect.  Half is frame
       invalidation, while half is
       nsPluginInstanceOwner::SetCurrentImage.
   20% seems to be CoreFoundation and AppKit overhead under
       __CFRunLoopRun and so forth.
    8% is AppKit overhead under _sendViewWillDrawInRect
    8% is NSView locking and unlocking focus
   30% is under PresShell::Paint, mostly under LayerManagerOGL::Render. 

and there's a bunch of long-tail stuff, I think.

Is that useful?  Do you want more of a breakdown under Paint/Render?
Sounds like that 30% of 12% is about the same as what windowserver is using for compositing. So that's not likely to be an issue.

It sounds like the biggest overhead is Appkit-related :-(. I'm not sure how much of that we can get rid of without using Core Animation.
I'm still getting strange artifacts in the latest daily build.  See 
http://www.hulu.com/stand_alone/2c1b7308f2c152b07626dc47e60cd87b?continuous_play_mode=2&continuous_play=on&continuous_play_sort=

for a demonstration, particularly when the camera pans up.  I get the same horizontal lines and blockiness that I reported before.
Whiteboard: [hardblocker][has patch][needs landing] → [hardblocker][has patch]
I don't think that the original bug is fixed.  Using the latest nightly build as of June 27th or 28th, I'm still getting a lot of dropped frames.  (Or perhaps the bug is back?)  I still have to use Safari to watch flash videos, and that's on an 8 core Mac Pro with an HD 4870.
(In reply to comment #205)
> I don't think that the original bug is fixed.  Using the latest nightly
> build as of June 27th or 28th, I'm still getting a lot of dropped frames. 
> (Or perhaps the bug is back?)  I still have to use Safari to watch flash
> videos, and that's on an 8 core Mac Pro with an HD 4870.

Have you tried FF 5.0? If you're only seeing this on the nightly then it's a regression from bug 598425 that is being fixed at the moment.
You need to log in before you can comment on or make changes to this bug.