Closed Bug 581797 Opened 14 years ago Closed 14 years ago

scaled HTML5 video in firefox really slow with radeon

Categories

(Core :: Audio/Video, defect)

1.9.2 Branch
x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Tracking Status
blocking2.0 --- final+

People

(Reporter: mcepl, Assigned: karlt)

References

Details

(Keywords: regression)

Attachments

(1 file)

Description of problem:
HTML5 video is unplayable since I upgraded from F11 to F13.

Version-Release number of selected component (if applicable):
firefox-3.6.4-1.fc13.x86_64 (mine is firefox-3.6.8-1.el6.x86_64 and I could reproduce with the upstream binary of Mozilla/5.0 (X11; U;
Linux i686 (x86_64); en-US; rv:2.0b1) Gecko/20100630 Firefox/4.0b1)
xorg-x11-drv-ati-6.13.0-1.fc13.x86_64
xorg-x11-server-Xorg-1.8.2-1.fc13.x86_64

How reproducible:
Always


Steps to Reproduce:
1. Open http://www.cendio.com/voo/
2. Download the file http://www.cendio.com/voo/stream.ogv to the local drive and play it with Firefox, mplayer, and totem (using GStreamer) 

Actual results:
Watch the machine come to a crawl with Xorg eating 100% CPU.
(I haven't observed 100% CPU on Xorg, but Firefox was at 150% CPU when played local file ... it's dual core mind you)
Playback is skippy and non-smooth both from the website and locally.
mplayer and totem play the same file without a hitch both from URL and locally, even in full-screen.

Expected results:
Video plays smoothly.

Additional info:
Reporter's spec is a dual core 2.8 GHz core 2 machine with a radeon HD 4550 and Fedora 13.    

Mine is Lenovo Thinkpad T400, dual core 2.4 GHz with Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller, and RHEL-6.
------------------
From the further discussion on the downstream bug:
(from reporter)
Note, I have intel driver here.    
I tested other HTML5 videos, and they worked as horribly as that one. And as I
wrote in the original report, the exact same videos work fine when the graphics
hardware isn't radeon.
What if you go fullscreen?

If it works fine on F11, sounds like we're tickling some driver bug.

Can you get a profile for us? At least break it down by "top".
(In reply to comment #1)
> What if you go fullscreen?
> 
> If it works fine on F11, sounds like we're tickling some driver bug.
> 
> Can you get a profile for us? At least break it down by "top".

I completely don't dismiss the idea of X11 bug (my other area of bug triage at Red Hat), but this has been reproduced on radeon and intel (though the downstream reporter claims it not to be reproduceable on nouveau?).

Also when I try here on intel with full screen video and F11, the playback is as choppy as when run in the browser (both when playing local file and from the Internet).

Concerning CPU consumption ... it is somehow weird. I cannot reproduce high CPU consumption reliably -- the choppiness of plauback is reliable though.
Another possibility is the sound subsystem. If you disable ALSA completely does the problem get better?
Matej, can you try some new Fedora too? It works fine for me on almost the same HW (LenovoT60/Intel945) & Fedora 14.
Hi,

We have the same issue with a xulrunner app (ie. local video files). The key here is the resizing of the video. In the link provided above, the video is being scaled up to 600px in the video tag. The video itself is actually a little smaller. If you remove the width="600px" using firebug the performance improves dramatically.

This only happens with a radeon card, and only happens with KMS enabled.

It's reproducible even on videos without sound.

Top breakdown is around

95% Xorg + 5% firefox

Normal breakdown (with the width="600px" removed) is:

10% Xorg + 20% firefox.

ATI Technologies Inc RV710 [Radeon HD 4350]

xorg-x11-drv-ati-6.13.0-1.fc13.i686
kernel-2.6.33.6-147.2.4.fc13.i686
firefox-3.6.7-1.fc13.i686
Sounds like XRender scaling is incredibly slow with that driver. Bug 577843 will almost certainly help, since video scaling will be performed client-side concurrently with YUV conversion. Of course, slow XRender scaling would still bite us in other ways.
Tom, please make sure to file a bug with your distribution about your problem, which will hopefully make it upstream...  Fundamentally, your video driver is just broken...
(In reply to comment #7)
> Tom, please make sure to file a bug with your distribution about your problem,
> which will hopefully make it upstream...  Fundamentally, your video driver is
> just broken...

https://bugzilla.redhat.com/show_bug.cgi?id=612492 has been reopened and our developers will take a look.
The key to the problem is that software fallback happens because implementing alpha=0 extension extend-none on surfaces
without alpha-channels is non-trivial in hardware.
https://bugs.freedesktop.org/show_bug.cgi?id=28670#c13
(In reply to comment #6)
> Bug 577843
> will almost certainly help, since video scaling will be performed client-side
> concurrently with YUV conversion.

That doesn't sound as good as hardware accelerated scaling.
Builds before bug 487305 landed had hardware accelerated scaling because ARGB surfaces can easily be extended with an alpha=0 border.
Blocks: 487305
blocking2.0: --- → ?
Keywords: regression
There is also a regression from user mode setting to kernel mode setting.
The software fallback is not half as noticeable with ums.  With kms I'm seeing
2 seconds per frame and x server barely responsive.

Softare fallback happens both with ums and kms, but it seems that the
difference between ums and kms is that with ums the pixmap would be migrated
to system ram for software fallback
https://bugs.freedesktop.org/show_bug.cgi?id=27139#c14

but with kms the video ram is mmapped.
http://cgit.freedesktop.org/xorg/driver/xf86-video-ati/tree/src/radeon_exa.c?id=b5bfdbd70d9671250957ccd41dfc8818850d257e#n305
http://cgit.freedesktop.org/mesa/drm/tree/radeon/radeon_bo_gem.c?id=966c9907c040b4fe4b288b4a9d82598797aee743#n181

Most of the time is spent in pixman's bits_image_fetch_bilinear_no_repeat_8888
https://bugs.freedesktop.org/attachment.cgi?id=35403

I could imagine fetch performance being devastating if pixman_image_composite
is alternating between reading and writing to mmapped vram (though I haven't
confirmed that is happening).
I guess the reason for using extend-none here was fear that cairo would fall
back to image compositing, which might require readback.

Extend-none might avoid cairo image-surface fallback but software fallback is
probably still going to happen on the server.

The source is typically an image surface here so that doesn't need to be read
back.
And with RGB24 surfaces and extend pad, if we made the clips integer, then
cairo shouldn't need to read back the destination for fallback compositing.
(If cairo doesn't know this, then I guess we could even use op-source to help
it out.)
Using EXTEND_PAD seems to be what we want here.  I don't know whether fixing
the clips to be integer is still necessary to avoid cairo-fallback on old
servers.  What do you think about trying just this, and we see if the clipping
causes problems?

(Operator-source could also be a workaround when rendering to RGB24 surfaces,
but is incorrect.  ARGB surfaces are also a possible workaround but extend pad
seems to be what we want.)

Also, by not applying the clip with EXTEND_NONE and operator-over, it actually
gives the correct (in the sense that it is the same as extend-pad with the clip) output on Mac when there are no other clips interfering,
assuming the source surface has the size indicated.
Assignee: nobody → karlt
Attachment #465558 - Flags: review?(roc)
So is there a way we can detect in our code these sorts of "likely to be slow" (EXTEND_NONE on RGB24 surfaces, whatever) operations?  Are there legitimate reasons we'd want to perform them?
Comment on attachment 465558 [details] [diff] [review]
use EXTEND_PAD on X11 to enable hardware video scaling

I think we also need to make sure that coordinates are snapped in the presence of scaling.
Attachment #465558 - Flags: review?(roc) → review+
(In reply to comment #15)
> So is there a way we can detect in our code these sorts of "likely to be slow"
> (EXTEND_NONE on RGB24 surfaces, whatever) operations?  Are there legitimate
> reasons we'd want to perform them?

The problem is that "likely to be slow" depends on drivers. On these radeon drivers, EXTEND_NONE is slow. On other drivers, using EXTEND_PAD is slow (because cairo takes a fallback path under some conditions, to work around driver bugs).
So are we fairly sure that this patch is not going to regress performance in other cases?  :(
I can pixel align the clips and check that buggy_pad_reflect does not cause cairo to fetch the destination pixels from the server.

Beyond that, the info I have is that extend-pad is more likely to be (correctly) supported by hardware than extend-none.  If extend-none is already done in software, then I wouldn't expect extend-pad in software to be much different.

I don't know what other bugs are out there, but extend-pad is the behavior that we want - extend-none was to workaround fallback in cairo.

As one other data point, with nvidia drivers, kinetik observed a very small improvement with extend-pad over extend-none, though cpu usage seemed to indicate this was still not hardware accelerated.
Summary: HTML5 video in firefox really slow with radeon → scaled HTML5 video in firefox really slow with radeon
If any drivers are accelerating scaling with extend-none, then perhaps we should only pass extend-pad to cairo for X server versions >=1.7, because cairo will only pass the scaling with extend-pad onto the server with those servers.
http://cgit.freedesktop.org/cairo/commit/?id=a1d0a06b6275cac3974be84919993e187394fe43
nouveau has similar software fallback with xRGB, over, and RepeatNone.
I haven't the hardware to see whether the result is quite as catastrophic.
What's happening with this bug? It's now a blocking bug... Karl is the patch ready to land?
No.  The patch should not land as is, as it will cause performance regressions for people with older servers.

IIUC, bug 577843 turned off gpu accelerated scaling, so that may have changed the importance of this bug.
Attachment #465558 - Flags: review-
(In reply to comment #25)
> IIUC, bug 577843 turned off gpu accelerated scaling, so that may have changed
> the importance of this bug.

No, that bug only changed things for people who don't have GPU-accelerated scaling.
(In reply to comment #26)
> (In reply to comment #25)
> > IIUC, bug 577843 turned off gpu accelerated scaling, so that may have changed
> > the importance of this bug.
> 
> No, that bug only changed things for people who don't have GPU-accelerated
> scaling.

Er, sorry. You're right, it did stop us from using XRender scaling. So this bug is probably fixed as originally filed.
I also fixed the originally reported bug for xf86-video-ati-6.13.2, so let's resolve this bug.  Other issues, such as using EXTEND_PAD where appropriate and restoring gpu acceleration, are better dealt with in different bugs.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: