Closed Bug 1254874 Opened 9 years ago Closed 2 years ago

Investigate hardware video decoding with the basic compositor

Categories

(Core :: Graphics, defect, P3)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: jrmuizel, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [gfx-noted])

Does this work? Should this work?
Whiteboard: [gfx-noted]
Adding cpearce since I believe he tested this during the initial implementation. Readback made it slower than software iirc, but it's possible that there are ways to do better.
(In reply to Matt Woodrow (:mattwoodrow) from comment #1) > Adding cpearce since I believe he tested this during the initial > implementation. > > Readback made it slower than software iirc, but it's possible that there are > ways to do better. Yes, I tested this using the naive approach (just locking and memcpying IIRC, it was a few years ago...). Intel released a paper on how to do this faster: https://software.intel.com/en-us/articles/copying-accelerated-video-decode-frame-buffers jya also implemented this in a personal project IIRC, so he may have input here.
Flags: needinfo?(jyavenard)
Yes, USWC copy using Intel's method is significantly faster. Over 10* faster in my experience. This is what VLC uses btw, it decodes in hardware and the perform a readback into a YV12 buffe. performance appears okay there. don't know how well that will work with discrete graphic cards though, likely still an improvement. I'd be happy to work on that, I've written all the bricks already.
Flags: needinfo?(jyavenard)
comparison is between memcpy (which is SSE2 accelerated already) and MOVDQA
:jya, is there a progress?
Flags: needinfo?(jyavenard)
I haven't done anything, i wasn't asked to. sorry :(
Flags: needinfo?(jyavenard)
Ok, no problem.
Can you point us to the "bricks" from comment 3? :)
Flags: needinfo?(jyavenard)
Here is the code I wrote for another project https://github.com/MythTV/mythtv/blob/master/mythtv/libs/libmythtv/mythframe.cpp The intel white paper also have a code sample that can be used. There are two copy algorithm in there. One is plain SSE copy the other is the one optimised for uswc memory. The first time the method of the class is used, it actually time the copy. With the Intel decoder all frames returned are uswc based, but that's not always the case under all circumstances.
Flags: needinfo?(jyavenard)
See Also: → 1275441
Severity: normal → S3

Basic Compositing was removed in bug 1727876.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.