Closed
Bug 412753
Opened 17 years ago
Closed 17 years ago
Speed up JPEG decoding by another 10%
Categories
(Core :: Graphics: ImageLib, defect, P1)
Core
Graphics: ImageLib
Tracking
()
VERIFIED
FIXED
mozilla1.9beta3
People
(Reporter: alfredkayser, Assigned: alfredkayser)
Details
(Keywords: perf)
Attachments
(2 files, 3 obsolete files)
23.34 KB,
patch
|
pavlov
:
review+
|
Details | Diff | Splinter Review |
2.71 KB,
patch
|
pavlov
:
review+
beltzner
:
approval1.9b3+
|
Details | Diff | Splinter Review |
Currently the JPEB library convert the source color space to RGB (3bytes/pixel), after which the JPEG decoder then converts this to the packed Cairo format (4bytes/pixel). Much more efficient would be to generate directly into the Cairo format.
This could be done in jpeglib, but for some systems, the 'system jpeglib' is used, and that won't work. Easier is to have the JPEG decoder after the jpeglib as initialized itself, override the color_convert function with one that can convert from the source color space directly into Cairo. As the most common (99.99% of the JPEG's out there) color space is YCbCr, we only need to provide a conversion routine for that into packed RGB.
This only requires a small change in the JPEG decoder.
Net effect is that in a simple 'jpegtest.html' test, about 10% speeddifference is noted (from avg. 66ms to 56ms).
Also, maybe a SSE2 guru can optimize this routine (based on the material done on the YCbCr->RGB and RGB to PackedRGB)?
Note, many thanks to mmoy for the static tables, which I have shamefully copied into this patch.
Attachment #297517 -
Flags: review?(pavlov)
Comment 1•17 years ago
|
||
This seems like a pretty neat solution.
tmp0 = _mm_unpacklo_epi8(row_R, row_G); /* RGRG RGRG RGRG RGRG */
tmp1 = _mm_unpacklo_epi8(row_B, zeroes); /* B0B0 B0B0 B0B0 B0B0 */
tmp2 = _mm_unpacklo_epi16(tmp0, tmp1); /* RGB0 RGB0 RGB0 RGB0 (low) */
tmp3 = _mm_unpackhi_epi16(tmp0, tmp1); /* RGB0 RGB0 RGB0 RGB0 (high) */
This code takes the output of the math processing which is three color vectors
and merges them back into four-byte pixels. The order can be changed by switching row_R with row_B and using a vector with all bits on instead of all zeroes. Then the next 23 lines of code (almost half of the SIMD routine) which convert four-byte pixels to three-byte pixels can then be tossed. I'll try to do this tonight.
Also, it would be nice to have only one copy of the static tables.
Comment 2•17 years ago
|
||
The new code would look like this. It's like playing with Legos.
tmp0 = _mm_unpacklo_epi8(ones, row_B); /* ABAB ABAB ABAB ABAB */
tmp1 = _mm_unpacklo_epi8(row_G, row_R); /* GRGR GRGR GRGR GRGR */
tmp2 = _mm_unpacklo_epi16(tmp0, tmp1); /* ABGR ABGR ABGR ABGR (low) */
tmp3 = _mm_unpackhi_epi16(tmp0, tmp1); /* ABGR ABGR ABGR ABGR (high) */
Assignee | ||
Comment 3•17 years ago
|
||
Interesting: this could really speed things up.
One set of tables would be nice, but where to put them? libjpeg could be a 'system library' so coming from somewhere without these tables.
Updated•17 years ago
|
Version: unspecified → Trunk
Comment 4•17 years ago
|
||
Moving to the blocking list - we'd love to get this in before Beta3 folks!
Flags: blocking1.9+
Priority: -- → P1
Comment 5•17 years ago
|
||
Perhaps making it global in nsJPEGDecoder? That way, the Mozilla version would have access to it.
It sounds like Mozilla's jpeg code in /jpeg isn't compiled and linked in when libjpeg is used which means that the detection code for SSE2 capability has to be moved for nsJPEGDecoder to use it. Do you think that it's reasonable to put it in nsJPEGDecoder?
Comment 6•17 years ago
|
||
I found a 17% improvement on Windows from Alfred's two patches. This bug and the input buffer bug.
Comment 7•17 years ago
|
||
Comment on attachment 297517 [details] [diff] [review]
Prototype
ok, this patch needs to not have the patch from 411379 in it. lets get 411718 in first, then 411379, then this.
Comment 8•17 years ago
|
||
(In reply to comment #7)
> (From update of attachment 297517 [details] [diff] [review])
> ok, this patch needs to not have the patch from 411379 in it. lets get 411718
> in first, then 411379, then this.
>
Alfred you want to make a new patch without the changes in 411379 since that already landed?
Assignee | ||
Comment 9•17 years ago
|
||
Note:
The 'ARGB' converter is only used in the most common case: in colorspace = JCS_YCbCr, and out colorspace = RGB, and no CMS transformations applied.
Attachment #297517 -
Attachment is obsolete: true
Attachment #299712 -
Flags: review?
Attachment #297517 -
Flags: review?(pavlov)
Assignee | ||
Updated•17 years ago
|
Attachment #299712 -
Flags: review? → review?(pavlov)
Assignee | ||
Comment 10•17 years ago
|
||
Attachment #299712 -
Attachment is obsolete: true
Attachment #299945 -
Flags: review?(pavlov)
Attachment #299712 -
Flags: review?(pavlov)
Comment 11•17 years ago
|
||
Comment on attachment 299945 [details] [diff] [review]
Real cvs diff of V2
+/**************** YCbCr -> Ciaro's RGB24/ARGB32 conversion: most common case **************/
should be just cairo and not Ciaro ;)
Assignee | ||
Comment 12•17 years ago
|
||
Attachment #299945 -
Attachment is obsolete: true
Attachment #300081 -
Flags: review?(pavlov)
Attachment #299945 -
Flags: review?(pavlov)
Updated•17 years ago
|
Attachment #300081 -
Flags: review?(pavlov) → review+
Updated•17 years ago
|
Keywords: checkin-needed
Comment 13•17 years ago
|
||
Checking in modules/libpr0n/decoders/jpeg/nsJPEGDecoder.cpp;
/cvsroot/mozilla/modules/libpr0n/decoders/jpeg/nsJPEGDecoder.cpp,v <-- nsJPEGDecoder.cpp
new revision: 1.88; previous revision: 1.87
done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9beta3
Comment 14•17 years ago
|
||
This broke --with-system-jpeg for me.
So if --with-system-jpeg is not supported anymore it needs to be removed from configure or if it is with newer jpeg versions the configure check needs to be updated IMHO.
Comment 15•17 years ago
|
||
we need to support building with the system jpeg library. we should probably not do this if we're not building with in-tree.
Assignee | ||
Comment 16•17 years ago
|
||
Can someone tell me how this breaks --with-system-jpeg?
Could it be that
#include "jpegint.h"
doesn't work, as jpegint.h for system-jpeg builds is not available?
The basic idea was that this should also work with system-jpeg enabled.
Comment 17•17 years ago
|
||
Alfred, yes, missing jpegint.h is the build error.
Assignee | ||
Comment 18•17 years ago
|
||
Just copy the definitions we need to override the deconverter.
Note this is a safe thing to do, as libjpeg hasn't changed for years now.
Can this be tested if this helps to build with system jpeg?
Comment 19•17 years ago
|
||
So none of the code from the jpeg library is available with --with-system-jpeg?
Assignee | ||
Comment 20•17 years ago
|
||
with --with-system-jpeg, the library is dynamically linked in from a system location, so the compiled code is available in that sense.
But any extensions and/or changes that are made to the code in the mozilla/jpeg directory are not available for those that build --with-system-jpeg.
Comment 21•17 years ago
|
||
(In reply to comment #18)
> Created an attachment (id=300344) [details]
> Can this be tested if this helps to build with system jpeg?
My clobber build just got past this, --with-system-jpeg
Assignee | ||
Comment 22•17 years ago
|
||
Comment on attachment 300344 [details] [diff] [review]
Replace include "jpegint.h" with some locally copied definitions...
Stuart, can you review this followup patch to get the system-jpeg builds compiled? Or it is better to create a new bug for this?
Attachment #300344 -
Flags: review?(pavlov)
Updated•17 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•17 years ago
|
Attachment #300344 -
Flags: review?(pavlov) → review+
Updated•17 years ago
|
Attachment #300344 -
Flags: approval1.9b3?
Comment 23•17 years ago
|
||
Comment on attachment 300344 [details] [diff] [review]
Replace include "jpegint.h" with some locally copied definitions...
a=beltzner for beta3, safe fix that moves a few defines
Attachment #300344 -
Flags: approval1.9b3? → approval1.9b3+
Comment 24•17 years ago
|
||
Checking in modules/libpr0n/decoders/jpeg/nsJPEGDecoder.cpp;
/cvsroot/mozilla/modules/libpr0n/decoders/jpeg/nsJPEGDecoder.cpp,v <-- nsJPEGDecoder.cpp
new revision: 1.90; previous revision: 1.89
done
Status: REOPENED → RESOLVED
Closed: 17 years ago → 17 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•16 years ago
|
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•