Created attachment 334945 [details] [diff] [review] go faster Make a pixman faster on arm. Also adds a faster rectilinear nearest neighbour image scaling.
Created attachment 334966 [details] [diff] [review] go faster w/ better configury Detect if the compiler supports ARM simd instructions
Attachment #334945 - Attachment is obsolete: true
Assignee: jmuizelaar → nobody
Status: ASSIGNED → NEW
Component: General → GFX: Thebes
Keywords: mobile, perf
Product: Fennec → Core
QA Contact: general → thebes
+ * Copyright © 208 Mozilla Corporation that was a long time ago
Flags: wanted1.9.1? → wanted1.9.1+
Priority: -- → P2
Created attachment 336679 [details] [diff] [review] arm pixman patch update Update pixman patch to match what was submitted for upstream inclusion
Attachment #334966 - Attachment is obsolete: true
Created attachment 336912 [details] [diff] [review] arm pixman v2 Fix the configury by adding HAVE_ARM_SIMD to autoconf.mk.in
Attachment #336679 - Attachment is obsolete: true
Created attachment 336920 [details] [diff] [review] arm pixman v3 use one fewer registers in fbCompositeSolidMask_nx8x8888arm allowing compiling with worser compilers.
Attachment #336912 - Attachment is obsolete: true
Created attachment 336971 [details] [diff] [review] arm pixman v4 The last patch was broken and wrong, this patch fixes it.
Attachment #336920 - Attachment is obsolete: true
Created attachment 337251 [details] [diff] [review] additional patch Additional patch -- not ARM specific, but it's related to our arm work. Just sticking it here so that it's in bugzilla, both of these should hopefully end up in pixman upstream shortly. I will probably commit these both to our own repo before that happens, though, since I think we're waiting on 0.12.0 to be released before putting in the arm code.
Hi, A colleague of mine had a brief look at Jeff's patch and has a few comments: 1. In the first while() loop of fbCompositeSrcAdd_8000x8000arm you can use UQADD8, it doesn't matter that you just want the bottom lane. Similarly in the third loop. 2. There are many AND operations with %[component_mask] to extract bytes. These can be replaced by UXTB16, saving the mask register. 3. There are mask operations with %[component_mask] followed by accumulates. These can be replaced by the combined mask and accumulate UXTAB16. Hope that helps.
Created attachment 338653 [details] [diff] [review] arm pixman v5 This version addresses Guillaume's comments (Thanks!) and squeezes some more performance out. e.g. over8888x8888 is about 10-11% faster.
Attachment #336971 - Attachment is obsolete: true
Pulled this in along with a few more pixman updates to 0.12.0. 19351 5d807b616378 2008-09-17 14:15 -0700 vladimir b=451621; push new pixman with arm fast-paths; r=me
Status: NEW → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.