628324 - Implement bilinear scaling with NEON

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Description

•

14 years ago

I haven't profiled the difference myself, but Siarhei says bug 598736 comment 5 (In reply to comment #4) > Does GOOD filter already works anywhere fast enough? Currently it's approximately 10-30 times slower than NEAREST scaling for pixman-0.19.4 on ARM Cortex-A8: [snip] Even with full NEON optimizations added, I expect that BILINEAR is still going to be about 2-4x slower than NEAREST. But indeed, NEON optimizations for bilinear scaling would be very nice to have. 2-4x slower may or may not be fast enough for pinch-zoom, but I think we'd use that for content rendering regardless. Siarhei, do you happen to have patches for this floating around? If not, I can look at this this week.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Comment 1

•

14 years ago

Blocks a blocker.

tracking-fennec: --- → ?

Mark Finkle (:mfinkle) (use needinfo?)

Updated

•

14 years ago

tracking-fennec: ? → 2.0+

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Reporter

Comment 2

•

14 years ago

Although this will be somewhat of a RISC-y project, if we put our Cortexes together, we should be able to StrongARM it through.

Joe Drew (not getting mail)

Comment 3

•

14 years ago

(In reply to comment #2) > Although this will be somewhat of a RISC-y project, if we put our Cortexes > together, we should be able to StrongARM it through. r-.

Siarhei Siamashka

Assignee

Comment 4

•

14 years ago

(In reply to comment #0) > Siarhei, do you happen to have patches for this floating around? Not yet, but I will provide more details a little bit later. > If not, I can look at this this week. That's the spirit! Thanks.

Mark Finkle (:mfinkle) (use needinfo?)

Comment 5

•

14 years ago

Can someone take ownership of this bug?

Brad Lassey [:blassey] (use needinfo?)

Updated

•

14 years ago

Assignee: nobody → jmuizelaar

Jeff Muizelaar [:jrmuizel]

Updated

•

14 years ago

Assignee: jmuizelaar → siarhei.siamashka

Stuart Parmenter

Comment 6

•

14 years ago

Would love to get this in if it is fast enough to use, but not blocking on it at this point

tracking-fennec: 2.0+ → 2.0next+

Siarhei Siamashka

Assignee

Comment 7

•

14 years ago

NEON optimizations for bilinear scaling are coming through upstream pixman, so eventually they should also reach Mozilla.

Patrick Walton (:pcwalton)

Comment 8

•

14 years ago

(In reply to comment #7) > NEON optimizations for bilinear scaling are coming through upstream pixman, so > eventually they should also reach Mozilla. Cool! Do you know the commits offhand?

Siarhei Siamashka

Assignee

Comment 9

•

14 years ago

(In reply to comment #8) > Cool! Do you know the commits offhand? There are no commits yet. I'm working on a proper patchset right now and expect to finish it in a few days (so that it's fast enough and passes all the tests). At least I think getting maximum performance for SRC operator and PAD repeat should be the bare minimum. There is also some interest in having fast bilinear scaling from webkit side in the cairo mailing list: http://lists.cairographics.org/archives/cairo/2011-February/021645.html

Siarhei Siamashka

Assignee

Comment 10

•

14 years ago

Sent pixman patches with bilinear scaling optimizations here: http://lists.freedesktop.org/archives/pixman/2011-February/001053.html Even though NONE repeat is a major PITA to implement, it is also partially supported after all. Additional patch for scaling r5g6b5 images with the help of ARM NEON will be available in a few days. Maybe some other variants of scaling operations can be optimized too. Thanks a lot for making the decision that SIMD optimizations for bilinear scaling could actually have some use in Mozilla. This allowed me to get some time allocated for working on this task. And actually these patches should have been ready by the beginning of the previous week, but I just dropped out and could not do much productive work lately due to certain circumstances.

Siarhei Siamashka

Assignee

Comment 11

•

14 years ago

With the following patchset ready, everything that was originally planned is now implemented: http://lists.freedesktop.org/archives/pixman/2011-March/001119.html The current performance numbers on 1GHz ARM Cortex-A8 are more like: nearest scaling a8r8g8b8: 163.12 MPix/s nearest scaling r5g6b5: 267.50 MPix/s bilinear scaling a8r8g8b8: 74.36 MPix/s bilinear scaling r5g6b5: 41.35 MPix/s Nearest scaling was also optimized recently. So in the end, bilinear scaling is roughly 2x slower than nearest for 32bpp format and more than 6x slower than nearest for 16bpp. Some additional optimizations for bilinear scaling are still possible though (in the ballpark of a few tens percents). Compared to the old C code in pixman, NEON bilinear scaling got approximately 10x faster on ARM. There is also SSE2 bilinear scaling optimization too (mostly proof of concept), but it's not really highly optimized and only provides ~2x speedup over C implementation. If anybody wants to invest some efforts in SSE2/SSSE3 bilinear scaling optimizations for pixman, there is some really good potential there. Hopefully all these optimizations will be included in pixman 0.21.8 release. There is still some more bilinear work to do in pixman, mostly to get NONE repeat fully optimized (EXTEND_NONE in cairo terms). But I really hope that firefox/fennec can switch to using EXTEND_PAD instead whenever it is possible (bug 600390 and bug 630114). Also more bilinear fast paths can be added on case by case basis (most likely those using OVER operator). But as I said, this particular bug is basically done.

Siarhei Siamashka

Assignee

Comment 12

•

14 years ago

Fixed via bug 640250

Status: NEW → RESOLVED

Closed: 14 years ago

Depends on: 640250

Resolution: --- → FIXED

Bugzilla

Implement bilinear scaling with NEON

Categories

(Core :: Graphics, defect)

Tracking

()

People

(Reporter: cjones, Assigned: siarhei.siamashka)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Updated

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12