Closed Bug 482676 Opened 11 years ago Closed 11 years ago

add pixman NEON optimizations

Categories

(Core :: Graphics, defect)

ARM
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla1.9.2a1
Tracking Status
status1.9.2 --- beta1-fixed
fennec 1.0+ ---

People

(Reporter: vlad, Assigned: vlad)

Details

(Keywords: mobile)

Attachments

(2 files, 2 obsolete files)

Attached patch add pixman NEON stuff (obsolete) — Splinter Review
We got a good chunk of ARM NEON optimizations for some of the core pixman compositing routines.  Here's the patch that incorporates this.

This also changes the detection code to explicitly test on both MSVC and on Linux (using setjmp/sigaction).

Note that in this patch the neon piece always returns FALSE (heh), because having it return TRUE causes kernel crashes on the maemo5 crashes (in the PVR driver, even, which is exciting).
Attachment #366774 - Flags: review?(jmuizelaar)
Comment on attachment 366774 [details] [diff] [review]
add pixman NEON stuff

>diff --git a/gfx/cairo/libpixman/src/pixman-arm-detect-win32.asm b/gfx/cairo/libpixman/src/pixman-arm-detect-win32.asm
>--- a/gfx/cairo/libpixman/src/pixman-arm-detect-win32.asm
>+++ b/gfx/cairo/libpixman/src/pixman-arm-detect-win32.asm
>@@ -20,10 +20,22 @@ FuncEndName SETS    VBar:CC:"$Name":CC:"
> $PrologName
>     MEND
> 
>-    export  pixman_msvc_try_armv6_op
>+    export  pixman_msvc_try_simd_op
> 
>-    FUNC_HEADER pixman_msvc_try_armv6_op
>-    uqadd8 r0,r0,r1
>+    FUNC_HEADER pixman_msvc_try_simd_op
>+    ;; I don't think the msvc arm asm knows how to do SIMD insns
>+    ;; uqadd8 r3,r3,r3
>+    DCD 0xe6633f93
>+    mov pc,lr
>+    ENTRY_END
>+    endp

If the assembler can't assemble the simd instructions then we won't be able to
build the msvc assembler files either, so I don't see the point of this.

I've also been looking at upstreaming this code and am wondering how much of the 
>+
>+    export  pixman_msvc_try_neon_op
>+
>+    FUNC_HEADER pixman_msvc_try_neon_op
>+    ;; I don't think the msvc arm asm knows how to do NEON insns
>+    ;; veor d0,d0,d0
>+    DCD 0xf3000110
>     mov pc,lr
>     ENTRY_END
>     endp
>diff --git a/gfx/cairo/libpixman/src/pixman-arm-neon.c b/gfx/cairo/libpixman/src/pixman-arm-neon.c
>new file mode 100644
>--- /dev/null
>+++ b/gfx/cairo/libpixman/src/pixman-arm-neon.c
>@@ -0,0 +1,1397 @@
>+/*
>+ * Copyright © 2009 ARM Ltd.
>+ *
>+ * Permission to use, copy, modify, distribute, and sell this software and its
>+ * documentation for any purpose is hereby granted without fee, provided that
>+ * the above copyright notice appear in all copies and that both that
>+ * copyright notice and this permission notice appear in supporting
>+ * documentation, and that the name of Mozilla Corporation not be used in
>+ * advertising or publicity pertaining to distribution of the software without
>+ * specific, written prior permission.  Mozilla Corporation makes no
>+ * representations about the suitability of this software for any purpose.  It
>+ * is provided "as is" without express or implied warranty.
>+ *
>+ * THE COPYRIGHT HOLDERS DISCLAIM ALL WARRANTIES WITH REGARD TO THIS
>+ * SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
>+ * FITNESS, IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY
>+ * SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
>+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN
>+ * AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING
>+ * OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
>+ * SOFTWARE.
>+ *
>+ * Author:  Ian Rickards (ian.rickards@arm.com) 
>+ *
>+ */
>+

This version of the patch has some bugs. I'll send you a newer one.
Attachment #366774 - Flags: review?(jmuizelaar) → review-
"I've also been looking at upstreaming this code and am wondering how much of
the" should continue as follows:

I've also been looking at upstreaming this code and am wondering how much of
the FUNC_HEADER stuff is actually needed. For example, does the 'include kxarm.h' do anything useful?
The disassembler can do the same thing that I do in the detection code -- just emit the instructions as dword values directly.  It's gross, but that file isn't meant to be hand-edited anyway.

kxarm.h seems to be needed -- all sorts of things break if I don't include it.  I've got an updated patch with better detection, I'll put something up with both Ian's updated patch and the fixed detection
Attached patch add pixman NEON stuff, v2 (obsolete) — Splinter Review
Updated.  Detection code seems to work on both my armv7 (+neon) and armv6 devices.
Assignee: nobody → vladimir
Attachment #366774 - Attachment is obsolete: true
Attachment #367362 - Flags: review?(jmuizelaar)
tracking-fennec: --- → ?
Flags: wanted-fennec1.0?
tracking-fennec: ? → 1.0+
Flags: wanted-fennec1.0? → wanted-fennec1.0+
Comment on attachment 367362 [details] [diff] [review]
add pixman NEON stuff, v2

We should add a check to configure to make sure that we can assemble neon instructions
like we do for arm v6 simd instructions.

>diff --git a/gfx/cairo/libpixman/src/Makefile.in b/gfx/cairo/libpixman/src/Makefile.in
>--- a/gfx/cairo/libpixman/src/Makefile.in
>+++ b/gfx/cairo/libpixman/src/Makefile.in
>@@ -90,7 +90,7 @@ endif
> endif
> ifeq (arm,$(findstring arm,$(OS_TEST)))
> ifdef HAVE_ARM_SIMD
>-USE_ARM_SIMD=1
>+USE_ARM_SIMD_GCC=1
> endif
> endif
> 
>@@ -133,9 +133,10 @@ DEFINES += -DUSE_VMX
> DEFINES += -DUSE_VMX
> endif
> 
>-ifdef USE_ARM_SIMD
>-CSRCS += pixman-arm-simd.c
>-DEFINES += -DUSE_ARM_SIMD
>+ifdef USE_ARM_SIMD_GCC
>+CSRCS += pixman-arm-simd.c pixman-arm-neon.c
>+DEFINES += -DUSE_ARM_SIMD -DUSE_ARM_NEON
>+ARM_NEON_CFLAGS = -mfloat-abi=softfp -mfpu=neon
> endif
> 
> ifdef USE_ARM_SIMD_MSVC
>@@ -156,6 +157,7 @@ include $(topsrcdir)/config/rules.mk
> 
> CFLAGS += -DPACKAGE="mozpixman" -D_USE_MATH_DEFINES
> 
>+
> # special rule for pixman-mmx to get the right cflags
> pixman-mmx.$(OBJ_SUFFIX): pixman-mmx.c Makefile Makefile.in
>     $(REPORT_BUILD)
>@@ -166,3 +168,8 @@ pixman-sse2.$(OBJ_SUFFIX): pixman-sse2.c
>     $(REPORT_BUILD)
>     @$(MAKE_DEPS_AUTO_CC)
>     $(ELOG) $(CC) $(OUTOPTION)$@ -c $(COMPILE_CFLAGS) $(MMX_CFLAGS) $(_VPATH_SRCS)
>+
>+pixman-arm-neon.$(OBJ_SUFFIX): pixman-arm-neon.c Makefile Makefile.in
>+    $(REPORT_BUILD)
>+    @$(MAKE_DEPS_AUTO_CC)
>+    $(ELOG) $(CC) $(OUTOPTION)$@ -c $(COMPILE_CFLAGS) $(ARM_NEON_CFLAGS) $(_VPATH_SRCS)
>diff --git a/gfx/cairo/libpixman/src/pixman-arm-detect-win32.asm b/gfx/cairo/libpixman/src/pixman-arm-detect-win32.asm
>--- a/gfx/cairo/libpixman/src/pixman-arm-detect-win32.asm
>+++ b/gfx/cairo/libpixman/src/pixman-arm-detect-win32.asm
>@@ -20,10 +20,22 @@ FuncEndName SETS    VBar:CC:"$Name":CC:"
> $PrologName
>     MEND
> 
>-    export  pixman_msvc_try_armv6_op
>+    export  pixman_msvc_try_simd_op
> 
>-    FUNC_HEADER pixman_msvc_try_armv6_op
>-    uqadd8 r0,r0,r1
>+    FUNC_HEADER pixman_msvc_try_simd_op

This function should probably still have 'arm' in the name. 'simd' is pretty general.

>+    ;; I don't think the msvc arm asm knows how to do SIMD insns
>+    ;; uqadd8 r3,r3,r3
>+    DCD 0xe6633f93
>+    mov pc,lr
>+    ENTRY_END
>+    endp
>+
>+    export  pixman_msvc_try_neon_op
>+
>+    FUNC_HEADER pixman_msvc_try_neon_op
>+    ;; I don't think the msvc arm asm knows how to do NEON insns
>+    ;; veor d0,d0,d0
>+    DCD 0xf3000110
>     mov pc,lr
>     ENTRY_END
>     endp

[snip]

> 
>-#ifdef USE_ARM_SIMD
>+#if defined(USE_ARM_SIMD) || defined(USE_ARM_NEON)
>+
>+#if defined(_MSC_VER)
>+
> pixman_bool_t
> pixman_have_arm_simd (void)
> {
>-#ifdef _MSC_VER
>     static pixman_bool_t initialized = FALSE;
>     static pixman_bool_t have_arm_simd = FALSE;
> 
>     if (!initialized) {
>         __try {
>-            pixman_msvc_try_armv6_op();
>+            pixman_msvc_try_simd_op();
>             have_arm_simd = TRUE;
>         } __except(GetExceptionCode() == EXCEPTION_ILLEGAL_INSTRUCTION) {
>             have_arm_simd = FALSE;
>@@ -2042,11 +2075,107 @@ pixman_have_arm_simd (void)
>     }
> 
>     return have_arm_simd;
>-#else
>-    return TRUE;
>-#endif
> }
>-#endif
>+
>+pixman_bool_t
>+pixman_have_arm_neon (void)
>+{
>+    static pixman_bool_t initialized = FALSE;
>+    static pixman_bool_t have_arm_neon = FALSE;
>+
>+    if (!initialized) {
>+        __try {
>+            pixman_msvc_try_neon_op();
>+            have_arm_neon = TRUE;
>+        } __except(GetExceptionCode() == EXCEPTION_ILLEGAL_INSTRUCTION) {
>+            have_arm_neon = FALSE;
>+        }
>+    initialized = TRUE;
>+    }
>+
>+    return have_arm_neon;
>+}
>+
>+#else /* linux ELF */
>+
>+#include <stdlib.h>
>+#include <unistd.h>
>+#include <sys/types.h>
>+#include <sys/stat.h>
>+#include <sys/mman.h>
>+#include <fcntl.h>
>+#include <string.h>
>+#include <elf.h>
>+
>+static pixman_bool_t arm_has_v7 = FALSE;
>+static pixman_bool_t arm_has_v6 = FALSE;
>+static pixman_bool_t arm_has_vfp = FALSE;
>+static pixman_bool_t arm_has_neon = FALSE;
>+static pixman_bool_t arm_has_iwmmxt = FALSE;
>+static pixman_bool_t arm_tests_initialized = FALSE;
>+
>+static void
>+pixman_arm_read_auxv() {
>+    int fd;
>+    Elf32_auxv_t aux;
>+
>+    fd = open("/proc/self/auxv", O_RDONLY);
>+    if (fd > 0) {
>+        while (read(fd, &aux, sizeof(Elf32_auxv_t))) {

How about: while (read(fd, &aux, sizeof(Elf32_auxv_t) == sizeof(Elf32_auxv_t)

>+            if (aux.a_type == AT_HWCAP) {
>+        uint32_t hwcap = aux.a_un.a_val;
>+        if (getenv("ARM_FORCE_HWCAP"))
>+            hwcap = strtoul(getenv("ARM_FORCE_HWCAP"), NULL, 0);
>+        // hardcode these values to avoid depending on specific versions
>+        // of the hwcap header, e.g. HWCAP_NEON
>+        arm_has_vfp = (hwcap & 64) != 0;
>+        arm_has_iwmmxt = (hwcap & 512) != 0;
>+        // this flag is only present on kernel 2.6.29
>+        arm_has_neon = (hwcap & 4096) != 0;
>+            } else if (aux.a_type == AT_PLATFORM) {
>+        const char *plat = (const char*) aux.a_un.a_val;
>+        if (getenv("ARM_FORCE_PLATFORM"))
>+            plat = getenv("ARM_FORCE_PLATFORM");
>+        if (strncmp(plat, "v7l", 3) == 0) {
>+            arm_has_v7 = TRUE;
>+            arm_has_v6 = TRUE;
>+        } else if (strncmp(plat, "v6l", 3) == 0) {
>+            arm_has_v6 = TRUE;
>+        }
>+            }
>+        }
>+        close (fd);
>+
>+    // if we don't have 2.6.29, we have to do this hack; set
>+    // the env var to trust HWCAP.
>+    if (!getenv("ARM_TRUST_HWCAP") && arm_has_v7)
>+        arm_has_neon = TRUE;
>+    }
>+
>+    arm_tests_initialized = TRUE;
>+}
>+
>+pixman_bool_t
>+pixman_have_arm_simd (void)
>+{
>+    if (!arm_tests_initialized)
>+    pixman_arm_read_auxv();
>+
>+    return arm_has_v6;
>+}
>+
>+pixman_bool_t
>+pixman_have_arm_neon (void)
>+{
>+    if (!arm_tests_initialized)
>+    pixman_arm_read_auxv();
>+
>+    return arm_has_neon;
>+}
>+
>+#endif /* linux */
>+
>+#endif /* USE_ARM_SIMD || USE_ARM_NEON */
> 
> #ifdef USE_MMX
> /* The CPU detection code needs to be in a file not compiled with
Keywords: mobile
Comment on attachment 367362 [details] [diff] [review]
add pixman NEON stuff, v2

I forgot to set the review flag when I did the review...

Conditional '+' on a response to the comments above.
Attachment #367362 - Flags: review?(jmuizelaar) → review+
Updated with configure.in bits; carrying r+ over.

Jeff, do you want to land this in upstream pixman, or do you want me to?
Attachment #367362 - Attachment is obsolete: true
Attachment #371300 - Flags: review+
argh, forgot to rename the functions
Checked in, with win32 asm function renames:

http://hg.mozilla.org/mozilla-central/rev/4256bc50a5db
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
There is a build error.

c:/cygwin/home/user/hg/mozilla-central/gfx/cairo/libpixman/src/pixman-pict.c(224
3) : warning C4002: too many actual parameters for macro 'pixman_have_arm_neon'
c:/cygwin/home/user/hg/mozilla-central/gfx/cairo/libpixman/src/pixman-pict.c(224
3) : error C2059: syntax error : 'constant'
make[2]: *** [pixman-pict.obj] Error 2


Should be separated pixman_have_arm_neon and pixman_have_arm_simd?
Attachment #371554 - Flags: review?(vladimir)
Comment on attachment 371554 [details] [diff] [review]
Fix build error
[Checkin: Comment 12]

Oh hm, I guess I never tested this on a compiler that doesn't understand NEON.  I don't really like having the macro get defined in the header file, though without that the dead code won't get eliminated.

I guess we have to do it though.. Jeff, what do you think?
Attachment #371554 - Flags: review?(vladimir)
Attachment #371554 - Flags: review?(jmuizelaar)
Attachment #371554 - Flags: review+
I've checked in the additional patch there because I had to disable NEON to see what's up with the mobile tinderbox, and was thus running into this.  Sigh.
So I had to disable this in the makefile, because the build was dying in a very weird way on whatever the toolchain is that we have on the tinderboxes:

../../../../dist/bin/libxul.so: undefined reference to `d7'
../../../../dist/bin/libxul.so: undefined reference to `d1'

both d1 and d7 are simd/neon/vfp register names, so somewhere it's thinking that they're being used as symbols instead of as registers.  The code uses the full complement of registers though, so I don't know what's special about d1 and d7; I went through the code a bit and didn't see anything unusual about their usage.

This builds fine for me here though, so it's something broken with the toolchain.
Works fine for me locally too. REOPEN?
We should probably open a new bug to figure out what's going on on the tinderbox.. the code is checked in, just the build piece is commented out in the pixman makefile.
I have opened a new bug (bug 487729), I am not sure that bug is concerned to this bug.
Comment on attachment 371554 [details] [diff] [review]
Fix build error
[Checkin: Comment 12]

This should no longer be needed.
Attachment #371554 - Flags: review?(jmuizelaar) → review-
Attachment #371300 - Attachment description: add pixman NEON stuff, v3 → add pixman NEON stuff, v3 [Checkin: Comment 9]
Attachment #371554 - Attachment description: Fix build error → Fix build error [Checkin: Comment 12]
Target Milestone: --- → mozilla1.9.2a1
Version: unspecified → Trunk
You need to log in before you can comment on or make changes to this bug.