specify hardware vfp flags for armv7

RESOLVED FIXED

Status

Fennec Graveyard
General
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: vlad, Assigned: vlad)

Tracking

Trunk
ARM
Android

Details

Attachments

(2 attachments, 1 obsolete attachment)

Created attachment 510375 [details] [diff] [review]
set vfp flags

As dougt noticed, all our android arm builds have -msoft-float specified.  -march=armv7-a is not enough to enable vfp usage.  The attached patch gets me a 10% win on sunspider on a Galaxy Tab (3000 with b4 -> 2700).

I -believe- -mfloat-abi=softfp is enough without -mfpu to get soft float for armv6 builds, which is what people can use if they don't have vfp.  I don't know of any shipping armv7 hardware that we care about that does not have vfp.
Created attachment 510393 [details] [diff] [review]
updated

I think this is more correct; my build with it is still going though.  Note that the -Wa flags are unnecessary (and were likely not correct, due to the extra space) since those things get passed down to the assembler by default -- I verified this via -v.
Assignee: nobody → vladimir
Attachment #510375 - Attachment is obsolete: true
Attachment #510393 - Flags: review?(blassey.bugs)
Attachment #510393 - Flags: review?(blassey.bugs) → review+

Updated

7 years ago
tracking-fennec: --- → 2.0+

Comment 2

7 years ago
http://hg.mozilla.org/mozilla-central/rev/6cfd4d2e8932
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
backed out due to Maemo breakage (sigh.)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
checking for valid optimization flags... no
configure: error: These compiler flags are invalid: -Os -freorder-blocks -fomit-frame-pointer -finline-limit=50
*** Fix above errors and then restart with               "make -f client.mk build"
make[1]: *** [configure] Error 1
make[1]: Leaving directory `/home/cltbld/build/mobile-trunk-maemo5-gtk/mozilla-central'
make: *** [/home/cltbld/build/mobile-trunk-maemo5-gtk/mozilla-central/objdir/Makefile] Error 2
program finished with exit code 2
Those can't be the actual flags it's complaining about.  Unfortunately, the
useful information is in config.log -- someone will have to look at it in a
local tree.  Perhaps the compiler that's in use here is too old to understand
things like vfpv3-d16?
(In reply to comment #5)
> Those can't be the actual flags it's complaining about.  Unfortunately, the
> useful information is in config.log -- someone will have to look at it in a
> local tree.  Perhaps the compiler that's in use here is too old to understand
> things like vfpv3-d16?

vfpv3-d16 is from gcc 4.4.  I think that Maemo 5 uses gcc 4.3.x.
Created attachment 510523 [details] [diff] [review]
updated, try to fix maemo

the gcc on our maemo builders probably doesn't know about vfpv3-d16.  However, that is too aggressive -- Cortex-A8 only has VFP (non-v3 -- technically v2 [v1 is dead]).  v3 introduces a new instruction which is useful, but not useful enough for the hw compat pain.  This patch switches it back to -mfpu=vfp.  This actually might make it work fine on the Maemo builders as well, but someone else can play that game.
Attachment #510523 - Flags: review?(blassey.bugs)
Attachment #510523 - Flags: review?(blassey.bugs) → review+
vlad pushed http://hg.mozilla.org/mozilla-central/rev/3470891975c7
Status: REOPENED → RESOLVED
Last Resolved: 7 years ago7 years ago
Resolution: --- → FIXED
Have you guys experimented with -mfpu=neon versus vfp? 

The NEON unit is similar to the MMX and SSE extensions found on X86 processors, it is optimized for Single Instruction Multiple Data (SIMD) operations. The NEON unit has 2 floating point pipelines, an integer pipeline and a 128bit load/store/permute pipeline. When properly utilized it is a very powerful coprocessor. Unfortunately GCC does a rather poor job of vectorizing code for the NEON unit. To get the best performance you should use either the intrinsics provided in the "arm_neon.h" header or hand written assembly. 

At least in my experience of building Android kernels, using NEON over VFP resulted in some pretty good performance wins.
(In reply to comment #9)
> Have you guys experimented with -mfpu=neon versus vfp? 
> 
> The NEON unit is similar to the MMX and SSE extensions found on X86 processors,
> it is optimized for Single Instruction Multiple Data (SIMD) operations. The
> NEON unit has 2 floating point pipelines, an integer pipeline and a 128bit
> load/store/permute pipeline. When properly utilized it is a very powerful
> coprocessor. Unfortunately GCC does a rather poor job of vectorizing code for
> the NEON unit. To get the best performance you should use either the intrinsics
> provided in the "arm_neon.h" header or hand written assembly. 
> 
> At least in my experience of building Android kernels, using NEON over VFP
> resulted in some pretty good performance wins.

It depends on CPU.  If it is Cortex-A8 IP, VFP is slow.  But Snapdragon (Scorpion core by Qualcomm) or Cortex-A9 IP is that VFP is fast.

Also, about vectorized issue using NEON, see Bug 583958 and linked bugs.
the main reason not to specify neon is tegra based devices, which don't support it.
if there is any further work here -- like trying out new build options, please file a new bug.
Depends on: 632915
You need to log in before you can comment on or make changes to this bug.