Closed Bug 817042 Opened 11 years ago Closed 4 years ago

AltiVec runtime detection

Categories

(Core :: XPCOM, defect)

PowerPC
All
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1571613

People

(Reporter: tobias.netzel, Assigned: gaston)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file, 4 obsolete files)

This adds OS dependent runtime detection for VMX (Altivec) capabilities. This is currently implemented for Linux and Mac OS X.
In case there is no runtime detection for the target OS VMX support is assumed based on compiler defines.
Attachment #687174 - Attachment is patch: true
Attachment #687174 - Flags: review?
Blocks: 817033
Attachment #687174 - Flags: review? → review?(benjamin)
Depends on: 817058
Blocks: 817058
No longer depends on: 817058
Blocks: 817075
Blocks: 817089
You should request reviews for your patches from someone specific...
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment on attachment 687174 [details] [diff] [review]
runtime VMX detection for Linux and Mac OS X

If I understand this, VMX is another name for altivec extensions to the powerpc instruction set. Is that correct?

Since ppc is not a core target of the Mozilla community, I want to understand the tradeoffs of these patches; they certainly increase the complexity of some code paths. Have you measured the perf impact of the patches you're attaching and whether they are actually going to be worthwhile?

Are you trying to make builds to target particular hardware? If so can you just rely on the extensions being present or not?

I'm not excited about adding micro-optimizations for non-core platforms unless they are a huge win.
First, thanks for considering this at all.
VMX is Altivec, that's correct.
The target machines have been Power Macintoshes until now but the TenFourFox team would like to make its achievements accessible for other platforms as well. That would be POWER based IBM machines or Power Macintoshes running Linux for example.
IBM introduced VMX support in 2007 with their POWER 6 architecture. And there are even "fast" Power Macintosh G3 based machines that lack Altivec support.
Considering the way the GNU/Linux distributions provide their software (Firefox being a core part in many cases) runtime VMX detection would be the only way to have VMX optimizations reach a significant amount of those potential users.

We ran some benchmarks, although nothing very sophisticated. You can expect a performance win between 10% and 40%. The string processing functions are actually somewhat slower when processing a small amount of data of less than approx. 32 characters).

We only vectorized code that was already vectorized for MMX/SSE and our implementations are ports of the SSE counterparts, considering some special requirements like strictly avoiding unaligned stores. So, there's no additional complexity compared to the SSE code path - apart from the fact that runtime VMX detection is not as easily done as for MMX/SSE.

I'd like to make clear that there's no particular NEED to get the VMX patches into the mozilla code base but we just want to follow the spirit of contributing to an open source software project.
Comment on attachment 687174 [details] [diff] [review]
runtime VMX detection for Linux and Mac OS X

I'm going to bump these reviews to glandium to help me decide 1) whether the maintenance cost is worth landing these in the tree and 2) whether there are any issues with the code in particular
Attachment #687174 - Flags: review?(benjamin) → review?(mh+mozilla)
Comment on attachment 687174 [details] [diff] [review]
runtime VMX detection for Linux and Mac OS X

Review of attachment 687174 [details] [diff] [review]:
-----------------------------------------------------------------

As VMX can be different things (like the VM extensions on recent intel cpus), you should not use the VMX name, it's confusing. Use altivec.

Benjamin, don't you think it would make sense to consolidate SSE, arm and this in one place, at some point?

::: xpcom/glue/VMX.cpp
@@ +19,5 @@
> +
> +/**
> + * taken from freevec.org
> + */
> +int have_altivec() {

Please use bool instead of int, and true/false values instead of 1/0.

@@ +29,5 @@
> +    int fd, i;
> +
> +    snprintf(fname, sizeof(fname)-1, "/proc/self/auxv");
> +
> +    fd = open(fname, O_RDONLY);

Just open("/proc/self/auxv") instead of uselessly doing a snprintf.

@@ +32,5 @@
> +
> +    fd = open(fname, O_RDONLY);
> +    if (fd < 0)
> +            goto out;
> + more:

do {

@@ +46,5 @@
> +                    goto out_close;
> +            }
> +    }
> +
> +    if (count == sizeof(buf))

} while (count == sizeof(buf));

@@ +49,5 @@
> +
> +    if (count == sizeof(buf))
> +            goto more;
> + out_close:
> +    close(fd);

Please use mozilla::AutoFDClose instead of the goto out_close pattern (which you can then replace with the proper return)

@@ +86,5 @@
> +} // namespace
> +
> +namespace mozilla {
> +  namespace vmx_private {
> +    bool vmx_enabled = vmx_have_altivec;

You don't need the intermediate vmx_have_altivec variable.
Attachment #687174 - Flags: review?(mh+mozilla) → review-
> Benjamin, don't you think it would make sense to consolidate SSE, arm and
> this in one place, at some point?

If the code is similar enough that sharing makes sense, sure.
Renaming from VMX to Altivec not done yet.
As I don't currently have a linux installation to test with I didn't test the linux specific changes, but apart from the missing defines there weren't any compiler errors.
Attachment #687174 - Attachment is obsolete: true
Attachment #697197 - Flags: review?(mh+mozilla)
Comment on attachment 697197 [details] [diff] [review]
runtime VMX detection for Linux and Mac OS X v2

Review of attachment 697197 [details] [diff] [review]:
-----------------------------------------------------------------

I'd rather review when it's renamed to altivec
Attachment #697197 - Flags: review?(mh+mozilla)
Summary: VMX (Altivec) runtime detection → AltiVec runtime detection
Attachment #697197 - Attachment is obsolete: true
Attachment #698246 - Flags: review?(mh+mozilla)
Comment on attachment 698246 [details] [diff] [review]
runtime AltiVec detection for Linux and Mac OS X v3

Review of attachment 698246 [details] [diff] [review]:
-----------------------------------------------------------------

Gerv, can you take a look at AltiVec.cpp and tell if Tobias needs to add some more explicit copyright/licensing info there, or are these trivial enough?
Attachment #698246 - Flags: review?(mh+mozilla)
Attachment #698246 - Flags: review?(gerv)
Attachment #698246 - Flags: review+
freevec.org is down so I can't see what the license is on code from their. But according to Google's cache, it looks like a personal site... so can I please have a more detailed reference to exactly where the code came from?

Taking stuff from ffmpeg (LGPL/GPL) doesn't work; can you point me at where the same code is on Apple's Altivec pages, as the comment suggests?

Gerv
Concerning the linux runtime AltiVec detection I've already implemented another version that should work on ELF32 and ELF64 platforms while the current one is probably ELF32 only.
I'm going to prepare a revised patch for reviewing.

The Mac OS X AltiVec runtime detection code came from:
https://developer.apple.com/hardwaredrivers/ve/g3_compatibility.html
We might want to pull Landry Breuil in here because this will break *BSD/ppc as written. I *think* the Linux code will work there as well.

I don't believe there are any builders making a current Firefox for AIX anymore, though POWER6 and up do have VMX.
Assignee: nobody → tobias.netzel
Status: NEW → ASSIGNED
Thanks for cc'ing me on this.

#erroring out is not nice to other platforms - i dont know how FreeBSD & NetBSD folks care about their ports to ppc, but i strive to get it in a working state on OpenBSD. #warning + defaulting to false might be nicer...

The linux code wont work (ugly /proc reading) but the macos code almost works : see http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavutil/ppc/cpu.c;h=20837dae115ffe3702f4d1613700694dac7e794d;hb=HEAD

The only difference for us is #include <sys/param.h> , #include <machine/cpu.h> in addition to #include <sys/sysctl.h> int sels[2] = {CTL_MACHDEP,CPU_ALTIVEC} instead of {CTL_HW, HW_VECTORUNIT}.

The following code works here :

#include <sys/param.h>
#include <sys/sysctl.h>
#include <machine/cpu.h>
#include <stdio.h>

int main() {
  int sels[2] = {CTL_MACHDEP,CPU_ALTIVEC};
  int has_vu = 0;
  size_t len = sizeof(has_vu);
  int err;

  err = sysctl(sels, 2, &has_vu, &len, NULL, 0);

  if (err == 0) {
    if (has_vu != 0) {
      printf ("has altivec");
    } else {
      printf ("no altivec");
    }
  } else {
    printf ("error");
  }
}

Other than that i'd be happy to see some optimized working asm code on ppc, using the sysv abi...
Well, then I'll replace the linux version with the ELF64 compatible one and will combine Mac OS X and BSD versions into one.
As our AltiVec functions are implemented using intrinsics ABI should be no problem!
<sigh> The Apple site doesn't have clear docs on the license of code presented on it. Luis: do you have historical experience here?

There seems to be an email address we can ask <sw.license@apple.com> if we can't find what agreement applies.

Gerv
Comment on attachment 698246 [details] [diff] [review]
runtime AltiVec detection for Linux and Mac OS X v3

Removing review request pending patch update.

Gerv
Attachment #698246 - Flags: review?(gerv)
Unfortunately I won't be able to work on this for at least the next two months.
My plan was to take the Linux runtime detection from bug 817045 with the patch from bug 832711 applied. As that implementation is not copied from anywhere there won't be any GPL/LGPL licensing problems any more.

The Mac OS X runtime detection could be implemented by extending the BSD one from comment 14 to also work for OS X, that way avoiding Apple licensing problems.
But in fact there's no need for OS X AltiVec runtime detection in the official mozilla sources since the existing PowerPC OS X ports (TenFourFox and derivates) anyway need many patches.
So implementing this for Linux and BSD only would be fine in my opinion.

Would be great if somebody™ interested in this (Landry Breuil?), please feel free to continue to work on this!
Here's the version that *should* work on NetBSD/OpenBSD using CTL_MACHDEP/CPU_ALTIVEC, and using sysctlbyname("hw.altivec") on FreeBSD/DragonFly (not sure for the latter, jan ?).

Currently build-testing on OpenBSD/ppc only. 

Other sources of ideas : http://hg.libsdl.org/SDL/rev/c20f08c4f437 - note that gfx/cairo/libpixman/src/pixman-cpu.c does it a bit differently in pixman_has_vmx().

I'm not attached to any of the implems, so if it's better licence-wise to use the version in pixman that one can be taken. Similar code is used all around, so i dont know if it's that important to attach a copyright to it, but IANAL.

Another question: what would use it ? So far i see nothing in xpcom/ benefiting from adding that flag, so i suppose it's other patches on top of that for tenfourfox ?

Regarding bug 817045, it seems to me that https://hg.mozilla.org/integration/mozilla-inbound/rev/f2e450529466 inconditionally enables -maltivec, was that intended, and has it been tested on hardware where altivec is not available ?
Assignee: tobias.netzel → landry
Attachment #698246 - Attachment is obsolete: true
Attachment #722750 - Flags: review?(mh+mozilla)
Attachment #722750 - Flags: feedback?(jbeich)
"-maltivec" is only used for the file containing AltiVec code, pixman-vmx.c .
I think that's clear already: you must enable AltiVec ABI for any file containing AltiVec code and preventing that code to be executed on non-AltiVec capable machines by i.e. using some sort of runtime detection.
But beware not to enable any sort of compiler auto-vectorization. Depending on compiler, optimization level and OS that may happen silently when running the build on an AltiVec capable machine without specifiying a target CPU. Hence you should build (in case of gcc) using -i.e. "mcpu=750" (for generic 32-bit PowerPC) or explicitly disable AltiVec using "-mno-altivec" (an "-maltivec" parameter passed after "-mno-altivec" should override the previous setting).
Comment on attachment 722750 [details] [diff] [review]
runtime AltiVec detection for Linux, BSD and Mac OS X v4

Review of attachment 722750 [details] [diff] [review]:
-----------------------------------------------------------------

f- for typos and broken build

::: xpcom/glue/AltiVec.cpp
@@ +48,5 @@
> +  } while (count == sizeof(buf));
> +  return false;
> +}
> +#elif defined(XP_MACOSX) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(__NetBSD__) || defined(__DragonFly__)
> +#if defined(__OpenBSD__) || defined (__NetBSD__)

remove space after defined and before (

@@ +52,5 @@
> +#if defined(__OpenBSD__) || defined (__NetBSD__)
> +#include <sys/param.h>
> +#include <machine/cpu.h>
> +#endif
> +#include <sys/sysctl.h>

sys/types.h (or sys/param.h) is required before sys/sysctl.h by every BSD and should be safe to include on OS X

xpcom/glue/AltiVec.cpp:56:
In file included from ../../dist/system_wrappers/sys/sysctl.h:2:
/usr/include/sys/sysctl.h:802:25: error: C++ requires a type specifier for all declarations
int     sysctl(const int *, u_int, void *, size_t *, const void *, size_t);
                            ^~~~~
1 error generated.

@@ +69,5 @@
> +  int has_vu = 0;
> +  size_t len = sizeof(has_vu);
> +  int err;
> +
> +#if defined(__FreeBSD__) || defined (__DragonFly__)

remove space after defined and before (

@@ +70,5 @@
> +  size_t len = sizeof(has_vu);
> +  int err;
> +
> +#if defined(__FreeBSD__) || defined (__DragonFly__)
> +  err = sysctlbyname("hw.altivec", &has_vu, &len, NULL, 0 );

DragonFly currently doesn't support non-x86 archs. It'll probably share name with FreeBSD (like hw.instruction_sse) if PowerPC support ever materialize.

::: xpcom/glue/AltiVec.h
@@ +56,5 @@
> +#if defined(__linux__)
> +  #define MOZILLA_MAY_SUPPORT_ALTIVEC 1
> +  // CPU detection for Linux via /proc/cpuinfo
> +  #define MOZILLA_ALTIVEC_HAVE_CPUID_DETECTION 1
> +#elif defined(XP_MACOSX) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(__NetBSD__) || defined(DragonFly)

missing underscores in __DragonFly__
Attachment #722750 - Flags: feedback?(jbeich) → feedback-
TenFourFox crossbuilds and twiddles the VMX flags with its own patches anyway (when building for G3, its mozconfig turns VMX off). The patch here is mostly to benefit other consumers so that other runtime-detected VMX optimizations might land.
include <sys/param.h> on all bsds* + fix whitespace
Attachment #722750 - Attachment is obsolete: true
Attachment #722750 - Flags: review?(mh+mozilla)
Attachment #722787 - Flags: review?(mh+mozilla)
Comment on attachment 722787 [details] [diff] [review]
runtime AltiVec detection for Linux, BSD and Mac OS X v5

The licensing bit still needs to be cleared
Attachment #722787 - Flags: review?(mh+mozilla) → review+
As for freevec.org i have no idea.
As for the macos/bsd part, since i modified it enough to support the 3 bsds, and it is hardly 10 lines of C everyone would write the same way; i'd be all for dropping the comment and live with it. Gerv ?
Flags: needinfo?(gerv)
Well, we certainly shouldn't check in some code which is commented "ripped off from <GPLed codebase>"...

Tell me exactly which code and which 10 lines you are referring to?

Gerv
Flags: needinfo?(gerv)
The implem of have_altivec() in xpcom/glue/AltiVec.cpp in att 722787. This is hardly copyrightable.... similar constructs can be found everywhere. Do you want someone (not me, i have better dead horse to beat) to find such implems in a gpl-free code ?

That similar code (at least for openbsd) can be found in vlc/mplayer/sdl/pixman, :
http://trac.videolan.org/vlc/ticket/3992
http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2011-April/068027.html
http://lists.libsdl.org/pipermail/commits-libsdl.org/2012-July/005524.html
https://bugs.freedesktop.org/show_bug.cgi?id=29331

http://www.freehackers.org/thomas/2011/05/13/how-to-detect-altivec-availability-on-linuxppc-at-runtime/ also references freevec.org.
Just remove that comment.

Gerv

Retiring this bug for bug 1571613.

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.