Closed
Bug 327129
Opened 19 years ago
Closed 19 years ago
Use __builtin_clz or count-leading-zeros for log2 calculations
Categories
(Core :: JavaScript Engine, enhancement)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: igor, Assigned: igor)
References
Details
Attachments
(1 file, 4 obsolete files)
The current version of JS_CEILING_LOG2 and JS_FLOOR_LOG2 macros from jsbit.h uses binary search to find the highest set bit. This code can be made faster/smaller on GCC which provides __builtin_clz to count leading zeros and define:
floor(log2(n)) = 31 - __builtin_clz(n) when n != 0
ceil(log2(n)) = 32 - __builtin_clz(n - 1) when n > 1
Assignee | ||
Comment 1•19 years ago
|
||
The only complication of the patch is the extra messuers to ensure that __builtin_clz is never called with 0 argument. For JS_FLOOR_LOG2 the remedy is simple:
floor(log2(n)) = 31 - __builtin_clz(n | 1)
which gives 0 for floor(log2(0)) as is the case for the current code.
For JS_CEILING_LOG2 I used an explicit check:
ceil(log2(n)) = (n <= 1 ? 0 : 32 - __builtin_clz(n - 1))
Besides making the code presumably faster the patch shrinks an optimized build of js shell by 188 bytes on i586 with GCC 4.0 as __builtin_clz is translated into single instruction.
Attachment #211843 -
Flags: review?(brendan)
Assignee | ||
Updated•19 years ago
|
Severity: normal → enhancement
Status: NEW → ASSIGNED
Comment 2•19 years ago
|
||
Comment on attachment 211843 [details] [diff] [review]
Implementation
Cool -- thanks for fixing the bogus expansions in the .c file, left over from age-old NSPR1 code.
COMPILER_ASSERT could help us avoid some GCC warnings elsewhere (jsgc.c at least). Patch freely, rs=me.
/be
Attachment #211843 -
Flags: review?(brendan) → review+
Assignee | ||
Comment 3•19 years ago
|
||
I committed the patch.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 4•19 years ago
|
||
(In reply to comment #2)
> COMPILER_ASSERT could help us avoid some GCC warnings elsewhere (jsgc.c at
> least).
See bug 327896.
Comment 5•19 years ago
|
||
This caused some Tinderboxes to go red. Mook says "they seem to be failing because __builtin_clz is only gcc 3.4 and higher".
Comment 6•19 years ago
|
||
../../../mozilla/js/src/jslog2.c: In function `JS_CeilingLog2':
../../../mozilla/js/src/jslog2.c:49: warning: implicit declaration of function `
__builtin_clz'
Comment 7•19 years ago
|
||
#if (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
would do the trick, of course, though I wonder if other compilers have similar built-ins and this should be changed to an autoconf test?
Comment 8•19 years ago
|
||
Don't you mean
#if __GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
?
Assignee | ||
Comment 9•19 years ago
|
||
My fault: I was under impression that GCC had __builtin_clz, but in fact it was introduced only since GCC 3.4:
http://gcc.gnu.org/gcc-3.4/changes.html
I reverted the commit.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 10•19 years ago
|
||
(In reply to comment #8)
> Don't you mean
>
> #if __GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
>
> ?
>
Erh, yeah, that :-)
Assignee | ||
Comment 11•19 years ago
|
||
The new version of the patch uses:
#if JS_GCC_HAS_BUILTIN_CLZ
where JS_GCC_HAS_BUILTIN_CLZ is defined as:
#if __GNUC__ >= 4 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
# define JS_GCC_HAS_BUILTIN_CLZ 1
#else
# define JS_GCC_HAS_BUILTIN_CLZ 0
#endif
Attachment #211843 -
Attachment is obsolete: true
Attachment #212463 -
Flags: review?(brendan)
Assignee | ||
Comment 12•19 years ago
|
||
(In reply to comment #7)
> #if (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
>
> would do the trick, of course, though I wonder if other compilers have similar
> built-ins and this should be changed to an autoconf test?
2 problems with this:
1. JS allows a standalone build via its own Makefile.ref which does not use autoconf.
2. I doubt that other compilers would have exactly the same name/semantic for __builtin_clz so separated ifdefs would have to be provided in any case.
IMO if somebody knows how to make floor(log2(n)) faster with other compilers then GCC >= 3.4, then separated bugs should be filed.
Summary: Using count-leading-zeros for log2 calculations. → Using __builtin_clz or count-leading-zeros for log2 calculations
Comment 13•19 years ago
|
||
(In reply to comment #12)
> 1. JS allows a standalone build via its own Makefile.ref which does not use
> autoconf.
Ah, ok.
> 2. I doubt that other compilers would have exactly the same name/semantic for
> __builtin_clz so separated ifdefs would have to be provided in any case.
Yeah, I was thinking of something like:
#if __GNUC__ >= 4 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
# define JS_BUILTIN_CLZ(_n) __builtin_clz(_n)
#endif
// elsewhere
#ifdef JS_BUILTIN_CLZ
// use JS_BUILTIN_CLZ(...)
#else
// the hard way
#endif
which would allow us to later add additional defines for JS_BUILTIN_CLZ for other compilers.
> IMO if somebody knows how to make floor(log2(n)) faster with other compilers
> then GCC >= 3.4, then separated bugs should be filed.
Fair enough.
Assignee | ||
Comment 14•19 years ago
|
||
Using #ifdef, not #if to get less preprocessor code to define JS_HAS_GCC_BUILTIN_CLZ.
Attachment #212463 -
Attachment is obsolete: true
Attachment #212475 -
Flags: review?(brendan)
Attachment #212463 -
Flags: review?(brendan)
Assignee | ||
Comment 15•19 years ago
|
||
It is better to commit 327896 first to use JS_COMPILER_ASSERT introduced there.
Depends on: 327896
Assignee | ||
Comment 16•19 years ago
|
||
Using JS_COMPILER_ASSERT
Attachment #212475 -
Attachment is obsolete: true
Attachment #212479 -
Flags: review?(brendan)
Attachment #212475 -
Flags: review?(brendan)
Comment 17•19 years ago
|
||
Comment on attachment 212479 [details] [diff] [review]
Implementation v4
Looks good, but that FIXME is either unnecesary (the ?: expression is small enough that a mispredict penalty will be very small, since all instructions will be in the pipe) or a FIXME for GCC, not for us.
/be
Attachment #212479 -
Flags: review?(brendan) → review+
Assignee | ||
Comment 18•19 years ago
|
||
Attachment #212479 -
Attachment is obsolete: true
Assignee | ||
Comment 19•19 years ago
|
||
I committed the last patch
Status: REOPENED → RESOLVED
Closed: 19 years ago → 19 years ago
Resolution: --- → FIXED
Updated•19 years ago
|
Flags: testcase-
Updated•19 years ago
|
Summary: Using __builtin_clz or count-leading-zeros for log2 calculations → Use __builtin_clz or count-leading-zeros for log2 calculations
You need to log in
before you can comment on or make changes to this bug.
Description
•