Closed Bug 327129 Opened 19 years ago Closed 19 years ago

Use __builtin_clz or count-leading-zeros for log2 calculations

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: igor, Assigned: igor)

References

Details

Attachments

(1 file, 4 obsolete files)

The current version of JS_CEILING_LOG2 and JS_FLOOR_LOG2 macros from jsbit.h uses binary search to find the highest set bit. This code can be made faster/smaller on GCC which provides __builtin_clz to count leading zeros and define: floor(log2(n)) = 31 - __builtin_clz(n) when n != 0 ceil(log2(n)) = 32 - __builtin_clz(n - 1) when n > 1
Attached patch Implementation (obsolete) — Splinter Review
The only complication of the patch is the extra messuers to ensure that __builtin_clz is never called with 0 argument. For JS_FLOOR_LOG2 the remedy is simple: floor(log2(n)) = 31 - __builtin_clz(n | 1) which gives 0 for floor(log2(0)) as is the case for the current code. For JS_CEILING_LOG2 I used an explicit check: ceil(log2(n)) = (n <= 1 ? 0 : 32 - __builtin_clz(n - 1)) Besides making the code presumably faster the patch shrinks an optimized build of js shell by 188 bytes on i586 with GCC 4.0 as __builtin_clz is translated into single instruction.
Attachment #211843 - Flags: review?(brendan)
Severity: normal → enhancement
Status: NEW → ASSIGNED
Comment on attachment 211843 [details] [diff] [review] Implementation Cool -- thanks for fixing the bogus expansions in the .c file, left over from age-old NSPR1 code. COMPILER_ASSERT could help us avoid some GCC warnings elsewhere (jsgc.c at least). Patch freely, rs=me. /be
Attachment #211843 - Flags: review?(brendan) → review+
I committed the patch.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
(In reply to comment #2) > COMPILER_ASSERT could help us avoid some GCC warnings elsewhere (jsgc.c at > least). See bug 327896.
This caused some Tinderboxes to go red. Mook says "they seem to be failing because __builtin_clz is only gcc 3.4 and higher".
../../../mozilla/js/src/jslog2.c: In function `JS_CeilingLog2': ../../../mozilla/js/src/jslog2.c:49: warning: implicit declaration of function ` __builtin_clz'
#if (__GNUC__ == 3 && __GNUC_MINOR__ >= 4) would do the trick, of course, though I wonder if other compilers have similar built-ins and this should be changed to an autoconf test?
Don't you mean #if __GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4) ?
My fault: I was under impression that GCC had __builtin_clz, but in fact it was introduced only since GCC 3.4: http://gcc.gnu.org/gcc-3.4/changes.html I reverted the commit.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to comment #8) > Don't you mean > > #if __GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4) > > ? > Erh, yeah, that :-)
Attached patch Implementation v2 (obsolete) — Splinter Review
The new version of the patch uses: #if JS_GCC_HAS_BUILTIN_CLZ where JS_GCC_HAS_BUILTIN_CLZ is defined as: #if __GNUC__ >= 4 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4) # define JS_GCC_HAS_BUILTIN_CLZ 1 #else # define JS_GCC_HAS_BUILTIN_CLZ 0 #endif
Attachment #211843 - Attachment is obsolete: true
Attachment #212463 - Flags: review?(brendan)
(In reply to comment #7) > #if (__GNUC__ == 3 && __GNUC_MINOR__ >= 4) > > would do the trick, of course, though I wonder if other compilers have similar > built-ins and this should be changed to an autoconf test? 2 problems with this: 1. JS allows a standalone build via its own Makefile.ref which does not use autoconf. 2. I doubt that other compilers would have exactly the same name/semantic for __builtin_clz so separated ifdefs would have to be provided in any case. IMO if somebody knows how to make floor(log2(n)) faster with other compilers then GCC >= 3.4, then separated bugs should be filed.
Summary: Using count-leading-zeros for log2 calculations. → Using __builtin_clz or count-leading-zeros for log2 calculations
(In reply to comment #12) > 1. JS allows a standalone build via its own Makefile.ref which does not use > autoconf. Ah, ok. > 2. I doubt that other compilers would have exactly the same name/semantic for > __builtin_clz so separated ifdefs would have to be provided in any case. Yeah, I was thinking of something like: #if __GNUC__ >= 4 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4) # define JS_BUILTIN_CLZ(_n) __builtin_clz(_n) #endif // elsewhere #ifdef JS_BUILTIN_CLZ // use JS_BUILTIN_CLZ(...) #else // the hard way #endif which would allow us to later add additional defines for JS_BUILTIN_CLZ for other compilers. > IMO if somebody knows how to make floor(log2(n)) faster with other compilers > then GCC >= 3.4, then separated bugs should be filed. Fair enough.
Attached patch Implementation v3 (obsolete) — Splinter Review
Using #ifdef, not #if to get less preprocessor code to define JS_HAS_GCC_BUILTIN_CLZ.
Attachment #212463 - Attachment is obsolete: true
Attachment #212475 - Flags: review?(brendan)
Attachment #212463 - Flags: review?(brendan)
It is better to commit 327896 first to use JS_COMPILER_ASSERT introduced there.
Depends on: 327896
Attached patch Implementation v4 (obsolete) — Splinter Review
Using JS_COMPILER_ASSERT
Attachment #212475 - Attachment is obsolete: true
Attachment #212479 - Flags: review?(brendan)
Attachment #212475 - Flags: review?(brendan)
Comment on attachment 212479 [details] [diff] [review] Implementation v4 Looks good, but that FIXME is either unnecesary (the ?: expression is small enough that a mispredict penalty will be very small, since all instructions will be in the pipe) or a FIXME for GCC, not for us. /be
Attachment #212479 - Flags: review?(brendan) → review+
I committed the last patch
Status: REOPENED → RESOLVED
Closed: 19 years ago19 years ago
Resolution: --- → FIXED
Flags: testcase-
Summary: Using __builtin_clz or count-leading-zeros for log2 calculations → Use __builtin_clz or count-leading-zeros for log2 calculations
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: