js_strtointeger - use special handling for base 10

RESOLVED WONTFIX

Status

()

Core
JavaScript Engine
RESOLVED WONTFIX
10 years ago
9 years ago

People

(Reporter: Michael Moy, Assigned: Michael Moy)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Assignee)

Description

10 years ago
jsnum.c has the routine js_strtointeger that has the following snippet of code:

I think that the most common case is base 10 with the number well within the
range of a 32-bit integer. Putting in special code to do base-10 before this
code could:

- Make for a tighter loop by not having to check for alpha lower and upper case
- Speed up the code for value = value * base + digit. This code converts base and digit from integer to double-precision floating point and then carries out the expected operation in double-precision floating point. The optimization would be to do the entire calculation in 32-bit integer arithmetic which is faster and avoids the integer to floating point conversions. As an aside, it might be a good idea to use a local to hold base as a DP FP so that it wouldn't have to be converted each time for compilers that can't factor it out. Smart x86 compilers (post Pentium 4 optimization) could do the multiply by 10 using two LEA instructions, one multiplying by 8 and the other multiplying by 2 and adding it to the first register. On x86 processors, in general, scalar integer math is faster than scalar floating point math.
- If the number turns out to be beyond the range of the integer, the special base 10 code should switch to DP FP math, either by using the regular routine or an optimized copy of the regular routine for base 10.

1025     /*
1026      * Done with the preliminaries; find some prefix of the string that's
1027      * a number in the given base.
1028      */
1029     JS_ASSERT(s1 < send);
1030     start = s1;
1031     value = 0.0;
1032     do {
1033         uintN digit;
1034         jschar c = *s1;
1035         if ('0' <= c && c <= '9')
1036             digit = c - '0';
1037         else if ('a' <= c && c <= 'z')
1038             digit = c - 'a' + 10;
1039         else if ('A' <= c && c <= 'Z')
1040             digit = c - 'A' + 10;
1041         else
1042             break;
1043         if (digit >= (uintN)base)
1044             break;
1045         value = value * base + digit;
1046     } while (++s1 != send);
(Assignee)

Comment 1

10 years ago
Created attachment 307877 [details] [diff] [review]
Special processing for base 10 in string to integer routine

This code process base 10 using integer calculations only switching over to double precision floating point after 9 digits. It also make a small optimization in the handling of other bases by factoring out an integer to double conversion.
Assignee: general → mmoy
Status: NEW → ASSIGNED
(Assignee)

Updated

10 years ago
Attachment #307877 - Flags: review?(brendan)
Comment on attachment 307877 [details] [diff] [review]
Special processing for base 10 in string to integer routine

Thanks, just nits and questions below. Mainly, can you measure a speed win with this patch, say on SunSpider?

>+    jsdouble value, base_double;

Is base_double worth the source cost? Won't quality compilers widen base as appropriate and common it if the widening is costly?

>     if ((negative = (*s1 == '-')) != 0 || *s1 == '+') {

Nit: house style generally avoids nesting assignment in if conditions (loop conditions are ok if canonical while ((c = getchar()) != EOF) and the like).

>+    if (base == 10) {
>+        int i = 0, value_int = 0;

Nit: the _int and _double suffixing is a bit verbose compared to similar code in SpiderMonkey -- maybe ival and dval instead?

>+
>+        do {
>+
>+            /* Ensure c is from 0 to 9 with one compare. */
>+

No need for those two deadwood blank lines around the single-line comment.

>+            int c = *s1 - '0';
>+            if (((unsigned char) c) > 9)

Just (unsigned) would do, and no need to parenthesize cast against >.

>+                break;
>+
>+            value_int = value_int * 10 + c;
>+            if (++i == 9) {
>+                value = (jsdouble) value_int;
>+                while (++s1 != send) {
>+                    int c = *s1 - '0';

Please hoist int c to top level or the oldest common ancestor block.

>+                    if (((unsigned char) c) > 9)

(unsigned) again.

>+                        break;
>+                    value = value * 10.0 + c;
>+                }
>+                goto skip_float_convert;
>+            }
>+        } while (++s1 != send);
>+
>+        value = (jsdouble) value_int;
>+
>+    skip_float_convert:
>+        ;

This label should go after the whole if-else, avoiding the empty statement target.

>+    }
>+    else {

House style cuddles: } else {.

>+        value = 0.0;
>+        base_double = (jsdouble) base;
>+        do {
>+            uintN digit;
>+            jschar c = *s1;

Does c need to be jschar here? Might be better as uintN aka unsigned.

That reminds me: use uintN consistently, or unsigned consistently. No point in mixing. I'm ok with unsigned. The intN and uintN typedefs look better when mixed intentionally with uint32, etc. but they mean int and unsigned.

>+            if ('0' <= c && c <= '9')

So if c is unsigned, the above could be (c - '0' <= (unsigned)('9' - '0')) -- is that faster? Avoids a branch to nearby target, at the cost of a subtract.

>+                digit = c - '0';

And here's a c - '0' common subexpression.

>+            else if ('a' <= c && c <= 'z')
>+                digit = c - 'a' + 10;
>+            else if ('A' <= c && c <= 'Z')
>+                digit = c - 'A' + 10;

Etc.

/be

Comment 3

10 years ago
this patch doesn't show an obvious improvement in testing on macintel.

Comment 4

9 years ago
no demonstrated perf improvement, no activity. closing for now
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → WONTFIX
Attachment #307877 - Flags: review?(brendan)
You need to log in before you can comment on or make changes to this bug.