Closed Bug 414828 Opened 17 years ago Closed 16 years ago

js_strtointeger - use special handling for base 10

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: mmoy, Assigned: mmoy)

Details

Attachments

(1 file)

jsnum.c has the routine js_strtointeger that has the following snippet of code: I think that the most common case is base 10 with the number well within the range of a 32-bit integer. Putting in special code to do base-10 before this code could: - Make for a tighter loop by not having to check for alpha lower and upper case - Speed up the code for value = value * base + digit. This code converts base and digit from integer to double-precision floating point and then carries out the expected operation in double-precision floating point. The optimization would be to do the entire calculation in 32-bit integer arithmetic which is faster and avoids the integer to floating point conversions. As an aside, it might be a good idea to use a local to hold base as a DP FP so that it wouldn't have to be converted each time for compilers that can't factor it out. Smart x86 compilers (post Pentium 4 optimization) could do the multiply by 10 using two LEA instructions, one multiplying by 8 and the other multiplying by 2 and adding it to the first register. On x86 processors, in general, scalar integer math is faster than scalar floating point math. - If the number turns out to be beyond the range of the integer, the special base 10 code should switch to DP FP math, either by using the regular routine or an optimized copy of the regular routine for base 10. 1025 /* 1026 * Done with the preliminaries; find some prefix of the string that's 1027 * a number in the given base. 1028 */ 1029 JS_ASSERT(s1 < send); 1030 start = s1; 1031 value = 0.0; 1032 do { 1033 uintN digit; 1034 jschar c = *s1; 1035 if ('0' <= c && c <= '9') 1036 digit = c - '0'; 1037 else if ('a' <= c && c <= 'z') 1038 digit = c - 'a' + 10; 1039 else if ('A' <= c && c <= 'Z') 1040 digit = c - 'A' + 10; 1041 else 1042 break; 1043 if (digit >= (uintN)base) 1044 break; 1045 value = value * base + digit; 1046 } while (++s1 != send);
This code process base 10 using integer calculations only switching over to double precision floating point after 9 digits. It also make a small optimization in the handling of other bases by factoring out an integer to double conversion.
Assignee: general → mmoy
Status: NEW → ASSIGNED
Attachment #307877 - Flags: review?(brendan)
Comment on attachment 307877 [details] [diff] [review] Special processing for base 10 in string to integer routine Thanks, just nits and questions below. Mainly, can you measure a speed win with this patch, say on SunSpider? >+ jsdouble value, base_double; Is base_double worth the source cost? Won't quality compilers widen base as appropriate and common it if the widening is costly? > if ((negative = (*s1 == '-')) != 0 || *s1 == '+') { Nit: house style generally avoids nesting assignment in if conditions (loop conditions are ok if canonical while ((c = getchar()) != EOF) and the like). >+ if (base == 10) { >+ int i = 0, value_int = 0; Nit: the _int and _double suffixing is a bit verbose compared to similar code in SpiderMonkey -- maybe ival and dval instead? >+ >+ do { >+ >+ /* Ensure c is from 0 to 9 with one compare. */ >+ No need for those two deadwood blank lines around the single-line comment. >+ int c = *s1 - '0'; >+ if (((unsigned char) c) > 9) Just (unsigned) would do, and no need to parenthesize cast against >. >+ break; >+ >+ value_int = value_int * 10 + c; >+ if (++i == 9) { >+ value = (jsdouble) value_int; >+ while (++s1 != send) { >+ int c = *s1 - '0'; Please hoist int c to top level or the oldest common ancestor block. >+ if (((unsigned char) c) > 9) (unsigned) again. >+ break; >+ value = value * 10.0 + c; >+ } >+ goto skip_float_convert; >+ } >+ } while (++s1 != send); >+ >+ value = (jsdouble) value_int; >+ >+ skip_float_convert: >+ ; This label should go after the whole if-else, avoiding the empty statement target. >+ } >+ else { House style cuddles: } else {. >+ value = 0.0; >+ base_double = (jsdouble) base; >+ do { >+ uintN digit; >+ jschar c = *s1; Does c need to be jschar here? Might be better as uintN aka unsigned. That reminds me: use uintN consistently, or unsigned consistently. No point in mixing. I'm ok with unsigned. The intN and uintN typedefs look better when mixed intentionally with uint32, etc. but they mean int and unsigned. >+ if ('0' <= c && c <= '9') So if c is unsigned, the above could be (c - '0' <= (unsigned)('9' - '0')) -- is that faster? Avoids a branch to nearby target, at the cost of a subtract. >+ digit = c - '0'; And here's a c - '0' common subexpression. >+ else if ('a' <= c && c <= 'z') >+ digit = c - 'a' + 10; >+ else if ('A' <= c && c <= 'Z') >+ digit = c - 'A' + 10; Etc. /be
this patch doesn't show an obvious improvement in testing on macintel.
no demonstrated perf improvement, no activity. closing for now
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → WONTFIX
Attachment #307877 - Flags: review?(brendan)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: