Closed Bug 505818 Opened 15 years ago Closed 6 years ago

Avoid string<->integer conversions iterating over array keys

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: dmandelin, Unassigned)

References

Details

Example code:

  ... // a is an array
  for (var i in a) {
    var e = a[i];
    ... // no further use of i
  }

ECMA-262 requires i to get string values from the iteration, so we have to convert each index to a string, and then back again for the access. It would be better to avoid those conversions when it is safe to do so, e.g., when all uses of the iteration variable are array indexes. 

This might work particularly well with tracing--it's a demotion-like problem. PICs and similar dynamic recompilation would also seem a way to do it.

See bug 504460 comment 9 and following for some initial discussion.
Blocks: 480297
Assignee: general → brendan
Status: NEW → ASSIGNED
Seems like bug 513530 with its super-duper static initialized data for string headers and chars, for both unitStrings and intStrings, would go a long way toward addressing this bug.

/be
Depends on: 513530
Throwing back in the pool. Bug 513530 landed but the static intStringTable entries count up to 255 so I'm sure there's more to do here.

/be
Assignee: brendan → general
conversion from integer to string is normally always safe and cannot fail (unless there's a severe memory exhaustion). Why not simply avoiding the conversion and simply store the integer without doing the actual conversion and without allocating or locating a shared string buffer store, but instead adding a flag bit in the integer datatype indicating that it there's no backing store, for the string, and instead an integer value is directly stored in the object's property value. So reconverting the string to an integer would be immediate as well without any possible exception (just have to change the exposed datatype property and removing the special flag bit).

There's not even any need for a static string table for small byte values (this requires additional tests when performing the conversion from numbers to strings, and this can be avoided and replaced by directly handling various gidden incarnation of string objects for numbers). Google's V8 engine already does this internally and transparently, and when its JIT compiler determines that there's absolutely no conversion performed in the middle of a loop where the temporary string variable, it can perfectly convert it using a compiled integer loop (without even having to reinstanciate the temporary string at each loop, just to rebind its other properties for reflection).
Blocks: 654190
Assignee: general → nobody
No assignee, updating the status.
Status: ASSIGNED → NEW
No assignee, updating the status.
No assignee, updating the status.
The "simplest" way to do that would be to change the underlying storage format for strings that represent desimal integers, so they can use a more compact and more efficient internal representation. The exposed "string" type would still allow getting decimal digits from them without ever needing to create/allocate new string buffers (there can exist a static array of "string" objects for the 10 decimal digits and the minus sign, just like there can exist also a preset static array of 256 "string" objects for any single ASCII or ISO8859 character in U+00000..U+00FF).

Internally, the "string" storage can be a variant using an allocated string buffer in union with a structure that an hold some internal types of binary integers or even floats (as long as their value and precision matches the precision of number types). All objects types in Javascript anyway already need a structure with some additional flags, as well as a pointer to an interface dictionary listing its exposed properties and medatadata, and there's space there to indicate the concrete implementation of strings.

If this is done, then most often no more conversion will occur, except when one will use string methods to extract substrings for digits (and even in this case, the creation of new string buffers for these substrings can be avoided, notably for extracting isolated digits represtented as a single ASCII. And iterating over arrays using "string" indexes that are actually decimal integers will be able to use the existing binary representation in the "string" object itself without any additional indirection via another string buffer.

In summary: change the native "string" object to use an internal variant structure directly integrated in the generic object header. This is done purely internally in the Javascript engine, and does not require any change to the exposed Javascript API and properties which can still pretend that they are "strings", with an accesible length and that are still allowing substrings to be extracted as "new" strings which can be a copy of the internal object structure header and its variant part, even if *sometimes* there will also be another part for holding a string buffer.

As "string" objects are themselves immutable, they can perfectly share their string buffers. As well you can always avoid using a string buffer if the string contains a single Javascript "character" code unit (an UTF-16 code unit including surrogates, or non-character of the BMP: they are simply 16 bit integers), or possibly even two (if the variant is large enough to hold 32 bits, which will then also allow using the same variant to hold natively any 32 bit integer without ever needing to convert it to decimal in a string buffer, and will also allow using the same variant to hold natively any 32-bit float, and without even needing to create any preallocated static array of strings for those unsigned 8-bit, unsigned 16-bit or signed 32-bit integers, or signed 32-bit floats).

Even if javascript only exposes very few "native" datatypes ("string", "number", "null", "boolean", "array") and then allows using all of them as generic "Object" with properties and metadata, the internal implementation can be sped up a lot by using many more internal native datatypes. And this can benefit also to the speed if this saves lot of work for the memory allocator, and garbage collector if many of these objects can be represetned internally using short structure with contant sizes (such as a single 32-bit or 64-bit item, which can be copied directly instead of being allocated, and passed directly in CPU registers or call stacks without taking care at all about maintaining them in the heap).
Note: ideally, the variant structure should be 64-bit and used so that all the static header flags and types for representing the variant are all using the highest bits set to 0 or 1, so that the lowest remaining bits (at least 32 of them) can directly hold the value of 32-bit binary integers, used in CPU registers jsut like if they were normal 64-bit integers (in case of overflow during arithmetic operations, the extra bits will force creating a new true "number" object with a separately allocated storage buffer, and decomposing the numeric value into that buffer, but only if that "overflow" integer is returned as a "number" object).

If the Javascript engine is compiled for 32-bits systems, I'm quite sure you can still represent the variant structure as a single 32 bit integer containing a few high bits for flags and native type, and then at least 8 bits for a small 8-bit integer value. You'll save less speed for the memory allocator or garabage collector, but you'll avoid most conversions to decimal numeric strings.
Note such technics using internal variants is not new. In fact it is used since long before javascript in other languages, notably for Lisp, Forth, PostScript.

I used it more than 30 years ago to write a Postscript processor/emulator that was used in a data reporting engine to generate printed documents. It was running then very fast with modest ressources, either on mainframes, or on the first 16-bit PCs with very modest resource usage. The most limiting factor was the cost of the memory allocator and garbage collector, rather than just a performance goal for pure arithmeric operations, but the speed cost for handling the vew variant bits was extremely minor compared to the huge gain in terms of memory management and in fine in global performance (this allowed the reporting engine to process very fast extremely large volumes of data and still create great visual reports).

I think that the same consierations are still valid today: memory management is still the most limiting factor of global performances (and responsiveness) of Javascript, due to its hgighly dynamic exposed datatyping system (which implies complex structures for keeping its genericity). But the exposed datatypes can be largely improved by detaching it from the underlying native datatyping system used inernally by the engine implementation, which can then use many more optimization opportunities (notably using native CPU registers, or parallelization without lot of synchronization mechanisms and lot of overheads caused by very inefficient conversions (like the implicit conversion of array indexes to strings for indexing actual objects in a dictionnary of properties: the dictionnary has a severe cost and an array can be built to use two separate indexes: one for strings that hold integer key values, another for all other string keys).
Strings are already stored as a variant.  See  https://searchfox.org/mozilla-central/rev/c9272ef398954288525e37196eada1e5a93d93bf/js/src/vm/StringType.h#214-274

The obvious issues I see with just implementing the proposal from comment 9 are:

1)  Going from an index to a string still involves allocation of a gcthing (which a string is and an index is not).

2)  Strings don't really have free space inside them to store an integer _and_ a char pointer.  If the integer would replace the char pointer, then all string consumers would have to be able to deal with the lack of a char pointer.  Another alternative would be to "flatten" the string to something with a char buffer if someone asks for it.  Maybe this would work ok.
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #12)
> Strings are already stored as a variant.  See 
> https://searchfox.org/mozilla-central/rev/
> c9272ef398954288525e37196eada1e5a93d93bf/js/src/vm/StringType.h#214-274
> 
> The obvious issues I see with just implementing the proposal from comment 9
> are:
> 
> 1)  Going from an index to a string still involves allocation of a gcthing
> (which a string is and an index is not).
At least indexes < 256 at least are already in the atom table.
> 
> 2)  Strings don't really have free space inside them to store an integer
> _and_ a char pointer.  If the integer would replace the char pointer, then
> all string consumers would have to be able to deal with the lack of a char
> pointer.  Another alternative would be to "flatten" the string to something
> with a char buffer if someone asks for it.  Maybe this would work ok.
I implemented this in bug 654190.
OK.   So what still needs to be done here?
     *   NormalAtom    000010       xxxxx0
     *   PermanentAtom 100010       1xxxx0

The "Atom" type only uses these flags: x00010 (only the 6th bits is variable to distinguish normal and permanent atoms).

But otherwise any *other* value of xxxx10 is not used for now.

It can be used to create other string subtypes for holding an integer value directly; this can be done in the 32-bit "Length" field directly (when it is compiled with JS_BITS_PER_WORD == 32)

Those combination of flags allow creating subtypes for other native types: xzzz10 where zzz is not 000.

So the flags 000110 can specify a string encoded as a single native 32bit integer (representing a string whose content contains a signed decimal number.
As evilpie said, since bug 654190 strings can contain a 16-bit index value. That value is then used to optimize the string-to-int32 conversion. Combined with the use of static strings for small integers, what we're doing is probably good enough.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Fwiw, I just tried profiling this testcase:

  var a = new Array(1000);
  for (var i = 0; i < 1000; ++i) a[i] = 5;

  var count = 100000;
  var start = new Date;
  for (var i = 0; i < count; ++i) {
    for (var j in a) {
       var e = a[j];
    }
  }
  var stop = new Date;
  alert((stop - start)/count * 1e6);

which is pretty slow in Firefox compared to Chrome (3x slower, but is on par with Safari).

The profile is at https://perfht.ml/2OtJYAJ -- there's a bunch of time under Snapshot and allocating JSStrings (gc pressure for those).   Sticking to indices < 250 does seem to help with this, by the way.  See a profile of that at https://perfht.ml/2Oqw426 which has no AllocateString under Int32ToString.   But the actual iterator  creation is still costly...

Anyway, may be worth a quick look at those profiles to see whether anything jumps out.
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #17)
> Anyway, may be worth a quick look at those profiles to see whether anything
> jumps out.

I think the main issue is that we don't use the iterator cache and our GetIterator IC when there are dense elements: https://searchfox.org/mozilla-central/rev/3d989d097fa35afe19e814c6d5bc2f2bf10867e1/js/src/vm/Iteration.cpp#840

The simplest fix is probably to do something like this:

* Allow use of the iterator cache when the receiver (not proto) has dense elements *and* there are no sparse indexes anywhere on the object/proto.

* Then when we reuse a cached iterator and getDenseInitializedLength > 0, we prepend all of our own dense indexes to the NativeIterator's property list. This might require a realloc when there are more dense elements than the cached iterator had, but it should still be much faster than what we're doing now.

I'll try something today and if it works out file a follow-up bug.
Flags: needinfo?(jdemooij)
The iterator "for var j in a" does not seem to be the issue: it should just loop through key indexes and reuse the objects found in the existing index without performing any allocation (except when creating the iterator (only once per loop in "for (var i=0;...;++i)" and in the "++i" if it needs to create a new integer object fur this counter).
all this seems to be while using an array index ("e = e[j]" above, where "j" should already be allocated and just reuses the key it got also by reference from "for var j in a").
Something causes unnecessary implicit conversions made by the iterator to change the array key it got from a in "for var j in a" to integer and then back again to string to use it as a key index while indexing in the inner loop.
Of course if integers could be stored directly in the variant representation of a "string" without performing any allocation, we would not be affected by this performance cost (but the previous comment still suggests that it even occurs when integer values are below 256 (where the strings should come from the prebuilt list of atoms, for some reasons, that cache of atoms is till not used, and a new string is still built).
So something gets wrong in the existing accessors defined to select and access the best possible variant representations of a string and the codes seems to use some shortcuts (bypassing the normal accessors) and forget to select and use the best variants (the integer value is not checked, it is treated like a generic "number" and forces the conversion to a "string").
So the execuition path goes through converting the key read the iterator from string to generic number (instead of integer), then back to string while indexing the array: the variant tuning info inside the string representation is lost, another string object gets allocated using another variant without even changing the effective value, but changing to another representation variant. May be this is caused by some invalid tweaks changing some tuning flags internally, to make the string temporarily mutable (this invalidates all other possible variants to make also the string buffer mutable character by character) by allocating a new object and its referenced buffer, just before making that object immutable and sharable again (the GC will have collect these two temporary areas if the other shared immutable strings are used, but this requires lookup in a global table of strings, and this is also probably not occuring, and the preallocated strings for integers from 0 to 255 are also not used.
(In reply to verdy_p from comment #19)
> The iterator "for var j in a" does not seem to be the issue: it should just
> loop through key indexes and reuse the objects found in the existing index
> without performing any allocation

The whole issue here is that indexes are treated internally as integers but for-in needs strings so we have to convert from int to string *somewhere*.

> occurs when integer values are below 256 (where the strings should come from
> the prebuilt list of atoms, for some reasons, that cache of atoms is till
> not used, and a new string is still built).

What do you base this on? The cache is used just fine AFAIK.

> So something gets wrong in the existing accessors defined to select and
> access the best possible variant representations of a string and the codes
> seems to use some shortcuts (bypassing the normal accessors) and forget to
> select and use the best variants (the integer value is not checked, it is
> treated like a generic "number" and forces the conversion to a "string").
> So the execuition path goes through converting the key read the iterator
> from string to generic number (instead of integer), then back to string
> while indexing the array: the variant tuning info inside the string
> representation is lost, another string object gets allocated using another
> variant without even changing the effective value, but changing to another
> representation variant. May be this is caused by some invalid tweaks
> changing some tuning flags internally, to make the string temporarily
> mutable (this invalidates all other possible variants to make also the
> string buffer mutable character by character) by allocating a new object and
> its referenced buffer, just before making that object immutable and sharable
> again (the GC will have collect these two temporary areas if the other
> shared immutable strings are used, but this requires lookup in a global
> table of strings, and this is also probably not occuring, and the
> preallocated strings for integers from 0 to 255 are also not used.

Sorry but there's a lot of speculation here that doesn't match how the engine works internally.
(In reply to verdy_p from comment #19)
> So the execuition path goes through converting the key read the iterator
> from string to generic number (instead of integer), then back to string
> while indexing the array:

That isn't what's happening; indexing into an array never allocates/creates strings. What will happen is that we get a string from the for-in loop iterator, say "123", and worst-case that gets converted to a jsid, a tagged pointer storing 123 as int31, no string allocation/creation/conversion is involved. (ICs actually have code to bypass jsid conversion/lookup by doing string "123" => integer 123 => array.elements[123].)
See Also: → 1502905
(In reply to Jan de Mooij [:jandem] from comment #18)
> I'll try something today and if it works out file a follow-up bug.

I don't have time to work on this now, but I filed bug 1502905. I also noticed/fixed bug 1500052 when I was looking into that :)
Flags: needinfo?(jdemooij)
You need to log in before you can comment on or make changes to this bug.