Closed Bug 1407314 Opened 7 years ago Closed 7 years ago

Optimise nursery collection for most common cases

Categories

(Core :: JavaScript: GC, enhancement, P3)

55 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox57 --- wontfix
firefox58 --- fixed

People

(Reporter: jonco, Assigned: jonco)

Details

Attachments

(4 files)

We do a bunch of function calls and if-chains for each object we tenure.  We should optimise all this for the common cases.

Here are the counts of the classes of promoted objects when running the octane benchmark:

      Array               792798
      Function             68523
      TypedArray           19969
      UnboxedPlainObject  628434
      UnboxedArrayObject       0
      InlineTypedObject        0
      OutlineTypedObject       0
      NativeObject      15121273
      PlainObject       15068827
Priority: -- → P3
Attached file microbenchmark
Here's a benchmark that creates a lot of objects in the nursery and times how long it takes to promote them.  Note there is some trickery involved in making the nursery grow to its maximum size and making sure we don't pre-tenure.
Attached patch improve-tenuringSplinter Review
Patch to rearrange JSObject::allocKindForTenure() to check the most common classes first, and a special check for plain objects in TenuringTracer::moveToTenured() which calls an inlined PlainObject::allocKindForTenure().

I had to make the typed object prototype a singleton as it was getting nursery allocated and then causing an assert to fail (I forget which on exactly).

This gives a 6% improvement on the benchmark.
Attachment #8918303 - Flags: review?(sphink)
This just moves some method definitions around to make the following patch easier to read.
Attachment #8918305 - Flags: review?(sphink)
This is a bit more controversial.  As well as inlining a bunch of methods, this adds a fast path for tenuring plain objects and inlines it in TenuringTracer::traverse.  Plain objects are by far the most common case, an this means we can skip a bunch of tests when tenuring them.

The benchmark shows a 13% improvement with all patches applied.
Attachment #8918309 - Flags: review?(sphink)
Comment on attachment 8918303 [details] [diff] [review]
improve-tenuring

Review of attachment 8918303 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/gc/Marking.cpp
@@ +2844,5 @@
>      MOZ_ASSERT(!src->zone()->usedByHelperThread());
>  
> +    AllocKind dstKind;
> +    if (src->is<PlainObject>())
> +        dstKind = src->as<PlainObject>().allocKindForTenure();

It appears that your stats don't include subclasses, so am I correct in interpreting them as saying that there are about as many PlainObjects as NativeObjects? (As in, just as many non-PlainObject NativeObjects as PlainObjects.) If so, why the specialization for PlainObject but not NativeObject?

Also, can you comment that this is an optimization? Hm... or better yet, make this

  if (src->is<PlainObject>()) {
    dstKind = src->as<...;
    MOZ_ASSERT(dstKind == src->allocKindForTenure(nursery()));
  } else {
    src->allocKindForTenure(nursery());
  }

Then you don't need the comment.
Attachment #8918303 - Flags: review?(sphink) → review+
Attachment #8918305 - Flags: review?(sphink) → review+
Comment on attachment 8918309 [details] [diff] [review]
improve-tenuring-3

Review of attachment 8918309 [details] [diff] [review]:
-----------------------------------------------------------------

A 13% speedup for some code reorganization and specializing for the common case? Heck yes!

::: js/src/gc/Marking.cpp
@@ +2677,5 @@
> +        *objp = moveToTenuredSlow(obj);
> +        return;
> +    }
> +
> +    *objp = movePlainObjectToTenured(&obj->as<PlainObject>());

I think I'd prefer the common case to come first. Makes it easier to infer the justification. But I think it's also worth a comment and a MOZ_LIKELY just to drill it home:

  // Optimize for the common case.
  if (MOZ_LIKELY(obj->is<PlainObject>)) {
     ...
     return;
  }

  *objp = moveToTenuredSlow(obj);

and again, am I reading the stats wrong? Aren't non-PlainObject NativeObjects just as common?

@@ +2998,5 @@
> +
> +inline JSObject*
> +js::TenuringTracer::movePlainObjectToTenured(PlainObject* src)
> +{
> +    // Fast path version of moveToTenuredSlow() for specialized for PlainObject.

Remove the first "for".
Attachment #8918309 - Flags: review?(sphink) → review+
(In reply to Steve Fink [:sfink] [:s:] from comment #5)
> Also, can you comment that this is an optimization? Hm... or better yet,
> make this
> 
>   if (src->is<PlainObject>()) {
>     dstKind = src->as<...;
>     MOZ_ASSERT(dstKind == src->allocKindForTenure(nursery()));
>   } else {
>     src->allocKindForTenure(nursery());
>   }

Er, never mind, this goes away with the third patch anyway, and I like the third patch.
(In reply to Jon Coppeard (:jonco) from comment #1)
> Created attachment 8918300 [details]
> microbenchmark
> 
> Here's a benchmark that creates a lot of objects in the nursery and times
> how long it takes to promote them.  Note there is some trickery involved in
> making the nursery grow to its maximum size and making sure we don't
> pre-tenure.

Uh, really? I thought literals in the source were always pretenured. My quick shell test confirms:

js> o1 = {}
({})
js> dumpObject(o1)
object 7f65e35947a0 from global 7f65e3591060 [global]
class 1ecca60 Object
lazy group
flags:
proto <Object at 7f65e3594040>
properties:

js> gc()
"before 749568, after 749568\n"
js> dumpObject(o1)
object 7f65e35947a0 from global 7f65e3591060 [global]
class 1ecca60 Object
lazy group
flags:
proto <Object at 7f65e3594040>
properties:

(notice how the object address does not change; it *will* change if you use Object.create(null) or Object.create(Object.prototype)).
(In reply to Steve Fink [:sfink] [:s:] from comment #5)
> It appears that your stats don't include subclasses, so am I correct in
> interpreting them as saying that there are about as many PlainObjects as
> NativeObjects? (As in, just as many non-PlainObject NativeObjects as
> PlainObjects.) If so, why the specialization for PlainObject but not
> NativeObject?

Nope, this is just me being sloppy.  The NativeObject count includes PlainObjects, so almost everything is a plain object.  Apart from NativeObject all the other cases are mutually exclusive.
(In reply to Steve Fink [:sfink] [:s:] from comment #8)
> Uh, really? I thought literals in the source were always pretenured.

I don't know exactly what's happening here, but it seems that the loop case is different to the top-level case:

$ cat test-toplevel.js

let o = {};
dumpObject(o);
gc();
dumpObject(o);

$ ./shell test-toplevel.js 2>&1 | grep object

object 105290120 from global 10528d060 [global]
object 105290120 from global 10528d060 [global]

$ cat test-loop.js

let a = [];
for (let i = 0; i < 10; i++) {
  let o = {};
  dumpObject(o);
  a.push(o);
}
gc();
for (let i = 0; i < 10; i++) {
  dumpObject(a[i]);
}

$ ./shell test-loop.js 2>&1 | grep object

object 107a00400 from global 107b8d060 [global]
object 107a00420 from global 107b8d060 [global]
object 107a00440 from global 107b8d060 [global]
object 107a00460 from global 107b8d060 [global]
object 107a00480 from global 107b8d060 [global]
object 107a004a0 from global 107b8d060 [global]
object 107a004c0 from global 107b8d060 [global]
object 107a00560 from global 107b8d060 [global]
object 107a00580 from global 107b8d060 [global]
object 107a005a0 from global 107b8d060 [global]
object 107b901a0 from global 107b8d060 [global]
object 107b901c0 from global 107b8d060 [global]
object 107b901e0 from global 107b8d060 [global]
object 107b90200 from global 107b8d060 [global]
object 107b90220 from global 107b8d060 [global]
object 107b90240 from global 107b8d060 [global]
object 107b90260 from global 107b8d060 [global]
object 107b90280 from global 107b8d060 [global]
object 107b902a0 from global 107b8d060 [global]
object 107b902c0 from global 107b8d060 [global]
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b256bd704792
Improve tenuring performance by rearranging allocKindForTenure() methods and adding fast path for plain objects r=sfink
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c8d176280216
Fix rooting hazards by updating annotations for renamed function r=me
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a2a53ad654aa
Fix remaning rooting hazards by telling the analysis that the object moved hook can't GC r=me on a CLOSED TREE
(In reply to Jon Coppeard (:jonco) from comment #10)
> (In reply to Steve Fink [:sfink] [:s:] from comment #8)
> > Uh, really? I thought literals in the source were always pretenured.
> 
> I don't know exactly what's happening here, but it seems that the loop case
> is different to the top-level case:

Ah, thanks for checking. I guess it makes sense that it couldn't use the single pretenured object it got from parsing the source, since it needs distinct objects for every loop iteration. Maybe it copies a template-like thing somehow? Oh well, good to know that it isn't an invalid test!
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: