Cache the needsHashtable flag in ScriptObject

VERIFIED WONTFIX

Status

--
enhancement
VERIFIED WONTFIX
9 years ago
8 years ago

People

(Reporter: stejohns, Assigned: stejohns)

Tracking

Details

Attachments

(1 attachment)

(Assignee)

Description

9 years ago
Every time we access a dynamic property in ScriptObject, we have to load this->vtable->traits->needsHashtable to decide how to proceed. We should cache this in ScriptObject to avoid a triple-load.
(Assignee)

Comment 1

9 years ago
Created attachment 447419 [details] [diff] [review]
Patch

Tiny tweak, but it all adds up: the lower three bits of "delegate" are unused, so we can use them as flags. (We have to strip the bit when accessing delegate but we do that much less frequently).

Unsurprisingly, perf impact is hard to measure -- I've done many perf runs and the differences seem to be in the noise range -- but fewer loads is always likely to be a win.
Assignee: nobody → stejohns
Attachment #447419 - Flags: superreview?(edwsmith)
Attachment #447419 - Flags: review?(lhansen)

Updated

9 years ago
Attachment #447419 - Flags: superreview?(edwsmith) → superreview+

Comment 2

9 years ago
Comment on attachment 447419 [details] [diff] [review]
Patch

Nothing obviously wrong.

nit: the code in the ctor is now causing more eye bleed, if you can encapsulate the WB better, it would be good.  but not a deal breaker.

If the hottest callers are JIT code via an inline cache (eg get, or call), then you can refine the cache for case BKIND_NONE.  At IC call sites, the name is constant, so isValidDynamicName should be constant, and the object type is constant (on a cache hit), so needsHashtable() should also be constant.  A streamlined handler would skip both checks.

Since you're seeing a big speedup on gameofline.as, i'm curious what its hot access paths are for the hashtable.
(Assignee)

Comment 3

9 years ago
(In reply to comment #2)
> Since you're seeing a big speedup on gameofline.as, i'm curious what its hot
> access paths are for the hashtable.

the gameoflife speedup is from an unrelated (and much more substantial) patch set that I haven't posted yet.

Comment 4

9 years ago
(In reply to comment #1)
> Unsurprisingly, perf impact is hard to measure -- I've done many perf runs and
> the differences seem to be in the noise range -- but fewer loads is always
> likely to be a win.

I suggest creating a microbenchmark that tries to benefit from the optimization.  If we still can't see the difference it's probably not worth the bother, but if the difference is observable then it comes down to whether the fix makes larger and more "realistic" tests regress.

Updated

9 years ago
Attachment #447419 - Flags: review?(lhansen) → review+
(Assignee)

Comment 5

9 years ago
(In reply to comment #4)
> I suggest creating a microbenchmark 

Good call. I'll do that first.
(Assignee)

Comment 6

9 years ago
After trying several microbenchmarks that hammer on this code, I can't find any measurable difference on my MacTel. Withdrawing.
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → WONTFIX

Comment 7

8 years ago
bulk verifying resolved !fixed issues
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.