Closed Bug 961434 Opened 11 years ago Closed 9 years ago

Consider self-hosting Array.prototype.push

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: isk, Assigned: isk)

Details

Attachments

(4 files, 4 obsolete files)

array_push_benchmark.js 11 years ago Masaya Iseki[:isk](UTC+9) 3.94 KB, text/javascript		Details
array_push_bench.js 11 years ago Masaya Iseki[:isk](UTC+9) 3.85 KB, text/javascript		Details
Self-hosted Array#push.WIP 11 years ago Masaya Iseki[:isk](UTC+9) 2.13 KB, patch		Details \| Diff \| Splinter Review
func04-pass00-BuildSSA-mir.gv.pdf 11 years ago Masaya Iseki[:isk](UTC+9) 24.95 KB, application/pdf		Details
func07-pass00-BuildSSA-mir.pdf 11 years ago Masaya Iseki[:isk](UTC+9) 19.52 KB, application/pdf		Details
func19-pass00-BuildSSA-mir.gv.pdf 11 years ago Masaya Iseki[:isk](UTC+9) 32.06 KB, application/pdf		Details
IRC-jandem-isk.txt 11 years ago Masaya Iseki[:isk](UTC+9) 1.65 KB, text/plain		Details
Self-hosted Array#push-v2.WIP 11 years ago Masaya Iseki[:isk](UTC+9) 2.11 KB, patch		Details \| Diff \| Splinter Review

Masaya Iseki[:isk](UTC+9)

Assignee

Description

•

11 years ago

Attached file array_push_benchmark.js (obsolete) — Details

Shell test case. I'm attaching a shell test case based on http://www.peacekeeper.therichins.net/test14.html. This shell test case focus on Array.push(). d8: 135 SM: 8076 SM is slower than d8. v8's Array.push() is self-hosting(https://code.google.com/p/v8/source/browse/trunk/src/array.js#464)

Till Schneidereit [:till]

Comment 1

•

11 years ago

This benchmark doesn't really test Array#push performance: try commenting out all the pushes. For me, that makes the test execute in a couple of ms in both v8 and jsc, whereas our time goes down from a horrific 1750ms to a terrible 1300ms. So, there are interesting things going on here: - apparently we do expensive things with these arrays each time the function is called, whereas other engines do nothing with them - now, if I move the creation of all the arrays out of the test function so it happens only once, our time goes down to 410ms. That would be somewhat closer to other engines, if they didn't improve a lot, too: v8 goes from 134ms to 79ms, jsc from 365ms to 117ms. It seems like other engines detect that the arrays aren't required to be re-initialized during each call, whereas we don't. Actually using them seems to cause more work to happen, but far from the amount we do. So, it still does look like our push implementation is a lot slower than other engines': the 410ms vs 79ms or 117ms. The bigger fish to fry here is the array creation (or cloning, or ... something) on each call, though.

Boris Zbarsky [:bzbarsky]

Comment 2

•

11 years ago

For the original benchmark, main-thread time is about 25% jitcode, about 35% NewInitArray and about 32% is array_push, 6% is DoNewArray, 2% is CheckOverRecursed. Overall, the main thread is about 65% of the samples. The other 35% are the free() calls from Arena::finalize on the GCHelperThread. For the NewInitArray time, there's a bunch of malloc under growElements (11% of the mainthread time), updateMallocCounter (7% of the mainthread time???), and 11% NewObjectCache::newObjectFromHit, mostly under ArenaLists::refillFreeList. Bunch PR_Lock/PR_Unlock under there. :( Under array_push, of the vast majority of the time is spent calling JSObject::growElements, which does a realloc. I wonder whether our main and GC threads are running into malloc lock contention here...

Guilherme Lima

Comment 3

•

11 years ago

Just so I can test again and compare in the future, using the original testcase: Nightly 29.0a1 (2014-01-18) - 72700 op/s Chrome 32 - 267500 op/s IE 11 - 174800 op/s

Masaya Iseki[:isk](UTC+9)

Assignee

Comment 4

•

11 years ago

Attached file array_push_bench.js — Details

Sorry to upload incomplete benchmark. I fix it.

Attachment #8362162 - Attachment is obsolete: true

Masaya Iseki[:isk](UTC+9)

Assignee

Comment 5

•

11 years ago

Attached patch Self-hosted Array#push.WIP (obsolete) — Details — Splinter Review

This patch is for self-hosted Array#push. The result is following. before patch: 458ms after patch: 2002ms This patch is very slow.

Jan de Mooij [:jandem]

Comment 6

•

11 years ago

Self-hosting Array.prototype.push won't help much because Ion inlines push already, and that's a faster path than the self-hosted version. This is Peacekeeper's arrayWeighted test, bug 918746 has all the details. In short, this mainly needs GGC so that we can allocate elements in the nursery and avoid all this malloc slowness.

Jan de Mooij [:jandem]

Comment 7

•

11 years ago

Also, with GGC, the push calls inlined in Ion could reallocate the elements inline if the capacity check fails, instead of falling back to a VM call. We could do this now by calling malloc from JIT code, but that's probably not worth it with GGC on the horizon. We could also optimize MNewArray followed by X MArrayPush instructions to allocate space for X elements eagerly. That's a pretty narrow optimization though, but shouldn't be too hard..

Masaya Iseki[:isk](UTC+9)

Assignee

Comment 8

•

11 years ago

Attached file func04-pass00-BuildSSA-mir.gv.pdf (obsolete) — Details

This is IonGraph after this patch.

Nicolas B. Pierron [:nbp]

Comment 9

•

11 years ago

Comment on attachment 8362466 [details] func04-pass00-BuildSSA-mir.gv.pdf This attachement contains only the code for ArrayPush, but not the code of the function which is calling it. What you should look for is if ArrayPush is inlined in the function which is calling it. You can try other IONFLAGS options to check why it might not be inlined. Also, is the caller hot enough to be Ion compiled?

Masaya Iseki[:isk](UTC+9)

Assignee

Comment 10

•

11 years ago

Attached file func07-pass00-BuildSSA-mir.pdf (obsolete) — Details

Thank you for your comment I'm sorry to upload incomplete IonGraph. I command following code: IONFLAGS=logs,scripts,osi,bailouts ./js/src/js --ion-eager --ion-parallel-compile=off I think IONFLAGS is no problem. Previous IonGraph is first call of Array#push. So it is not hot code. But previous IonGraph is

Attachment #8362466 - Attachment is obsolete: true

Masaya Iseki[:isk](UTC+9)

Assignee

Updated

•

11 years ago

Attachment #8362471 - Attachment filename: func07-pass00-BuildSSA-mir.gv.pdf → func07-pass00-BuildSSA-mir.pdf

Masaya Iseki[:isk](UTC+9)

Assignee

Updated

•

11 years ago

Attachment #8362471 - Attachment description: func07-pass00-BuildSSA-mir.gv.pdf → func07-pass00-BuildSSA-mir.pdf

Nicolas B. Pierron [:nbp]

Comment 11

•

11 years ago

You should try to use a for loop in the script which is using push, such as it becomes hot enough to both compile and inline. Although, you are not running the benchmark, right? At least, this does not look like the run() function used in the benchmark.

Masaya Iseki[:isk](UTC+9)

Assignee

Comment 12

•

11 years ago

Attached file func19-pass00-BuildSSA-mir.gv.pdf — Details

I misunderstand. I command following code: var arr = ['foo'] arr.push('bar') So iongraph didn't look like the run() function. I try to use benchmark. Resumepoint have parent block number in this iongraph. So self-hosting function is inlined. Yesterday, jandem taught me why Ion inlines push is faster than self-hosting version. This is because that Ion inline cache is such like monomorphic inline cache(MIC). On the other hand, baseline inline cache is such like polymorphic inline cache(PIC). PIC have to handle all cases. So it have cost to search the cache. MIC handle only case. So it have not cost to search the cache and can more optimize code. I have a question. I think PIC and MIC can be collaborate such like firstly check MIC and if not hit check MIC. Is it possible in this engine and Is doing it worth the cost?

Attachment #8362471 - Attachment is obsolete: true

Masaya Iseki[:isk](UTC+9)

Assignee

Comment 13

•

11 years ago

Attached file IRC-jandem-isk.txt — Details

jandem taught me why ion-inline is faster than self-hosting.

Masaya Iseki[:isk](UTC+9)

Assignee

Comment 14

•

11 years ago

Attached patch Self-hosted Array#push-v2.WIP — Details — Splinter Review

I update WIP-patch. I had misunderstood. IonMonkey already have cooperated with BaselineJIT. So If we can't compile at IonMonkey, we try to compile at BaselineJIT. In other words, If we can't hit at MIC, we try to whether hit at PIC. So If Ion inlines push is faster than self-hosted version, making self-hosted push should be faster than now. But result is difference.

Attachment #8362360 - Attachment is obsolete: true

Masaya Iseki[:isk](UTC+9)

Assignee

Updated

•

11 years ago

Assignee: nobody → iseki.m.aa

Lars T Hansen [:lth]

Comment 15

•

9 years ago

We have generational GC now. Running this benchmark in the browser on my MacBook Pro in Firefox 49 (with hundreds of tabs open, sorry) and Chrome 53, I get (typical numbers): v8: 65ms SM: 98ms SM is still 50% slower on this microbenchmark, but as the discussion above has indicated this is not because we're not self-hosting push. Since bug 918746, which tracks the original benchmark, has already been resolve, I will close this bug as well.

Status: UNCONFIRMED → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Consider self-hosting Array.prototype.push

Categories

(Core :: JavaScript Engine: JIT, enhancement)

Tracking

()

People

(Reporter: isk, Assigned: isk)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(4 files, 4 obsolete files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Updated

Comment 11

Comment 12

Comment 13

Comment 14

Updated

Comment 15

Attachment

General

Description

File Name

Content Type