Last Comment Bug 759978 - Inline property assignment after normal property assignment in constructor (much) slower on TI/JM
: Inline property assignment after normal property assignment in constructor (m...
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86_64 Linux
: -- normal (vote)
: mozilla16
Assigned To: Shu-yu Guo [:shu]
:
Mentors:
Depends on: 762022 783396
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-30 19:50 PDT by Shu-yu Guo [:shu]
Modified: 2012-11-15 08:15 PST (History)
2 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
test case (378 bytes, application/javascript)
2012-05-30 19:50 PDT, Shu-yu Guo [:shu]
no flags Details
WIP (10.24 KB, patch)
2012-05-31 01:57 PDT, Shu-yu Guo [:shu]
no flags Details | Diff | Splinter Review
WIP (6.88 KB, patch)
2012-05-31 02:00 PDT, Shu-yu Guo [:shu]
no flags Details | Diff | Splinter Review
patch (7.23 KB, patch)
2012-05-31 15:28 PDT, Shu-yu Guo [:shu]
no flags Details | Diff | Splinter Review
patch (7.24 KB, patch)
2012-06-04 13:11 PDT, Shu-yu Guo [:shu]
bhackett1024: review+
Details | Diff | Splinter Review

Description Shu-yu Guo [:shu] 2012-05-30 19:50:45 PDT
Created attachment 628575 [details]
test case

At a glance the definite property analysis fails for |C|, but it's unclear why removing

  this.cval = 0;

from the constructor speeds it up.
Comment 1 Shu-yu Guo [:shu] 2012-05-30 19:59:01 PDT
5 seems to be the magic number of properties that triggers this.
Comment 2 Brian Hackett (:bhackett) 2012-05-30 20:00:13 PDT
This test is just allocating objects in a loop so the main bottleneck is the giant number of page faults we will take.  The objects have 4 properties without the .cval assign, and 5 with it.  That bumps the objects into a higher size class and they will increase from 12 words (on 32 bit, 4 header + 8 fixed slots) to 20 words (4 header + 16 fixed slots) and we will take a lot more faults.

I get about the same times for C() and D().  What engine(s) and running modes are you looking at, and what are the times you get?
Comment 3 Shu-yu Guo [:shu] 2012-05-30 20:02:44 PDT
64bit debug build on Linux:

$ ./js -n -m objwtf.js 
481
12
Comment 4 Shu-yu Guo [:shu] 2012-05-30 20:03:03 PDT
err, 64bit debug JM/TI on m-i tip.
Comment 5 Brian Hackett (:bhackett) 2012-05-30 20:09:34 PDT
Hmm, what do you get with an opt build?  (debug builds can be pretty flaky for doing perf comparions.)  And is this on the version of the test where C() and D() both define just four properties?
Comment 6 Shu-yu Guo [:shu] 2012-05-30 20:28:17 PDT
Sorry I'm being sloppy. I get similar slowdown in both opt and debug builds for the version with *5* properties on both C and D.
Comment 7 Shu-yu Guo [:shu] 2012-05-31 01:57:04 PDT
Created attachment 628625 [details] [diff] [review]
WIP

It is definitely caused in part by |AnalyzeNewScriptProperties| not understanding inline assignments like |this.x = this.y = 0|, as the 'this' values are pushed but not used immediately, which causes the function to bail out and mark nothing as definite.

Callgrind shows most of the time being spent in the SetProp IC stub due to properties not being marked definite for both the 5 and 4-properties versions. The reason for the 5-properties version being much slower than the 4-properties version even when the IC stub keeps getting hit must be the faults that bhackett mentioned.

This WIP patch tries to teach |AnalyzeNewScriptProperties| to understand multiple inline property assignments on 'this'.
Comment 8 Shu-yu Guo [:shu] 2012-05-31 02:00:31 PDT
Created attachment 628626 [details] [diff] [review]
WIP

Reorder functions to make patch smaller.
Comment 9 Shu-yu Guo [:shu] 2012-05-31 15:28:32 PDT
Created attachment 628921 [details] [diff] [review]
patch

new version
Comment 10 Shu-yu Guo [:shu] 2012-05-31 15:31:55 PDT
Brian,

Why do non-SETPROP uses of 'this' bail out the entire |AnalyzeNewScriptProperties| function instead of be ignored? Why should something like:

  this.x = 0;
  this.y = this.x;

not mark this.y as a definite property? I feel like I'm missing some soundness issue.
Comment 11 Brian Hackett (:bhackett) 2012-05-31 17:39:30 PDT
Definite properties need to describe not just properties that an object will eventually have, but those it will have before anything actually tries to access the property.  So AnalyzeNewScriptProperties bails out when it finds a use of 'this' which could escape and cause properties of the 'this' object to be accessed in unknown ways.

This shouldn't conflict with this patch, you just need to make sure that when the constructor accesses this.f that this.f is definitely a property of the object by that point.
Comment 12 Shu-yu Guo [:shu] 2012-06-04 13:11:56 PDT
Created attachment 629899 [details] [diff] [review]
patch

Final-ish version.

This patch refactors |AnalyzeNewScriptProperties| to process popped 'this' values in FIFO (stack) order instead of linear PC order.

The main refactoring is that there is a "producer" loop in |AnalyzeNewScriptProperties| that pushes 'this' uses onto a stack and a "consumer" loop in |AnalyzePoppedThis| that processes them all in FIFO order, which results in appending SETPROPs to |initializerList| in PC order.

Re: complexity concerns, I don't think this adds any more complexity than the previous case, and processes the 'this' in the natural order that it's used in the bytecode anyways.

The ordering constraints that check for 'this' escaping via eval and the like are intact -- the notion of |lastThisPopped| now generalizes to the offset of the bottom of the 'this' stack everytime that stack is processed, which is (assuming non-crazy malformed bytecode) the largest 'this' pop offset that we've processed.

More could be done in general to analyze definite properties, of course, but this patch just fixes this single thing.
Comment 13 Brian Hackett (:bhackett) 2012-06-04 17:23:59 PDT
Comment on attachment 629899 [details] [diff] [review]
patch

Review of attachment 629899 [details] [diff] [review]:
-----------------------------------------------------------------

Looks great!
Comment 15 Paul Wright 2012-06-05 05:57:04 PDT
This should not be marked RESOLVED FIXED, as it has not been merged to M-C yet.  Also, this patch significantly regressed 64-bit Kraken Gaussian Blur on AWFY.
Comment 16 Paul Wright 2012-06-05 05:59:29 PDT
Correction:  The patch actually has been merged to M-C.  It still regresses Kraken, though.
Comment 17 Geoff Lankow (:darktrojan) 2012-06-05 06:24:41 PDT
https://hg.mozilla.org/mozilla-central/rev/f56e2197d9cd
Comment 18 Shu-yu Guo [:shu] 2012-06-05 10:22:19 PDT
I can't reproduce the slowdown locally for 64-bit imaging-gaussian-blur. That benchmark doesn't even call the function that this patch changed.

Any suggestions?
Comment 19 Paul Wright 2012-06-06 06:23:10 PDT
I have no suggestions for you on this one, but the regression remains and is consistent on AWFY.
Comment 20 Brian Hackett (:bhackett) 2012-06-06 06:41:17 PDT
Several kraken tests and at least one SS test are very sensitive to cache behavior and can change in performance dramatically for no good reason.  Since the shell is deterministic that change will persist across runs until some other random commit changes the behavior again.  This stuff generally won't reproduce on other machines or in the browser.
Comment 21 Paul Wright 2012-06-06 06:45:59 PDT
Filed Bug 762022 and marked blocking this one.

Note You need to log in before you can comment on or make changes to this bug.