Closed Bug 516075 Opened 10 years ago Closed 10 years ago

Move shape into JSObjectMap from JSScope, const-ipate and use LIR_ldc* to get to it

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9.2
Tracking Status
status1.9.2 --- beta1-fixed

People

(Reporter: brendan, Assigned: brendan)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(3 files, 3 obsolete files)

More from the big patch in bug 504587.

/be
Flags: wanted1.9.2?
No tracer-level shape guard CSE here, just a TR::guardShape helper and LIR_ldc* usage for nanojit-level CSE.

This triggers 75 traces instead of 69, which must be due to CSE enabled via LIR_ldc*, right? Stats with patch:

recorder: started(5), aborted(3), completed(32), different header(0), trees trashed(0), slot promoted(0), unstable loop variable(1), breaks(0), returns(0), unstableInnerCalls(1), blacklisted(0)
monitor: triggered(75), exits(75), type mismatch(0), global mismatch(0)

and without:

recorder: started(5), aborted(3), completed(32), different header(0), trees trashed(0), slot promoted(0), unstable loop variable(1), breaks(0), returns(0), unstableInnerCalls(1), blacklisted(0)
monitor: triggered(69), exits(69), type mismatch(0), global mismatch(0)

Same "leaving trace at ..." lines for the first 69, modulo address differences; in particular, same .js file and line number / bytecode offset. Then these six with the patch to get to 75 traces triggered from 69:

leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=7854914
leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=7806855
leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=8912820
leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=7811465
leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=7932582
leaving trace at richards.js:190@71, op=ne, lr=0xfb6e50, exitType=LOOP, sp=2, calldepth=0, cycles=7950128

I will dig deeper into what is causing branch exits, but we are not hitting the MAX_BRANCHES limit.

Julian, please feel free to fragment-profile this.

/be
Attachment #400233 - Flags: review?(gal)
Bizarre. These are all the LIR_ldc uses with this bug's patch:

$ grep -n LIR_ldc jstracer.cpp
8145:    ops_ins = addName(lir->insLoad(LIR_ldcp, map_ins, int(offsetof(JSObjectMap, ops))), "ops");
8146:    LIns* n = lir->insLoad(LIR_ldcp, ops_ins, op_offset);
8303:                             addName(lir->insLoad(LIR_ldcp, cx_ins, offsetof(JSContext, runtime)),
8458:             * LIR_ldcp is ok to use here even though Array classword can
8465:                                      lir->insLoad(LIR_ldcp, v_ins, offsetof(JSObject, classword)),
10555:                    LIns* type_ins = lir->insLoad(LIR_ldcb, typep_ins, 0);

Lines 814[56] are in TR::map_is_native, line 8303 is for cx->runtime (pre-exists this patch), 8465 is from this patch with comment about tricksy-ness, and line 10555 pre-exists (typemap byte load).

I went from bottom of file up changing each of the three LIR_ldcp uses added by this bug's patch to LIR_ldp and got 8 extra traces changing the classword load to LIR_ldp, then undoing that and changing either of the map_is_native ones got 5 (not six) extra traces.

These extras all of course were at richards.js:190 and were LOOP exits.

Smells like a latent bug involving CSE and treecalls. The loop at line 190 has interesting structure:

  while (this.currentTcb != null) {
    if (this.currentTcb.isHeldOrSuspended()) {
      this.currentTcb = this.currentTcb.link;
    } else {
      this.currentId = this.currentTcb.id;
      this.currentTcb = this.currentTcb.run();
    }
  }

/be
Er, the cx->runtime LIR_ldcp is from this bug's patch. Anyway, taking out the three other new LIR_ldcp uses gets me two extra traces at richards.js:190 that LOOP exit. Changing the cx->runtime load to a LIR_ldp gets me three!

Going to bed, this is vexing me.

/be
Something between e4944123b960 and 74c704b49e7f made them go away. Need to confirm this and find the relevant cset. Help welcome, I've been on a plane all day.

/be
Attachment #400233 - Attachment is obsolete: true
Attachment #400233 - Flags: review?(gal)
Attachment #400423 - Attachment description: merged up to tm tip -- extra traces gone!? → merged up to tm tip -- extra trace runs gone!?
4 extra traces now that I'm off the plane. Stinks of UMR. Anyone valgrinding?

/be
The cx->runtime ldc change causing extra traces is insane. We should zoom in on that and investigate.
(In reply to comment #5)
> 4 extra traces now that I'm off the plane. Stinks of UMR. Anyone valgrinding?

I compared (tip = 32243:2a3c66f0a36c) against (same, + patch in comment #4).

I didn't see any memory errors running v8-richards, even with the
various paranoia-levers set to max.

I am wondering if CseFilter is 100% reliable.  At the moment we don't
have any way to verify its claims, but we could easily do so, by
formalising the LIR_assert thing that was part of one of the bug
504587 patches, and using that to actually check equality on
expressions claimed to be CSEs.

In any case I plain to start hacking on CSE later this week, to do
alias set stuff, and I could do that as the first thing to do.
(In reply to comment #7)
> 504587 patches, and using that to actually check equality on
> expressions claimed to be CSEs.

Actually .. that doesn't distinguish the cases

  CseFilter is somehow subtly broken

vs

  CseFilter is correct, but the assurances about memory CSEs you're
  giving it via LIR_ldc* turn out not to be so, for whatever reason

Verifying the latter would entail runtime checking for locations so
claimed, in order to verify that they really are read-only for the
lifetime of the trace.  Hmm.
If it's any use, bug 502778 has a patch that extensively overhauls CseFilter.
(In reply to comment #8)
> Verifying the latter would entail runtime checking for locations so
> claimed, in order to verify that they really are read-only for the
> lifetime of the trace.  Hmm.

I hacked up nanojit so as to generate code to pass a hint to Memcheck
every time a LIR_ldc* is executed, saying "this location is now
read-only until further notice".  Then I hacked up Memcheck to take
notice of said hints.  I verified this works by doing a store of the
data immediately after an LIR_ldc and checking that Memcheck
complains.

Unfortunately after all that, when running richards, there is no
complaint.  Well, I guess that removes the "bogus ldc* annotations"
theory as a cause of badness.
Attachment #400423 - Attachment is obsolete: true
Ok, this is very strange. I tested by applying pieces of the patch, showing no regressions, until I got to everything except the guardShape changes and the LIR_ldcp changes. No problems. Adding the LIR_ldcp changes did not cause any regression, although I was over-confidently testing only one trial. I then took out the LIR_ldcp selections and added the guardShape factoring. Still good.

Adding LIR_ldcp regressed things, I saw four extra traces. Then six. Reverted to test without this patch (LIR_ldp instead of LIR_ldcp in those four places where we load const pointers). Got zero extra traces on first run, then six again on second and repeated runs!

I'm using DiffMerge and dvander's suggestion to diff the grep '^leaving trace' lines of the full JIT spew. It shows word diffs on line as well as line diffs. I trust it saw my grep output file changes, as it prompted me to reload based on the mtime changing.

It seems the guardShape changes in addition to LIR_ldcp are needed.

I have no idea why the results would be non-deterministic if valgrind gives a clean bill of health.

/be
Attached patch combined patch (obsolete) — Splinter Review
You've seen this in bug 504587.

Sorry for all the noise, Julian pointed out that v8/run-richards.js is typical v8 hide-the-compiler-costs "run as many times in a given time period as you can" benchmarketing, which means it is non-deterministic. I think that's all that was going on here. Grrr!

/be
Attachment #400507 - Flags: review?(jorendorff)
My results now do look like warm-up effects: more traces on 2nd run. Hate those non-deterministic benchmarks! Is the SunSpider v8-richards.js really free of all ND, including as invoked by the sunspider driver?

/be
Even if we run an ND-free richards, why did Julian see extra fragment runs when profiling? My attempts to reproduce what he profiled led to my confusing TMFLAGS stats differences running the ND version of richards.

Assuming that source of ND is ruled out, either the extra frag runs are gone, or I can't see them. With the combined patch I see 71 traces triggered, from 33 compmleted, same as without the patch.

/be
(In reply to comment #15)
> Assuming that source of ND is ruled out, either the extra frag runs are gone,
> or I can't see them. With the combined patch I see 71 traces triggered, from 33
> completed, same as without the patch.

$ TMFLAGS=stats ./Darwin_DBG.OBJ/js -j !$
TMFLAGS=stats ./Darwin_DBG.OBJ/js -j ~/Hacking/webkit.org/WebKit/SunSpider/tests/v8-richards.js
recorder: started(5), aborted(2), completed(33), different header(0), trees trashed(4), slot promoted(0), unstable loop variable(1), breaks(0), returns(0), unstableInnerCalls(1), blacklisted(0)

/be
Comment on attachment 400507 [details] [diff] [review]
combined patch


> JSScope *
> JSScope::createEmptyScope(JSContext *cx, JSClass *clasp)
> {
>     JS_ASSERT(!emptyScope);
> 
>-    JSScope *scope = (JSScope *) cx->malloc(sizeof(JSScope));
>-    if (!scope)
>+    void *mem = cx->malloc(sizeof(JSScope));
>+    if (!mem)
>         return NULL;
> 
>-    scope->map.ops = map.ops;
>+    JSScope *scope = new (mem) JSScope(ops);
>     scope->object = NULL;

Why not cx->create<JSScope>(ops) here?


>+static inline JSScope *
>+OBJ_SCOPE(JSObject *obj)
>+{
>+    JS_ASSERT(OBJ_IS_NATIVE(obj));
>+    return (JSScope *) obj->map;
>+}
>+
>+static inline uint32
>+OBJ_SHAPE(JSObject *obj)
>+{
>+    JS_ASSERT(obj->map->shape != JSObjectMap::SHAPELESS);
>+    return obj->map->shape;
>+}

`static` isn't necessary on these functions since the header is already C++-only.

In jstracer.h:
>+#include "jsdhash.h"

This one isn't necessary in this patch, right? And when you land the other half you'll presumably use js::HashMap.

Great patch.
Attachment #400507 - Flags: review?(jorendorff) → review+
I renew my call to reduce richards, especially the rate measurement part of it (which might explain the weird ND behavior).
(In reply to comment #18)
> I renew my call to reduce richards, especially the rate measurement part of it
> (which might explain the weird ND behavior).

No need for reduction. I was running the wrong benchmark, which will run extra loops some of the time.

Julian said he'd have fragment profile confirmation by end of today PDT.

/be
Attached patch patch to commitSplinter Review
Just enough change here to want another r+. I booted the LIR_dbreak/insAssert to bug 504587's patch (to be updated).

Got another idea I have been haunted by to try for avoiding shape-based branch exits -- details in bug 504587 with patch.

/be
Attachment #400507 - Attachment is obsolete: true
Attachment #400595 - Flags: review?(jorendorff)
Attachment #400595 - Flags: review?(jorendorff) → review+
Comment on attachment 400595 [details] [diff] [review]
patch to commit

You can avoid the (JSObject *)NULL cast, I think, by making the second parameter default to NULL.
(In reply to comment #13)
>
> Sorry for all the noise, Julian pointed out that v8/run-richards.js is typical
> v8 hide-the-compiler-costs "run as many times in a given time period as you
> can" benchmarketing, which means it is non-deterministic.

The ./sunspider --v8 tests use a consistent number of iterations.
Comparing 32246:7eff6f4aee73  vs  32246:7eff6f4aee73 + comment #20 patch

All looks good to me.


No change in #frag entries -- 12863820


Static code measures are good:

  before:
    Total displayed code bytes = 87087, exit bytes = 110283
    Total displayed static exits = 3678

  after:
    Total displayed code bytes = 85726, exit bytes = 107427
    Total displayed static exits = 3583


Cachegrindage is good:

before:
 I   refs:      3,519,034,997
 I1  misses:       27,635,209
 L2i misses:            5,933
 I1  miss rate:          0.78%
 L2i miss rate:          0.00%

 D   refs:      1,724,507,309  (1,145,431,422 rd   + 579,075,887 wr)
 D1  misses:          186,091  (      158,617 rd   +      27,474 wr)
 L2d misses:           44,502  (       23,156 rd   +      21,346 wr)
 D1  miss rate:           0.0% (          0.0%     +         0.0%  )
 L2d miss rate:           0.0% (          0.0%     +         0.0%  )

 L2 refs:          27,821,300  (   27,793,826 rd   +      27,474 wr)
 L2 misses:            50,435  (       29,089 rd   +      21,346 wr)
 L2 miss rate:            0.0% (          0.0%     +         0.0%  )

 Branches:        543,251,301  (  543,152,745 cond +      98,556 ind)
 Mispredicts:      12,475,007  (   12,447,678 cond +      27,329 ind)
 Mispred rate:            2.2% (          2.2%     +        27.7%   )

after:
 I   refs:      3,460,187,860
 I1  misses:       26,172,205
 L2i misses:            5,944
 I1  miss rate:          0.75%
 L2i miss rate:          0.00%

 D   refs:      1,694,400,527  (1,116,362,352 rd   + 578,038,175 wr)
 D1  misses:          193,432  (      165,750 rd   +      27,682 wr)
 L2d misses:           44,591  (       23,371 rd   +      21,220 wr)
 D1  miss rate:           0.0% (          0.0%     +         0.0%  )
 L2d miss rate:           0.0% (          0.0%     +         0.0%  )

 L2 refs:          26,365,637  (   26,337,955 rd   +      27,682 wr)
 L2 misses:            50,535  (       29,315 rd   +      21,220 wr)
 L2 miss rate:            0.0% (          0.0%     +         0.0%  )

 Branches:        528,853,452  (  528,755,712 cond +      97,740 ind)
 Mispredicts:      11,968,920  (   11,941,676 cond +      27,244 ind)
 Mispred rate:            2.2% (          2.2%     +        27.8%   )
http://hg.mozilla.org/tracemonkey/rev/f723616174c1

Onward to bug 504587! Thanks to all,

/be
Whiteboard: fixed-in-tracemonkey
http://hg.mozilla.org/mozilla-central/rev/f723616174c1
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.