560994 - nanojit: introduce conditionally-live guards

Reporter

Description

•

14 years ago

for (...)
  ({})

emits a call to NewObject followed by a guard that the pointer returned is not null. This always keeps the NewObject call alive.

We should add a LIR_xeq guard with the following specification:

y = xeq a, b

if y == b -> side exit
else y = a

The guard is only emitted if y is used.

Andreas Gal :gal

Reporter

Updated

•

14 years ago

Blocks: 560995

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

Assignee: general → nnethercote

Status: NEW → ASSIGNED

Nicholas Nethercote [inactive]

Assignee

Comment 1

•

14 years ago

(In reply to comment #0)
> for (...)
>   ({})
> 
> emits a call to NewObject followed by a guard that the pointer returned is not
> null. This always keeps the NewObject call alive.
> 
> We should add a LIR_xeq guard with the following specification:
> 
> y = xeq a, b
> 
> if y == b -> side exit
> else y = a
> 
> The guard is only emitted if y is used.

I think you mean:

  if a == b -> side exit
  else y = a

Another option is this:

  y = xeqi0 a

  if a == 0 -> side exit
  else y = a

(Note that the 'i' in 'xeqi' is for "integer", ie. just the usual type suffix)

That's less general purpose, but easier to understand (IMO) -- in the more general version it's less obvious what the return value is -- and a more compact LIR instruction (unary instead of binary).  I think I'll start by implementing that, if we need the more general-purpose one I can change it.

Andreas Gal :gal

Reporter

Comment 2

•

14 years ago

I mostly need this for pointers, so xeqp, and a word on the other side, so builtin immediate won't work. I already see a bunch of other nice uses, so I think we want a generic xeqi, xeqq.

Nicholas Nethercote [inactive]

Assignee

Comment 3

•

14 years ago

Ok, I'll do the generic version then.

As for the name... "addxovi" has a meaningful reading:  "add but exit on overflow".  This operation is "first but exit if first equals second", or "left but exit if left equals right".  Maybe 'fstxeq{i,q}'?  Or 'lhsxeq{i,q}'?

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

Summary: TM: add a way to guard on builtins not failing that doesn't keep the builtin alive [nanojit] → nanojit: add an expression-guard that tests for equality

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

Depends on: 561003

Nicholas Nethercote [inactive]

Assignee

Comment 4

•

14 years ago

Hmm, consider this code:

  x = fstxeqi a, b
  ...
  y = add a, c

Is this legitimate?  It doesn't feel so -- if we're checking 'a', it seems like we shouldn't use it again, that we should only its checked synonym 'x'.

If it's not allowed we generate better code because we know that 'a' and 'x' are the same thing (without that knowledge we have to put them in separate registers and copy 'a' into 'x' with a mov).

But if it's not allowed then that is strange, because there's nothing else like this in LIR, ie. synonyms and/or values that can't be reused past a certain point.

No longer depends on: 561003

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

Depends on: 561003

Andreas Gal :gal

Reporter

Comment 5

•

14 years ago

Please don't call it fstxeqi. I am not even going to make a technical argument here, just get on my knees and beg. Please :) xeqi/q a, b seems fine. Its a convention that the first argument is the return value. No support for 2nd or snd or whatever.

I am ok with disallowing using a after the guard (for my uses). We can probably assert and enforce that somewhat.

Nicholas Nethercote [inactive]

Assignee

Comment 6

•

14 years ago

Attached patch rough patch (obsolete) — Details — Splinter Review

- It's currently called fstxeq[iq] because that's what I'd started calling it.  It can change later.  Use insGuardXov() to insert it.

- fstxeqq not supported yet.

- The first operand is not allowed to be used again.

- Only tested on some very small lirasm test files.

I'm uneasy about the first-operand-cannot-be-used-again constraint.  Sure you can check for it, but that won't save us in optimized builds... it's just one more thing to get wrong.  My dodgy semantics spider sense is tingling.  Maybe Julian will have something to add.

Nicholas Nethercote [inactive]

Assignee

Comment 7

•

14 years ago

(In reply to comment #6)
> My dodgy semantics spider sense is tingling.

If we currently have this:

  o = calli newObject(...)
  c = eqi o, NULL
  xt c
  ...
  # 'o' may be used here

With fstxeqi it would become:

  o = calli newObject(...)
  o2 = fstxeqi o, NULL
  ...
  # 'o2' may be used here, 'o' must not, hmm

To avoid the dodginess, we'd need this:

  o = callxeqi newObject(...), 0

This properly incorporates the notion that it's the call that's producing the value that may result in an exit.

Andreas, you said you see a bunch of uses for this, can you elaborate?  Are they all function call result checks?

Nicholas Nethercote [inactive]

Assignee

Comment 8

•

14 years ago

Hmm, NewObject is not a pure function, which means that it cannot be eliminated even if the return value is dead.  Quoting from jsbuiltins.cpp:

 * - The isPure flag.  Set to 1 if:
 *   (a) the function's return value is determined solely by its arguments
 *       (ie. no hidden state, no implicit inputs used such as global
 *       variables or the result of an I/O operation); and
 *   (b) the function causes no observable side-effects (ie. no writes to
 *       global variables, no I/O output).
 *   Multiple calls to a pure function can be merged during CSE.

Brendan Eich [:brendan]

Comment 9

•

14 years ago

gal wants this also for an IntToString IIR.

I too see "fst" as "fist" not "first" and beg for a terser name with an implicit handedness.

/be

Nicholas Nethercote [inactive]

Assignee

Comment 10

•

14 years ago

FWIW, I got 'fst' from the Haskell Standard Prelude:

  fst              :: (a,b) -> a
  fst (x,y)        =  x

There's also a curried version which has an even worse name:

  const            :: a -> b -> a
  const x _        =  x

----

But thinking more generally... consider a typical access of an array of
integers:

  20: ldi6 = ldi.o $stack4[4]

  # check: isArray?
  21: ~JSSLOT_CLASS_MASK_BITS = immi -4
  22: andi3 = andi ldi6, ~JSSLOT_CLASS_MASK_BITS
  23: clasp = immi 136533920
  24: guard(class is Array) = eqi andi3, clasp
  25: xf4: xf guard(class is Array) -> pc=0x9eb76cd imacpc=(nil) sp+16 rp+0 (GuardID=002)

  # check: index < length?
  26: ldi3 = ldi.o $stack4[28]
  27: ldi5 = ldi.o $stack4[16]
  28: ltui2 = ltui ldi1, ldi5
  29: xf3: xf ltui2 -> pc=0x9eb76cd imacpc=(nil) sp+16 rp+0 (GuardID=003)

  # check: dslots != NULL?
  30: eqi3 = eqi ldi3, 0
  31: xt2: xt eqi3 -> pc=0x9eb76cd imacpc=(nil) sp+16 rp+0 (GuardID=004)

  # check: index < capacity?
  32: ldi4 = ldi.o ldi3[-4]
  33: ltui1 = ltui ldi1, ldi4
  34: xf2: xf ltui1 -> pc=0x9eb76cd imacpc=(nil) sp+16 rp+0 (GuardID=005)

  # check: element is a number?
  35: JSVAL_DOUBLE = immi 2
  36: lshi1 = lshi ldi1, JSVAL_DOUBLE
  37: addi1 = addi ldi3, lshi1
  38: ldi2 = ldi.o addi1[0]
  39: JSVAL_TAGMASK = immi 7
  40: andi2 = andi ldi2, JSVAL_TAGMASK
  41: eqi2 = eqi andi2, JSVAL_DOUBLE
  42: JSVAL_INT = immi 1
  43: andi1 = andi ldi2, JSVAL_INT
  44: ori1 = ori andi1, eqi2
  45: eqi1 = eqi ori1, 0
  46: xt1: xt eqi1 -> pc=0x9eb76cd imacpc=(nil) sp+16 rp+0 (GuardID=006)

  # unbox
  47: js_UnboxDouble1 = calld. #js_UnboxDouble ( ldi2 )

There are five(!) checks.  If js_UnboxDouble1 is dead, then at least the
last three checks could be removed.  I'm not sure about the first two, I
imagine that they have to stay to preserve JS language semantics.

To make those last three guards removable we need something more than xeqi:

- We could instead augment every statement guard with a list of the values
  keeping it alive (and we'd need a way of saying "this guard always live").

- We'd also need to augment every value with a 'live' bit.  We'd update
  this during codegen at the use-point of every value.  (This is different
  to the 'inUse' bit we currently have because 'inUse' is temporary, it gets
  reset once you step back to before the def-point.)

- When generating code for a guard we'd check its list, and if none of the
  values in it had ever been used we could eliminate the guard.

I imagine a lot of TM's guards could be made conditionally-live like this.

Andreas Gal :gal

Reporter

Comment 11

•

14 years ago

NewObject can't be cse-d, but its side-effect free as far as script is concerned. I am also interested in a couple other easier things like IntToString, which is side-effect free but has a guard on it to guard for OOM (there is a couple other places like that).

Andreas Gal :gal

Reporter

Comment 12

•

14 years ago

And no, please no "fst" "first" "1". Its the first operand always. Period. Thats really not a helpful part of the name.

Nicholas Nethercote [inactive]

Assignee

Comment 13

•

14 years ago

NewObject is an interesting case.  It can't be CSE'd but it can be eliminated if the return value is dead.  We'll need to use finer-grained purity declarations, as pure vs. impure is too coarse.  There's a language called Mercury that I use as my basic model when considering these questions, see http://www.mercury.csse.unimelb.edu.au/information/doc-latest/mercury_ref/Purity-levels.html#Purity-levels.  Mercury's pure/semipure/impure classification doesn't help us with NewObject, but we'll need something like that.

And I've got the message about "fst".  Can we talk about comment 10?  That potentially renders the "fst" discussion moot because it greatly generalizes the basic idea without needing new opcodes.

Nicholas Nethercote [inactive]

Assignee

Comment 14

•

14 years ago

To clarify further, instead of this:

  x = calli foo(...)
  y = xeqi x, 0

We'd have this:

  x = calli foo(...)
  c = eqi x, 0
  xt c [x]

Where the "[x]" means that the guard is only live if 'x' is live.

Nicholas Nethercote [inactive]

Assignee

Comment 15

•

14 years ago

If we go with the more general form, this bug depends on bug 518267, which will add an accurate isLive bit to each LIR instruction.

One question:  should this list of values be attached to all existing statement-guards, or should we add new opcodes?  I think all this conditionally-live stuff is only relevant for LIR_xt and LIR_xf.  In which case we could add LIR_xtc/LIR_xfc.

Pros of new opcodes:
- Doesn't waste space in guards that don't need the value list.
- Conditionally-live guards are a bit tricky, so marking them clearly is good.

Cons:
- Two more opcodes

As for how the list would be stored:  we could have a pointer in the LIns to a NULL-terminated array, and use a NULL pointer to mean "always live" (but if we have separate xtc/xfc that wouldn't be necessary, you'd just use xt/xf).  Alternatively, I suspect that in practice the list of values will always have length 1, in which case we could just store the 1 value directly.  (We wouldn't be able to represent a guard that is conditionally-live on multiple values.)

Depends on: 518267

Nicholas Nethercote [inactive]

Assignee

Comment 16

•

14 years ago

(In reply to comment #14)
> To clarify further, instead of this:
> 
>   x = calli foo(...)
>   y = xeqi x, 0
> 
> We'd have this:
> 
>   x = calli foo(...)
>   c = eqi x, 0
>   xt c [x]
> 
> Where the "[x]" means that the guard is only live if 'x' is live.

More precisely: the "[x]" means that the guard is only live if 'x' is live after the 'xt'.  If it's used in other ways before the 'xt' they wouldn't count in terms of keeping the 'xt' live;  doing so would be bogus, the LIR generator just shouldn't do that, as the whole point of this is to check 'x' and exit quickly if something is wrong.

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

Depends on: 563277

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

No longer depends on: 563277

Summary: nanojit: add an expression-guard that tests for equality → nanojit: introduce conditionally-live guards

Nicholas Nethercote [inactive]

Assignee

Updated

•

14 years ago

Depends on: 563277

Nicholas Nethercote [inactive]

Assignee

Comment 17

•

14 years ago

Just to clarify:  NewObject() isn't called on trace, but the following are:
- js_NewEmptyArray()
- js_Object_tn()
- js_NonEmptyObject()
- js_NewInstance()

(plus some others) and they end up calling NewObject().

Nicholas Nethercote [inactive]

Assignee

Comment 18

•

14 years ago

Attached patch rought patch v2 — Details — Splinter Review

When this patch is combined with the patch in bug 563863 -- which allows dead allocating functions to be eliminated -- this program:

  function f()
  {
      for (var i = 0; i < 10000000; i++) ({});
  }
  f();

runs about 10x faster because the js_Object_tn() call is removed.

That's nice, but for the cases I've handled (js_NewInstance, js_NewEmptyArray, js_NewEmptyArrayWithLength, js_NewArrayWithSlots, js_Object_tn) it never helps for realistic code (well, SunSpider).  So I'm still a bit skeptical about the benefits of conditionally-live guards.

Andreas, can you try implementing bug 560995 on top of this patch and see what benefit you get?  If the results are good that'll make me happier about working more on this bug.  Or if you can think of other cases where conditionally-live guards could be used that would help too.

Attachment #440703 - Attachment is obsolete: true

Mike Shaver (:shaver -- probably not reading bugmail closely)

Comment 19

•

14 years ago

One common pattern that I think could be helped by this (and things built on it) is:

function connectSocket(argsobj)
{
   connect(argsobj.host, argsobj.port, argsobj.options || "");
}

connectSocket({host: "localhost", port: 6379, options: "tls"})

It seems that we should be able to see through the object wrapping there, and propagate the values directly.  If argsobj doesn't otherwise escape, we want to avoid allocating it.

Nicholas Nethercote [inactive]

Assignee

Comment 20

•

14 years ago

(In reply to comment #19)
> One common pattern that I think could be helped by this (and things built on
> it) is:
> 
> function connectSocket(argsobj)
> {
>    connect(argsobj.host, argsobj.port, argsobj.options || "");
> }
> 
> connectSocket({host: "localhost", port: 6379, options: "tls"})
> 
> It seems that we should be able to see through the object wrapping there, and
> propagate the values directly.  If argsobj doesn't otherwise escape, we want to
> avoid allocating it.

I doubt that'll happen.  I bet the new object's address will be stored to the stack right after creation (preserving VM semantics) and then we're hosed.  And even if it's not we'd need store-to-load forwarding in NJ which we don't currently have.

Nicholas Nethercote [inactive]

Assignee

Comment 21

•

14 years ago

GETELEM has improved since comment 10 was written;  it now has three checks instead of five.  But the basic idea still holds.  I tried making the final check -- the "does the element have the expected type?" check -- conditionally live on whether the gotten element is used.  

Unfortunately trace-test/tests/basic/typeofTest.js fails:

function typeofTest()
{
  var values = ["hi", "hi", "hi", null, 5, 5.1, true, undefined, /foo/, typeofTe
st, [], {}], types = [];
  for (var i = 0; i < values.length; i++)
    types[i] = typeof values[i];
  return types.toString();
}
assertEq(typeofTest(), "string,string,string,object,number,number,boolean,undefined,object,function,object,object");

Normally the type test guards against the value being used in inappropriate ways.  In this case the value isn't used, so the guard is removed, but the type test is still needed because of the typeof.  How annoying.

Nicholas Nethercote [inactive]

Assignee

Comment 22

•

14 years ago

Bug 625333 had an example where conditionally-live guards would apply: js_StringTo{Number,Int32} return a value and have a bool outparam that indicates if an OOM happened.  This bool is checked with a subsequent guard, but if the direct result was never used the call and the guard could be removed... though that would require marking the call as pure, which is dodgy since it's actually writing to the outparam.  Sounds a bit like PURE_ALLOCATING rather than PURE (see bug 563863).

Nicholas Nethercote [inactive]

Assignee

Comment 23

•

13 years ago

TM's days are numbered, and TR doesn't appear to need this feature:  WONTFIX.

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → WONTFIX

rough patch 14 years ago Nicholas Nethercote [inactive] 16.53 KB, patch		Details \| Diff \| Splinter Review
rought patch v2 14 years ago Nicholas Nethercote [inactive] 37.32 KB, patch		Details \| Diff \| Splinter Review