nanojit: start adding alias info to loads/stores/calls

RESOLVED FIXED

Status

RESOLVED FIXED
9 years ago
5 years ago

People

(Reporter: njn, Assigned: njn)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin)

Attachments

(3 attachments, 2 obsolete attachments)

(Assignee)

Description

9 years ago
LIR currently has some crude alias info on loads (eg. LIR_ld vs LIR_ldc) and calls (the _cse field) -- we can distinguish between "read-only" and "other".

For bug 517910 (and others in the future) we need more precise alias sets.  This can be best done by adding a bitset to loads and stores, and two bitsets to calls (for memory read and written by the function).  To start with I want to add a bitset with only two bits -- ie. 10 represents "read-only" and 01 represents "other".  (Note that the "read-only" bitset is impossible for stores;  if we're storing to a location it cannot be read-only!)

For loads, this will remove the need for LIR_ldc et al -- the read-only/other property will be in the bitset.  For calls, this will remove the need for '_cse' -- again the bitsets will cover it.

Loads/stores currently don't have any spare bits to add this new info.  I propose changing the 32-bit 'disp' field to 16-bits.  I checked and found that neither TM nor TR generates any displacements larger than 16-bits (at least, not in TM trace-tests and TR acceptance tests).  This makes sense, as most displacements are offsets into structs, or offsets into the stack or globals area.

Code generation won't be affected by these changes.
(Assignee)

Updated

9 years ago
Blocks: 517910

Comment 1

9 years ago
+1 random thumbs up, i like where this is headed.
(Assignee)

Comment 2

9 years ago
This should also make LoadFilter obsolete.

Comment 3

9 years ago
how about 30 bits? sounds a bit safer. we do have sometimes web code with tons of properties in objects
(Assignee)

Comment 4

9 years ago
We will need more than 2 bits of alias info eventually.

In the patch I'm working on I just assert in insLoad()/insStore() if the displacement is too big.  It would be better to convert such a case to a LIR_add + a load/store.  But if we do that I'd like to have a JS test that actually triggers the code.  Andreas, can you write such a test?  Or at least point me to the insLoad()/insStore() line in jstracer.cpp that could cause it?

Do property accesses involve constant displacements?
(Assignee)

Updated

9 years ago
Duplicate of this bug: 543400

Comment 6

9 years ago
Sure, can you sketch out the LIR you would like to see and I write a test case for you.

Property accesses and global variable access both use constant displacements. So do array access with a constant offset (foo[5]).

Comment 7

9 years ago
Here is an example:

var x = {}
for (var i = 0; i < 100000; ++i)
    x[i] = 0;
x.y = 5;
for (var i = 0; i < 10; ++i)
    ++x.y;

  50: qeq3 = qeq js_BoxDouble1, JSVAL_ERROR_COOKIE
  51: xt3: xt qeq3 -> pc=0x100414c2f imacpc=0x0 sp+8 rp+0 (GuardID=006)
  52: ldq3 = ldq $global0[56]
  53: stqi ldq3[799976] = js_BoxDouble1
  54: ld2 = ld eos[752]
  55: $global1 = i2f ld2
  56: 1 = int 1
  57: add1 = add ld2, 1

And that's the LIR we generate (see the large offset for instruction #53). This reminds me we should finish the work on integer speculation for load/stores. Always going double is super lame.
(Assignee)

Comment 8

9 years ago
Posted patch NJ patch (obsolete) — Splinter Review
This patch is easier to understand if you read LIR.h first.

- Added the AccSet type and a big description.  Current access regions are
  READ_ONLY, STACK and OTHER.  STACK isn't currently used but I added it now
  because it helped make the explanation clearer.  New regions added in the
  future might be TM/TR-specific, but for the moment I have this within NJ.

- Tweaked CallInfo.  '_cse' was renamed as '_isPure' which is a much better
  description.  The unused '_fold' flag was replaced with '_storeAccSet'
  which will be useful for alias analysis.

- Removed LIR_ldc et al.

- Shrunk the 'disp' field in loads and stores from 32 bits to 16 bits to
  make room for AccSets.  It hardly ever makes a difference.  Also changed
  LirBufWriter to rewrite any displacements larger than 16 bits via a
  separate instruction.  Did the same for CseFilter::insLoad() because it
  relies on LirBufWriter not changing the code;  this is a hack but the
  least worst option I could think of for ensuring this rewriting always
  occurs.

- Added an optional parameter to insLoad(), insStorei() and insStore().  If
  not specified, it defaults to ACC_LOAD_ANY/ACC_STORE_ANY.  Threaded it
  through all the filters.

- Added some AccSet checking to ValidateWriter.

- For debug printing, for the boring ACC_LOAD_ANY/ACC_STORE_ANY cases I just
  print the opcode normally, eg. 'ld', 'sti'.  For other cases, I add a
  suffix indicating which access regions are touched, eg. 'ld.r' is
  READ_ONLY, 'sti.so' is STACK|OTHER.

- lirasm: updated in the obvious ways.

- I tested with TM and SunSpider that no code was changed by any of these
  changes, except for the change of 'ldc' to 'ld.r', and likewise for
  similar opcodes.

Thanks to Julian for lots of helpful discussion about this patch.
Attachment #426638 - Flags: review?(jseward)
(Assignee)

Updated

9 years ago
Attachment #426638 - Flags: review?(rreitmai)
(Assignee)

Comment 9

9 years ago
Posted patch TM patch (obsolete) — Splinter Review
- Updated all the CallInfos:
  - Where we had (cse, fold) == (0, 0) we now have (isPure, accSet) == (0,
    ACC_STORE_ANY).
  - Where we had (cse, fold) == (1, 1) we now have (isPure, accSet) == (1,
    ACC_NONE).

  They were the only combos to worry about because 'fold' was unused and
  always equal to 'cse'.

  I also temporarily put in a static assert to ensure all CallInfos had one
  of these two values, which was a good way to ensure I converted all of
  them.  (I'd missed a few before I did that.)

- Replaced all insLoad(LIR_ldc, ...) calls with insLoad(LIR_ld, ...,
  ACC_READONLY).  Likewise LIR_ldcs-->LIR_ldzs and LIR_ldcb-->LIR_ldzb.

- Removed one use of LIR_ldcp that had a comment explaining
  how the load wasn't constant but it was really ok to pretend it was.
  Yikes... if you lie to the compiler, the compiler will gets its revenge!
  Turns out it had no effect on all of SunSpider anyway, so it's no great
  loss.

- Like the NJ patch, the generated code wasn't changed.  Eg. all
  call annotations are equivalent to the old ones.  Improvements can come
  later.

- Added a trace-test that checks that large displacements are handled
  correctly.
Attachment #426639 - Flags: review?(jseward)
(Assignee)

Comment 10

9 years ago
BTW no TR patch yet, and it's Friday evening here so I won't get to it until next week.

Comment 11

9 years ago
Comment on attachment 426638 [details] [diff] [review]
NJ patch

I want to give some feedback too (patch looks awesome at first glance). Marking myself so I don't forget to do this this weekend.
Attachment #426638 - Flags: review?
(Assignee)

Comment 12

9 years ago
Comment 517910#c19 (https://bugzilla.mozilla.org/show_bug.cgi?id=517910#c19) has some ruminations about the differences between automatic and manual access annotations that are relevant to this bug... at least, they represent things that don't need to be addressed in the bug, but do need to be addressed in the follow-up patch.
(Assignee)

Comment 13

9 years ago
Ping?

Turns out the TM patch will need to be bigger.  It seems that CallInfos are not just in JS, they're all over Firefox, some are even generated by Python scripts.  Sigh.

Comment 14

9 years ago
Any changes coming for the NJ patch?  

Just asking since I haven't had a chance to review yet.
Comment on attachment 426638 [details] [diff] [review]
NJ patch


This mostly seems fine to me.  I think it's correct, but I am still a
concerned that terminology might be a bit confusing.  I particular it
would be nice to be assured that there is no special semantics
associated with READONLY areas.  The only way in which they are
different from arbitrary other areas is that you guarantee no stores
or writes will access a READONLY area.  But that's not a property of
the alias representation framework itself; it's a property guaranteed
by your construction of LIR traces.

Since AccSets are sets of areas, should the full-set names be
ACC_{LOAD,STORE}_ALL instead of ACC_{LOAD,STORE}_ANY ?  That would
seem to fit in better with the "other end of the lattice" being called
ACC_{LOAD,STORE}_NONE.

One other thing I wasn't clear about was why there are two different
symbols to denote a full access set -- ACC_LOAD_ANY and ACC_STORE_ANY.
This doesn't really seem right, and the only reason I can think of
(which is a legit reason) is that they can't be the same because
ACC_STORE_ANY needs to omit the _READONLY region.  Yes?

>+// Other limitations:
>+// - Loads always use accSet==ACC_LOAD_ANY
>+// - Stores always use accSet==ACC_Stores_ANY

you mean ACC_STORE_ANY ?

>+        if (ci->_isPure && ci->_storeAccSet != ACC_NONE)
>+            errorAccSetShould(ci->_name, ci->_storeAccSet, "equal ACC_NONE for pure functions");

seems a bit overkill.  A pure function can legitimately do stores;
what the purity implies it may not do is to read memory.

>+    // - A load that accesses a READONLY region can be safely marked instead
>+    //   as loading from OTHER.  In other words, it's safe to underestimate
>+    //   the size of the READONLY region.  (This also applies to the load set
>+    //   of a function.)

But that's only because you provide elsewhere, a guarantee that no
store or call will write the READONLY region.  In general the only
safe approximation, both for reads and writes, is to overestimate
the actual access set.
Comment on attachment 426638 [details] [diff] [review]
NJ patch

r+; details on Comment 15
Attachment #426638 - Flags: review?(jseward) → review+
Comment on attachment 426639 [details] [diff] [review]
TM patch

Seems fine to me.  No comments beyond those for the NJ patch.
Attachment #426639 - Flags: review?(jseward) → review+
(Assignee)

Comment 18

9 years ago
(In reply to comment #14)
> Any changes coming for the NJ patch?  

No.
(Assignee)

Comment 19

9 years ago
(In reply to comment #15)
>
> I particular it
> would be nice to be assured that there is no special semantics
> associated with READONLY areas.  The only way in which they are
> different from arbitrary other areas is that you guarantee no stores
> or writes will access a READONLY area.

Well, the special semantics are that we promise no stores will occur in READONLY.  So READONLY isn't special except for the special bit.

> But that's not a property of
> the alias representation framework itself; it's a property guaranteed
> by your construction of LIR traces.

Er, ok.

> Since AccSets are sets of areas, should the full-set names be
> ACC_{LOAD,STORE}_ALL instead of ACC_{LOAD,STORE}_ANY ?  That would
> seem to fit in better with the "other end of the lattice" being called
> ACC_{LOAD,STORE}_NONE.

This comment was meant to explain this:

    // A convention that's worth using:  use ACC_LOAD_ANY/ACC_STORE_ANY for
    // cases that you're unsure about or haven't considered carefully.  Use
    // ACC_ALL/ACC_ALL_WRITABLE for cases that you have considered carefully.
    // That way it's easy to tell which ones have been considered and which
    // haven't.

But since I probably won't have default args in the final version (see comment 12), the ANY values will likely disappear in bug 517910.

> One other thing I wasn't clear about was why there are two different
> symbols to denote a full access set -- ACC_LOAD_ANY and ACC_STORE_ANY.
> This doesn't really seem right, and the only reason I can think of
> (which is a legit reason) is that they can't be the same because
> ACC_STORE_ANY needs to omit the _READONLY region.  Yes?

Yes.
 
> A pure function can legitimately do stores;
> what the purity implies it may not do is to read memory.

No.  A store is impure.  Therefore a function that does stores is impure.


> >+    // - A load that accesses a READONLY region can be safely marked instead
> >+    //   as loading from OTHER.  In other words, it's safe to underestimate
> >+    //   the size of the READONLY region.  (This also applies to the load set
> >+    //   of a function.)
> 
> But that's only because you provide elsewhere, a guarantee that no
> store or call will write the READONLY region.  In general the only
> safe approximation, both for reads and writes, is to overestimate
> the actual access set.

Well, yes.  If you remove that guarantee it's not safe.  But nobody's removing the guarantee.  So the point is moot.
(Assignee)

Comment 20

9 years ago
Posted patch TR patchSplinter Review
This all follows on straightforwardly from the NJ changes.
Attachment #427533 - Flags: review?(rreitmai)
(Assignee)

Comment 21

9 years ago
Posted patch TM patch v2Splinter Review
New to this patch:
- Added the static assertion that checked that (cse, fold) pairs of (0, 0)
  were correctly converted to (isPure, storeAccSet) pairs of (0,
  ACC_STORE_ANY).  It won't be valid forever but it's a great check for the
  moment that no JS_DEFINE_CALLINFO/TRCINFO conversions have been missed.

- Renamed the cse/fold params to the JS_TN_INIT_HELPER_* macros -- no
  functional change, but they're now the right names.  Likewise for the
  empty versions of JS_DEFINE_CALLINFO_*.

- Found some TRCINFOs outside of js/src that hadn't been converted -- some
  in CustomQS_WebGL.h, and some generated by qsgen.py.
Attachment #426639 - Attachment is obsolete: true
Attachment #427534 - Flags: review?(jseward)
(Assignee)

Comment 22

9 years ago
Posted patch NJ patch v2Splinter Review
I don't think anything significant has changed, but here's a more recent version just in case.
Attachment #426638 - Attachment is obsolete: true
Attachment #427535 - Flags: review?(rreitmai)
Attachment #426638 - Flags: review?(rreitmai)
Attachment #426638 - Flags: review?
Attachment #427534 - Flags: review?(jseward) → review+

Comment 23

9 years ago
Comment on attachment 427535 [details] [diff] [review]
NJ patch v2

much goodness, gladly +'d
Attachment #427535 - Flags: review?(rreitmai) → review+

Comment 24

9 years ago
Comment on attachment 427533 [details] [diff] [review]
TR patch

r+'d, needs to be rebased somewhat (I've already done so) but otherwise looks good.

Something to watch for (with the conversion of 32b to 16b displacement) is that some true 'base' pointers may no longer be visible on the frame, since they could have been folded by and addp.

This may affect gc dynamics, but as Nick pointed out above he's not seen > 16b displacements being used in practice.
Attachment #427533 - Flags: review?(rreitmai) → review+

Comment 26

9 years ago
http://hg.mozilla.org/tamarin-redux/rev/658539ac197a
Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey → fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin

Comment 27

9 years ago
http://hg.mozilla.org/mozilla-central/rev/1df8886b7507
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
Component: Nanojit → Nanojit
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.