Closed
Bug 545274
Opened 15 years ago
Closed 15 years ago
nanojit: start adding alias info to loads/stores/calls
Categories
(Core Graveyard :: Nanojit, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
(Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin)
Attachments
(3 files, 2 obsolete files)
35.90 KB,
patch
|
rreitmai
:
review+
|
Details | Diff | Splinter Review |
99.89 KB,
patch
|
jseward
:
review+
|
Details | Diff | Splinter Review |
76.01 KB,
patch
|
rreitmai
:
review+
|
Details | Diff | Splinter Review |
LIR currently has some crude alias info on loads (eg. LIR_ld vs LIR_ldc) and calls (the _cse field) -- we can distinguish between "read-only" and "other".
For bug 517910 (and others in the future) we need more precise alias sets. This can be best done by adding a bitset to loads and stores, and two bitsets to calls (for memory read and written by the function). To start with I want to add a bitset with only two bits -- ie. 10 represents "read-only" and 01 represents "other". (Note that the "read-only" bitset is impossible for stores; if we're storing to a location it cannot be read-only!)
For loads, this will remove the need for LIR_ldc et al -- the read-only/other property will be in the bitset. For calls, this will remove the need for '_cse' -- again the bitsets will cover it.
Loads/stores currently don't have any spare bits to add this new info. I propose changing the 32-bit 'disp' field to 16-bits. I checked and found that neither TM nor TR generates any displacements larger than 16-bits (at least, not in TM trace-tests and TR acceptance tests). This makes sense, as most displacements are offsets into structs, or offsets into the stack or globals area.
Code generation won't be affected by these changes.
Comment 1•15 years ago
|
||
+1 random thumbs up, i like where this is headed.
Assignee | ||
Comment 2•15 years ago
|
||
This should also make LoadFilter obsolete.
Comment 3•15 years ago
|
||
how about 30 bits? sounds a bit safer. we do have sometimes web code with tons of properties in objects
Assignee | ||
Comment 4•15 years ago
|
||
We will need more than 2 bits of alias info eventually.
In the patch I'm working on I just assert in insLoad()/insStore() if the displacement is too big. It would be better to convert such a case to a LIR_add + a load/store. But if we do that I'd like to have a JS test that actually triggers the code. Andreas, can you write such a test? Or at least point me to the insLoad()/insStore() line in jstracer.cpp that could cause it?
Do property accesses involve constant displacements?
Comment 6•15 years ago
|
||
Sure, can you sketch out the LIR you would like to see and I write a test case for you.
Property accesses and global variable access both use constant displacements. So do array access with a constant offset (foo[5]).
Comment 7•15 years ago
|
||
Here is an example:
var x = {}
for (var i = 0; i < 100000; ++i)
x[i] = 0;
x.y = 5;
for (var i = 0; i < 10; ++i)
++x.y;
50: qeq3 = qeq js_BoxDouble1, JSVAL_ERROR_COOKIE
51: xt3: xt qeq3 -> pc=0x100414c2f imacpc=0x0 sp+8 rp+0 (GuardID=006)
52: ldq3 = ldq $global0[56]
53: stqi ldq3[799976] = js_BoxDouble1
54: ld2 = ld eos[752]
55: $global1 = i2f ld2
56: 1 = int 1
57: add1 = add ld2, 1
And that's the LIR we generate (see the large offset for instruction #53). This reminds me we should finish the work on integer speculation for load/stores. Always going double is super lame.
Assignee | ||
Comment 8•15 years ago
|
||
This patch is easier to understand if you read LIR.h first.
- Added the AccSet type and a big description. Current access regions are
READ_ONLY, STACK and OTHER. STACK isn't currently used but I added it now
because it helped make the explanation clearer. New regions added in the
future might be TM/TR-specific, but for the moment I have this within NJ.
- Tweaked CallInfo. '_cse' was renamed as '_isPure' which is a much better
description. The unused '_fold' flag was replaced with '_storeAccSet'
which will be useful for alias analysis.
- Removed LIR_ldc et al.
- Shrunk the 'disp' field in loads and stores from 32 bits to 16 bits to
make room for AccSets. It hardly ever makes a difference. Also changed
LirBufWriter to rewrite any displacements larger than 16 bits via a
separate instruction. Did the same for CseFilter::insLoad() because it
relies on LirBufWriter not changing the code; this is a hack but the
least worst option I could think of for ensuring this rewriting always
occurs.
- Added an optional parameter to insLoad(), insStorei() and insStore(). If
not specified, it defaults to ACC_LOAD_ANY/ACC_STORE_ANY. Threaded it
through all the filters.
- Added some AccSet checking to ValidateWriter.
- For debug printing, for the boring ACC_LOAD_ANY/ACC_STORE_ANY cases I just
print the opcode normally, eg. 'ld', 'sti'. For other cases, I add a
suffix indicating which access regions are touched, eg. 'ld.r' is
READ_ONLY, 'sti.so' is STACK|OTHER.
- lirasm: updated in the obvious ways.
- I tested with TM and SunSpider that no code was changed by any of these
changes, except for the change of 'ldc' to 'ld.r', and likewise for
similar opcodes.
Thanks to Julian for lots of helpful discussion about this patch.
Attachment #426638 -
Flags: review?(jseward)
Assignee | ||
Updated•15 years ago
|
Attachment #426638 -
Flags: review?(rreitmai)
Assignee | ||
Comment 9•15 years ago
|
||
- Updated all the CallInfos:
- Where we had (cse, fold) == (0, 0) we now have (isPure, accSet) == (0,
ACC_STORE_ANY).
- Where we had (cse, fold) == (1, 1) we now have (isPure, accSet) == (1,
ACC_NONE).
They were the only combos to worry about because 'fold' was unused and
always equal to 'cse'.
I also temporarily put in a static assert to ensure all CallInfos had one
of these two values, which was a good way to ensure I converted all of
them. (I'd missed a few before I did that.)
- Replaced all insLoad(LIR_ldc, ...) calls with insLoad(LIR_ld, ...,
ACC_READONLY). Likewise LIR_ldcs-->LIR_ldzs and LIR_ldcb-->LIR_ldzb.
- Removed one use of LIR_ldcp that had a comment explaining
how the load wasn't constant but it was really ok to pretend it was.
Yikes... if you lie to the compiler, the compiler will gets its revenge!
Turns out it had no effect on all of SunSpider anyway, so it's no great
loss.
- Like the NJ patch, the generated code wasn't changed. Eg. all
call annotations are equivalent to the old ones. Improvements can come
later.
- Added a trace-test that checks that large displacements are handled
correctly.
Attachment #426639 -
Flags: review?(jseward)
Assignee | ||
Comment 10•15 years ago
|
||
BTW no TR patch yet, and it's Friday evening here so I won't get to it until next week.
Comment 11•15 years ago
|
||
Comment on attachment 426638 [details] [diff] [review]
NJ patch
I want to give some feedback too (patch looks awesome at first glance). Marking myself so I don't forget to do this this weekend.
Attachment #426638 -
Flags: review?
Assignee | ||
Comment 12•15 years ago
|
||
Comment 517910#c19 (https://bugzilla.mozilla.org/show_bug.cgi?id=517910#c19) has some ruminations about the differences between automatic and manual access annotations that are relevant to this bug... at least, they represent things that don't need to be addressed in the bug, but do need to be addressed in the follow-up patch.
Assignee | ||
Comment 13•15 years ago
|
||
Ping?
Turns out the TM patch will need to be bigger. It seems that CallInfos are not just in JS, they're all over Firefox, some are even generated by Python scripts. Sigh.
Comment 14•15 years ago
|
||
Any changes coming for the NJ patch?
Just asking since I haven't had a chance to review yet.
Comment 15•15 years ago
|
||
Comment on attachment 426638 [details] [diff] [review]
NJ patch
This mostly seems fine to me. I think it's correct, but I am still a
concerned that terminology might be a bit confusing. I particular it
would be nice to be assured that there is no special semantics
associated with READONLY areas. The only way in which they are
different from arbitrary other areas is that you guarantee no stores
or writes will access a READONLY area. But that's not a property of
the alias representation framework itself; it's a property guaranteed
by your construction of LIR traces.
Since AccSets are sets of areas, should the full-set names be
ACC_{LOAD,STORE}_ALL instead of ACC_{LOAD,STORE}_ANY ? That would
seem to fit in better with the "other end of the lattice" being called
ACC_{LOAD,STORE}_NONE.
One other thing I wasn't clear about was why there are two different
symbols to denote a full access set -- ACC_LOAD_ANY and ACC_STORE_ANY.
This doesn't really seem right, and the only reason I can think of
(which is a legit reason) is that they can't be the same because
ACC_STORE_ANY needs to omit the _READONLY region. Yes?
>+// Other limitations:
>+// - Loads always use accSet==ACC_LOAD_ANY
>+// - Stores always use accSet==ACC_Stores_ANY
you mean ACC_STORE_ANY ?
>+ if (ci->_isPure && ci->_storeAccSet != ACC_NONE)
>+ errorAccSetShould(ci->_name, ci->_storeAccSet, "equal ACC_NONE for pure functions");
seems a bit overkill. A pure function can legitimately do stores;
what the purity implies it may not do is to read memory.
>+ // - A load that accesses a READONLY region can be safely marked instead
>+ // as loading from OTHER. In other words, it's safe to underestimate
>+ // the size of the READONLY region. (This also applies to the load set
>+ // of a function.)
But that's only because you provide elsewhere, a guarantee that no
store or call will write the READONLY region. In general the only
safe approximation, both for reads and writes, is to overestimate
the actual access set.
Comment 16•15 years ago
|
||
Attachment #426638 -
Flags: review?(jseward) → review+
Comment 17•15 years ago
|
||
Comment on attachment 426639 [details] [diff] [review]
TM patch
Seems fine to me. No comments beyond those for the NJ patch.
Attachment #426639 -
Flags: review?(jseward) → review+
Assignee | ||
Comment 18•15 years ago
|
||
(In reply to comment #14)
> Any changes coming for the NJ patch?
No.
Assignee | ||
Comment 19•15 years ago
|
||
(In reply to comment #15)
>
> I particular it
> would be nice to be assured that there is no special semantics
> associated with READONLY areas. The only way in which they are
> different from arbitrary other areas is that you guarantee no stores
> or writes will access a READONLY area.
Well, the special semantics are that we promise no stores will occur in READONLY. So READONLY isn't special except for the special bit.
> But that's not a property of
> the alias representation framework itself; it's a property guaranteed
> by your construction of LIR traces.
Er, ok.
> Since AccSets are sets of areas, should the full-set names be
> ACC_{LOAD,STORE}_ALL instead of ACC_{LOAD,STORE}_ANY ? That would
> seem to fit in better with the "other end of the lattice" being called
> ACC_{LOAD,STORE}_NONE.
This comment was meant to explain this:
// A convention that's worth using: use ACC_LOAD_ANY/ACC_STORE_ANY for
// cases that you're unsure about or haven't considered carefully. Use
// ACC_ALL/ACC_ALL_WRITABLE for cases that you have considered carefully.
// That way it's easy to tell which ones have been considered and which
// haven't.
But since I probably won't have default args in the final version (see comment 12), the ANY values will likely disappear in bug 517910.
> One other thing I wasn't clear about was why there are two different
> symbols to denote a full access set -- ACC_LOAD_ANY and ACC_STORE_ANY.
> This doesn't really seem right, and the only reason I can think of
> (which is a legit reason) is that they can't be the same because
> ACC_STORE_ANY needs to omit the _READONLY region. Yes?
Yes.
> A pure function can legitimately do stores;
> what the purity implies it may not do is to read memory.
No. A store is impure. Therefore a function that does stores is impure.
> >+ // - A load that accesses a READONLY region can be safely marked instead
> >+ // as loading from OTHER. In other words, it's safe to underestimate
> >+ // the size of the READONLY region. (This also applies to the load set
> >+ // of a function.)
>
> But that's only because you provide elsewhere, a guarantee that no
> store or call will write the READONLY region. In general the only
> safe approximation, both for reads and writes, is to overestimate
> the actual access set.
Well, yes. If you remove that guarantee it's not safe. But nobody's removing the guarantee. So the point is moot.
Assignee | ||
Comment 20•15 years ago
|
||
This all follows on straightforwardly from the NJ changes.
Attachment #427533 -
Flags: review?(rreitmai)
Assignee | ||
Comment 21•15 years ago
|
||
New to this patch:
- Added the static assertion that checked that (cse, fold) pairs of (0, 0)
were correctly converted to (isPure, storeAccSet) pairs of (0,
ACC_STORE_ANY). It won't be valid forever but it's a great check for the
moment that no JS_DEFINE_CALLINFO/TRCINFO conversions have been missed.
- Renamed the cse/fold params to the JS_TN_INIT_HELPER_* macros -- no
functional change, but they're now the right names. Likewise for the
empty versions of JS_DEFINE_CALLINFO_*.
- Found some TRCINFOs outside of js/src that hadn't been converted -- some
in CustomQS_WebGL.h, and some generated by qsgen.py.
Attachment #426639 -
Attachment is obsolete: true
Attachment #427534 -
Flags: review?(jseward)
Assignee | ||
Comment 22•15 years ago
|
||
I don't think anything significant has changed, but here's a more recent version just in case.
Attachment #426638 -
Attachment is obsolete: true
Attachment #427535 -
Flags: review?(rreitmai)
Attachment #426638 -
Flags: review?(rreitmai)
Attachment #426638 -
Flags: review?
Updated•15 years ago
|
Attachment #427534 -
Flags: review?(jseward) → review+
Comment 23•15 years ago
|
||
Comment on attachment 427535 [details] [diff] [review]
NJ patch v2
much goodness, gladly +'d
Attachment #427535 -
Flags: review?(rreitmai) → review+
Comment 24•15 years ago
|
||
Comment on attachment 427533 [details] [diff] [review]
TR patch
r+'d, needs to be rebased somewhat (I've already done so) but otherwise looks good.
Something to watch for (with the conversion of 32b to 16b displacement) is that some true 'base' pointers may no longer be visible on the frame, since they could have been folded by and addp.
This may affect gc dynamics, but as Nick pointed out above he's not seen > 16b displacements being used in practice.
Attachment #427533 -
Flags: review?(rreitmai) → review+
Assignee | ||
Comment 25•15 years ago
|
||
http://hg.mozilla.org/projects/nanojit-central/rev/4adbf1bbb16c
http://hg.mozilla.org/tracemonkey/rev/1df8886b7507
http://hg.mozilla.org/tracemonkey/rev/ffb1d237ec0f
Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey
Comment 26•15 years ago
|
||
Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey → fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin
Comment 27•15 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•