Closed Bug 498521 Opened 15 years ago Closed 14 years ago

Rename LIR_ld* to LIR_ld(8|16|32|64|f|d)

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 504506

People

(Reporter: Waldo, Unassigned)

Details

The current suffixes are almost but not quite comprehensible.  What does "cb" or "cs" mean, for example?  "character (byte-sized)" and "character (short-sized)"?  Explicit width numbers would be clearer.  It's also not clear how "c" differs from "cs"; Luke suggested maybe sign extension played a role here, I haven't looked at either enough to even hazard a guess.  If something like that happens to be the case, we could make it ld((u|i)(8|16|32|64)|f|d) to compensate.  In any case -- replace less-readable mnemonics with more-readable ones.

No idea how this plays with nanojit-reunification, probably should be post-that rename, unless we leave #defines in temporarily to allow a short transition period.
The 'c' means 'constant', iirc. 

But I prefer number-qualified clarify as much as the next person.
So I can understand, what does 'constant' mean in the this context?  For example, how does ldc differ from ld and why are there no lds and ldb?
As I gather it, in the context of ldc, ldcb and ldcs, 'constant' means that the loaded value is non-volatile; you don't have to save it in order to reload / rematerialize it.

This is not the same as the LIR_quad or LIR_int opcode, which (while also referred to as 'constant', in the sense of the LIns::isconst() predicate) is probably more like what many readers would consider 'immediate': the value to load is encoded *in* the instruction stream, and can be constant-folded through subsequent arithmetic expressions.

To make matters slightly more complex, I do not think there exist any volatile variants of ldcb or ldcs. That is, 8 and 16-bit loads only exist in non-volatile form.

I could also be quite wrong about this. I'm going from what I've inferred during reading.
I'm probably exposing my lack of assembler upbringing, but does the world end if we use LIR_loadconst instead of LIR_ldc, for example?
Ed seems to favor brief and expressive names. Over the month I have come to like LIR_x, LIR_xt, and LIR_xf for example. I think its better than LIR_SideExit or LIR_AlwaysExitHere. ldc doesn't sound too bad to me (so ldc8, i.e.).
I'm not sure I agree that "LIR_xt" is expressive, but getting rid of quad/short and making them explicitly sized in terms of octets would improve a lot of it for sure.  (ExitAlways, ExitIfTrue, ExitIfFalse would seem to work; LIR_ as prefix seems unnecessary in the presence of nanojit::LIR as a namespace?)

I would like the shed to be green.
Shaver, man up -- it's assembly code! ExplainItAll StudlyCaps is for wimps. :-P

Srsly I agree on current cryptic cybercrud being bad. Real assembly languages were once the most used in practice for many industrial software projects, and there was indeed competition (indirect sometimes, but affecting CPU choice where hardware was being built) to have short but sweet names, and to avoid rainbow colored mystery meat.

/be
Regarding Comment #1 (what does 'c' mean?), I asked precisely
this a couple of weeks back.  See
https://bugzilla.mozilla.org/show_bug.cgi?id=495569
comments #2, #5, #7.

What Ed said is:

> the 'c' variants are meant to mean:
>  - a second load with the same effective address will return the same value
>  - if the result is not used, the load can be eliminated

(which essentially means "it's CSEable").

To which I expressed the view (and still do) that it would be
preferable to record the underlying properties of the load, as an
annotation, rather than one specific transformation that those
properties permit.  I believe the real property that such loads
should be annotated with is "does not alias any store in the
fragment".

This would, for example, allow floating the load
forwards/backwards past arbitrary stores, which has nothing to do
with constancy or CSEability.  Hence the "it doesn't alias"
annotation is really what you want, not the more limited statement
about constancy or CSEabililty.
I'm with Julian on improving the semantics of load instructions, giving them some orthagonal bits, or annotations, about the fields they're touching.  but yeah, the meaning of 'c' is what it is, for now.

the meaning of 'i' is legacy and iirc already gone in TM.  it meant the displacement was immediate in the LIR instructions.  not needed for expressiveness but it *did* significantly reduce LIR size by eliminating lots of consts.

as for the names, the reason for letters (q, b, etc) is a) historical and easily changed.  the reason for letters and not numbers is that the LIR pretty printer makes expression names by taking the mnemonic and appending a number.  it checks for trailing digits, and adds an underscore though.  so LIR_or becomes or1, or2, etc, and LIR_ld8 would get named ld8_1, ld8_2, etc.

as a precedent, LLVM's pretty printer just makes up names starting with 0.  %0, %1, etc.  i like having the mnemonic in there for readability but its pure aesthetics.

why dont we come up with names that are short (man up! it's assembly!) and very clear.  I couldn't parse what Brendan likes, and StudlyLongNames give me hives when reading asm output.  but message about cybercrud is loud and clear.  i'd be okay with "load" and "store" if we can tame the suffixes.

note: the *current* lir instruction only distinguishes data bytes by size, not by type, so for example int64 and double use the same load/store instructions.  (we could change that too, but i'm trying to factor issues).
(In reply to comment #0)

> No idea how this plays with nanojit-reunification, probably should be post-that
> rename, unless we leave #defines in temporarily to allow a short transition
> period.

If we do this, then enums please, not defines.
FWIW, I think:
- short names (ld, st) are fine
- numbers are *much* better than letters for bit sizes
- distinguishing int ops from float ops is good, and require 'f' for float ops and no suffix for int ops
- I'm not sure what the 'd' represents in the original suggestion
f = float, d = double, of course.  :-)
dealing with double and int64 separately seems inevitable, but needs more than just renaming; it would strengthen the LIR type model which currently only cares about bit-size.  (separate bug?  add to semantics-cleanup bug?)

LLVM also distinguishes pointers from i32/i64, but i dont think we're quite ready to swallow that pill yet.

the auto-naming of lir instructions detects when a call ends in a digit and appends an _, but i'm not sure we do it for lir instruction names.  i.e. we want:

   ld32_1 = ld32 disp, ptr

and not

   ld321 = ld32 disp, ptr
FWIW, 64-bit int and 64-bit float ops are now mostly distinguished, eg. fadd, qiadd.  Loads are an exception, which needs to be fixed (bug 520714).
Also FWIW, the ldc* variants are now gone, and loads are now marked with an AccSet (see bug 545274), so that source of confusion is gone.

And the rest of this bug is subsumed by bug 504506, so I'm marking this as a dup.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.