Closed Bug 579140 Opened 10 years ago Closed 9 years ago

fatval review

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: luke, Assigned: luke)

References

Details

Attachments

(9 files, 1 obsolete file)

In bug 549143 comment 52 and earlier we decided to land fatvals without the standard formal review, due to the coordination challenge of reviewing a highly-mechanical 3MB patch, on condition that the interesting parts be reviewed after the fact (and, of course, all the tests were passing).  In this bug I will post the non-mechanical changes for review.  Since it would be a bit chaotic to to discuss all the different parts in a single bug, perhaps the various reviewers could post separate bugs for independent issues and block this one.
Attached patch gc changesSplinter Review
The major changes to the GC are:
1. removing heap double machinery ( \o/ )
2. dealing with the GC-thing/jsval root distinction
3. updating the conservative GC
4. avoiding all the add-hoc ways to mark a value and using a standard set of Mark*() functions
Assignee: general → lw
Status: NEW → ASSIGNED
Attachment #457627 - Flags: review?(anygregor)
This is mostly a syntactic review of jsapi.h and jspubtd.h, which are public-facing.
Attachment #457628 - Flags: review?(brendan)
Tiny changes, but hey, its nsScriptSecurityManager, it deserves a careful look.
Attachment #457629 - Flags: review?(mrbkap)
These mostly involve the casting between jsval and NPIdentifier.
Attachment #457631 - Flags: review?
Attachment #457631 - Flags: review? → review?(jst)
Attached patch tracer changesSplinter Review
Unfortunately, the changes here were a bit more extensive, mostly from two sources:
 - boxing and unboxing on trace
 - making fp->scopeChain and fp->argsobj JSObject* (not jsvals)

The latter means having a separate 'visitFrameObjPtr' overload in all the frame stack slot visitors so that the visitors can treat these slots as unboxed pointers rather than jsvals.
Attachment #457634 - Flags: review?(gal)
This does not include the qsgen.py and quickstubs hacking done by peterv and dmandelin.  It deals with the xptcall last-minute surprise where jsval was being passed as a ptr-sized argument through CallMethod.  The fix is to make the 'jsval' XPIDL parameter type generate 'const jsval &' for inparams and 'jsval *' for outparams.  See bug 578547.
Attachment #457639 - Flags: review?(mrbkap)
Attached patch xdr changes (obsolete) — Splinter Review
Small patch.  I just noticed that my JSXDR_BYTECODE_VERSION bump went away when someone else bumped to the same number.  I will re-bump in a followup patch.
Attachment #457643 - Flags: review?(igor)
Comment on attachment 457643 [details] [diff] [review]
xdr changes


>+enum jsvaltag {
>+    JSVAL_OBJECT  =             0x0,
>+    JSVAL_INT     =             0x1,
>+    JSVAL_DOUBLE  =             0x2,
>+    JSVAL_STRING  =             0x4,
>+    JSVAL_SPECIAL =             0x6,
>+    JSVAL_XDRNULL =             0x8,
>+    JSVAL_XDRVOID =             0xA
>+};

Nit: rename this into enum XDRValueTag and use XDR_OBJECT etc. for tag definitions.

>+
>+static jsvaltag
>+JSVAL_TAG(jsval v)

Nit: rename this into GetXDRTag

> JSBool
> js_XDRAtom(JSXDRState *xdr, JSAtom **atomp)
> {
>+    if (!JS_XDRString(xdr, &str))
>         return JS_FALSE;
>-    if (type == JSVAL_STRING)
>-        return js_XDRStringAtom(xdr, atomp);

The code must call js_XDRStringAtom to avoid allocating already existed atoms during decoding.
Attachment #457643 - Flags: review?(igor) → review-
These changes arise from:
 1. not storing doubles as atoms, but not embedding them in the bytecode either
 2. lookupswitch
Attachment #457653 - Flags: review?(brendan)
Depends on: 579158
(In reply to comment #8)
bug 579158
Attachment #457629 - Flags: review?(mrbkap) → review+
Attachment #457631 - Flags: review?(jst) → review+
Comment on attachment 457639 [details] [diff] [review]
xpconnect / xpcom / xpidl changes

>     case nsXPTType::T_JSVAL :
>-        *((jsval*)d) = s;
>-        break;
>+        {
>+            NS_ASSERTION(useAllocator,"trying to convert a jsval to const jsval & without allocator : this would leak");

Nits: space after the comma, no space before the colon, begin the message with a capital.

>+            // The C++ type is (const jsval &), which here means (jsval *).
>+            jsval *buf = new jsval(s);
>+            if (!buf)

nit: if(!buf) (no space after if).

>+            } else {
>+                if (type_tag == nsXPTType::T_JSVAL) {

Nit:

}
else
{
    if(type_tag == ...)
    {
       ...

to match existing style.

> void
> XPCWrappedNative::HandlePossibleNameCaseError(XPCCallContext& ccx,
>+    // TODO: remove this all more thoroughly.

Is it truly too onerous to just remove this in this patch?


>diff --git a/xpcom/typelib/xpidl/xpidl.h b/xpcom/typelib/xpidl/xpidl.h

I don't know if I'm the most qualified reviewer for these changes, but I read through them and they look good. r=me though I'd feel better if shaver or someone who's actually hacked this code gives these changes a once over.
Attachment #457639 - Flags: review?(mrbkap) → review+
(In reply to comment #11)
Thanks!  I'll include these changes in the next batch.
The first thing I looked for was a big comment explaining the new Value type, ie. how all the different types are represented in it.  Especially the tricky cases, like all the NaN representation stuff (eg. I'm not very familiar with IEE754 representation so how you can squeeze 32-bit integers into the NaN-space is quite unclear to me) and how pointers are represented on 64-bit platforms.

I couldn't find such a comment.  If there is one, can someone point me at it?  If there isn't one, can one be added?  This is, like, the most important data type in the entire JS engine.  Please please please please pretty please?
(In reply to comment #13)
That's entirely reasonable.  The representation has been changed a couple times (much easier to do that now), so I kept holding off.  But yes, I'll add it.
Comment on attachment 457634 [details] [diff] [review]
tracer changes

>+/* Return JSVAL_TYPE_DOUBLE for all numbers (int and double) and the tag otherwise. */
>+static inline JSValueType
>+GetPromotedType(const Value &v)
>+{
>+    if (v.isNumber())
>+        return JSVAL_TYPE_DOUBLE;
>+    if (v.isObject())
>+        return v.toObject().isFunction() ? JSVAL_TYPE_FUNOBJ : JSVAL_TYPE_NONFUNOBJ;
>+    return v.extractNonDoubleObjectTraceType();
>+}

Nit: Comment is out-of-date, doesn't account for FUNOBJ/NONFUNOBJ.  I'd be ok
with just removing the comment.  Also, any reason for the capital 'G' in
'Get'?  getCoercedType doesn't have one.


>+/* Return JSVAL_TYPE_INT32 for all whole numbers that fit into signed 32-bit and the tag otherwise. */
>+static inline JSValueType
>+getCoercedType(const Value &v)
>+{
>+    if (v.isNumber()) {
>+        int32_t _;
>+        return (v.isInt32() || JSDOUBLE_IS_INT32(v.toDouble(), &_))
>+               ? JSVAL_TYPE_INT32
>+               : JSVAL_TYPE_DOUBLE;
>+    }
>+    if (v.isObject())
>+        return v.toObject().isFunction() ? JSVAL_TYPE_FUNOBJ : JSVAL_TYPE_NONFUNOBJ;
>+    return v.extractNonDoubleObjectTraceType();
>+}

Nit: This comment is out-of-date too.


>+#if JS_BITS_PER_WORD == 32 || JS_BITS_PER_WORD == 64

Will this ever be false?


>+void
>+TraceRecorder::set_array_fslot(LIns *obj_ins, unsigned slot, uint32 val)

Nit: This would be better named stobj_set_fslot_uint32() to match
stobj_get_fslot_uint32().


>+    lir->insStore(INS_CONSTU(JSVAL_TAG_UNDEFINED), vaddr_ins, offset + sTagOffset, accSet);
>+    lir->insStore(INS_CONST(0), vaddr_ins, offset + sPayloadOffset, accSet);

Nit: If you feel like it, the following would be a naming scheme that matches
Nanojit better:

  INS_CONST             INS_IMMI
  INS_CONSTU            INS_IMMUI
  INS_CONSTPTR          INS_IMMP
  INS_CONSTWORD         INS_IMMWORD
  INS_CONSTQWORD        INS_IMMQ
  INS_CONSTOBJ          INS_IMMOBJ
  INS_CONSTFUN          INS_IMMFUN
  INS_CONSTSTR          INS_IMMSTR
  INS_CONSTSPROP        INS_IMMSPROP
  INS_CONSTID           INS_IMMID

But I understand if you can't be bothered.

That's a lot of nits and not much real substance.  I'll be looking closely
at the generated LIR over the next days/weeks and will probably find more
substantial things to improve.  But this is already a great step.  Here are
instruction counts for SS on 32-bit via Cachegrind:

------------------------------------------------------------------
instructions executed (nb: "on-trace" may overestimate slightly)
------------------------------------------------------------------
3d-cube:
  total         123,305,743     114,235,804     1.079x better
  on-trace       23,125,236      21,012,419     1.101x better
3d-morph:
  total          63,251,614      56,661,772     1.116x better
  on-trace       22,143,499      21,881,077     1.012x better
3d-raytrace:
  total         140,448,740     144,632,895    *1.030x worse*
  on-trace       25,466,580      21,908,423     1.162x better
access-binary-trees:
  total          78,757,294      78,377,064     1.005x better
  on-trace       18,790,968      17,603,289     1.067x better
access-fannkuch:
  total         179,725,800     176,474,254     1.018x better
  on-trace      105,047,079     101,475,966     1.035x better
access-nbody:
  total          84,276,606      41,551,361     2.028x better
  on-trace       30,747,913      17,631,480     1.744x better
access-nsieve:
  total          58,306,986      60,676,166    *1.041x worse*
  on-trace       26,450,494      26,450,146     -- 
bitops-3bit-bits-in-byte:
  total          15,800,454      15,801,827     -- 
  on-trace        2,989,366       2,989,968     -- 
bitops-bits-in-byte:
  total          44,088,711      44,084,602     -- 
  on-trace       31,086,203      31,086,473     -- 
bitops-bitwise-and:
  total          23,358,092      23,361,123     -- 
  on-trace       10,817,439      10,818,308     -- 
bitops-nsieve-bits:
  total          84,961,141      70,189,160     1.210x better
  on-trace       42,249,900      38,775,383     1.090x better
controlflow-recursive:
  total          43,427,997      43,133,371     1.007x better
  on-trace       27,502,226      27,490,633     -- 
crypto-md5:
  total          48,767,432      46,912,841     1.040x better
  on-trace        5,623,536       5,456,717     1.031x better
crypto-sha1:
  total          33,039,844      31,054,042     1.064x better
  on-trace        7,245,581       6,948,014     1.043x better
date-format-tofte:
  total         134,565,135     135,041,978    *1.004x worse*
  on-trace       11,356,927      10,778,455     1.054x better
date-format-xparb:
  total         104,691,122      96,308,438     1.087x better
  on-trace       11,584,143      10,459,970     1.107x better
math-cordic:
  total          55,420,014      53,259,424     1.041x better
  on-trace       30,338,833      28,238,262     1.074x better
math-partial-sums:
  total          38,545,224      38,455,343     1.002x better
  on-trace        6,288,126       6,282,992     1.001x better
math-spectral-norm:
  total          29,737,061      28,598,748     1.040x better
  on-trace       11,272,834      10,393,910     1.085x better
regexp-dna:
  total         115,062,703     113,308,338     1.015x better
  on-trace       75,040,530      75,040,526     -- 
string-base64:
  total          54,715,023      54,930,403    *1.004x worse*
  on-trace        8,154,524       8,284,686    *1.016x worse*
string-fasta:
  total         128,834,143     123,272,117     1.045x better
  on-trace       26,543,717      24,522,074     1.082x better
string-tagcloud:
  total         271,755,909     273,760,672    *1.007x worse*
  on-trace        6,599,894       6,619,842    *1.003x worse*
string-unpack-code:
  total         272,905,648     270,018,630     1.011x better
  on-trace        5,623,065       5,635,935    *1.002x worse*
string-validate-input:
  total          82,123,948      81,759,623     1.004x better
  on-trace        7,716,212       7,635,908     1.011x better
-------
all:
  total       2,309,872,384   2,215,859,996     1.042x better
  on-trace      579,804,825     545,420,856     1.063x better


And on 64-bit:


------------------------------------------------------------------
instructions executed (nb: "on-trace" may overestimate slightly)
------------------------------------------------------------------
3d-cube:
  total         112,246,454     102,738,890     1.093x better
  on-trace       24,367,383      21,750,083     1.120x better
3d-morph:
  total          87,950,382      82,734,116     1.063x better
  on-trace       19,043,357      18,998,868     1.002x better
3d-raytrace:
  total         131,118,603     139,336,041    *1.063x worse*
  on-trace       26,034,418      23,487,793     1.108x better
access-binary-trees:
  total          68,015,381      68,593,308    *1.008x worse*
  on-trace       19,258,096      18,536,158     1.039x better
access-fannkuch:
  total         179,388,656     165,993,048     1.081x better
  on-trace      108,987,463      94,573,586     1.152x better
access-nbody:
  total          66,556,260      34,907,267     1.907x better
  on-trace       31,822,538      19,053,105     1.670x better
access-nsieve:
  total          56,947,918      55,577,976     1.025x better
  on-trace       29,487,353      27,176,201     1.085x better
bitops-3bit-bits-in-byte:
  total           8,102,558       8,229,053    *1.016x worse*
  on-trace        3,359,354       3,487,346    *1.038x worse*
bitops-bits-in-byte:
  total          39,940,673      40,194,733    *1.006x worse*
  on-trace       35,012,350      35,281,129    *1.008x worse*
bitops-bitwise-and:
  total          15,889,691      15,890,265     -- 
  on-trace       11,405,249      11,405,240     -- 
bitops-nsieve-bits:
  total          75,538,364      64,718,907     1.167x better
  on-trace       43,806,106      38,596,983     1.135x better
controlflow-recursive:
  total          39,354,198      39,494,683    *1.004x worse*
  on-trace       31,355,659      31,888,915    *1.017x worse*
crypto-md5:
  total          41,351,514      40,124,524     1.031x better
  on-trace        5,432,222       5,266,775     1.031x better
crypto-sha1:
  total          24,737,942      23,359,647     1.059x better
  on-trace        7,724,434       7,310,195     1.057x better
date-format-tofte:
  total         117,925,675     117,650,205     1.002x better
  on-trace       10,222,344      10,221,890     -- 
date-format-xparb:
  total          93,631,336      89,140,335     1.050x better
  on-trace       10,503,967      10,197,729     1.030x better
math-cordic:
  total          61,422,125      58,984,537     1.041x better
  on-trace       31,855,474      29,180,498     1.092x better
math-partial-sums:
  total          46,746,600      46,687,755     1.001x better
  on-trace        5,728,498       5,728,481     -- 
math-spectral-norm:
  total          22,907,898      21,180,846     1.082x better
  on-trace       12,198,111      11,199,071     1.089x better
regexp-dna:
  total         123,404,024     121,561,465     1.015x better
  on-trace       79,678,093      79,678,125     -- 
string-base64:
  total          41,480,003      41,643,495    *1.004x worse*
  on-trace        7,624,928       7,330,001     1.040x better
string-fasta:
  total         110,616,962     106,761,209     1.036x better
  on-trace       24,950,301      23,187,183     1.076x better
string-tagcloud:
  total         253,278,904     254,080,480    *1.003x worse*
  on-trace        4,302,116       4,286,346     1.004x better
string-unpack-code:
  total         285,306,910     283,878,443     1.005x better
  on-trace        3,152,869       3,249,849    *1.031x worse*
string-validate-input:
  total          70,718,075      71,243,661    *1.007x worse*
  on-trace        6,988,713       7,052,658    *1.009x worse*
-------
all:
  total       2,174,577,106   2,094,704,889     1.038x better
  on-trace      594,301,396     548,124,208     1.084x better


Nice!  Esp. on access-nbody.  If anyone wants more detail I can give
detailed differential profiles for any of the individual tests.  Looking
briefly, seems like we're mostly saving instructions on trace, and in
boxing/unboxing, and in not having to heap allocate doubles.
Comment on attachment 457627 [details] [diff] [review]
gc changes


> 
> JSBool
> js_InitGC(JSRuntime *rt, uint32 maxbytes)
> {
>+#if defined(XP_WIN) && defined(_M_X64)
>+    if (!InitNtAllocAPIs())
>+        return JS_FALSE;
>+#endif
>+    
>     InitGCArenaLists(rt);
> 
>     if (!rt->gcRootsHash.init(256))
>         return false;
> 

Where does this come from?

One thing:
In the code you are replacing TraceValues with MarkIDRange for example.
MarkIDRange does JS_SET_TRACING_INDEX and calls MarkID which does JS_SET_TRACING_NAME.
It seems that JS_SET_TRACING_NAME overwrites the index.

Otherwise looks great!
Attachment #457627 - Flags: review?(anygregor) → review+
Comment on attachment 457634 [details] [diff] [review]
tracer changes

No good deed goes unpunished -- r?'ing njn.
Attachment #457634 - Flags: review?(gal) → review?(nnethercote)
(In reply to comment #15)
> >+#if JS_BITS_PER_WORD == 32 || JS_BITS_PER_WORD == 64
> 
> Will this ever be false?

Beats me :)  I general, I tried to make sure that if it ever isn't, there would be a compile-time error at every point that needs to be addressed.  However, the primary reason is that (later) we might use different 64-bit schemes depending on whether your OS offers the equivalent of MAP_32BIT so that we can use nunboxing (which is slightly more efficient).  In that case, I'd want to find all the places where the architecture matters.  Crudely, I've tried to make this equivalent do 'grep "^#.*JS_BITS_PER_WORD"'.

> >+void
> >+TraceRecorder::set_array_fslot(LIns *obj_ins, unsigned slot, uint32 val)
> 
> Nit: This would be better named stobj_set_fslot_uint32() to match
> stobj_get_fslot_uint32().

The reason I didn't do that is because I didn't want set_array_fslot to bother writing the type tag, which, in the special case of an array's fslots, isn't necessary since all the array's fslots already have the correct tag.  It seems like the generality implied by the name stobj_set_fslot_uint32 would lead to future bugs.

> Nit: If you feel like it, the following would be a naming scheme that matches
> Nanojit better:

I'd rather not make that change right now, but do feel free like doing it at your leisure.  You don't have to worry about conflicting with me anymore :)

> I'll be looking closely
> at the generated LIR over the next days/weeks and will probably find more
> substantial things to improve.

Thanks!  For anything substantial, please feel free to file a separate bugs blocking this one.
(In reply to comment #16)
> >+#if defined(XP_WIN) && defined(_M_X64)
> >+    if (!InitNtAllocAPIs())
> >+        return JS_FALSE;
> >+#endif
> 
> Where does this come from?

Hah, good catch.  That is from when we were trying to use nunboxing on 64-bit, which we ripped out in lieu of the newer squished-nunboxing strategy.  I'm surprised that isn't a compile-time error...

> One thing:
> In the code you are replacing TraceValues with MarkIDRange for example.
> MarkIDRange does JS_SET_TRACING_INDEX and calls MarkID which does
> JS_SET_TRACING_NAME.
> It seems that JS_SET_TRACING_NAME overwrites the index.

MarkIdRange calls the binary overload of MarkId which does not do any JS_SET_TRACING_NAME.

Thanks for looking it over!
(In reply to comment #18)
> (In reply to comment #15)
> > >+#if JS_BITS_PER_WORD == 32 || JS_BITS_PER_WORD == 64
> > 
> > Will this ever be false?
> 
> Beats me :)

No, and IIRC we assume this elsewhere, and #error-check it too. If jstypes.h could do the #if/#elif/#else\n#error/#endif once for all others, then we could simplify everywhere else.

/be
Attached patch jsiter changesSplinter Review
Oops, forgot one.
Attachment #457999 - Flags: review?(gal)
Attached patch xdr changes v2Splinter Review
Attachment #457643 - Attachment is obsolete: true
Attachment #458000 - Flags: review?(igor)
(In reply to comment #20)
> (In reply to comment #18)
> > (In reply to comment #15)
> > > >+#if JS_BITS_PER_WORD == 32 || JS_BITS_PER_WORD == 64
> > > 
> > > Will this ever be false?
> > 
> > Beats me :)
> 
> No, and IIRC we assume this elsewhere, and #error-check it too. If jstypes.h
> could do the #if/#elif/#else\n#error/#endif once for all others, then we could
> simplify everywhere else.

But Luke's point about grep '^#.*JS_BITS_PER_WORD' is a good one. Could do both without copying ugly #error lines all over (do the #else\n#error ... in jstypes or another always-included place).

/be
Comment on attachment 458000 [details] [diff] [review]
xdr changes v2

>diff --git a/js/src/jsxdrapi.cpp b/js/src/jsxdrapi.cpp

>+enum XDRValueTag {
>+    XDRTAG_OBJECT  =             0x0,
>+    XDRTAG_INT     =             0x1,
>+    XDRTAG_DOUBLE  =             0x2,
>+    XDRTAG_STRING  =             0x4,
>+    XDRTAG_SPECIAL =             0x6,
>+    XDRTAG_XDRNULL =             0x8,
>+    XDRTAG_XDRVOID =             0xA
>+};

The last nit: the numerical values are leftover from the shallow jsval era. So either comment about that or, preferably, change the sequence to 0,1,2...  and bump XDR version.
Attachment #458000 - Flags: review?(igor) → review+
(In reply to comment #24)
Will do.
Comment on attachment 457634 [details] [diff] [review]
tracer changes

You might still like Gal to do a review of jstracer.cpp, I really just looked at the bits that I already understand well;  he knows more about the whole file than I do.
Attachment #457634 - Flags: review?(nnethercote) → review+
Attachment #457999 - Flags: review?(gal) → review+
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Comment on attachment 457628 [details] [diff] [review]
public API changes

r=me or rs=me (a combination; I've looked, even patched some) at this point!

/be
Attachment #457628 - Flags: review?(brendan) → review+
Comment on attachment 457653 [details] [diff] [review]
parser / emitter / script changes

Ditto last comment.

/be
Attachment #457653 - Flags: review?(brendan) → review+
You need to log in before you can comment on or make changes to this bug.