Closed Bug 496674 Opened 15 years ago Closed 15 years ago

TM: replace FORALL macro with template magic

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: gal, Assigned: graydon)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(4 files, 7 obsolete files)

      No description provided.
Flags: wanted1.9.2?
Attached patch first steps (obsolete) — Splinter Review
Attached patch latest and greatest (obsolete) — Splinter Review
Attachment #381900 - Attachment is obsolete: true
Assignee: gal → graydon
Attached patch Simplify and extend gal's patch (obsolete) — Splinter Review
This still doesn't cover all uses of FORALL_* yet, but it tidies up the approach gal was taking. In particular:

  - No more alloca'ing temporary frame* buffers; uses recursive frame iteration 
    with early return when visitation stops.

  - No more code duplication in NativeStackOffset; uses the visitor pattern but
    iterates exactly as many times as the old code did. Different signature on
    the visitor permits address-range comparisons and arithmetic, early returns.

  - No more SlotKind enum, just two different visitor callbacks. No waste when
    debugging spew is turned off.

My measurements show this more or less identical to trunk on this laptop. I'll carry on replacing macro call sites with new visitors and profiling / shaving on machines with more stable timing tomorrow. But I think this approach might work well.
Attachment #382984 - Attachment is obsolete: true
Attached patch More progress on this (obsolete) — Splinter Review
Further updates. Modify visitors to mFoo style, extend to cover the TraceRecorder::import() cases of FORALL, adjust signatures, split a visitor, tidy up, etc.

Compiled binary is 83 bytes smaller and runs at (as far as I can tell) exactly the same speed. It *should* be producing identical code. Will carry on extending to cover remaining FORALL uses tomorrow.
Attachment #383409 - Attachment is obsolete: true
Attached patch victorySplinter Review
This patch completes the work, deleting the offending macros entirely. 

Passes trace tests and v8. Still hits 82 bytes less size and identical-looking speed. Throwing it into the try servers now. 

Note that while it is a net addition in LOC, it moves several ugly chunks out of the middle of a number of overlong functions with very delicate logic and into independent helpers, so I think it improves the comprehensibility of the code.

(Subject to an enormous grain of salt perhaps.)
Attachment #384033 - Attachment is obsolete: true
Attachment #384202 - Flags: review?(gal)
Comment on attachment 384202 [details] [diff] [review]
victory

>+template <typename Visitor>
>+JS_REQUIRES_STACK static bool
>+visitFrameSlots(Visitor &visitor, unsigned depth, JSStackFrame *fp,
>+                JSStackFrame *up) {
>+

Drive by nits: VisitFrameSlots (capitalize for StudlyCaps), and left brace on that blank line in column 1 of course.

>+    if (depth > 0 &&
>+        !visitFrameSlots(visitor, depth-1, fp->down, fp))
>+        return false;

Overbrace single-line consequent due to multiline condition -- or pull that second line of the condition up if the 100th column is not violated (except by a newline).

>+    if (!visitor.visitStackSlots(StackBase(fp),
>+                                 size_t(fp->regs->sp - StackBase(fp)),
>+                                 fp))
>+        return false;

Ditto on overbracing single-line consequent due to multiline condition or else clause.

etc. for the recurring nits like this.

Nice work, great to get rid of that macro. It cost gal@uci.edu and me (at least; probably others) more gray hair than we care to admit.

/be
Comment on attachment 384202 [details] [diff] [review]
victory

>+    if (fp->callee) {
>+        if (depth == 0) {
>+#ifdef JS_JIT_SPEW
>+            visitor.setStackSlotKind("args");
>+#endif
>+            if (!visitor.visitStackSlots(&fp->argv[-2], argSlots(fp) + 2, fp))
>+                return false;
>+        }
>+#ifdef JS_JIT_SPEW
>+        visitor.setStackSlotKind("var");
>+#endif

These recurrent #ifdef JS_JIT_SPEW\n    something();\n#endif\n lines are ripe for a JIT_SPEW_ONLY(something()); macro.

These kinds of #ifdef reduction macros are winning, even in our modern template era. Unless of course (like debug_only, debug_only_v, etc. -- argh!) they do not expand properly to a warning-free useless expression such as ((void)0) or equivalent in the not-defined case (with debug_only etc. this requires any statement-terminating ; to be placed inside the parens just before the right paren! Do not clone this misfeatures).

/be
Comment on attachment 384202 [details] [diff] [review]
victory

>+template <typename Visitor>
>+JS_REQUIRES_STACK static JS_ALWAYS_INLINE void
>+visitSlots(Visitor& visitor, JSContext* cx, JSObject* globalObj,
>+           unsigned callDepth, unsigned ngslots, uint16* gslots) {
>+    if (visitStackSlots(visitor, cx, callDepth))
>+        visitGlobalSlots(visitor, cx, globalObj, ngslots, gslots);

Nothing seems to use the return value of visitStackSlots (or of visitSlots for that matter) and I can't see why anyone would want visitStackSlots returning false (due to stopAt frame clamping) to prevent global slots from being visited. The void return type reinforces the suspicion that this should just call the visitStackSlots and then visitGlobalSlots, as the old macro did. Hope I'm not missing something here!

/be
(In reply to comment #8)
> These recurrent #ifdef JS_JIT_SPEW\n    something();\n#endif\n lines are ripe
> for a JIT_SPEW_ONLY(something()); macro.

I haven't looked too far down in the patch yet, but the ifdefs can certainly reside inside the implementations of those functions in the visitor objects themselves, as occurs with JSStackFrame::assertValidStackDepth right now, at no loss of efficiency.
(In reply to comment #9)

> Nothing seems to use the return value of visitStackSlots (or of visitSlots for
> that matter) and I can't see why anyone would want visitStackSlots returning
> false (due to stopAt frame clamping) to prevent global slots from being
> visited. The void return type reinforces the suspicion that this should just
> call the visitStackSlots and then visitGlobalSlots, as the old macro did. Hope
> I'm not missing something here!

Existing uses of the macro involve early returns or gotos to skip the global crawl if the stack crawl stops early (the visitor isn't always clamping, sometimes it's attempting to check some fact holds against "all slots in the trace"). So the predicated global crawl in visitSlots is only an attempt to preserve functionality. But you're right that nobody uses the return value of the *outer* template function visitStackSlots. I'll remove that.

(In reply to comment #10)

> I haven't looked too far down in the patch yet, but the ifdefs can certainly
> reside inside the implementations of those functions in the visitor objects
> themselves, as occurs with JSStackFrame::assertValidStackDepth right now, at no
> loss of efficiency.

Hm, perhaps in this case that is so. I'll poke at it.
Fwiw, full build success on builds and only a scattering of warnings that look like someone else (video timeouts and a looks-like busted vm) on unit tests from the try servers.
(in reply to comment #6)

> so I think it improves the comprehensibility of the code.

Not to mention, improves the navigability of the code from the point
of view of debugging tools (the presence of line number info).
Attachment #384202 - Flags: review?(gal) → review+
Comment on attachment 384202 [details] [diff] [review]
victory

mCx? mTypeMap? mPtr? Did you work for Microsoft before? Is this some funky name mangling convention I don't know about? :) I usually just use cx, since cx(cx) is well defined in the constructor init. But up to you.


>+class CaptureTypesVisitor :
>+    public SlotVisitorBase {
>+    JSContext* mCx;
>+    uint8* mTypeMap;
>+    uint8* mPtr;
>+
>+public:
>+    JS_ALWAYS_INLINE CaptureTypesVisitor(JSContext* cx, uint8* typeMap) :
>+        mCx(cx),
>+        mTypeMap(typeMap),
>+        mPtr(typeMap)
>+    {}
>+

The code is a bit redundant here. This is why I merged both into one visitor function, but I am ok with either.

>+    JS_ALWAYS_INLINE void visitGlobalSlot(jsval *vp, unsigned n,
>+                                          unsigned slot) {
>+            uint8 type = getCoercedType(*vp);
>+            if ((type == JSVAL_INT) &&
>+                oracle.isGlobalSlotUndemotable(mCx, slot))
>+                type = JSVAL_DOUBLE;
>+            JS_ASSERT(type != JSVAL_BOXED);
>+            debug_only_v(nj_dprintf("capture type global%d: %d=%c\n",
>+                                    n, type, typeChar[type]);)
>+            *mPtr++ = type;
>+    }
>+
>+    JS_ALWAYS_INLINE bool visitStackSlots(jsval *vp, int count,
>+                                          JSStackFrame* fp) {
>+        for (int i = 0; i < count; ++i) {
>+            uint8 type = getCoercedType(vp[i]);
>+            if ((type == JSVAL_INT) &&
>+                oracle.isStackSlotUndemotable(mCx, length()))
>+                type = JSVAL_DOUBLE;
>+            JS_ASSERT(type != JSVAL_BOXED);
>+            debug_only_v(nj_dprintf("capture type %s%d: %d=%c\n",
>+                                    mStackSlotKind, i, type, typeChar[type]);)
>+            *mPtr++ = type;
>+        }
>+        return true;
>+    }
>+
>+    JS_ALWAYS_INLINE uintptr_t length() {
>+        return mPtr - mTypeMap;
>+    }
>+};
>+
(In reply to comment #14)
> (From update of attachment 384202 [details] [diff] [review])
> mCx? mTypeMap? mPtr? Did you work for Microsoft before? Is this some funky name
> mangling convention I don't know about? :) I usually just use cx, since cx(cx)
> is well defined in the constructor init. But up to you.

Heh. No, I work for Mozilla, and this is the Mozilla C++ member-naming convention. See also 6000+ other classes we maintain, including the recently-landed JSString class in jsstr.h. I hope you don't cry too much when I start posting patches to clean up the member-naming for the rest of jstracer.cpp! :)
http://hg.mozilla.org/tracemonkey/rev/5e1b444c803c
Whiteboard: fixed-in-tracemonkey
Meh, backed out. Made static analysis fail. REQUIRE_STACK issue.
Add this to the previous patch and the static analysis tinderbox should be happy. Sorry, forgot to check my dehydra builddir.
Attachment #384777 - Flags: review?(gal)
Comment on attachment 384777 [details] [diff] [review]
Fix the static analysis aspect.

r=me, assuming that perf is not regressed.
Attachment #384777 - Flags: review?(gal) → review+
Flags: wanted1.9.2? → wanted1.9.2+
The SS compare script on mac said perf was +/- noise. A couple faster, a couple slower. Nothing conclusive.

http://hg.mozilla.org/tracemonkey/rev/6d41b2eaaae1
Attached patch nit police raid (obsolete) — Splinter Review
The StudlyCaps naming convention for static top-level functions seems beneficial, you can distinguish methods that otherwise have the same name when searching.

/be
Attachment #384973 - Flags: review?(graydon)
Comment on attachment 384973 [details] [diff] [review]
nit police raid

Sorry I missed this during review-processing. Thanks.
Attachment #384973 - Flags: review?(graydon) → review+
I forgot to police the other nits mentioned in this bug...

/be
Attachment #384973 - Attachment is obsolete: true
Attachment #385004 - Flags: review?(graydon)
Attachment #384975 - Attachment is obsolete: true
Attachment #385004 - Flags: review?(graydon) → review+
http://hg.mozilla.org/mozilla-central/rev/5e1b444c803c
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Flags: in-testsuite-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: