Closed Bug 483723 Opened 16 years ago Closed 14 years ago

Trace string.indexOf

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: bent.mozilla, Assigned: bent.mozilla)

Details

Attachments

(2 files, 4 obsolete files)

This is bumping us off trace in dromaeo.
Assignee: general → jwalden+bmo
OS: Mac OS X → All
Hardware: x86 → All
Attached patch Patch, v1 (obsolete) — Splinter Review
Traces indexOf and lastIndexOf.
Assignee: jwalden+bmo → bent.mozilla
Status: NEW → ASSIGNED
Attachment #367843 - Flags: review?(gal)
Attached patch Patch, v1.1 (obsolete) — Splinter Review
Removes two fallback cases.
Attachment #367843 - Attachment is obsolete: true
Attachment #367850 - Flags: review?(gal)
Attachment #367843 - Flags: review?(gal)
Comment on attachment 367850 [details] [diff] [review] Patch, v1.1 Nit: all traceable natives should end with _tn, so str_indexOfWithStat_tn (no need to add more underscores in the middle!). Is this patch up to date? I will let gal review in more detail. /be
Attachment #367850 - Flags: review?(brendan) → review+
(In reply to comment #3) > (From update of attachment 367850 [details] [diff] [review]) > Nit: all traceable natives should end with _tn, so str_indexOfWithStat_tn Er, str_indexOfWithStart_tn. /be
Attached patch Patch, v1.2 (obsolete) — Splinter Review
Updated with brendan's naming change.
Attachment #367850 - Attachment is obsolete: true
Attachment #368946 - Flags: review?(gal)
Attachment #367850 - Flags: review?(gal)
Attached patch Patch, v1.3Splinter Review
Removed unnecessary NaN check from indexOf.
Attachment #368946 - Attachment is obsolete: true
Attachment #368964 - Flags: review?(gal)
Attachment #368946 - Flags: review?(gal)
Comment on attachment 368964 [details] [diff] [review] Patch, v1.3 Can you add a couple test cases for it to trace-tests.js? Thanks.
Attachment #368964 - Flags: review?(gal) → review+
Whiteboard: fixed-in-tracemonkey
I backed this out of TM because jag landed a change to the indexOf algorithm that conflicts badly. I failed at merging it a few times, better to have the history clean instead of a questionable merge.
Keywords: 4xp
Whiteboard: fixed-in-tracemonkey
Keywords: 4xp
Let me know what you think of this approach. I think we can dispense with using |index| outside str_indexOf_common(...), but that's up to you.
Comment on attachment 369998 [details] [diff] [review] Patch, v1.4 (1.3 + merge + more refactoring) Ben could you take a look at this?
Attachment #369998 - Flags: review?(bent.mozilla)
Comment on attachment 369998 [details] [diff] [review] Patch, v1.4 (1.3 + merge + more refactoring) > str_indexOf(JSContext *cx, uintN argc, jsval *vp) > ... >+ d = js_DoubleToInteger(d); >+ if (d <= 0) >+ start = 0; >+ else if (d >= textlen) >+ start = textlen; >+ else >+ start = int32(d); These checks are going to be done in str_indexOf_common so no need for them here. Just do |start = int32(js_DoubleToInteger(d));| ? Same for the lastIndexOf case. >+str_indexOf_tn(JSString *str, JSString *str2) Why not just immediately |return str_indexOfWithStart_tn(str, str2, 0);| ? >+str_lastIndexOf_tn(JSString *str, JSString *str2) So here you could just |return str_lastIndexOfWithStart_tn(str, str2, JSSTRING_LENGTH(str));|. It doesn't look like this will be expensive to do twice and it cuts code duplication down a bit.
(In reply to comment #12) > (From update of attachment 369998 [details] [diff] [review]) > > str_indexOf(JSContext *cx, uintN argc, jsval *vp) > > ... > >+ d = js_DoubleToInteger(d); > >+ if (d <= 0) > >+ start = 0; > >+ else if (d >= textlen) > >+ start = textlen; > >+ else > >+ start = int32(d); > > These checks are going to be done in str_indexOf_common so no need for them > here. Just do |start = int32(js_DoubleToInteger(d));| ? Same for the > lastIndexOf case. If |d| is positive or negative infinity (or anything that won't fit in an int32; the infinity cases are explicitly tested for), that cast won't do what you'd want. I actually had made the change you suggested and had to undo it. Is there a simple way to cast a jsdouble to an int such that anything less than INT_MIN becomes that, and anything larger than INT_MAX becomes that? > >+str_indexOf_tn(JSString *str, JSString *str2) > > Why not just immediately |return str_indexOfWithStart_tn(str, str2, 0);| ? I wanted to avoid the unnecessary boundary checks, but perhaps it's not really worth the cost in code size. > >+str_lastIndexOf_tn(JSString *str, JSString *str2) > > So here you could just |return str_lastIndexOfWithStart_tn(str, str2, > JSSTRING_LENGTH(str));|. It doesn't look like this will be expensive to do > twice and it cuts code duplication down a bit. I could also pass in PR_INT32_MAX (either way will have us use textlen-patlen in the end). Again, was trying to avoid unnecessary boundary checks. What do you think of removing the |index| temporary var?
Or INT_MAX, for my last comment.
Attachment #369998 - Flags: review?(bent.mozilla) → review+
Comment on attachment 369998 [details] [diff] [review] Patch, v1.4 (1.3 + merge + more refactoring) > that cast won't do what you'd want. Ah, drat. That's unfortunate. > I wanted to avoid the unnecessary boundary checks, but perhaps it's not really > worth the cost in code size. I'd opt for reducing code complexity but a JS guru should really make that call. Brendan, sayrer? > What do you think of removing the |index| temporary var? I don't really care, personally. Advice from a JS person would be good here too. r=me whichever way you go. Thanks!
Attachment #369998 - Flags: superreview?(brendan)
Is this wanted1.9.1? /be
Flags: wanted1.9.1?
Review ping because this looks like it will help the blocker bug 480494.
Attachment #369998 - Flags: superreview?(brendan) → superreview-
Comment on attachment 369998 [details] [diff] [review] Patch, v1.4 (1.3 + merge + more refactoring) The int32* start / startPtr stuff is goofy, it requires more work in both callers and callee to synthesize the starting position some of the time, than if the caller always passed a jsint start param. Measure code size before prematurely optimizing at the price of code complexity. I bet getting rid of the indirection and maybe-null param testing would save more space than obligating the caller to pass start always would cost. /be
I was less worried about space than having to do some work needlessly, but there too I should measure if I really want to push that issue. I'll write up the simpler patch.
Attached patch Patch, v1.5Splinter Review
Attachment #369998 - Attachment is obsolete: true
Attachment #375567 - Flags: superreview?(brendan)
Attachment #375567 - Flags: review?(bent.mozilla)
Comment on attachment 375567 [details] [diff] [review] Patch, v1.5 This looks good to me.
Attachment #375567 - Flags: review?(bent.mozilla) → review+
Comment on attachment 375567 [details] [diff] [review] Patch, v1.5 >+JSValPtrToString(JSContext *cx, jsval *vp) > { > JSObject *obj; > JSString *str; > > if (JSVAL_IS_OBJECT(*vp)) { > obj = JSVAL_TO_OBJECT(*vp); > if (!obj) > return ATOM_TO_STRING(cx->runtime->atomState.nullAtom); > if (!OBJ_DEFAULT_VALUE(cx, obj, JSTYPE_STRING, vp)) > return NULL; > } > if (JSVAL_IS_STRING(*vp)) >@@ -267,16 +263,29 @@ ArgToRootedString(JSContext *cx, uintN a > str = js_NumberToString(cx, *JSVAL_TO_DOUBLE(*vp)); > } else if (JSVAL_IS_BOOLEAN(*vp)) { > return ATOM_TO_STRING(cx->runtime->atomState.booleanAtoms[ > JSVAL_TO_BOOLEAN(*vp)? 1 : 0]); Fix the else after return by setting str = ATOM_TO_STRING(...) and not returning early. Space nit before ? but really: don't use (x ? 1 : 0) where x is a boolean (0 or 1) already. > } else { > JS_ASSERT(JSVAL_IS_VOID(*vp)); > return ATOM_TO_STRING(cx->runtime->atomState.typeAtoms[JSTYPE_VOID]); > } >+ return str; Since *vp is never set, this function should take a jsval v parameter, not jsval *vp; therefore it should be named ValueToString. >+} >+ >+static JSString * >+ArgToRootedString(JSContext *cx, uintN argc, jsval *vp, uintN arg) >+{ >+ JSString *str; >+ >+ if (arg >= argc) >+ return ATOM_TO_STRING(cx->runtime->atomState.typeAtoms[JSTYPE_VOID]); >+ >+ vp += 2 + arg; >+ str = JSValPtrToString(cx, vp); > if (str) > *vp = STRING_TO_JSVAL(str); > return str; > } > > /* > * Forward declarations for URI encode/decode and helper routines > */ >@@ -635,16 +644,27 @@ JSClass js_StringClass = { > str = JSVAL_TO_STRING(vp[1]); \ > } else { \ > str = NormalizeThis(cx, vp); \ > if (!str) \ > return JS_FALSE; \ > } \ > JS_END_MACRO > >+#define ARG_TO_ROOTED_STRING(cx,argc,vp,arg,str) \ >+ JS_BEGIN_MACRO \ >+ if (argc > arg && JSVAL_IS_STRING(vp[2+arg])) { \ arg < argc for the first condition, to match number-line order used in ArgToRootedString. Don't add macros like this now, even though NORMALIZE_THIS is right above -- do use an inline static. >+inline jsint >+str_indexOf_common(const jschar *text, jsint textlen, >+ const jschar *pat, jsint patlen, >+ jsint start) >+{ >+ jsint i, lastMatchPos, index; >+ >+ /* last possible match position */ >+ lastMatchPos = textlen - patlen; >+ >+ /* any match attempts before this position are known to fail, skip them */ >+ if (start < 0) >+ start = 0; >+ >+ /* empty pattern always matches, but clamp to [0 .. textlen] */ Nit: comments shown above should be capitalized sentences (so full stops at end). >+ if (patlen == 0) { >+ if (start > textlen) >+ return textlen; >+ >+ return start; >+ } >+ >+ if (patlen > textlen || start > lastMatchPos) lastMatchPos means "greatest possible match position (index)" but it sounds like "last" in the "previous" sense. How about startmax to keep names consistently short and nocamelcaps? > static JSBool > str_indexOf(JSContext *cx, uintN argc, jsval *vp) > { > JSString *str, *str2; > const jschar *text, *pat; >+ jsint textlen, patlen, start; >+ jsdouble d; >+ >+ NORMALIZE_THIS(cx, vp, str); > text = JSSTRING_CHARS(str); >+ textlen = jsint(JSSTRING_LENGTH(str)); >+ >+ ARG_TO_ROOTED_STRING(cx, argc, vp, 0, str2); > pat = JSSTRING_CHARS(str2); >+ patlen = jsint(JSSTRING_LENGTH(str2)); >+ >+ start = 0; > if (argc > 1) { >+ if (JSVAL_IS_INT(vp[3])) { >+ start = JSVAL_TO_INT(vp[3]); >+ } else { >+ d = js_ValueToNumber(cx, &vp[3]); >+ if (JSVAL_IS_NULL(vp[3])) >+ return JS_FALSE; >+ d = js_DoubleToInteger(d); >+ if (d >= textlen) >+ start = textlen; >+ else if (d > 0) >+ start = jsint(d); >+ } >+ } >+ >+ *vp = INT_TO_JSVAL(str_indexOf_common(text, textlen, pat, patlen, start)); This patch adds redundant tests to the str_indexOf native, e.g. clamping start to textlen, but it seems necessary given the double domain vs. jsint parameter types. Just noting en passant. >+#ifdef JS_TRACER >+static int32 FASTCALL >+str_indexOfWithStart_tn(JSString *str, JSString *str2, int32 start) >+{ >+ const jschar *text, *pat; >+ jsint textlen, patlen; >+ >+ text = JSSTRING_CHARS(str); >+ textlen = jsint(JSSTRING_LENGTH(str)); >+ pat = JSSTRING_CHARS(str2); >+ patlen = jsint(JSSTRING_LENGTH(str2)); >+ >+ return int32(str_indexOf_common(text, textlen, pat, patlen, jsint(start))); >+} >+ >+static int32 FASTCALL >+str_indexOf_tn(JSString *str, JSString *str2) >+{ >+ return str_indexOfWithStart_tn(str, str2, 0); >+} >+#endif No clamping here, so the slightly redundancy for str_indexOf seems a good trade-off in favor of the traceable natives. >+ >+JS_DEFINE_TRCINFO_2(str_indexOf, >+ (3, (static, INT32, str_indexOfWithStart_tn, THIS_STRING, STRING, INT32, 0, 0)), >+ (2, (static, INT32, str_indexOf_tn, THIS_STRING, STRING, 0, 0))) >+ >+inline jsint >+str_lastIndexOf_common(const jschar *text, jsint textlen, >+ const jschar *pat, jsint patlen, >+ jsint start) >+{ >+ jsint i, lastMatchPos; >+ >+ /* last possible match position; is textlen for empty pattern */ >+ lastMatchPos = textlen - patlen; >+ >+ /* any match attempts after this position are known to fail, skip them */ >+ if (start > lastMatchPos) >+ start = lastMatchPos; >+ >+ /* empty pattern always matches, but clamp to [0 .. textlen] */ >+ if (patlen == 0) { >+ if (start < 0) >+ return 0; >+ >+ return start; >+ } Similar comments to above apply here. Will stamp r+ on a revised patch, no need to re-solicit bent review. /be
> Fix the else after return by setting str = ATOM_TO_STRING(...) and not > returning early. > > Space nit before ? but really: don't use (x ? 1 : 0) where x is a boolean > (0 or 1) already. > Since *vp is never set, this function should take a jsval v parameter, not > jsval *vp; therefore it should be named ValueToString. Heh, these are the parts of the patch I inherited (not an excuse). Will fix. > arg < argc for the first condition, to match number-line order used in > ArgToRootedString. > > Don't add macros like this now, even though NORMALIZE_THIS is right above > -- do use an inline static. I should've listened to that little voice nagging me about this. Mind if I switch the other macro too? > Nit: comments shown above should be capitalized sentences (so full stops at > end). Done. > lastMatchPos means "greatest possible match position (index)" but it sounds > like "last" in the "previous" sense. How about startmax to keep names > consistently short and nocamelcaps? Done. This file seems to mix nocamelcaps, no_camel_caps and camelCaps. It looked like nocamelcaps was only used for ***len. Should've looked around a bit more. I couldn't come up with a good name, but startmax works for me, though it's really the index after which to bail matching because it's known to fail. Any suggestion for a name for the largest index for which str.lastIndexOf(pat) could possibly match? (The bail index is of course 0). > This patch adds redundant tests to the str_indexOf native, e.g. clamping start > to textlen, but it seems necessary given the double domain vs. jsint parameter > types. Just noting en passant. Yeah. No clever way around this eh? > No clamping here, so the slightly redundancy for str_indexOf seems a good > trade-off in favor of the traceable natives. Well, in the old code there was no clamping at all unless a start position was specified. Now the common code always has to clamp. I could move the clamping out into the callers that need it at the cost of a bit of code duplication. Yea? Nay?
(In reply to comment #23) > Mind if I switch the other macro too? If it fits in the patch, i.e., doesn't increase the diffstats too badly, then I do not mind. > Any suggestion for a name for the largest index for which str.lastIndexOf(pat) > could possibly match? (The bail index is of course 0). startmax? :-P > > No clamping here, so the slightly redundancy for str_indexOf seems a good > > trade-off in favor of the traceable natives. > > Well, in the old code there was no clamping at all unless a start position was > specified. Now the common code always has to clamp. I could move the clamping > out into the callers that need it at the cost of a bit of code duplication. > Yea? Nay? Oh sure -- that would make the traceable native with the fewest args the fastest. We want that, we do not mind a bit of code dup in general, and it can give perf boosts due to reduced minimal-cost path wins and also better branch prediction opportunities. /be
> Since *vp is never set, this function should take a jsval v parameter, not > jsval *vp; therefore it should be named ValueToString. It looks like |OBJ_DEFAULT_VALUE| needs vp.
Is the |JSVAL_IS_STRING(...)| optimization in those macros worth it? I guess it optimizes for a common case and lets you avoid null-testing str when you know it can't be. I'm leaning towards dropping it though.
(In reply to comment #25) > > Since *vp is never set, this function should take a jsval v parameter, not > > jsval *vp; therefore it should be named ValueToString. > > It looks like |OBJ_DEFAULT_VALUE| needs vp. Good point. On the JSVAL_IS_STRING inline optimization, can you measure a difference? Micro-benchmark that traces or otherwise? /be
Flags: in-testsuite+
cvsroot/mozilla/js/tests/js1_8_1/trace/trace-test.js,v <-- trace-test.js new revision: 1.14; previous revision: 1.13 /cvsroot/mozilla/js/tests/shell.js,v <-- shell.js
Given that we now trace fastnatives, do we still want to make indexof/lastindexof traceable natives?
Comment on attachment 375567 [details] [diff] [review] Patch, v1.5 No sr? in js/src, and stale bug/patch. Please resolve if you can. /be
Attachment #375567 - Flags: superreview?(brendan)
Hrm, so I've unbitrotted the tracing bits of the patch, and there are the expected speed-ups on micro benchmarks, but is it worth the extra code? Does dromaeo still need this? Does anyone know of other benchmarks or real world uses that would benefit?
> Does dromaeo still need this? Define "need"? See comment 30. As for real-world or other benchmarks... http://www.multivax.com/last_question.html
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: