Closed
Bug 483723
Opened 16 years ago
Closed 14 years ago
Trace string.indexOf
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: bent.mozilla, Assigned: bent.mozilla)
Details
Attachments
(2 files, 4 obsolete files)
|
9.04 KB,
patch
|
gal
:
review+
|
Details | Diff | Splinter Review |
|
13.62 KB,
patch
|
bent.mozilla
:
review+
|
Details | Diff | Splinter Review |
This is bumping us off trace in dromaeo.
Updated•16 years ago
|
Assignee: general → jwalden+bmo
OS: Mac OS X → All
Hardware: x86 → All
| Assignee | ||
Comment 1•16 years ago
|
||
Traces indexOf and lastIndexOf.
| Assignee | ||
Comment 2•16 years ago
|
||
Removes two fallback cases.
Attachment #367843 -
Attachment is obsolete: true
Attachment #367850 -
Flags: review?(gal)
Attachment #367843 -
Flags: review?(gal)
| Assignee | ||
Updated•16 years ago
|
Attachment #367850 -
Flags: review?(brendan)
Comment 3•16 years ago
|
||
Comment on attachment 367850 [details] [diff] [review]
Patch, v1.1
Nit: all traceable natives should end with _tn, so str_indexOfWithStat_tn (no need to add more underscores in the middle!). Is this patch up to date? I will let gal review in more detail.
/be
Attachment #367850 -
Flags: review?(brendan) → review+
Comment 4•16 years ago
|
||
(In reply to comment #3)
> (From update of attachment 367850 [details] [diff] [review])
> Nit: all traceable natives should end with _tn, so str_indexOfWithStat_tn
Er, str_indexOfWithStart_tn.
/be
| Assignee | ||
Comment 5•16 years ago
|
||
Updated with brendan's naming change.
Attachment #367850 -
Attachment is obsolete: true
Attachment #368946 -
Flags: review?(gal)
Attachment #367850 -
Flags: review?(gal)
| Assignee | ||
Comment 6•16 years ago
|
||
Removed unnecessary NaN check from indexOf.
Attachment #368946 -
Attachment is obsolete: true
Attachment #368964 -
Flags: review?(gal)
Attachment #368946 -
Flags: review?(gal)
Comment 7•16 years ago
|
||
Comment on attachment 368964 [details] [diff] [review]
Patch, v1.3
Can you add a couple test cases for it to trace-tests.js? Thanks.
Attachment #368964 -
Flags: review?(gal) → review+
| Assignee | ||
Comment 8•16 years ago
|
||
Pushed to tracemonkey, http://hg.mozilla.org/tracemonkey/rev/3d4b09aa09bd
Whiteboard: fixed-in-tracemonkey
Comment 9•16 years ago
|
||
I backed this out of TM because jag landed a change to the indexOf algorithm that conflicts badly. I failed at merging it a few times, better to have the history clean instead of a questionable merge.
Keywords: 4xp
Whiteboard: fixed-in-tracemonkey
Comment 10•16 years ago
|
||
Let me know what you think of this approach.
I think we can dispense with using |index| outside str_indexOf_common(...), but that's up to you.
Comment 11•16 years ago
|
||
Comment on attachment 369998 [details] [diff] [review]
Patch, v1.4 (1.3 + merge + more refactoring)
Ben could you take a look at this?
Attachment #369998 -
Flags: review?(bent.mozilla)
| Assignee | ||
Comment 12•16 years ago
|
||
Comment on attachment 369998 [details] [diff] [review]
Patch, v1.4 (1.3 + merge + more refactoring)
> str_indexOf(JSContext *cx, uintN argc, jsval *vp)
> ...
>+ d = js_DoubleToInteger(d);
>+ if (d <= 0)
>+ start = 0;
>+ else if (d >= textlen)
>+ start = textlen;
>+ else
>+ start = int32(d);
These checks are going to be done in str_indexOf_common so no need for them here. Just do |start = int32(js_DoubleToInteger(d));| ? Same for the lastIndexOf case.
>+str_indexOf_tn(JSString *str, JSString *str2)
Why not just immediately |return str_indexOfWithStart_tn(str, str2, 0);| ?
>+str_lastIndexOf_tn(JSString *str, JSString *str2)
So here you could just |return str_lastIndexOfWithStart_tn(str, str2, JSSTRING_LENGTH(str));|. It doesn't look like this will be expensive to do twice and it cuts code duplication down a bit.
Comment 13•16 years ago
|
||
(In reply to comment #12)
> (From update of attachment 369998 [details] [diff] [review])
> > str_indexOf(JSContext *cx, uintN argc, jsval *vp)
> > ...
> >+ d = js_DoubleToInteger(d);
> >+ if (d <= 0)
> >+ start = 0;
> >+ else if (d >= textlen)
> >+ start = textlen;
> >+ else
> >+ start = int32(d);
>
> These checks are going to be done in str_indexOf_common so no need for them
> here. Just do |start = int32(js_DoubleToInteger(d));| ? Same for the
> lastIndexOf case.
If |d| is positive or negative infinity (or anything that won't fit in an int32; the infinity cases are explicitly tested for), that cast won't do what you'd want. I actually had made the change you suggested and had to undo it. Is there a simple way to cast a jsdouble to an int such that anything less than INT_MIN becomes that, and anything larger than INT_MAX becomes that?
> >+str_indexOf_tn(JSString *str, JSString *str2)
>
> Why not just immediately |return str_indexOfWithStart_tn(str, str2, 0);| ?
I wanted to avoid the unnecessary boundary checks, but perhaps it's not really worth the cost in code size.
> >+str_lastIndexOf_tn(JSString *str, JSString *str2)
>
> So here you could just |return str_lastIndexOfWithStart_tn(str, str2,
> JSSTRING_LENGTH(str));|. It doesn't look like this will be expensive to do
> twice and it cuts code duplication down a bit.
I could also pass in PR_INT32_MAX (either way will have us use textlen-patlen in the end). Again, was trying to avoid unnecessary boundary checks.
What do you think of removing the |index| temporary var?
Comment 14•16 years ago
|
||
Or INT_MAX, for my last comment.
| Assignee | ||
Updated•16 years ago
|
Attachment #369998 -
Flags: review?(bent.mozilla) → review+
| Assignee | ||
Comment 15•16 years ago
|
||
Comment on attachment 369998 [details] [diff] [review]
Patch, v1.4 (1.3 + merge + more refactoring)
> that cast won't do what you'd want.
Ah, drat. That's unfortunate.
> I wanted to avoid the unnecessary boundary checks, but perhaps it's not really
> worth the cost in code size.
I'd opt for reducing code complexity but a JS guru should really make that call. Brendan, sayrer?
> What do you think of removing the |index| temporary var?
I don't really care, personally. Advice from a JS person would be good here too.
r=me whichever way you go. Thanks!
Updated•16 years ago
|
Attachment #369998 -
Flags: superreview?(brendan)
Comment 17•16 years ago
|
||
Review ping because this looks like it will help the blocker bug 480494.
Updated•16 years ago
|
Attachment #369998 -
Flags: superreview?(brendan) → superreview-
Comment 18•16 years ago
|
||
Comment on attachment 369998 [details] [diff] [review]
Patch, v1.4 (1.3 + merge + more refactoring)
The int32* start / startPtr stuff is goofy, it requires more work in both callers and callee to synthesize the starting position some of the time, than if the caller always passed a jsint start param.
Measure code size before prematurely optimizing at the price of code complexity. I bet getting rid of the indirection and maybe-null param testing would save more space than obligating the caller to pass start always would cost.
/be
Comment 19•16 years ago
|
||
I was less worried about space than having to do some work needlessly, but there too I should measure if I really want to push that issue. I'll write up the simpler patch.
Comment 20•16 years ago
|
||
Attachment #369998 -
Attachment is obsolete: true
Attachment #375567 -
Flags: superreview?(brendan)
Attachment #375567 -
Flags: review?(bent.mozilla)
| Assignee | ||
Comment 21•16 years ago
|
||
Comment on attachment 375567 [details] [diff] [review]
Patch, v1.5
This looks good to me.
Attachment #375567 -
Flags: review?(bent.mozilla) → review+
Comment 22•16 years ago
|
||
Comment on attachment 375567 [details] [diff] [review]
Patch, v1.5
>+JSValPtrToString(JSContext *cx, jsval *vp)
> {
> JSObject *obj;
> JSString *str;
>
> if (JSVAL_IS_OBJECT(*vp)) {
> obj = JSVAL_TO_OBJECT(*vp);
> if (!obj)
> return ATOM_TO_STRING(cx->runtime->atomState.nullAtom);
> if (!OBJ_DEFAULT_VALUE(cx, obj, JSTYPE_STRING, vp))
> return NULL;
> }
> if (JSVAL_IS_STRING(*vp))
>@@ -267,16 +263,29 @@ ArgToRootedString(JSContext *cx, uintN a
> str = js_NumberToString(cx, *JSVAL_TO_DOUBLE(*vp));
> } else if (JSVAL_IS_BOOLEAN(*vp)) {
> return ATOM_TO_STRING(cx->runtime->atomState.booleanAtoms[
> JSVAL_TO_BOOLEAN(*vp)? 1 : 0]);
Fix the else after return by setting str = ATOM_TO_STRING(...) and not returning early.
Space nit before ? but really: don't use (x ? 1 : 0) where x is a boolean (0 or 1) already.
> } else {
> JS_ASSERT(JSVAL_IS_VOID(*vp));
> return ATOM_TO_STRING(cx->runtime->atomState.typeAtoms[JSTYPE_VOID]);
> }
>+ return str;
Since *vp is never set, this function should take a jsval v parameter, not jsval *vp; therefore it should be named ValueToString.
>+}
>+
>+static JSString *
>+ArgToRootedString(JSContext *cx, uintN argc, jsval *vp, uintN arg)
>+{
>+ JSString *str;
>+
>+ if (arg >= argc)
>+ return ATOM_TO_STRING(cx->runtime->atomState.typeAtoms[JSTYPE_VOID]);
>+
>+ vp += 2 + arg;
>+ str = JSValPtrToString(cx, vp);
> if (str)
> *vp = STRING_TO_JSVAL(str);
> return str;
> }
>
> /*
> * Forward declarations for URI encode/decode and helper routines
> */
>@@ -635,16 +644,27 @@ JSClass js_StringClass = {
> str = JSVAL_TO_STRING(vp[1]); \
> } else { \
> str = NormalizeThis(cx, vp); \
> if (!str) \
> return JS_FALSE; \
> } \
> JS_END_MACRO
>
>+#define ARG_TO_ROOTED_STRING(cx,argc,vp,arg,str) \
>+ JS_BEGIN_MACRO \
>+ if (argc > arg && JSVAL_IS_STRING(vp[2+arg])) { \
arg < argc for the first condition, to match number-line order used in ArgToRootedString.
Don't add macros like this now, even though NORMALIZE_THIS is right above -- do use an inline static.
>+inline jsint
>+str_indexOf_common(const jschar *text, jsint textlen,
>+ const jschar *pat, jsint patlen,
>+ jsint start)
>+{
>+ jsint i, lastMatchPos, index;
>+
>+ /* last possible match position */
>+ lastMatchPos = textlen - patlen;
>+
>+ /* any match attempts before this position are known to fail, skip them */
>+ if (start < 0)
>+ start = 0;
>+
>+ /* empty pattern always matches, but clamp to [0 .. textlen] */
Nit: comments shown above should be capitalized sentences (so full stops at end).
>+ if (patlen == 0) {
>+ if (start > textlen)
>+ return textlen;
>+
>+ return start;
>+ }
>+
>+ if (patlen > textlen || start > lastMatchPos)
lastMatchPos means "greatest possible match position (index)" but it sounds like "last" in the "previous" sense. How about startmax to keep names consistently short and nocamelcaps?
> static JSBool
> str_indexOf(JSContext *cx, uintN argc, jsval *vp)
> {
> JSString *str, *str2;
> const jschar *text, *pat;
>+ jsint textlen, patlen, start;
>+ jsdouble d;
>+
>+ NORMALIZE_THIS(cx, vp, str);
> text = JSSTRING_CHARS(str);
>+ textlen = jsint(JSSTRING_LENGTH(str));
>+
>+ ARG_TO_ROOTED_STRING(cx, argc, vp, 0, str2);
> pat = JSSTRING_CHARS(str2);
>+ patlen = jsint(JSSTRING_LENGTH(str2));
>+
>+ start = 0;
> if (argc > 1) {
>+ if (JSVAL_IS_INT(vp[3])) {
>+ start = JSVAL_TO_INT(vp[3]);
>+ } else {
>+ d = js_ValueToNumber(cx, &vp[3]);
>+ if (JSVAL_IS_NULL(vp[3]))
>+ return JS_FALSE;
>+ d = js_DoubleToInteger(d);
>+ if (d >= textlen)
>+ start = textlen;
>+ else if (d > 0)
>+ start = jsint(d);
>+ }
>+ }
>+
>+ *vp = INT_TO_JSVAL(str_indexOf_common(text, textlen, pat, patlen, start));
This patch adds redundant tests to the str_indexOf native, e.g. clamping start to textlen, but it seems necessary given the double domain vs. jsint parameter types. Just noting en passant.
>+#ifdef JS_TRACER
>+static int32 FASTCALL
>+str_indexOfWithStart_tn(JSString *str, JSString *str2, int32 start)
>+{
>+ const jschar *text, *pat;
>+ jsint textlen, patlen;
>+
>+ text = JSSTRING_CHARS(str);
>+ textlen = jsint(JSSTRING_LENGTH(str));
>+ pat = JSSTRING_CHARS(str2);
>+ patlen = jsint(JSSTRING_LENGTH(str2));
>+
>+ return int32(str_indexOf_common(text, textlen, pat, patlen, jsint(start)));
>+}
>+
>+static int32 FASTCALL
>+str_indexOf_tn(JSString *str, JSString *str2)
>+{
>+ return str_indexOfWithStart_tn(str, str2, 0);
>+}
>+#endif
No clamping here, so the slightly redundancy for str_indexOf seems a good trade-off in favor of the traceable natives.
>+
>+JS_DEFINE_TRCINFO_2(str_indexOf,
>+ (3, (static, INT32, str_indexOfWithStart_tn, THIS_STRING, STRING, INT32, 0, 0)),
>+ (2, (static, INT32, str_indexOf_tn, THIS_STRING, STRING, 0, 0)))
>+
>+inline jsint
>+str_lastIndexOf_common(const jschar *text, jsint textlen,
>+ const jschar *pat, jsint patlen,
>+ jsint start)
>+{
>+ jsint i, lastMatchPos;
>+
>+ /* last possible match position; is textlen for empty pattern */
>+ lastMatchPos = textlen - patlen;
>+
>+ /* any match attempts after this position are known to fail, skip them */
>+ if (start > lastMatchPos)
>+ start = lastMatchPos;
>+
>+ /* empty pattern always matches, but clamp to [0 .. textlen] */
>+ if (patlen == 0) {
>+ if (start < 0)
>+ return 0;
>+
>+ return start;
>+ }
Similar comments to above apply here.
Will stamp r+ on a revised patch, no need to re-solicit bent review.
/be
Comment 23•16 years ago
|
||
> Fix the else after return by setting str = ATOM_TO_STRING(...) and not
> returning early.
>
> Space nit before ? but really: don't use (x ? 1 : 0) where x is a boolean
> (0 or 1) already.
> Since *vp is never set, this function should take a jsval v parameter, not
> jsval *vp; therefore it should be named ValueToString.
Heh, these are the parts of the patch I inherited (not an excuse). Will fix.
> arg < argc for the first condition, to match number-line order used in
> ArgToRootedString.
>
> Don't add macros like this now, even though NORMALIZE_THIS is right above
> -- do use an inline static.
I should've listened to that little voice nagging me about this.
Mind if I switch the other macro too?
> Nit: comments shown above should be capitalized sentences (so full stops at
> end).
Done.
> lastMatchPos means "greatest possible match position (index)" but it sounds
> like "last" in the "previous" sense. How about startmax to keep names
> consistently short and nocamelcaps?
Done. This file seems to mix nocamelcaps, no_camel_caps and camelCaps. It looked like nocamelcaps was only used for ***len. Should've looked around a bit more. I couldn't come up with a good name, but startmax works for me, though it's really the index after which to bail matching because it's known to fail.
Any suggestion for a name for the largest index for which str.lastIndexOf(pat) could possibly match? (The bail index is of course 0).
> This patch adds redundant tests to the str_indexOf native, e.g. clamping start
> to textlen, but it seems necessary given the double domain vs. jsint parameter
> types. Just noting en passant.
Yeah. No clever way around this eh?
> No clamping here, so the slightly redundancy for str_indexOf seems a good
> trade-off in favor of the traceable natives.
Well, in the old code there was no clamping at all unless a start position was specified. Now the common code always has to clamp. I could move the clamping out into the callers that need it at the cost of a bit of code duplication. Yea? Nay?
Comment 24•16 years ago
|
||
(In reply to comment #23)
> Mind if I switch the other macro too?
If it fits in the patch, i.e., doesn't increase the diffstats too badly, then I do not mind.
> Any suggestion for a name for the largest index for which str.lastIndexOf(pat)
> could possibly match? (The bail index is of course 0).
startmax? :-P
> > No clamping here, so the slightly redundancy for str_indexOf seems a good
> > trade-off in favor of the traceable natives.
>
> Well, in the old code there was no clamping at all unless a start position was
> specified. Now the common code always has to clamp. I could move the clamping
> out into the callers that need it at the cost of a bit of code duplication.
> Yea? Nay?
Oh sure -- that would make the traceable native with the fewest args the fastest. We want that, we do not mind a bit of code dup in general, and it can give perf boosts due to reduced minimal-cost path wins and also better branch prediction opportunities.
/be
Comment 25•16 years ago
|
||
> Since *vp is never set, this function should take a jsval v parameter, not
> jsval *vp; therefore it should be named ValueToString.
It looks like |OBJ_DEFAULT_VALUE| needs vp.
Comment 26•16 years ago
|
||
Is the |JSVAL_IS_STRING(...)| optimization in those macros worth it? I guess it optimizes for a common case and lets you avoid null-testing str when you know it can't be. I'm leaning towards dropping it though.
Comment 27•16 years ago
|
||
(In reply to comment #25)
> > Since *vp is never set, this function should take a jsval v parameter, not
> > jsval *vp; therefore it should be named ValueToString.
>
> It looks like |OBJ_DEFAULT_VALUE| needs vp.
Good point.
On the JSVAL_IS_STRING inline optimization, can you measure a difference? Micro-benchmark that traces or otherwise?
/be
Comment 28•16 years ago
|
||
js1_8_1/trace/trace-test.js
http://hg.mozilla.org/tracemonkey/rev/61892f57b46a
Flags: in-testsuite+
Comment 29•16 years ago
|
||
cvsroot/mozilla/js/tests/js1_8_1/trace/trace-test.js,v <-- trace-test.js
new revision: 1.14; previous revision: 1.13
/cvsroot/mozilla/js/tests/shell.js,v <-- shell.js
Comment 30•16 years ago
|
||
Given that we now trace fastnatives, do we still want to make indexof/lastindexof traceable natives?
Comment 31•15 years ago
|
||
Comment on attachment 375567 [details] [diff] [review]
Patch, v1.5
No sr? in js/src, and stale bug/patch. Please resolve if you can.
/be
Attachment #375567 -
Flags: superreview?(brendan)
Comment 32•15 years ago
|
||
Hrm, so I've unbitrotted the tracing bits of the patch, and there are the expected speed-ups on micro benchmarks, but is it worth the extra code?
Does dromaeo still need this? Does anyone know of other benchmarks or real world uses that would benefit?
Comment 33•15 years ago
|
||
> Does dromaeo still need this?
Define "need"? See comment 30.
As for real-world or other benchmarks... http://www.multivax.com/last_question.html
Updated•14 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•