Closed Bug 503952 Opened 11 years ago Closed 10 years ago

replace JSStringBuffer/JSCharBuffer with JSTempVector

Categories

(Core :: JavaScript Engine, defect)

x86
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
Tracking Status
status1.9.2 --- beta1-fixed

People

(Reporter: luke, Assigned: luke)

References

Details

Attachments

(6 files, 16 obsolete files)

67.93 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
28.73 KB, patch
sayrer
: review+
Details | Diff | Splinter Review
58.58 KB, patch
brendan
: review+
Details | Diff | Splinter Review
33.59 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
45.27 KB, patch
brendan
: review+
Details | Diff | Splinter Review
839 bytes, patch
Waldo
: review+
Details | Diff | Splinter Review
JSStringBuffer like the new JSTempVector except that (1) JSStringBuffer is specific to jschars while JSTempVector is generic, and (2) JSTempVector is a C++-style container, so when it is an auto variable, the destructor call is implicit and thus less bug-prone.  I also plan to update JSTempVector as needed.

JSStringBuffer is only used in json.cpp and jsxml.cpp and JSCharBuffer only in jsstr.cpp.  JSTempVector should have similar performance, but its template-y inlined-ness may yield a small speedup.

I'll break the update into separate patches for each piece updated.
fyi: performance testing showed speedups for the same JSON benchmarks Brendan and I used on json.cpp last time.
Attached patch WIP v.1 (obsolete) — Splinter Review
This patch:
 - replaces JSStringBuffer in jscon.cpp.  On sayrer's json benchmarks (financial, test-pattern, tinderbox), this runs 4.5%, 17%, and 30% faster, respectively.  Some (most?) of this speedup is related to removing indirection by calling JSTempVector directly and calling the user's callback only once.
 - adds the ability for JSTempVector<X,N> to store first N X's in-place before switching to dynamic memory.  Most benchmarks don't show any difference because they use huge lists, but if you shrink the list size you can see a definite speedup. (e.g. in peacekeeper's array join, changing the array sizes to 4, 8, and 16 (instead of 10, 100, 500) you see a 20% speedup.

Because with bugs like bug 503981 flying around, who *doesn't* want to use JSTempVector! :-/
Attached patch WIP v.2 (obsolete) — Splinter Review
Attachment #388371 - Attachment is obsolete: true
This patch replaces JSStringBuffer in jsxml.cpp.

I know E4X is on its way to self-hosted, so perf isn't important, but as a general note: there are a number of uses of JSStringBuffer which postpone OOM discovery until after a batch of operations.  JSTempVector requires testing every operation.  This is more verbose and perhaps slower because of branching (branch prediction may make it go away, I don't know).  This suggests adding a new "fail" member which is set when any JSTempVector operation returns false.  For now I'll leave it as is, but it might be useful; I'll see if it comes up again and I'd be happy to hear comments.
Attached patch whole thing (obsolete) — Splinter Review
This patch replaces JSCharBuffer, used only in jsstr.cpp.  I ran some simple performance tests of decodeURI(encodeURI(s)) for tiny, medium, and long strings and got speedups of 8%, 5%, and 1%, respectively.

I also think JSCharBuffer is a good name for the typedef, in jsprvtd.h:

  typedef JSVector<jschar, 32> JSCharBuffer

(since "StringBuffer" might imply an automatic terminating '\0'), so I substituted JSCharBuffer for JSTempVector throughout.  This touch most files so I included the sum of all the patches so far.
Attachment #388569 - Attachment is obsolete: true
Attachment #388725 - Attachment is obsolete: true
(In reply to comment #4)
> I know E4X is on its way to self-hosted, so perf isn't important, but as a
> general note: there are a number of uses of JSStringBuffer which postpone OOM
> discovery until after a batch of operations.  JSTempVector requires testing
> every operation.  This is more verbose and perhaps slower because of branching
> (branch prediction may make it go away, I don't know).  This suggests adding a
> new "fail" member which is set when any JSTempVector operation returns false. 

See bug 489419 and follow the links. There's a trap here, which is if you sometimes skip the error checks, even if the abstraction is encapsulated well enough to cope, you may skip the "final" check, or *a* check before using the now-malformed data (transient OOM, recover with undetected dataloss).

Nanojit is trying via its new Allocator a custom OOM handling design but it depends on details of the application (compiling, which uses temporaries but can afford to "leak" them if the compiler finishes soon enough and frees allocated memory, indeed skipping running destructors for soon-to-be-dead intermediates on error is a win).

There may be local reasoning like the Nanojit Allocator (an experiment still in progress) but maybe not. If the only use-case is only jsxml.cpp then my advice is don't bother.

/be
(In reply to comment #6)
> See bug 489419 and follow the links. There's a trap here, which is if you
> sometimes skip the error checks, even if the abstraction is encapsulated well
> enough to cope, you may 

I mean accidentally here, but the whole issue is about API convenience vs. error handling. We're dealing with tendencies or temptations to skip error checks. Or just skipping checks correctly, but then the code is refactored and the final check is not early enough.

/be

> skip the "final" check, or *a* check before using the
> now-malformed data (transient OOM, recover with undetected dataloss).
Attached patch whole thing v.2 (obsolete) — Splinter Review
Replaced JSStringBuffer in jsscan.h/.cpp, removed JSStringBuffer entirely.

A general comment: I had to put jsvector.h in jsscan.h because JSTokenStream stores JSTempVector by value, requiring a complete type.  This goes against my preference to keep template-y headers out of common headers to avoid the template-inclusion explosion I've seen elsewhere.  Just this once is ok, but if this should happen more, we might need to stratify headers a bit like so:

jsscan.h   // contains data type declarations and API
   ^
   |
jsscanX.h  // (for some suffix X like 'full') contains data type definitions
   ^       // included by the few .cpps that actually use the data types
   |
jsscan.cpp, jsparse.cpp, jsregexp.cpp, jsfun.cpp, jsemit.cpp

, for each header that contains data type definitions that contain templates.  This 2-level scheme can be good for non-template reasons too, as it can allow the top-level headers (which are widely included) to be significantly shrunk by moving more than just data type definitions but all manor of mostly-local declarations to the X.h header.  Anyhow, not necessary now.

In reply to comment 7: Agreed; the "batched error" style does seem to require more mental control flow analysis to judge correct error handling.
Attachment #388764 - Attachment is obsolete: true
Comment on attachment 388934 [details] [diff] [review]
whole thing v.2

> JS_Stringify(JSContext *cx, jsval *vp, JSObject *replacer, jsval space,
>              JSONWriteCallback callback, void *data)
> {
>     CHECK_REQUEST(cx);
>-    return js_Stringify(cx, vp, replacer, space, callback, data);
>+    JSCharBuffer cb(cx);
>+    JSBool ok = js_Stringify(cx, vp, replacer, space, cb);
>+    if (!ok)
>+        return JS_FALSE;

No need for single-use ok, but if there were, it could be bool

Anent that, use false in new code, not JS_FALSE -- fine with JSBool return type, shorter and less ugly, etc. Ditto true/JS_TRUE, obviously!

>+static inline JSBool
>+BufferToString(JSContext *cx, JSCharBuffer &cb, jsval *rval)

Can cb be a const JSCharBuffer& here?

> {
>-    size_t length = buf.size() - 1;
>-    jschar *chars = buf.extractRawBuffer();
>-    JSString *str = js_NewString(cx, chars, length);
>-    if (!str) {
>-        JS_free(cx, chars);
>+    JSString *str = js_NewStringFromCharBuffer(cb);
>+    if (!str)
>         return JS_FALSE;
>-    }
>     *rval = STRING_TO_JSVAL(str);
>     return JS_TRUE;
> }

New-enough code to use bool/true/false, we are mixing to convert as we go and make incremental progress. Could have a flag day, ideally based on a dehydra-generated patch, but need some care due to ABI requirement for JSBool not bool, etc. Anyway, want to spread the meme.

>-        if (!(ok = buf.pushBack(chars, chars + charlen)))
>+        if (!(ok = cb.pushBack(chars, charlen)))

Missed this nit earlier: we avoid nesting assignment in 'if' conditions unless there's a compelling reason (macro madness, probably not necessary now with C++ and inline).

>             goto done;
>         if (index + 1 != length) {
>-            if (!(ok = buf.pushBack(',')) || !(ok = buf.pushBack(' ')))
>+            if (!(ok = cb.pushBack(',')) || !(ok = cb.pushBack(' ')))
>                 goto done;
>         } else if (hole) {
>-            if (!(ok = buf.pushBack(',')))
>+            if (!(ok = cb.pushBack(',')))
>                 goto done;

Here the nested ok = looks better -- why waste extra lines on ok = ...; and split if-goto sequences instead of || -- but from Igor's work optimizing jsinterp.cpp we found it was even better to make an error: label that sets ok = false; and then does a goto done, or inline customized return failure sequence, at the bottom of the function.

All the ok = ... source noise, even minimized via nested assignments, adds up and can cost optimized code quality.

I still can't believe STL namers of things used "pushBack" for "append" :-/.

>   make_string:
>-    if (!(ok = BufferToString(cx, buf, vp)))
>+    if (!(ok = BufferToString(cx, cb, vp)))
>         goto done;
> 
>   done:

No need for the last if-goto here, just set ok = BufferToString(cx, cb, vp) and fall through the done: label.

>     if (!initiallySharp)
>         js_LeaveSharpObject(cx, NULL);
>     return ok;
> }

Maybe Igor should have a look.

/be
(In reply to comment #9)
> >+static inline JSBool
> >+BufferToString(JSContext *cx, JSCharBuffer &cb, jsval *rval)
> 
> Can cb be a const JSCharBuffer& here?

js_NewStringFromCharBuffer actually yoinks the buffer from within 'cb'.

> I still can't believe STL namers of things used "pushBack" for "append" :-/.

I'll just rename it.  I've already "fixed" a few other STL annoyances (no ranged push_back, requiring verbose v.insert(v.end(), src.begin(), src.end()), no "do N times" version of push_back/pop_back) and the identifier style is different.  STL users won't be unduly jarred.
P.S., after I address comment 9, I'm going to post individual review requests for each area affected.
Attachment #388934 - Attachment is obsolete: true
Attachment #389037 - Flags: review?(jwalden+bmo)
Attachment #389038 - Flags: review? → review?(brendan)
Attachment #389039 - Flags: review? → review?(bclary)
Attachment #389044 - Flags: review?(brendan)
Attachment #389043 - Flags: review? → review?(jwalden+bmo)
Attachment #389039 - Flags: review?(bclary) → review?(igor)
Comment on attachment 389039 [details] [diff] [review]
changes to jsxml.cpp (patch on top of json-changes)

>diff --git a/js/src/jsxml.cpp b/js/src/jsxml.cpp
>+static inline bool
>+append(JSCharVector &cb, JSString &str) {

Use JSString *, not JSString &, to emphasis that JSString is allocated on the heap.

>+static bool
>+AppendAttributeValue(JSContext *cx, JSCharVector &cb, JSString *valstr)
>+{
>+    JS_ASSERT(valstr != 0);
>+    if (!cb.append('='))
>+        return false;
>     valstr = js_EscapeAttributeValue(cx, valstr, JS_TRUE);
>-    if (!valstr) {
>-        if (STRING_BUFFER_OK(sb)) {
>-            free(sb->base);
>-            sb->base = STRING_BUFFER_ERROR_BASE;
>-        }
>-        return;
>-    }
>-    js_AppendJSString(sb, valstr);
>+    return valstr && append(cb, *valstr);
> }

This has to be return !valstr || append(cb, *valstr).

>+static JSString *
>+EscapeElementValue(JSContext *cx, JSCharVector &cb, JSString *str)
>+{
>+    size_t length;
>+    const jschar *start;
>+
>+    for (const jschar *cp = start, *end = start + length; cp != end; ++cp) {
>+        jschar c = *cp;
>+        if (c == '<') {
>+            if (!append(cb, js_lt_entity_str))
>+                return NULL;
>+        } else if (c == '>') {
>+            if (!append(cb, js_gt_entity_str))
>+                return NULL;
>+        } else if (c == '&') {
>+            if (!append(cb, js_amp_entity_str))
>+                return NULL;
>+        } else {
>+            if (!cb.append(c))
>+                return NULL;
>+        }
>+    }

Pre-existing nit: use switch, not ifs, to choose over c for better readability. 

>+static JSString *
>+EscapeAttributeValue(JSContext *cx, JSCharVector &cb, JSString *str,
>                      JSBool quote)
> {
...
>+    for (const jschar *cp = start, *end = start + length; cp != end; ++cp) {
>+        jschar c = *cp;
>+        if (c == '"') {
>+            if (!append(cb, js_quot_entity_str))
>+                return NULL;
>+        } else if (c == '<') {
>+            if (!append(cb, js_lt_entity_str))
>+                return NULL;
>+        } else if (c == '&') {
>+            if (!append(cb, js_amp_entity_str))
>+                return NULL;
>+        } else if (c == '\n') {
>+            if (!append(cb, "&#xA;"))
>+                return NULL;
>+        } else if (c == '\r') {
>+            if (!append(cb, "&#xD;"))
>+                return NULL;
>+        } else if (c == '\t') {
>+            if (!append(cb, "&#x9;"))
>+                return NULL;
>+        } else {
>+            if (!cb.append(c))
>+                return NULL;
>+        }
>+    }

Pre-existing nit: use switch, not ifs, to choose over c for better readability. 

>@@ -2556,96 +2492,83 @@ namespace_match(const void *a, const voi
...
>       case JSXML_CLASS_LIST:
>         /* ECMA-357 10.2.2. */
>         XMLArrayCursorInit(&cursor, &xml->xml_kids);
>         i = 0;
>         while ((kid = (JSXML *) XMLArrayCursorNext(&cursor)) != NULL) {
>-            if (pretty && i != 0)
>-                js_AppendChar(&sb, '\n');
>-
>-            kidstr = XMLToXMLString(cx, kid, ancestorNSes, indentLevel);
>+            if (pretty && i != 0) {
>+                if (!cb.append('\n'))
>+                    return NULL;
>+            }
>+
>+            JSString *kidstr = XMLToXMLString(cx, kid, ancestorNSes, indentLevel);
>             if (!kidstr)
>-                break;
>-
>-            js_AppendJSString(&sb, kidstr);
>+                return NULL;

This return on failure skips mandatory call to XMLArrayCursorFinish.
Attachment #389039 - Flags: review?(igor) → review-
(In reply to comment #17)
> >+static bool
> >+AppendAttributeValue(JSContext *cx, JSCharVector &cb, JSString *valstr)
> >+{
> >+    JS_ASSERT(valstr != 0);
> >+    if (!cb.append('='))
> >+        return false;
> >     valstr = js_EscapeAttributeValue(cx, valstr, JS_TRUE);
> >-    if (!valstr) {
> >-        if (STRING_BUFFER_OK(sb)) {
> >-            free(sb->base);
> >-            sb->base = STRING_BUFFER_ERROR_BASE;
> >-        }
> >-        return;
> >-    }
> >-    js_AppendJSString(sb, valstr);
> >+    return valstr && append(cb, *valstr);
> > }
> 
> This has to be return !valstr || append(cb, *valstr).

I'm not sure why you think so: this new function returns 'false' on failure, and the original function fails if valstr == NULL.  Your suggested change would return 'true'.

But now that you point out this function, I see that I forgot to change the call-sites to check the return value!

> >+static inline bool
> >+append(JSCharVector &cb, JSString &str) {
> 
> Use JSString *, not JSString &, to emphasis that JSString is allocated on the
> heap.

Perhaps you're referring to a SpiderMonkey pre-existing C++ style guideline, but, in general C++ style, '&' doesn't mean "not on the heap", it means "this parameter is non-null and doesn't transfer ownership".  Hence, it represents a stronger semantic guarantee to the reader than a pointer.
(In reply to comment #18)
> (In reply to comment #17)
> > >+static bool
> > >+AppendAttributeValue(JSContext *cx, JSCharVector &cb, JSString *valstr)
> > >+{
> > >+    JS_ASSERT(valstr != 0);

Generally don't assert non-null, you'll get a clear crash quickly enough -- but do assert just valstr, not valstr != 0, valstr != NULL, or !!valstr if you have a good reason to assert non-null.

> > >+static inline bool
> > >+append(JSCharVector &cb, JSString &str) {
> > 
> > Use JSString *, not JSString &, to emphasis that JSString is allocated on the
> > heap.
> 
> Perhaps you're referring to a SpiderMonkey pre-existing C++ style guideline,
> but, in general C++ style, '&' doesn't mean "not on the heap", it means "this
> parameter is non-null and doesn't transfer ownership".  Hence, it represents a
> stronger semantic guarantee to the reader than a pointer.

We're still not sure in the style guide about references. For GC-things, there is no transfer of ownership but there also no transfer of rootedness, so you're right that a ref wins there.

For non-null guarantees, we generally avoid overloading NULL for poor-man's "Maybe" or "option type" use cases. Those are uncommon, so an accidental null is usually due to a failure to OOM check, or possibly a failure to initialize at all. But these are exceedingly rare in our experience (I'm talking about mainly long term maintainers: igor, mrbkap, shaver, myself) -- dumb bring-up bugs. Not to say cost-free, just not a huge driver toward refs over ptrs.

References have their uses. They are certainly a different model in other ways (init vs. assignment, this tripped up dmandelin the other month). Const refs are pretty cool (I recall getting the cfront 0.9 magtape while still at UIUC and liking the synergy with operators for complex number types, etc.)

But for pellucid call expressions where foo(inParam, &outParam) telegraphs the direction and effect, compared to bar(inParam, amIanInOrAnOutParam) which leads to some small increase in cross-referencing, and a reduction in easy grep for canonical name in out param context, we've been sticking with pointer types in some new (C++ era) code.

We should try to converge on the best style in the wiki style page; newsgroup first if needed.

/be
(In reply to comment #18)
> I'm not sure why you think so: this new function returns 'false' on failure,
> and the original function fails if valstr == NULL.  Your suggested change would
> return 'true'.

You are right, that was a bogus comment.
(In reply to comment #19)
> For non-null guarantees, we generally avoid overloading NULL for poor-man's
> "Maybe" or "option type" use cases.

Perhaps you have already memoized all the places where pointer means "maybe", but I still discover them with frequency.  Since most functions are not documented as to whether the parameter/return value is nullable, I generally have to perform a manual backwards data-flow analysis.  E.g., take that JS_ASSERT you advised me to remove; the reason I added it was because AppendAttributeValue is called with GetURI, which returns GetSlotString, which may return NULL!  Since there are many unchecked uses of GetSlotString's return value, it appears the NULL doesn't appear in practice.  A '&' on AppendAttributeValue's in-param would have prevented this detective story.

> But for pellucid call expressions where foo(inParam, &outParam) telegraphs the
> direction and effect, compared to bar(inParam, amIanInOrAnOutParam) which leads
> to some small increase in cross-referencing, and a reduction in easy grep for
> canonical name in out param context, we've been sticking with pointer types in
> some new (C++ era) code.

Is what you're saying that '&' is the signal for "out param" while '*' is the signal for "in param"?  Isn't this accomplished by const-ness or the lack thereof?

Ultimately, one of the problems I've felt is that we have [at least] a 3x2 matrix of contractual meanings (nullability x ownership-transfer x scalar-or-array) which we are trying to project on {pointer, reference}.  One option I've used before is to synthesize more parameter semantics with tiny smart-pointer types.  E.g.

nonnull_xfer<Foo> foo(nullable_xfer<Bar> pc, nullable_ref<Baz>);

Furthermore, the pointers can assert on things like "forgot to transfer" or "passed NULL for non-null".  But this is probably overkill.  Plus (and I'm not up to date on this) compilers tend to pass structs on the stack, even if they're word-sized.
(In reply to comment #21)
> (In reply to comment #19)
> > For non-null guarantees, we generally avoid overloading NULL for poor-man's
> > "Maybe" or "option type" use cases.
> 
> Perhaps you have already memoized all the places where pointer means "maybe",
> but I still discover them with frequency.

Please list any that have a leading JSContext *cx parameter -- that param plus a pointer return type should say "fallible, null means failure either reported promptly (OOM, slow script killer dialog) or exception left pending in cx".

I know of a few, but they're not all over. We've started putting any cx param last, not first, for infallible predicates that (sometimes for only bad old API reasons) need a cx.

/be
(In reply to comment #21)
> Is what you're saying that '&' is the signal for "out param" while '*' is the
> signal for "in param"?  Isn't this accomplished by const-ness or the lack
> thereof?

No, because the const is in the declaration or definition, not in the call expression.

/be
Appending '\0' lazily doesn't work, because JS strings can contain a null character, so |[{toString: function() { return '\0'; } }].toString()| will produce the wrong value.  That's a substantial issue, so I'm going to cut off patch analysis at least until that issue's addressed -- have too many other things I need to do to eagerly spend time on this.
Attachment #389037 - Flags: review?(jwalden+bmo) → review-
Attachment #389043 - Flags: review?(jwalden+bmo) → review-
(In reply to comment #24)

Good catch!  However, the issue is not substantial; it only requires removing the |cb.back() != '\0'| clause from the condition in js_NewStringFromCharBuffer (and updating the interface description in jsstr.h) since all callers expect the mandatory '\0' to be appended by js_NewStringFromCharBuffer.
Attachment #389037 - Flags: review- → review?(jwalden+bmo)
Attachment #389043 - Flags: review- → review?(jwalden+bmo)
Comment on attachment 389037 [details] [diff] [review]
changes to JSTempVector and uses associated with Array.join

>diff --git a/js/src/jsvector.h b/js/src/jsvector.h

>-    bool pushBack(const T &);
>-    template <class U> bool pushBack(const U *begin, const U *end);
>+    bool append(const T &);
>+    bool appendN(const T &, size_t n);
>+    template <class U> bool append(const U *begin, const U *end);
>+    template <class U> bool append(const U *begin, size_t length);

I made a mistake when I reviewed the first jsvector patch, so to be clear pushBack is not supportive of continuing that mistake.

We're not hacking on or in Perl, so there should be one way to do it -- not append(start, sentinel) and append(start, length), just one or the other.  The former provokes a violent reaction from me (and should have with pushBack, I have no idea why I missed it as such), and the latter is how we've always done it, the mistake of pushBack excepted.  Just because STL may do it that way does not make it better; consider push_back!  I want opinions from other JS hackers on this before this proceeds further; if I am forced to stomach two forms, one of which I detest, I require that I be dragged kicking and screaming into it.
(In reply to comment #26)
>
> I want opinions from other JS hackers on this before this proceeds further; 

I may not count :)

But I claim it doesn't matter. At all.
(In reply to comment #26)
> We're not hacking on or in Perl, so there should be one way to do it -- not
> append(start, sentinel) and append(start, length), just one or the other.

This isn't potayto potahto.  Having used both forms throughout all these patches, it is often the case that it is natural/cleaner/faster to write it in one form or the other.  Furthermore, I'm not sure "there should be only one way to X" can be used as an axiom when evaluating a design.  Especially in C++.
Even the Zen of Python says only

There should be one-- and preferably only one --obvious way to do it.

and it waffles:

Although that way may not be obvious at first unless you're Dutch.

Waldo: please choose a better hill to die on. I'll stand with you shouting "Get to the choppa!" like Arnie S. in Predator when the time comes. Now here or now.

/be
I think they ought to have different names, though:

  pushRange(begin, end);
  pushUntil(begin, sentinel);
(In reply to comment #30)
> I think they ought to have different names, though:
> 
>   pushRange(begin, end);
>   pushUntil(begin, sentinel);

Well, they both append a range ((begin, length) vs. (begin, end)).  Adding a further-distinguishing suffix to "append" (like "BeginEndRange" and "BeginLengthRange") seems reminiscent of ye ol' MS apps-Hungarian tradition of prefixing a variable name with its type.
The "sentinel" idea is a red herring, this is the old [begin, end) half-open range alternative parameterization: [begin, begin+length).

JS has substr (from Perl) for the latter, substring (from Java 1.0) and slice (slice has extra features: negative indexes a la Python). substr/substring/slice don't exactly scream the differences among these, but they each were copied from a nearby language whose fans know what they each mean.

How about appendSlice(begin, end) vs. appendRange(begin, length)? Clear as mud!

/be
(In reply to comment #32)
Those make sense.  To justify ignoring our friend (and sometimes foe) overload-resolution, do you think there are really so many situations where, when you see:

vec.append(p, q)

it isn't clear whether q is one-past-the-end or a length?
I think the struggle to find pithy names suggests (but does not prove) that using overloading may win. However, is the overloading under the template type always unambiguous? I guess it must be due to pointer mode usage.

> it isn't clear whether q is one-past-the-end or a length?

Re: "end", bite your tongue. The "end" is the fencepost in slices, not the "last" item or its index. end or fencepost or limit :: last or max.

/be
(In reply to comment #34)
> Re: "end", bite your tongue.

Excuse me, "past-the-end" is standard C++ terminology (literally, 24.2.6).
As I clarified (in the finest sense that a politician uses it!) to Brendan in email, when I said sentinel I really meant "last element to be added", just misread slightly when looking at the patch.  :-)  Apologies for that...

The original case where I noticed this was in one of the conversions in attachment 389043 [details] [diff] [review] where a length argument was converted to an item argument, and it looked a lot like some scrambling had happened in the translation.  Only after I noticed one of the values in the argument was of pointer type did it start to make sense what had been done.  So that's one case where it's unclear -- maybe one that only matters once, but it indisputably exists.

I have no problem with appendSlice(begin, end) and a simple append(start, len).  I don't see the added typing of "Slice" as a sufficient incentive to keep the names the same, even if it may often be the case that the difference is apparent from the arguments -- you still have to read the arguments to be sure, reasoning backward to the call's effect rather than forward.
(In reply to comment #36)
> As I clarified (in the finest sense that a politician uses it!) to Brendan in
> email, when I said sentinel I really meant "last element to be added", just
> misread slightly when looking at the patch.  :-)  Apologies for that...
> 
> The original case where I noticed this was in one of the conversions in
> attachment 389043 [details] [diff] [review] where a length argument was converted to an item argument,
> and it looked a lot like some scrambling had happened in the translation.  Only
> after I noticed one of the values in the argument was of pointer type did it
> start to make sense what had been done.  So that's one case where it's unclear
> -- maybe one that only matters once, but it indisputably exists.

What you found is code that I wrote before I added the append(T*, size_t) overload.  I should change it to use the length argument directly.  This goes back to what I said earlier about it sometimes being more natural to use one form or the other.  Symmetrically, there were several cases where, by using the append(T*, T*) overload, I was able simplify logic.

> I have no problem with appendSlice(begin, end) and a simple append(start, len).
>  I don't see the added typing of "Slice" as a sufficient incentive to keep the
> names the same, even if it may often be the case that the difference is
> apparent from the arguments -- you still have to read the arguments to be sure,
> reasoning backward to the call's effect rather than forward.

That is the general argument used against overloading.  If you are glancing over the code, you don't have to know have to know which overload is called, you just know that it appends.  If you are reasoning about correctness, you will already know about the append arguments and their properties by the time your mental symbolic execution reaches the append, so you don't have to go backwards.
Comment on attachment 389037 [details] [diff] [review]
changes to JSTempVector and uses associated with Array.join

>diff --git a/js/src/jsarray.cpp b/js/src/jsarray.cpp
>-    JSBool initiallySharp = IS_SHARP(he) ? JS_TRUE : JS_FALSE;
>+    bool initiallySharp = IS_SHARP(he) ? true : false;

I don't believe the ?: is needed any more, won't assignment to a bool auto-!=0ify?


>+    /* After this point, all paths exit through the 'error' label. */
>+    MUST_FLOW_THROUGH("error");

This is a misnomer given that even success code flows through here; "out" is probably the most canonical name.


>+        if (!JS_CHECK_OPERATION_LIMIT(cx) ||
>+            !GetArrayElement(cx, obj, index, &hole, vp))
>+            goto error;

Should be braced -- or will that fit in 99 characters?  Too lazy to count.


>+            if (!buf.append(',') || !buf.append(' '))
>+                goto error;

This suggests adding |template<size_t N> bool append(const char data[N])|.


>-    JSBool ok = JS_TRUE;
>+    /* After this point, all paths exit through the 'error' label. */
>+    MUST_FLOW_THROUGH("error");

Again should be "out", or even leave as is; I don't remember which is more common.


>+        if (!JS_CHECK_OPERATION_LIMIT(cx) ||
>+            !GetArrayElement(cx, obj, index, &hole, rval))
>+            goto error;

Bracing.


>+                if (!js_ValueToObject(cx, *rval, &robj))
>+                    goto error;
>+                *rval = OBJECT_TO_JSVAL(robj);  /* protect object */

The rooting idiom is ubiquitous and key to understanding SpiderMonkey code; I don't think it's necessary to call this out as protecting robj.


>+                if (!js_TryMethod(cx, robj, atom, 0, NULL, rval))
>+                    goto error;

I'm not sure rooting via rval is going to work.  It would be perfectly acceptable for a fastnative to assign to *rval early and then call methods on robj or get properties or whatever to cause gc.  A JSAutoTempValueRooter seems in order here.


>diff --git a/js/src/jsbool.cpp b/js/src/jsbool.cpp

>-js_BooleanToStringBuffer(JSContext *cx, JSBool b, JSTempVector<jschar> &buf)
>+js_BooleanToCharBuffer(JSContext *cx, JSBool b, JSCharVector &buf)
> {
>     static const jschar trueChars[] = { 't', 'r', 'u', 'e' },
>                         falseChars[] = { 'f', 'a', 'l', 's', 'e' };
>-    return b ? buf.pushBack(trueChars, trueChars + JS_ARRAY_LENGTH(trueChars))
>-             : buf.pushBack(falseChars, falseChars + JS_ARRAY_LENGTH(falseChars));
>+    return b ? buf.append(trueChars, JS_ARRAY_END(trueChars))
>+             : buf.append(falseChars, JS_ARRAY_END(falseChars));
> }

Hm, I guess this wants a string-append method also templatized for jschar, in addition to char.


>diff --git a/js/src/jsprvtd.h b/js/src/jsprvtd.h

>+/* Common JSTempVector instantiations: */
>+typedef JSTempVector<jschar, 32> JSCharVector;

Sigh, C++ preventing JSCharVector and JSCharVector<64> but rather requiring JSCharVector<> is sadmaking.


>diff --git a/js/src/jsstr.cpp b/js/src/jsstr.cpp

Down to here.  I said I was going to finish off this review tonight, but it's late enough that continuing the effort now is feasible -- has to wait for morning.  :-\
(In reply to comment #38)
> >+            if (!buf.append(',') || !buf.append(' '))
> >+                goto error;
> 
> This suggests adding |template<size_t N> bool append(const char data[N])|.

C turns T v[N] into T *v in parameter position, C++ too -- but I'm not a template guru. Does this really specialization accomplish anything?

> >+                if (!js_TryMethod(cx, robj, atom, 0, NULL, rval))
> >+                    goto error;
> 
> I'm not sure rooting via rval is going to work.  It would be perfectly
> acceptable for a fastnative to assign to *rval early and then call methods on
> robj or get properties or whatever to cause gc.  A JSAutoTempValueRooter seems
> in order here.

The |this| parameter is rooted by the frame for the method call. This is a dodgy case but the dodgyness is in the API to js_TryMethod. Its implementation looks safe. It would be better to comment this with a FIXME: citing a new bug.

/be
So um, on actually reading the patch I have to admit the word "sentinel" completely threw me off. I thought we were talking about:
  append(const T *begin, const T *end);
  append(const T *begin, const T& sentinel);  // like, cb.append(cstr, '\n')
which I thought sounded like a bit much.

I am still a little confused as to what Waldo means by this:

> when I said sentinel I really meant "last element to be added", just
> misread slightly when looking at the patch.  :-)  Apologies for that...

"last element to be added"?

Looking at the actual patch I don't see any places where it's not obvious what's going on. The call sites are all like `cb.append(buf, buflen)` so it seems plenty explicit enough in practice.
Thanks Waldo!

(In reply to comment #39)
> C turns T v[N] into T *v in parameter position, C++ too -- but I'm not a
> template guru. Does this really specialization accomplish anything?

Fortunately, the T[N] to T* change is defined as an implicit conversion that is only considered when the source and destination types do not exactly match.
(In reply to comment #38)
> >+/* Common JSTempVector instantiations: */
> >+typedef JSTempVector<jschar, 32> JSCharVector;
> 
> Sigh, C++ preventing JSCharVector and JSCharVector<64> but rather requiring
> JSCharVector<> is sadmaking.

While JSCharVector<64> would be nice (C++0x has template aliases), you can (only) write JSCharVector.
(In reply to comment #39)
> C turns T v[N] into T *v in parameter position, C++ too -- but I'm not a
> template guru. Does this really specialization accomplish anything?

With inlining in the relevant method you could check for space for all the characters at once, rather than one at a time.  Maybe compilers will optimize to one check for space, but I wouldn't bet on it.


> The |this| parameter is rooted by the frame for the method call. This is a
> dodgy case but the dodgyness is in the API to js_TryMethod. Its implementation
> looks safe. It would be better to comment this with a FIXME: citing a new bug.

Well noted, concern retracted.
Regarding the "FIXME" comment and new bug, what exactly is the fix here?  Cleaning up the semantics of js_TryMethod?
Comment on attachment 389037 [details] [diff] [review]
changes to JSTempVector and uses associated with Array.join

>diff --git a/js/src/jsstr.cpp b/js/src/jsstr.cpp

> JSString *
>+js_NewStringFromCharBuffer(JSCharVector &cb)
>+{
>+    /* Add the missing terminating '\0' if missing. */
>+    if (cb.empty())
>+        return ATOM_TO_STRING(cb.jscontext()->runtime->atomState.emptyAtom);
>+    if (cb.back() != jschar('\0') && !cb.append('\0'))
>+        return NULL;

The above should be formatted this way:

>+    if (cb.empty())
>+        return ATOM_TO_STRING(cb.jscontext()->runtime->atomState.emptyAtom);
>+
>+    /* Add the missing terminating '\0' if missing. */
>+    if (cb.back() != jschar('\0') && !cb.append('\0'))
>+        return NULL;

You haven't removed the != '\0' here (or the 'if missing' extra bit), weren't you going to earlier?  Also along those lines, these earlier lines in jsarray.cpp:

        static const jschar arr[] = { '[', ']', 0 };
        if (!buf.append(arr, JS_ARRAY_END(arr)))

demonstrate a bug, although not one you can hit unless you disable sharp variables, which the browser doesn't do.  (I *wish* we could hit this error!)  This is another place that would benefit from passing JS_ARRAY_LENGTH, in my opinion.


>+    /* The length passed to js_NewString must not include the '\0'. */
>+    size_t length = cb.size() - 1;

It would be clearer if we extracted the length before appending '\0'.


>+    jschar *buf = cb.extractRawBuffer();
>+    if (!buf)
>+        return NULL;
>+    JSString *str = js_NewString(cb.jscontext(), buf, length);
>+    if (!str)
>+        free(str);

This can be cx->free(str) now, I believe.


>diff --git a/js/src/jsstr.h b/js/src/jsstr.h

>+ * If (cb.back() != '\0'), the function will add the mandatory '\0'.

Old comment.


>diff --git a/js/src/jsvector.h b/js/src/jsvector.h

>+/* Library of template meta-programs for use in the C++ JS data-structures.  */
>+namespace JSTMP {

JSTemplateUtils strikes me as a better name.


>+/* Statically compute floor(log2(i)). */
>+template <size_t i> struct FloorLog2 {
>+    static const size_t result = 1 + FloorLog2<i/2>::result;

>+/* Statically compute ceiling(log2(i)). */
>+template <size_t i> struct CeilingLog2 {
>+    static const size_t result = FloorLog2<2*i-1>::result;

Spaces around binary operators.


>+/* Statically compute the number of bits in the given unsigned type. */
>+template <class T> struct BitSize {
>+    static const int result = sizeof(T) * JS_BITS_PER_BYTE;

...like you did here!


>+    /*
>      * Grows the given buffer to have capacity newcap, preserving the objects
>-     * constructed in the range [begin, end) and updating vec.
>+     * constructed in the range [begin, end) and updating v.  Assumes that (1)
>+     * newcap has not overflowed, and (2) multiplying newcap by sizeof(T) will
>+     * not overflow.
>      */
>-    static inline bool growTo(JSTempVector<T> &vec, size_t newcap) {
>-        size_t bytes = sizeof(T) * newcap;
>-        T *newbuf = reinterpret_cast<T *>(malloc(bytes));
>+    static inline bool growTo(JSTempVector<T> &v, size_t newcap) {
>+        T *newbuf = reinterpret_cast<T *>(malloc(newcap * sizeof(T)));
>         if (!newbuf) {
>-            js_ReportOutOfMemory(vec.mCx);
>+            js_ReportOutOfMemory(v.mCx);
>             return false;
>         }
>-        for (T *dst = newbuf, *src = vec.mBegin; src != vec.mEnd; ++dst, ++src)
>+        for (T *dst = newbuf, *src = v.dynBegin(); src != v.dynEnd(); ++dst, ++src)
>             new(dst) T(*src);
>-        JSTempVectorImpl::destroy(vec.mBegin, vec.mEnd);
>-        free(vec.mBegin);
>-        vec.mEnd = newbuf + (vec.mEnd - vec.mBegin);
>-        vec.mBegin = newbuf;
>-        vec.mCapacity = newbuf + newcap;
>+        JSTempVectorImpl::destroy(v.dynBegin(), v.dynEnd());
>+        free(v.dynBegin());
>+        v.dynEnd() = newbuf + (v.dynEnd() - v.dynBegin());
>+        v.dynBegin() = newbuf;
>+        v.dynCapacity() = newcap;
>         return true;
>     }

This should use v.mCx->malloc and v.mCx->free, and note that the former reports OOM for you.  Also, I think we came to an agreement not to use 'm' prefixes on member names -- or maybe we didn't and just argued about it.  I'm not particular about it, just make sure it agrees with whatever we decided.


>+        /*
>+         * You would think that memset would be a big win (or even break even)
>+         * when we know T is a POD.  But currently its not.  This is probably
>+         * because |append| tends to be given small ranges and memset requires
>+         * a function call that doesn't get inlined.
>+         *
>+         * memset(begin, 0, sizeof(T) * (end-begin));  //SLOWER
>+         */

"currently it's not"

>         for (T *p = begin; p != end; ++p)
>             *p = 0;
>     }

This only works because all the manually-noted-as-POD types are builtin C++ types which contain the value 0, right?  That seems somewhat unfortunate.


>+    template <class U>
>+    static inline void copyConstruct(T *dst, const U *srcbeg, const U *srcend) {
>+        /*
>+         * See above memset comment.  Also, notice that copyConstruct is
>+         * currently templated (T != U), so memcpy won't work without
>+         * requiring T == U.
>+         *
>+         * memcpy(dst, srcbeg, sizeof(T) * (srcend-srcbeg));

Spaces around -.


>+    static inline bool growTo(JSTempVector<T,N> &v, size_t newcap) {
>         size_t bytes = sizeof(T) * newcap;
>-        T *newbuf = reinterpret_cast<T *>(realloc(vec.mBegin, bytes));
>+        T *newbuf = reinterpret_cast<T *>(realloc(v.dynBegin(), bytes));
>         if (!newbuf) {
>-            js_ReportOutOfMemory(vec.mCx);
>+            js_ReportOutOfMemory(v.mCx);
>             return false;
>         }
>-        vec.mEnd = newbuf + (vec.mEnd - vec.mBegin);
>-        vec.mBegin = newbuf;
>-        vec.mCapacity = newbuf + newcap;
>+        v.dynEnd() = newbuf + (v.dynEnd() - v.dynBegin());
>+        v.dynBegin() = newbuf;
>+        v.dynCapacity() = newcap;
>         return true;
>     }

More v.mCx->malloc/free changes.  Also, you should assert not being in inline mode at the start of this method.


>+    /* Use 'Impl' to statically specialize copy/move operations for PODs. */
>+    typedef JSTempVectorImpl<T, N, JSTMP::IsPodType<T>::result> Impl;
>+    friend class JSTempVectorImpl<T, N, JSTMP::IsPodType<T>::result>;

Given the IsPodType hackery in here, the comment seems mostly redundant; I'd remove it.


>+    bool checkOverflow(size_t newval, size_t oldval, size_t diff) const;
>+    bool dynGrowCapacityBy(size_t diff);
>+    bool inlineToDynamic(size_t newcap);

...and here it begins.  I'm not sure I like the names "inline" and "dynamic" -- and I *definitely* don't like inl or dyn prefixes.  "heap" would be a more fitting term, I believe, than "dynamic".  Om these two cases, I think |convertToHeapStorage| and |growHeapCapacityBy| are both clearer to me than your names.  More naming concerns related to this follow in subsequent comments...


>+    /* magic constants */
>+
>+    static const int sGrowthFactor = 2;
>+    static const int sMaxInlineBytes = 1024;

I happen to like all-caps names for constants, even when they're true constants and not just macros (and, I guess, even when they're not constants but only constants for a particular template specialization -- man, templates are complex).  I don't care enough to ask for this to be changed, but if other people feel the same it may be worth doing so.


>+    /*
>+     * Pointers to the dynamiclly allocated buffer.  Only [dynBegin, dynEnd)

Pointers to the heap buffer.


>+    /*
>+     * Since a vector either stores elements inline or in a dynamic buffer, but

...and now for the inline term.

Actually, on second thought that name's cool.  :-)  But do change that to heap buffer, please.


>+     * not both at the same time, reuse the storage.  mRawVal serves as the
>+     * union discriminator.  In 'inline mode' (when elements are stored in
>+     * u.mBuf), mRawVal holds the vector's size.  In 'dynamic mode' (when

heap mode (also nix the quotes around the two modes, don't see that they add any value and might perhaps take away value)


>+     * elements are stored in [ptrs.mBegin, ptrs.mEnd)), mRawVal holds the

Should be u.ptrs, if you're going to spell it out this far.  Also, returning to the name mRawVal, it seems mSizeOrCapacity (prefixed or not per the current rule of the day) would be clearer than "raw value".


>+    bool inlineMode() const { return mRawVal <= InlineCapacity; }

usingInlineStorage()?  Says what it is, seems better than indirecting through a mode whose definition one can find elsewhere.


>+    union {
>+        BufferPtrs ptrs;
>+        char mBuf[InlineCapacity * sizeof(T)];
>+    } u;

Boo-urns on not being able to have T mBuf[N].  We could go crazy and add

template<size_t N> T array[N];
template<size_t N> char mBuf[N * sizeof(T)];

to the appropriate specializations of IsPodType, but then IsPodType starts to permeate the rest of JSVectorMumble, which I guess is worse.


>+    /* Only valid in 'inline buffer' mode. */

/* Only valid when inline storage is used. */ or even /* Only valid when usingInlineStorage(). */.  Latter is more precise, so I think I like it slightly better.

>+    size_t &inlSize() { JS_ASSERT(inlineMode()); return mRawVal; }
>+    size_t inlSize() const { JS_ASSERT(inlineMode()); return mRawVal; }
>+    T *inlBegin() const { JS_ASSERT(inlineMode()); return (T *)u.mBuf; }
>+    T *inlEnd() const { JS_ASSERT(inlineMode()); return ((T *)u.mBuf) + mRawVal; }

Spell it out, inlineSize, inlineBegin, inlineEnd.  Also, I think I'd prefer multiline method definitions; this is a bit noisy with the assert.


>+    /* Only valid in 'dynamic buffer' mode. */

/* Only valid when !usingInlineStorage(). */

>+    size_t dynSize() { JS_ASSERT(!inlineMode()); return u.ptrs.mEnd - u.ptrs.mBegin; }
>+    size_t &dynCapacity() { JS_ASSERT(!inlineMode()); return mRawVal; }
>+    T *&dynBegin() { JS_ASSERT(!inlineMode()); return u.ptrs.mBegin; }
>+    T *&dynEnd() { JS_ASSERT(!inlineMode()); return u.ptrs.mEnd; }
>+    size_t dynCapacity() const { JS_ASSERT(!inlineMode()); return mRawVal; }
>+    T *const &dynBegin() const { JS_ASSERT(!inlineMode()); return u.ptrs.mBegin; }
>+    T *const &dynEnd() const { JS_ASSERT(!inlineMode()); return u.ptrs.mEnd; }

heapSize, heapCapacity, heapBegin, heapEnd.


>   public:
>+
>     JSTempVector(JSContext *cx)

I don't think the blank line is needed here.

>       :
>+        mCx(cx), mRawVal(0)
> #ifdef DEBUG
>-        mInProgress(false),
>+        , mInProgress(false)
> #endif
>-        mCx(cx), mBegin(0), mEnd(0), mCapacity(0)
>     {}
>     ~JSTempVector();

Generally I believe we've kept member constructor calls on the same line as ':'.


>     JSTempVector(const JSTempVector &);
>     JSTempVector &operator=(const JSTempVector &);

Hum, how'd I miss before that these methods aren't kosher?  If you have two vectors and assign them like this, you could duplicate or smash mBegin when not using inline storage, which is either a leak or a double free.  These shouldn't exist, or there should be a swap method, or something so that ownership transfers correctly.


>+    JSContext *jscontext() const {
>+        return mCx;
>+    }

Seems a 'js' prefix isn't really needed, is it?


>+    bool empty() const {
>+        return inlineMode() ? !inlSize() : (dynBegin() == dynEnd());

Would prefer inlineSize() == 0, personally.


>     T &back() {
>-        JS_ASSERT(!mInProgress);
>-        return *(mEnd - 1);
>+        JS_ASSERT(!mInProgress && !empty());
>+        return *(end() - 1);
>     }

>     const T &back() const {
>         JS_ASSERT(!mInProgress && !empty());
>-        return *(mEnd - 1);
>+        return *(end() - 1);
>     }

end()[-1]?  This is one reason I prefer keeping indexes rather than pointers around for bounds-tracking when using arrays.  (Also, indexes are arguably higher-level -- name a high-level language that doesn't use indexes, if you can.  None immediately spring to mind for me.)


> 
>     /* mutators */
> 
>     bool reserve(size_t);
>+    bool resize(size_t);
>+    void shrinkBy(size_t);
>     bool growBy(size_t);
>     void clear();

Why not give descriptive argument names so that someone reading the header doesn't have to guess about the meaning of each or spend extra time looking up the implementation after assuming the header was a one-stop shop (as it usually is)?  Throughout the patch, actually.


>+        Impl::destroy(dynBegin(), dynEnd());
>+        free(dynBegin());

mCx->free


> /*
>- * Check for overflow of an increased size or capacity (generically, 'value').
>- * 'diff' is how much greater newval should be compared to oldval.
>+ * Check for overflow of a new size/capacity.  Additionally, check that, when
>+ * newval is multiplied by sizeof(T), it doesn't overflow.  'diff' is how much
>+ * greater 'newval' should be compared to 'oldval'.  Only 'newval' is checked
>+ * for overflow; 'diff' is assumed valid.
>  */

increment might be a better name than diff; I think we use that name (actually the semi-gratuitous incr, if memory serves) in the interpreter when handling ++ and --, when we do +=2 or -=2 (for the JSVAL_INT tagbit).


>-template <class T>
>+template <class T, size_t N>
> inline bool
>-JSTempVector<T>::checkOverflow(size_t newval, size_t oldval, size_t diff) const
>+JSTempVector<T,N>::checkOverflow(size_t newval, size_t oldval, size_t diff) const
> {
>-    size_t newbytes = newval * sizeof(T),
>-           oldbytes = oldval * sizeof(T),
>-           diffbytes = diff * sizeof(T);
>-    bool ok = newbytes >= oldbytes && (newbytes - oldbytes) >= diffbytes;
>+    /* Check whether 'newval' has already overflowed, */
>+    bool ok = newval > oldval && (newval - oldval) >= diff &&
>+    /* and whether it will overflow when multiplied by sizeof(T) */
>+              ((newval & JSTMP::MulOverflowMask<sizeof(T)>::result) == 0);

I'm not a fan of inline dispersal of comments like this; just have one block comment for the entire assignment.


>+template <class T, size_t N>
>+inline bool
>+JSTempVector<T,N>::inlineToDynamic(size_t extra)
>+{
>+    /* Allocate dynamic buffer. */
>+    size_t newcap = extra + InlineCapacity * sGrowthFactor;

So if I want to reserve space for 33 characters in my JSCharVector, I get newcap = 1 + 32 * 2 = 65 characters?  Why 65?  I don't quite understand what you're trying to do here.  Why not simple doubling that the allocator probably handles better?


>+    T *newbuf = reinterpret_cast<T *>(malloc(newcap * sizeof(T)));
>+    if (!newbuf) {
>+        js_ReportOutOfMemory(mCx);
>+        return false;
>     }

mCx->malloc/free


>+    /* Switch in dynamic buffer. */
>+    mRawVal = newcap;  /* now we are in 'dynamic mode' */

/* marks us as !usingInlineStorage() */


>+template <class T, size_t N>
>+inline bool
>+JSTempVector<T,N>::growBy(size_t delta)
>+{
>+    ReentrancyGuard g(*this);
>+    if (inlineMode()) {
>+        size_t freespace = InlineCapacity - inlSize();
>+        if (delta <= freespace) {
>+            T *newend = inlEnd() + delta;
>+            Impl::initialize(inlEnd(), newend);
>+            inlSize() += delta;

JS_ASSERT(usingInlineStorage()) as a sanity check on up-to-the-limit growings.


>+template <class T, size_t N>
>+inline bool
>+JSTempVector<T,N>::append(const T &t)
>+{
>+    ReentrancyGuard g(*this);
>+    if (inlineMode()) {
>+        if (inlSize() < InlineCapacity) {
>+            new(inlEnd()) T(t);
>+            ++inlSize();

JS_ASSERT(usingInlineStorage()) as well for sanity


>+JSTempVector<T,N>::extractRawBuffer()
> {
>-    T *ret = mBegin;
>-    mBegin = mEnd = mCapacity = 0;
>+    if (inlineMode()) {
>+        T *ret = reinterpret_cast<T *>(malloc(inlSize() * sizeof(T)));
>+        if (!ret) {
>+            js_ReportOutOfMemory(mCx);
>+            return NULL;
>+        }

mCx->malloc


>+JSTempVector<T,N>::replaceRawBuffer(T *p, size_t length)
> {
>     ReentrancyGuard g(*this);
>-    Impl::destroy(mBegin, mEnd);
>-    free(mBegin);
>-    mBegin = p;
>-    mCapacity = mEnd = mBegin + length;
>+    /* Destroy what we have. */
>+    if (inlineMode()) {
>+        Impl::destroy(inlBegin(), inlEnd());
>+        inlSize() = 0;
>+    } else {
>+        Impl::destroy(dynBegin(), dynEnd());
>+        free(dynBegin());

mCx->free


>+    }
>+
>+    /* Take in the new buffer. */
>+    if (length <= InlineCapacity) {
>+        /*
>+         * (mRawVal <= InlineCapacity) means inline storage, so we MUST use
>+         * inline storage, even though memory is already allocated.
>+         */
>+        mRawVal = length;
>+        Impl::copyConstruct(inlBegin(), p, p + length);
>+        free(p);

mCx->free, and isn't a Impl::destroy(p, p + length) in order?
Attachment #389037 - Flags: review?(jwalden+bmo) → review-
(In reply to comment #44)
> Regarding the "FIXME" comment and new bug, what exactly is the fix here? 
> Cleaning up the semantics of js_TryMethod?

What did I mean, hmm. Probably the best followup bug would be to consider using JSVAL_ERROR_COOKIE so js_TryMethod and a great many other methods, possibly even native methods (incompatible API change, but conceivable for JSFastNatives) no longer return JSBool, rather jsval. Comments welcome but this seems like it will win some perf and reduce codesize.

/be
(In reply to comment #45)

Thanks!

> You haven't removed the != '\0' here (or the 'if missing' extra bit), weren't
> you going to earlier?

I did, I was just waiting for more changes before reattaching a patch.

>         static const jschar arr[] = { '[', ']', 0 };
>         if (!buf.append(arr, JS_ARRAY_END(arr)))
> 
> ...
> This is another place that would benefit from passing JS_ARRAY_LENGTH, in my
> opinion.

Following your previous suggestion, I changed this to a single-line append of a literal.

> >+    if (!str)
> >+        free(str);
> 
> This can be cx->free(str) now, I believe.

I believe you are referring to a patch that replaces JS_free(cx,p) with cx->free(p).  JSTempVector malloc's, so I think it should stay 'free'.  I had a recent discussion with brendan and andreas about JS_malloc (now cx->malloc) vs. plain malloc.  JS_malloc does OOM checking and "bogo" accounting which is about to be changed.  I do OOM checking, so we decided there was no need for cx->malloc.

> >diff --git a/js/src/jsvector.h b/js/src/jsvector.h
> 
> >+/* Library of template meta-programs for use in the C++ JS data-structures.  */
> >+namespace JSTMP {
> 
> JSTemplateUtils strikes me as a better name.

Well, whatever the name is, it gets used in a lot of places and a long name like 'JSTemplateUtils' will be most unpleasant since, being in a header file, I cannot shorten it with a namespace alias.  Perhaps JSTU?

> Also, I think we came to an agreement not to use 'm' prefixes on
> member names -- or maybe we didn't and just argued about it.  I'm not
> particular about it, just make sure it agrees with whatever we decided.

I already changed all these names once before in response to a style discussion on bug 200505.  JSString prepends 'm'.

> >         for (T *p = begin; p != end; ++p)
> >             *p = 0;
> >     }
> 
> This only works because all the manually-noted-as-POD types are builtin C++
> types which contain the value 0, right?  That seems somewhat unfortunate.

You're right, I should default-construct; it's equivalent for builtins and also works for user-defined PODs.

> >+    union {
> >+        BufferPtrs ptrs;
> >+        char mBuf[InlineCapacity * sizeof(T)];
> >+    } u;
> 
> Boo-urns on not being able to have T mBuf[N].  We could go crazy and add

I actually prefer the array of bytes; it makes it quite clear that we are asking for raw, uninitialized memory.  Once you start having arrays of T's, you start having extra lifetimes.  That PODs are flexible in this manner is a special case.

> >     JSTempVector(const JSTempVector &);
> >     JSTempVector &operator=(const JSTempVector &);
> 
> Hum, how'd I miss before that these methods aren't kosher?  If you have two
> vectors and assign them like this, you could duplicate or smash mBegin when not
> using inline storage, which is either a leak or a double free.

They aren't implemented.  You are referring to the compiler-generated defaults which, I assure you, I intended to avoid when implementing these operations.  However, now that I remember that we cannot use exception handling, there is no clean way to handle error cases for these two functions.  I'll make them private.

> >     T &back() {
> >-        JS_ASSERT(!mInProgress);
> >-        return *(mEnd - 1);
> >+        JS_ASSERT(!mInProgress && !empty());
> >+        return *(end() - 1);
> >     }
> 
> >     const T &back() const {
> >         JS_ASSERT(!mInProgress && !empty());
> >-        return *(mEnd - 1);
> >+        return *(end() - 1);
> >     }
> 
> end()[-1]?  This is one reason I prefer keeping indexes rather than pointers
> around for bounds-tracking when using arrays.  (Also, indexes are arguably
> higher-level -- name a high-level language that doesn't use indexes, if you
> can.  None immediately spring to mind for me.)

end()[-1] is much much weirder.  end() does not point to an array in any high level sense; or perhaps its points to a 0-length array.  We can only reason that it works by reasoning about its translation to pointer arithmetic, which I wrote directly.

> >+template <class T, size_t N>
> >+inline bool
> >+JSTempVector<T,N>::inlineToDynamic(size_t extra)
> >+{
> >+    /* Allocate dynamic buffer. */
> >+    size_t newcap = extra + InlineCapacity * sGrowthFactor;
> 
> So if I want to reserve space for 33 characters in my JSCharVector, I get
> newcap = 1 + 32 * 2 = 65 characters?  Why 65?  I don't quite understand what
> you're trying to do here.  Why not simple doubling that the allocator probably
> handles better?

To quote the Dude, "My thinking on this case has become very uptight."  I was stuck on avoiding an "expensive" rounding computation in the case of large 'extra' values.  But of course this computation is done rarely (that's the whole point of doubling).  So yeah, I'll make it double cleanly.
Attachment #389037 - Attachment is obsolete: true
Attachment #393042 - Flags: review?(jwalden+bmo)
Attached patch error from merge (obsolete) — Splinter Review
Made a mistake merging the patch.  Now good.
Attachment #393042 - Attachment is obsolete: true
Attachment #393053 - Flags: review?(jwalden+bmo)
Attachment #393042 - Flags: review?(jwalden+bmo)
Looking at these new cx->malloc/realloc/free calls in more detail, this is not just JS_malloc/JS_realloc/JS_free re-syntaxed.  Switched all calls to use the cx-> version.
Attachment #393053 - Attachment is obsolete: true
Attachment #393083 - Flags: review?(jwalden+bmo)
Attachment #393053 - Flags: review?(jwalden+bmo)
Comment on attachment 393083 [details] [diff] [review]
cx->{malloc,free,realloc} instead of {malloc,free,realloc}

>diff --git a/js/src/jsbool.cpp b/js/src/jsbool.cpp

>+    return b ? js_AppendLiteral(buf, "true") : js_AppendLiteral(buf, "false");

Shurely by now you'll agree it makes sense to have one common js_AppendLiteral call and make the last argument a ternary expression?  :-P


>diff --git a/js/src/jsregexp.cpp b/js/src/jsregexp.cpp

>@@ -2694,29 +2712,31 @@ class RegExpNativeCompiler {
> 
>             /* Collect misses. */
>             LIns *missLbl = lir->ins0(LIR_label);
>             asciiMissBr->setTarget(missLbl);
>             belowMissBr->setTarget(missLbl);
>             aboveMissBr->setTarget(missLbl);
>             LIns *missBr = lir->ins2(LIR_j, NULL, NULL);
>             if (node->op == REOP_SPACE)
>-                fails.pushBack(missBr);
>+                if (!fails.append(missBr))
>+                    return NULL;

Should be braced.


>diff --git a/js/src/jsstr.cpp b/js/src/jsstr.cpp

>+js_NewStringFromCharBuffer(JSContext *cx, JSCharVector &cb)
>+{
>+    if (cb.empty())
>+        return ATOM_TO_STRING(cx->runtime->atomState.emptyAtom);
>+
>+    /* The length passed to js_NewString must not include the '\0'. */
>+    size_t length = cb.size();
>+
>+    /* Add the missing terminating '\0' if missing. */
>+    if (!cb.append('\0'))
>+        return NULL;

I'd remove the comments and put these three lines all together, seems clear enough to me what's actually being done:

>+    size_t length = cb.size();
>+    if (!cb.append('\0'))
>+        return NULL;


>+    if (!str)
>+        cx->free(str);

???


>diff --git a/js/src/jsstr.h b/js/src/jsstr.h

>+/*
>+ * GC-allocate a string descriptor and steal the char buffer held by |cb|.
>+ * This function takes responsibility for adding the terminating '\0' required
>+ * by js_NewString.
>+ */
>+extern JSString *
>+js_NewStringFromCharBuffer(JSContext *cx, JSCharVector &cb);

Some method declarations name their arguments; others don't.  I'd prefer if all were named, consistently (that is, ever method argument consistently is given a name), as hints to functionality when it may not be obvious.


>diff --git a/js/src/jsvector.h b/js/src/jsvector.h

>+    /*
>+     * Pointers to the heap-allocated buffer.  Only [heapBegin, heapEnd)
>+     * hold valid constructed T objects.  The range [heapEnd, heapBegin +
>+     * heapCapacity) holds uninitialized memory.
>+     */
>+    struct BufferPtrs {
>+        T *mBegin, *mEnd;
>+    };

Make heapBegin => heapBegin(), heapEnd => heapEnd(), heapCapacity => heapCapacity(), since none of them are variables but rather methods.


>+    static const size_t sInlineCapacity =
>+        JSUtils::min<JSUtils::max<N, sizeof(BufferPtrs) / sizeof(T)>::result,
>+                   sMaxInlineBytes / sizeof(T)>::result;

Need two more spaces of indentation for proper alignment.


>+    size_t size() const {
>+        return usingInlineStorage() ? inlineSize() : (heapEnd() - heapBegin());
>+    }
>+
>+    bool empty() const {
>+        return usingInlineStorage() ? inlineSize() == 0 : (heapBegin() == heapEnd());
>+    }

It seems more likely than not that an empty vector uses inline storage, so:

    return JS_LIKELY(usingInlineStorage()) ? ... : ...;

This would probably inline better, I'd think.  I'm not even sure I remember any of the current vector uses removing elements anyway, but I might be wrong about that.


>+/*
>+ * This helper function is specialized for appending the charactesr of a string

characters


>+    /* Round up to next power of 2. */
>+    size_t newcap;
>+    JS_CEILING_LOG2(newcap, mincap);
>+    JS_ASSERT(newcap < 32);
>+    newcap = 1u << newcap;

>+    /* Round up to next power of 2. */
>+    size_t newcap;
>+    JS_CEILING_LOG2(newcap, mincap);
>+    JS_ASSERT(newcap < 32);
>+    newcap = 1u << newcap;

Should probably be JS_ASSERT(newcap < JSUtils::BitSize<size_t>), right?  64-bit safe and all that, of course.


>+    /* Switch in heap buffer. */
>+    mSizeOrCapacity = newcap;  /* marks us as !usingInlineStorage */

Should put parens after !uIL.


>+        /*
>+         * (mSizeOrCapacity <= sInlineCapacity) means inline storage, so we MUST
>+         * use inline storage, even though memory is already allocated.
>+         */

even though p would otherwise be acceptable storage

Almost there, these changes are all pretty minimal...
Attachment #393083 - Flags: review?(jwalden+bmo) → review-
(In reply to comment #51)
> >+    return b ? js_AppendLiteral(buf, "true") : js_AppendLiteral(buf, "false");
> 
> Shurely by now you'll agree it makes sense to have one common js_AppendLiteral
> call and make the last argument a ternary expression?  :-P

Ironically, I do ;)  Recall that js_AppendLiteral takes a jschar (&)[N].

> >+    if (!str)
> >+        cx->free(str);
> 
> ???

Yep, pretty dim; good catch.

> >+    bool empty() const {
> >+        return usingInlineStorage() ? inlineSize() == 0 : (heapBegin() == heapEnd());
> >+    }
> 
> It seems more likely than not that an empty vector uses inline storage, so:
> 
>     return JS_LIKELY(usingInlineStorage()) ? ... : ...;
> 
> This would probably inline better, I'd think.  I'm not even sure I remember any
> of the current vector uses removing elements anyway, but I might be wrong about
> that.

This question *asks* whether the vector is empty, so I'm not sure its so likely to be using inline storage.  In fact, since its the large containers that, by definition, are using dynamic storage and executing many vector operations, there's reason to argue that the opposite is likely.  It's dynamic branch prediction that's going to make all these tests go away, anyhow.

> Almost there, these changes are all pretty minimal...

Yipee!  and thanks.
Attachment #393083 - Attachment is obsolete: true
Attachment #393313 - Flags: review?(jwalden+bmo)
Comment on attachment 393313 [details] [diff] [review]
v.5, all gussied up

*stamp*

Pushing anon!
Attachment #393313 - Flags: review?(jwalden+bmo) → review+
(In reply to comment #52)
> Ironically, I do ;)  Recall that js_AppendLiteral takes a jschar (&)[N].

Gag, C++ should support this.

Push wasn't quite as anon as expected, jsregexp.cpp had bitrotted a bunch somehow, but I disentangled it.  I also changed BitSize<T>::result to size_t, because otherwise growHeapCapacityTo() had signedness problems comparing that value to a size_t.  Easy enough to fix...

Hoping to get more reviewed over the weekend maybe -- we'll see!
Thanks Waldo!  I'll post post a fresh patch for the json changes; the attached one is très old.
Attachment #389038 - Attachment is obsolete: true
Attachment #393542 - Flags: review?(brendan)
Attachment #389038 - Flags: review?(brendan)
Comment on attachment 393542 [details] [diff] [review]
json-changes v.3, fresh for reviewer

Looks good, and I will review an updated jsscan.cpp but I'd like to duck this so it hits sayrer ;-).

/be
Attachment #393542 - Flags: review?(brendan) → review?(sayrer)
Comment on attachment 393542 [details] [diff] [review]
json-changes v.3, fresh for reviewer

>
>+    JSObject *objectStack;
>+    JSTempVector<jschar> objectKey;
>+    JSTempVector<jschar> buffer;

With some rooting similar to this:

http://mxr.mozilla.org/mozilla/source/dom/src/json/nsJSON.cpp#714

can objectStack be a vector as well? There's a lot of ugly code in here because objectStack is a JS Array.
(In reply to comment #59)
> can objectStack be a vector as well? There's a lot of ugly code in here because
> objectStack is a JS Array.

I think it definitely could.  I'm thinking that would involve building a JSTempRootingVector template which wraps a JSTempVector.  Perhaps I should make a separate bug to do that though?
Comment on attachment 393542 [details] [diff] [review]
json-changes v.3, fresh for reviewer

file a follow up on exiting from that loop via return.
Attachment #393542 - Flags: review?(sayrer) → review+
Committed: http://hg.mozilla.org/tracemonkey/rev/2a5544bd7012
bug 510360 is for the return-from-loop, although it looks like its not a bug-bug.
Attached patch xml changes, v.2 (obsolete) — Splinter Review
Rebased and with corrections from first review.
Attachment #389039 - Attachment is obsolete: true
Attachment #394599 - Flags: review?(igor)
Comment on attachment 394599 [details] [diff] [review]
xml changes, v.2

>diff --git a/js/src/jsxml.cpp b/js/src/jsxml.cpp
>@@ -2812,122 +2742,126 @@ XMLToXMLString(JSContext *cx, JSXML *xml
...
>         /* 17(b)(iii). */
>         if (!prefix->empty()) {
>-            js_AppendJSString(&sb, prefix);
>-            js_AppendChar(&sb, ':');
>+            if (!AppendString(cb, prefix) || !cb.append(':'))
>+                goto out;
>         }
> 
>         /* 17(b)(iv). */
>-        js_AppendJSString(&sb, GetLocalName(attr->name));
>+        if (!AppendString(cb, GetLocalName(attr->name)))
>+            goto out;
> 
>         /* 17(d-g). */
>-        AppendAttributeValue(cx, &sb, attr->xml_value);
>+        if (!AppendAttributeValue(cx, cb, attr->xml_value))
>+            goto out;
>     }
>     XMLArrayCursorFinish(&cursor);

This skips on failure XMLArrayCursorFinish. 

>     /* Step 17(c): append XML namespace declarations. */
>     XMLArrayCursorInit(&cursor, &decls);
>     while ((ns2 = (JSObject *) XMLArrayCursorNext(&cursor)) != NULL) {
>         JS_ASSERT(IsDeclared(ns2));
> 
>-        js_AppendCString(&sb, " xmlns");
>+        if (!js_AppendLiteral(cb, " xmlns"))
>+            goto out;
> 
>         /* 17(c)(ii): NULL means *undefined* here. */
>         prefix = GetPrefix(ns2);
>         if (!prefix) {
>             prefix = GeneratePrefix(cx, GetURI(ns2), &ancdecls);
>             if (!prefix)
>                 break;
>             ns2->fslots[JSSLOT_PREFIX] = STRING_TO_JSVAL(prefix);
>         }
> 
>         /* 17(c)(iii). */
>         if (!prefix->empty()) {
>-            js_AppendChar(&sb, ':');
>-            js_AppendJSString(&sb, prefix);
>+            if (!cb.append(':') || !AppendString(cb, prefix))
>+                goto out;
>         }
> 
>         /* 17(d-g). */
>-        AppendAttributeValue(cx, &sb, GetURI(ns2));
>+        if (!AppendAttributeValue(cx, cb, GetURI(ns2)))
>+            goto out;
>     }
>     XMLArrayCursorFinish(&cursor);

The same here: goto out skips XMLArrayCursorFinish.

>         XMLArrayCursorInit(&cursor, &xml->xml_kids);
>         while ((kid = (JSXML *) XMLArrayCursorNext(&cursor)) != NULL) {
>-            if (pretty && indentKids)
>-                js_AppendChar(&sb, '\n');
>-
>-            kidstr = XMLToXMLString(cx, kid, &ancdecls, nextIndentLevel);
>+            if (pretty && indentKids) {
>+                if (!cb.append('\n'))
>+                    goto out;
>+            }
>+
>+            JSString *kidstr = XMLToXMLString(cx, kid, &ancdecls, nextIndentLevel);
>             if (!kidstr)
>                 break;
> 
>-            js_AppendJSString(&sb, kidstr);
>+            if (!AppendString(cb, kidstr))
>+                goto out;
>         }
>         XMLArrayCursorFinish(&cursor);
>         if (kid)
>             goto out;

The same bug happens here.
Attachment #394599 - Flags: review?(igor) → review-
Attached patch now with less flagrant leakage (obsolete) — Splinter Review
Oops, should have caught these after your first review.
Attachment #394599 - Attachment is obsolete: true
Attachment #394857 - Flags: review?(igor)
Comment on attachment 394857 [details] [diff] [review]
now with less flagrant leakage

Some     MUST_FLOW_THROUGH("out"); and even a second time with a closer label for the XMLCursorFinish would help the static analysis box catch such errors in the future.

/be
Alternatively, (I was thinking about doing this, but thought I might be spending too much time on to-be-selfhosted code), we could eliminate the potential for error by using RAII to close the cursor.
Yes, I really wanted C++ RAII support when I did XMLCursor -- but even so the patch has other, existing goto out patterns that lack the static analysis markup. We can fix all of these with RAII, in due course. I'm looking for more coverage in the mean time.

Followup bugs as needed.

/be
Attached patch RAII cursor guards (obsolete) — Splinter Review
Attachment #394890 - Flags: review?
Comment on attachment 394890 [details] [diff] [review]
RAII cursor guards

Why not just make JSXMLArrayCursor itself RAII? The one non-auto use of it is js_EnumerateXMLValues, IIRC, but that would want dtor => Finish too.

/be
Comment on attachment 394890 [details] [diff] [review]
RAII cursor guards

(I accidentally hit 'Enter' and submitted the attachment too early.  My punishment is typing this comment in a 5x5 box.  Why is that?)

I started to annotate with MUST_FLOW_THROUGH wherever it applied and I realized I was doing the same work as putting in the RAII guards, so I just did that.

Also in the patch: while I was twiddling with XMLArrayCursorNext-loops, I switched them to use the C++ declarations-as-conditions feature.
Attachment #394890 - Flags: review? → review?(igor)
Attached patch even RAIIerSplinter Review
What comment 70 suggested.  The only work was getting the dynamically scoped cursors to have their constructors/destructors called.  Altogether simpler code, though.
Attachment #394857 - Attachment is obsolete: true
Attachment #394890 - Attachment is obsolete: true
Attachment #394954 - Flags: review?(igor)
Attachment #394857 - Flags: review?(igor)
Attachment #394890 - Flags: review?(igor)
Attachment #394954 - Flags: review?(igor) → review+
Comment on attachment 394954 [details] [diff] [review]
even RAIIer


>+static inline bool
>+AppendString(JSCharVector &cb, JSString *str) {

Left brace on its own line.

>+    /* NOTE: not appending terminating '\n'. */

Why is this note-worthy? An "AppendString" helper taking a JSString would seem to do just what its name says, no added newlines or other filigree implied.

>+    ~JSXMLArrayCursor() {
>+        disconnect();
>+    }

One-liner? No big, up to you.

>+    void disconnect() {
>+        if (!array)
>+            return;
>+        if (next)
>+            next->prevp = prevp;
>+        *prevp = next;
>+        array = NULL;

New school PGO-friendly way to write this does overindent the common case: if (array) {...} with ... the real guts of the method. Just a thought.

>+    static JSXMLFilter *allocate(JSContext *cx, JSXML *list, JSXMLArray *array) {
>+        void *memory = cx->malloc(sizeof(JSXMLFilter));
>+        if (!memory)
>+            return NULL;
>+        return new(memory) JSXMLFilter(list, array);
>+    }
>+
>+    static void deallocate(JSContext *cx, JSXMLFilter *filter) {
>+        filter->~JSXMLFilter();
>+        cx->free(filter);
>+    }

Familiar ;-). Any way to templateize and share?

MEGO effect (my eyes glaze over) took hold while I was reading, but it looks great. Main thing is to run js/tests/e4x tests and compare to unpatched baseline results.

/be
(In reply to comment #73)

Thanks!

> >+    void disconnect() {
> >+        if (!array)
> >+            return;
> >+        if (next)
> >+            next->prevp = prevp;
> >+        *prevp = next;
> >+        array = NULL;
> 
> New school PGO-friendly way to write this does overindent the common case: if
> (array) {...} with ... the real guts of the method. Just a thought.

I do so love to avoid indentation... so you're saying that PGO wouldn't be able to transform the above code?

> >+    static JSXMLFilter *allocate(JSContext *cx, JSXML *list, JSXMLArray *array) {
> >+        void *memory = cx->malloc(sizeof(JSXMLFilter));
> >+        if (!memory)
> >+            return NULL;
> >+        return new(memory) JSXMLFilter(list, array);
> >+    }
> >+
> >+    static void deallocate(JSContext *cx, JSXMLFilter *filter) {
> >+        filter->~JSXMLFilter();
> >+        cx->free(filter);
> >+    }
> 
> Familiar ;-). Any way to templateize and share?

Good idea.  I'll file another bug so that all instances of the pattern can be replaced in one patch.

> MEGO effect (my eyes glaze over) took hold while I was reading, but it looks
> great. Main thing is to run js/tests/e4x tests and compare to unpatched
> baseline results.

Done and will do again before push.
PGO can do anything based on the profile, but maybe the profile we feed it won't have any E4X -- I'd bet it won't.

Who cares about E4X perf anyway (I've been heard to say).

Still, as a general style point, the "always return early rather than overindent the common case" style we've espoused, perhaps too much, is yielding to what is both poor-man's-PGO and often a preferred-by-straight-line-code-readers (same thing, really) style.

/be
Attached patch bit-freshened-upSplinter Review
Attachment #389043 - Attachment is obsolete: true
Attachment #395174 - Flags: review?(jwalden+bmo)
Attachment #389043 - Flags: review?(jwalden+bmo)
Comment on attachment 395174 [details] [diff] [review]
bit-freshened-up

>diff --git a/js/src/json.cpp b/js/src/json.cpp

> JSBool
> js_Stringify(JSContext *cx, jsval *vp, JSObject *replacer, jsval space,
>-             JSCharVector &cb) {
>+             JSCharBuffer &cb) {

Fix the brace here while you're touching the line.
Attachment #395174 - Flags: review?(jwalden+bmo) → review+
Last patch, woohoo!
Attachment #389044 - Attachment is obsolete: true
Attachment #395324 - Flags: review?(brendan)
Attachment #389044 - Flags: review?(brendan)
Comment on attachment 395324 [details] [diff] [review]
jsscan-changes, fresh for the reviewer

>-        /* Token stream ensures that tokenbuf is NUL-terminated. */
>-        JS_ASSERT(*ts->tokenbuf.ptr == (jschar) 0);
>+        // (TODO remove comment) note to reviewer: this function no longer requires terminating '\0'

These (TODO remove comments) markers are not needed -- I'm hip. One tardy nit on a jsvector.h API name:

>         obj = js_NewRegExpObject(cx, ts,
>-                                 ts->tokenbuf.base,
>-                                 ts->tokenbuf.ptr - ts->tokenbuf.base,
>+                                 ts->tokenbuf.begin(),
>+                                 ts->tokenbuf.size(),

SpiderMonkey tries to use size to mean bytes, as in size_t; in contrast, count or length means number of elements in an array or vector. Would it break STL brains to rename to length() instead of size()?

..12345678901234567890123456789012345678901234567890123456789012345678901234567890
>+    /*
>+     * To construct a JSTokenStream, first call the constructor, which is
>+     * infallible, then call |init|, which can fail.  To destroy a

Prevailing style avoids French spacing, and if you follow it here then JSTokenStream fits on the previous line, making for a less ragged-right comment. God I'm fussy!

>+     * JSTokenStream, first call |close| then call the destructor.  If |init|
>+     * fails, do not call |close|.
>+     *
>+     * This class uses JSContext.tempPool to allocate internal buffers.  The
>+     * caller should JS_ARENA_MARK sometime before calling |init| and
>+     * JS_ARENA_RELEASE sometime after calling |close|.

The "sometime" uses are a bit coy. Lose them and you don't lose meaning, but again you get nicer spacing:

..12345678901234567890123456789012345678901234567890123456789012345678901234567890
>+     * This class uses JSContext.tempPool to allocate internal buffers. The
>+     * caller should JS_ARENA_MARK before calling |init| and JS_ARENA_RELEASE
>+     * after calling |close|.

r=me with nits picked, thanks!

/be
Attachment #395324 - Flags: review?(brendan) → review+
(In reply to comment #79)
> >+        // (TODO remove comment) note to reviewer: this function no longer requires terminating '\0'
> 
> These (TODO remove comments) markers are not needed -- I'm hip.

I know you're hip, the comment is more so that I don't forget to remove them before committing.  I always grep TODO before committing.

> >         obj = js_NewRegExpObject(cx, ts,
> >-                                 ts->tokenbuf.base,
> >-                                 ts->tokenbuf.ptr - ts->tokenbuf.base,
> >+                                 ts->tokenbuf.begin(),
> >+                                 ts->tokenbuf.size(),
> 
> SpiderMonkey tries to use size to mean bytes, as in size_t; in contrast, count
> or length means number of elements in an array or vector. Would it break STL
> brains to rename to length() instead of size()?

I'd noticed the size/length distinction in the code and had actually thought about this.  So, done.  I also renamed all the JSTempVector internal variables accordingly.

> Prevailing style avoids French spacing, and if you follow it here then
> JSTokenStream fits on the previous line, making for a less ragged-right
> comment.

I'm trying to switch to Freedom spacing, but .<space><space> is a hard mechanical habit to break.

> God I'm fussy!

Sometimes I reword comments so that they fit on a single line.
http://hg.mozilla.org/mozilla-central/rev/2a5544bd7012
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
I noticed that JS_CEILING_LOG2 is defined for a 32-bit value, and I am passing size_t.  I should have used JS_CEILING_LOG2W, which does word-sized log2.  This will crash any 64-bit systems that allocate >= 2^32 elements.
Attachment #396472 - Flags: review?(jwalden+bmo)
Attachment #396472 - Flags: review?(jwalden+bmo) → review+
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 514139
Flags: wanted1.9.2+
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.