split off JSRegExpObject and JSDateObject from JSObject

RESOLVED WONTFIX

Status

()

Core
JavaScript Engine
RESOLVED WONTFIX
8 years ago
6 years ago

People

(Reporter: njn, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 5 obsolete attachments)

Created attachment 446137 [details] [diff] [review]
patch 1

This patch starts splitting class-specific things out of JSObject into
subclasses of JSObject.  It's based on discussions from a while back with
Brendan, Waldo and Luke.

I've started looking at this because it'll help a lot with the type-specific
arrays work (bug 566767).  But I started with RegExp and Date objects, which
are simpler cases;  arrays will be more complicated because we have dense
vs. sparse, and there are many more array functions.

Notable features:

- In each case there are two subclasses:  JSFooObjectInner, and JSFooObject.
  (Kibitzing on the names is welcome.)  The reason is that JSFooObject could
  end up with a lots of methods (esp. in the case of arrays), but I want to
  expose JSObject's private fields (esp. fslots/dslots) to as little code as
  possible.

  So JSFooObjectInner is both a friend and a sub-class of JSObject.  It
  contains just methods for low-level stuff, like getting/setting
  fslots/dslots.  JSFooObject is then a sub-class of JSFooObjectInner;  it
  can only access private JSObject data fields via getters/setters in
  JSFooObjectInner.

- JSObject has a toFoo() method which static_casts a JSObject to a
  JSFooObject.  So when we see this a lot:

    obj->toFoo()

  I could have instead made it a static method within JSFooObject, viz:

    JSFooObject::to(obj)

  That's nicer in one way -- it results in less Foo-specific code in
  JSObject -- but the conversions are much uglier and there are potentially
  many conversion points (but not as many as I thought there might be, which
  is nice).  There's a similar story with the isFoo() functions.

- The JSFooObjectInner getters/setters have JS_ASSERT(isFoo()) within them.
  This is arguably unnecessary since these calls should have been preceded
  with a JSFooObject::to() call which itself contains JS_ASSERT(isFoo()),
  but I kept them in for now to be extra careful.

- There are a lot of functions that seem suitable for pulling into
  JSFooObject, but cannot for one of two reasons.

  - Ones like regexp_getProperty() and regexp_finalize() are put into tables
    of functions, so their prototypes cannot be changed easily.

  - Ones like GetUTCTime() and js_regexp_toString() aren't
    guaranteed to receive a JSDateObject, ie. they have a JS_InstanceOf()
    test at the start.

    One result of this is that in some places we have an awkward mix because
    we're using a JSObject* in some places and then a JSFooObject* version
    of the same object in others nearby (eg. see date_setYear()).  I'll post
    a follow-up patch that implements one idea for improving this.

This is exploratory hacking and I may have misinterpreted some of the
ideas that have been floated, so comments are welcome!
Attachment #446137 - Flags: feedback?(brendan)
Attachment #446137 - Flags: feedback?(jwalden+bmo)
Attachment #446137 - Flags: feedback?(lw)
Created attachment 446139 [details] [diff] [review]
patch 1, v2

minor naming tweaks over v1
Attachment #446137 - Attachment is obsolete: true
Attachment #446139 - Flags: feedback?(brendan)
Attachment #446137 - Flags: feedback?(lw)
Attachment #446137 - Flags: feedback?(jwalden+bmo)
Attachment #446137 - Flags: feedback?(brendan)
Attachment #446139 - Flags: feedback?(jwalden+bmo)
Attachment #446139 - Flags: feedback?(lw)

Comment 2

8 years ago
Awesome!  I'm excited to see that you are pursuing this.  From a quick scan of the patch, it looks nice.  Also, I don't think the issues you mentioned in your final bullet are that unpleasant.  If I may go all Oprah-feely, a JSFooObject feels like a thin abstract data type, whereas fast natives like date_setYear feel like clients since, as you pointed out, there may not be any JSFooObjects present.

One question: could getPrivate() be overridden in JSFooObject to return a specific type, instead of a void*?  If done for all the sub-classes, then JSObject::getPrivate could even be made private.  This wold have really helped me in, e.g., bug 540706, where I had to track down all the getPrivate calls for Block/With objects.
Created attachment 446170 [details] [diff] [review]
patch 1, v3
Attachment #446139 - Attachment is obsolete: true
Attachment #446170 - Flags: feedback?(brendan)
Attachment #446139 - Flags: feedback?(lw)
Attachment #446139 - Flags: feedback?(jwalden+bmo)
Attachment #446139 - Flags: feedback?(brendan)
Attachment #446170 - Flags: feedback?(jwalden+bmo)
Attachment #446170 - Flags: feedback?(lw)
Created attachment 446174 [details] [diff] [review]
patch 2, v1

This patch partly fixes the mixed use of JSObject and JSDateObject in
jsdate.cpp.  It:

- Adds JSObject::maybeToDate(), which is like toDate() but returns NULL if
  the object isn't an instance of Date.

- Replaces some JS_InstanceOf() calls with maybeToDate().

- Removes the instance check from GetUTCTime(), merges it into 
  JSDateObject::getUTCTime(), and precedes all changed calls to GetUTCTime()
  with a maybeToDate() call.

The main benefit is that prior to this patch, we have to do some kind of
check before calling toDate(), and often that check is the one that's folded
into GetUTCTime(), eg:

    JSObject *obj = JS_THIS_OBJECT(cx, vp);
    jsdouble result;
    if (!GetUTCTime(cx, obj, vp, &result))
        return false;
    JSDateObject *dobj = obj->toDate();     // must come after GetUTCTime()!

That's easy to get wrong;  indeed I got it wrong in earlier versions of 
patch 1.  With this patch, that becomes this:

    JSDateObject *dobj = JS_THIS_OBJECT(cx, vp)->maybeToDate(cx, vp);
    if (!dobj)
        return false;
    jsdouble result = dobj->getUTCTime();

which is harder to get wrong.  And there's no JSObject* variable created in
some of the cases, which is nice.

There are other cases where we could do the same transformation, esp.
GetAndCacheLocalTime().  But that one is more difficult, because we'd need
to add a "if (!obj)" test prior to all calls to it if it is put inside
JSDateObject.  (I'm not sure why GetUTCTime() doesn't have a !obj check but 
GetAndCacheLocalTime() does.)  So I'm uncertain about this patch.
Attachment #446174 - Flags: feedback?(brendan)
Attachment #446174 - Flags: feedback?(jwalden+bmo)
Attachment #446174 - Flags: feedback?(lw)
Comment on attachment 446170 [details] [diff] [review]
patch 1, v3

(Watch as I make a statement by *clearing* the feedback flag rather than +ing or -ing it when the feedback is both good and bad!  ;-) )

Like the general approach, have a few issues, some I-think-this-is-betters with it.

I am utterly unclear as to why you would have both DateObject and DateObjectInner.  If you want conceptual separation, I prefer delineation by comments in the class definition.

Lose the JS prefix, put the class in the js namespace.

SetUTCTime and setUTCTime?  Differentiation by capitalization, and by violating existing class-method camelCaps style, is right out.  setUTCTimeValue, perhaps.

The get/set accessors should also assert the type of the passed-in or sent-out value when possible: JSVAL_IS_NUMBER for dates, for example.

As far as conversion, fallible and infallible, back in the day one strawman version of ES4 had "as", "is", and "to" operators.  I really liked those names, and I think they would be great here, certainly better than "maybe".  So then, isDate, asDate that asserts/casts/returns, and toDate that checks and casts or reports/returns null.
Attachment #446170 - Flags: feedback?(jwalden+bmo)
Comment on attachment 446174 [details] [diff] [review]
patch 2, v1

Looks reasonable, more or less, but a huge problem (hence the -): JS_THIS_OBJECT is fallible and may return NULL.  This was previously caught by JS_InstanceOf where obj == NULL throwing (an API misfeature if you ask me, but c'est la vie).

It would be nice to get an asserting-not-null version of instanceof at some point as a helper, perhaps that JS_InstanceOf might call (right now error-reporting is shared between null and not-instance-of cases, but that might be fixable without code bloat), but that's a task for another day.
Attachment #446174 - Flags: feedback?(jwalden+bmo) → feedback-
(In reply to comment #6)
> (From update of attachment 446174 [details] [diff] [review])
> Looks reasonable, more or less, but a huge problem (hence the -):
> JS_THIS_OBJECT is fallible and may return NULL.

Good catch.

> This was previously caught by
> JS_InstanceOf where obj == NULL throwing (an API misfeature if you ask me, but
> c'est la vie).

The null check came in with the patch for bug 417893. See the comment around "INCOMPATIBILITY NOTICE" in that bug's patch. It doesn't seem worth revisting but there's the bug to cite/block if you do.

/be
Nick: I'm busy, but I will make time to look in detail; relying on Waldo and Luke in the interim -- general approach sounds good.

/be
(In reply to comment #5)
>
> I am utterly unclear as to why you would have both DateObject and
> DateObjectInner.  If you want conceptual separation, I prefer delineation by
> comments in the class definition.

I explained in comment 0, but I mustn't have been clear enough.  Think about arrays.  We will end up with dozens of methods in JSArrayObject.  I've been doing a lot of work recently encapsulating fslots and dslots in order to minimize the amount of code that touches them.  If we don't have JSArrayObjectInner, then every one of these dozens of array-manipulating methods will have full access to fslots and dslots, effectively undoing all my encapsulation work.

In other words, I want a small inner circle of methods that can touch slots directly, and a large outer circle of methods that do array stuff, but can't touch slots directly.  Reasonable?
 
> Lose the JS prefix, put the class in the js namespace.
> 
> SetUTCTime and setUTCTime?  Differentiation by capitalization, and by violating
> existing class-method camelCaps style, is right out.  setUTCTimeValue, perhaps.

Both SetUTCTime and setUTCTime already existed.  I added setUTCTime() recently, SetUTCTime() is from someone else, it predated my work on this stuff.  I was trying to minimize naming changes (this is relevant to "JS" prefixes as well) to avoid blowing out the patch.  I'd prefer to do those things as follow-ups.

> The get/set accessors should also assert the type of the passed-in or sent-out
> value when possible: JSVAL_IS_NUMBER for dates, for example.

Good idea, I'll add that.

> As far as conversion, fallible and infallible, back in the day one strawman
> version of ES4 had "as", "is", and "to" operators.  I really liked those names,
> and I think they would be great here, certainly better than "maybe".  So then,
> isDate, asDate that asserts/casts/returns, and toDate that checks and casts or
> reports/returns null.

Sounds fine, I'll change.


(In reply to comment #6)
> Looks reasonable, more or less, but a huge problem (hence the -):
> JS_THIS_OBJECT is fallible and may return NULL.  This was previously caught by
> JS_InstanceOf where obj == NULL throwing (an API misfeature if you ask me, but
> c'est la vie).

Yes, you're right.  Back to the drawing board on that one.


(In reply to comment #8)
> Nick: I'm busy, but I will make time to look in detail; relying on Waldo and
> Luke in the interim -- general approach sounds good.

I'm not planning to land this stuff any time soon.  It's just that it's a necessary refactoring for the type-specific arrays -- I need lots of array-manipulating functions in a class so I can then templatize it as <typename T, T HoleValue>.  So I started with some simple cases to work through complaints like Waldo's in comment 5.
I'm not super-worried about more methods having access to fslots/dslots, to be honest; I think tight code layout, ordering, and commenting is sufficient segregation.

Comment 11

8 years ago
(In reply to comment #10)
> I'm not super-worried about more methods having access to fslots/dslots, to be
> honest; 

I think we should give njn room to operate here. We can always change the arrangement later if it becomes inconvenient.
Not inconvenient, just more layers to obscure things.  I guess I can live with it, but I still don't think the privacy win trumps the simplicity loss.

Comment 13

8 years ago
(In reply to comment #12)
> I guess I can live with
> it, but I still don't think the privacy win trumps the simplicity loss.

Of course you can live with it, it's just a small and easily-changed code style question. I agree there's a danger in too much ceremony and indirection, but I don't think js/src will have that problem soon, if ever.
(In reply to comment #13)
> (In reply to comment #12)
> > I guess I can live with
> > it, but I still don't think the privacy win trumps the simplicity loss.
> 
> Of course you can live with it, it's just a small and easily-changed code style
> question. I agree there's a danger in too much ceremony and indirection, but I
> don't think js/src will have that problem soon, if ever.

lol. srsly.

Waldo and I agree again in principle, but I want njn to run ahead. I am a C luddite, I admit it.

/be
(In reply to comment #9)
> >
> > As far as conversion, fallible and infallible, back in the day one strawman
> > version of ES4 had "as", "is", and "to" operators.  I really liked those names,
> > and I think they would be great here, certainly better than "maybe".  So then,
> > isDate, asDate that asserts/casts/returns, and toDate that checks and casts or
> > reports/returns null.
> 
> Sounds fine, I'll change.

I've been working on these changes for arrays.  I tried applying that naming scheme and have found it very easy to get wrong.  I repeatedly typed toDenseArray() (the can-fail one) when I really wanted asDenseArray() (the cannot-fail one).  I had to go back and fix them all up.

The cannot-fail conversion is much more commonly used than the can-fail conversion, so I'd prefer to make the can-fail conversion have a harder-to-type name, like... maybeToDenseArray().
Sometimes we have too little indirection (fslots/dslots direct access prior to njn's efforts comes to mind).  Sometimes we have too much (OBJ_GET_{CLASS,SLOT} calls to STOBJ_GET_{CLASS,SLOT} as detritus from ancient threadsafety considerations, again prior to njn's fine work).  The aim is to get it Just Right(tm), Goldilocks-style.  :-)

More tangentially, though...

I accepted that njn wanted this particular setup, even if I disagreed with him that it was best.  I moved on; one interstitial class (even a handful if all object types undergo this treatment) is not the end of the world.  Yet comment 13 appears to assume this was some sort of passive-aggressive complaint designed to sow lingering doubt, or perhaps that it was made petulantly, or perhaps something else entirely (I hesitate to presumptively read a particularized sentiment into it) -- "Of course you can live with it" might imply one of a number of different things absent some evidence (smiley or something, or maybe I'm just not discerning enough) that it was made with some levity.

It does seem as though I've been involved in every technical dispute under the sun in the last couple days with respect to SpiderMonkey.  Yet I do not think I should not be arguing what I think.  Our code quality, our allocation of effort, and our users do best when we fully consider our options.  I can't apologize for arguing for what I believe to be correct, or for not being persuaded by counterarguments even if we go along with them.  Still, I get the feeling that some people are getting excessively frustrated at me in the process, and that's not good.  To the best of my ability I have made every effort to criticize ideas but not persons, to do so calmly and clearly, and to not say what someone else has already said in argument.  I have always assumed opponents in arguments are arguing in good faith even when I believe they argue for the wrong choices.  What am I doing wrong?

I'm particularly worried given we're all (well, everyone but njn, who will be doing much more important work than hacking SpiderMonkey with us) meeting up next week to get stuff done.  I have no desire for this to continue into next week, or even to continue beyond the extent it already has.  We will be most productive if we don't have frustration or hurt feelings or whatever getting in the way of discussion, and I'd like to see us avoid such distraction.
(In reply to comment #15)
> The cannot-fail conversion is much more commonly used than the can-fail
> conversion, so I'd prefer to make the can-fail conversion have a harder-to-type
> name, like... maybeToDenseArray().

Was the leading-cx-means-failures-reported idiom insufficiently clear from the syntax of such uses, for some reason?  I'm trying to figure out exactly what made this susceptible to misuse, and whether it might not just be the newness of the constructs in SpiderMonkey.  If "maybe" helps with that, and if it's not just a passing problem that would quickly disappear with experience, go for it.
Maybe is not quite right because it does not imply failure, error reporting or exception throwing. It comes from the functional languages that have convenient categorical sum types, including the special case of T | nil (nil the bottom type).

I agree with Waldo, leading cx is enough. Any convention would do if clear enough. Alas we are losing leading cx with SystemAllocPolicy, but there's a bug on an explicit fix.

Waldo, don't worry about disagreements next week. I believe face-to-face time helps. In that light, bug comment controversies can generate more heat than they should. Maybe they add light, on re-reading after the heat dissipates. Hope so, anyway. Your contributions along with Luke's to getting our Object C++ veneer just right are appreciated.

/be
And of course we owe Nick a lot for diving in -- I'm thrilled, the jsobj.h macrology, once used consistently out of necessity after the obj->slots split, had become overgrown. Plus, C++!

/be
(In reply to comment #18)
> Maybe is not quite right because it does not imply failure, error reporting or
> exception throwing. It comes from the functional languages that have convenient
> categorical sum types, including the special case of T | nil (nil the bottom
> type).

I may have confused you all.  For RegExp, the maybeToRegExp() function I experimented with involved a call to JS_InstanceOf().  But for arrays, the maybeTo{Slow,Dense}Array() functions just check the clasp and return NULL if it isn't the right one.  

I now think that the simpler array-style maybeToFoo() functions are what is needed in most cases, not the RegExp-style ones, so I don't plan to use the RegExp-style ones.

In which case the "maybe" language does seem appropriate, because we return NULL on failure.  And the position of 'cx' is irrelevant because the simple maybeToFoo() functions don't take a 'cx' argument.
(In reply to comment #16)
> Sometimes we have too little indirection (fslots/dslots direct access prior to
> njn's efforts comes to mind).  Sometimes we have too much (OBJ_GET_{CLASS,SLOT}
> calls to STOBJ_GET_{CLASS,SLOT} as detritus from ancient threadsafety
> considerations, again prior to njn's fine work).  The aim is to get it Just
> Right(tm), Goldilocks-style.  :-)

Ok, I'm glad you agree widespread direct access to fslots/dslots is not good.  With respect to my splitting of these classes into inner circle and outer circle, do you think it's overkill because we should be able to rely on people using the slot getters/setters without mechanically enforcing it?


> More tangentially, though...

I wouldn't worry about the disagreements/discussions, that's the point of code review.  Naming in particular is something nobody ever agrees on.

And I suspect comment 13 was Sayre's attempt to quickly cut off any "get the last word in" comment sequences that sometimes occur, where everybody restates their original position, which can bloat out a bug for little benefit.  He may have been more blunt than ideal, but then, his less blunt effort in comment 11 didn't appear blunt enough :)
(In reply to comment #20)
> 
> I may have confused you all.  For RegExp, the maybeToRegExp() function I
> experimented with involved a call to JS_InstanceOf().

And I may have confused you more because it was actually maybeToDate() that involved JS_InstanceOf();  the patch didn't have a maybeToRegExp().  Sorry for any confusion.
Created attachment 446627 [details] [diff] [review]
patch 1, v4

Changes in this version:

- JS{Date,RegExp}Object{,Inner} become js::{Date,RegExp}Object{,Inner}.
  (Waldo, I see now it makes sense to do this now when the names are first
  introduced.)

- Renamed JSSLOT_FOO_BAR as BAR_SLOT in FooObject.  (I like having _SLOT at
  the end, it reads nicer that way.)

- Moved toFoo() into jsfoo.h;  although it is a member of JSObject it
  conceptually belongs in FooObject.  This matches isFoo(), which is already
  in jsfoo.h.

- Added jsval type-checks in the getters/setters.  They all ended up
  JSVAL_IS_NUMBER.  I didn't bother with the addressOfFoo() ones, though.

I haven't addressed the setUTCTime()/SetUTCTime() ickiness, I'm not sure how
best to handle it and, as I said, it predates this patch.
Attachment #446170 - Attachment is obsolete: true
Attachment #446174 - Attachment is obsolete: true
Attachment #446627 - Flags: feedback?(jwalden+bmo)
Attachment #446170 - Flags: feedback?(lw)
Attachment #446170 - Flags: feedback?(brendan)
Attachment #446174 - Flags: feedback?(lw)
Attachment #446174 - Flags: feedback?(brendan)
Comment on attachment 446627 [details] [diff] [review]
patch 1, v4

I should have tried to push quicker on my end, this is probably not going to be pleasant to deal with in concert with bug 563938 given your time constraints.  :-\


(In reply to comment #23)
> - Renamed JSSLOT_FOO_BAR as BAR_SLOT in FooObject.  (I like having _SLOT at
>   the end, it reads nicer that way.)

Hm, I shouldn't have let that slip by before!  Indeed better, at least if you're not of the nsWalletlibService::WALLET_PreEdit persuasion.  :-D  (Context: <http://mxr.mozilla.org/mozilla/source/extensions/wallet/src/nsWalletService.cpp> and cry.)


> - Moved toFoo() into jsfoo.h;  although it is a member of JSObject it
>   conceptually belongs in FooObject.  This matches isFoo(), which is already
>   in jsfoo.h.

Righto.


> - Added jsval type-checks in the getters/setters.  They all ended up
>   JSVAL_IS_NUMBER.  I didn't bother with the addressOfFoo() ones, though.

I'd be paranoid and check just to be safe, but it probably doesn't matter.


> I haven't addressed the setUTCTime()/SetUTCTime() ickiness, I'm not sure how
> best to handle it and, as I said, it predates this patch.

I'm confused.  Previously SetUTCTime was a static method outside of any class.  After, it's a non-static class method of a class that never previously existed.  How is that "predat[ing]"?


>diff --git a/js/src/jsdate.h b/js/src/jsdate.h

> JS_BEGIN_EXTERN_C
> 
>+namespace js {
>+
>+/* Low-level Date object interface. */
>+class DateObjectInner : public JSObject
>+{

Hum.  |namespace| inside an extern "C".  That didn't -- and doesn't -- make much sense.  Move the JS_BEGIN_EXTERN_C down so it encloses only C-ful symbols and not classes or namespaces?


>+/* High-level Date object interface. */
>+class DateObject : public DateObjectInner
>+{
>+  public:
>+    void SetDateToNaN(JSContext *cx, jsval *vp = NULL);
>+    JSBool SetUTCTime(JSContext *cx, jsdouble t, jsval *vp = NULL);
>+};

Again, I'm not sure why these have to be capitalized like this.

And, come to think of it, if you really want to have two setUTCTime methods, that isn't actually a problem here -- just requires using DateObjectInner::setUTCTime(...) inside the "outermost" method.  It's less pleasant than just giving it a different name, I think, but it would be an acceptable solution.  The GC rooter bits have at least one explicitly-qualified lookup like this for extracting the context from a GC rooter, I think, so there's precedent for it.


>diff --git a/js/src/jsregexp.cpp b/js/src/jsregexp.cpp

>+JSBool
>+RegExpObject::SetLastIndex(JSContext *cx, jsdouble lastIndex)
>+{
>+    return JS_NewNumberValue(cx, lastIndex, addressOfLastIndex());
> }

JS_NewNumberValue imposes symbol overhead, and we can go one better and use js_NewWeaklyRootedDouble (?) from the definition directly -- even better when this method is exposed for inlining.


>diff --git a/js/src/jsregexp.h b/js/src/jsregexp.h

> 
> JS_BEGIN_EXTERN_C
> 
>-namespace js { class AutoValueRooter; }
>+namespace js {
>+

Same pre-existing C++-stuff-inside-extern "C", move JS_BEGIN_EXTERN_C below the class and namespace declarations/definitions.


>+/* High-level RegExp object interface. */
>+class RegExpObject : public RegExpObjectInner
>+{
>+  public:
>+    JSBool SetLastIndex(JSContext *cx, jsdouble lastIndex);
>+};

Why isn't this setLastIndex?

This looks pretty inline-able, should make it so (and put definition in jsregexpinlines.h and all that).
Attachment #446627 - Flags: feedback?(jwalden+bmo) → feedback+
(In reply to comment #24)
> 
> > I haven't addressed the setUTCTime()/SetUTCTime() ickiness, I'm not sure how
> > best to handle it and, as I said, it predates this patch.
> 
> I'm confused.  Previously SetUTCTime was a static method outside of any class. 
> After, it's a non-static class method of a class that never previously existed.
>  How is that "predat[ing]"?

I was inaccurate previously.  To clarify... SetUTCTime() predated this patch and all my slot-related work (so don't blame me for the capital 'S' at the start :)  I moved it into DateObject as part of this patch.

Separately, I previously added setDateUTCTime() as part of my slots encapsulation work.  When moved into DateObjectInner as part of this patch I again removed the now-superfluous "Date", resulting in setUTCTime().


> >+/* High-level Date object interface. */
> >+class DateObject : public DateObjectInner
> >+{
> >+  public:
> >+    void SetDateToNaN(JSContext *cx, jsval *vp = NULL);
> >+    JSBool SetUTCTime(JSContext *cx, jsdouble t, jsval *vp = NULL);
> >+};
> 
> Again, I'm not sure why these have to be capitalized like this.

Again, these functions already existed, I just moved them into DateObject.  I'd be happy to lose the capitalization.


> >+/* High-level RegExp object interface. */
> >+class RegExpObject : public RegExpObjectInner
> >+{
> >+  public:
> >+    JSBool SetLastIndex(JSContext *cx, jsdouble lastIndex);
> >+};
> 
> Why isn't this setLastIndex?

Again, it comes from a pre-existing name, in this case SetRegExpLastIndex().


So we have two similar naming schemes in place.  I've been using {get,set}Foo() for low-level slot getters/setters.  And there are numerous existing {Get,Set}Foo() functions that tend to be higher-level.

One possible improvement would be to change slot getters/setters to {get,set}FooSlot(), which would be clearer but there are a lot calls to them (esp. array ones) so that would be a lot of extra typing.  Suggestions?
The capitalization distinction is conditioned upon the method's location.  Global functions are capitalized: SetUTCTime, SetRegExpLastIndex, and so on.  Method functions are not capitalized: setUTCTime, setLastIndex, and so on.  Moving a function from one location to another means the proper style is different for each, so the name capitalization should change.

The lower-level function, when you have two setting a field as here, should indeed be the one with the grodier name, the one far fewer users have to put up with.  setUTCTimeSlot is a reasonable name, for example, if you'd rather not have explicitly-qualified DateObjectInner::setUTCTime(v).  Me, I see no reason not to explicitly qualify, but up to you.
Created attachment 448919 [details] [diff] [review]
patch 1, v5

(In reply to comment #24)
> >
> > - Added jsval type-checks in the getters/setters.  They all ended up
> >   JSVAL_IS_NUMBER.  I didn't bother with the addressOfFoo() ones, though.
> 
> I'd be paranoid and check just to be safe, but it probably doesn't matter.

I don't think it's safe, it's possible the slot hasn't been set.


> Hum.  |namespace| inside an extern "C".  That didn't -- and doesn't -- make
> much sense.  Move the JS_BEGIN_EXTERN_C down so it encloses only C-ful symbols
> and not classes or namespaces?

Fixed in jsdate.h and jsregexp.h.


> >+/* High-level Date object interface. */
> >+class DateObject : public DateObjectInner
> >+{
> >+  public:
> >+    void SetDateToNaN(JSContext *cx, jsval *vp = NULL);
> >+    JSBool SetUTCTime(JSContext *cx, jsdouble t, jsval *vp = NULL);
> >+};
> 
> Again, I'm not sure why these have to be capitalized like this.
> 
> And, come to think of it, if you really want to have two setUTCTime methods,
> that isn't actually a problem here -- just requires using
> DateObjectInner::setUTCTime(...) inside the "outermost" method.  It's less
> pleasant than just giving it a different name, I think, but it would be an
> acceptable solution.  The GC rooter bits have at least one explicitly-qualified
> lookup like this for extracting the context from a GC rooter, I think, so
> there's precedent for it.

Requiring 'DateObjectInner::' offended me.  In this patch I've changed the
names of those two functions as follows:

  SetDateToNaN --> invalidateUTCTimeAndLocalTime
  SetUTCTime   --> setUTCTimeAndInvalidateLocalTime

Verbose, yes, but also accurate.  (Feel free to kibitz, but SetUTCTime() in
particular was a highly misleading name.)


> >+JSBool
> >+RegExpObject::SetLastIndex(JSContext *cx, jsdouble lastIndex)
> >+{
> >+    return JS_NewNumberValue(cx, lastIndex, addressOfLastIndex());
> > }
> 
> JS_NewNumberValue imposes symbol overhead, and we can go one better and use
> js_NewWeaklyRootedDouble (?) from the definition directly -- even better when
> this method is exposed for inlining.

Done.  A CHECK_REQUEST(cx) was also required.


> >+/* High-level RegExp object interface. */
> >+class RegExpObject : public RegExpObjectInner
> >+{
> >+  public:
> >+    JSBool SetLastIndex(JSContext *cx, jsdouble lastIndex);
> >+};
> 
> Why isn't this setLastIndex?

I changed it accordingly.

> This looks pretty inline-able, should make it so (and put definition in
> jsregexpinlines.h and all that).

I didn't do that because it's only used within jsregexp.cpp and the
definition precedes all its uses so it's likely the compiler will inline it
anyway.
Attachment #446627 - Attachment is obsolete: true
Attachment #448919 - Flags: review?(jwalden+bmo)
Comment on attachment 448919 [details] [diff] [review]
patch 1, v5

(In reply to comment #27)
> I don't think it's safe, it's possible the slot hasn't been set.

Oh, nice catch.  And I guess an is-void-or-is-* check is stretching things a bit.


> Requiring 'DateObjectInner::' offended me.  In this patch I've changed the
> names of those two functions as follows:
> 
>   SetDateToNaN --> invalidateUTCTimeAndLocalTime
>   SetUTCTime   --> setUTCTimeAndInvalidateLocalTime
> 
> Verbose, yes, but also accurate.  (Feel free to kibitz, but SetUTCTime() in
> particular was a highly misleading name.)

Nothing better immediately springs to mind.  I have no particular qualms about long names when they're the right tool for the task.


> Done.  A CHECK_REQUEST(cx) was also required.

The point of CHECK_REQUEST(cx) is to do in-request checks at API entry points, when you're required to be in a request when calling them.  (We may also have one or two places where C_R is used in a method that itself is at the very start of many API entry points -- js_AtomizeChars being such a case, since so many of the JS_*Property methods start with it.)  This method isn't part of public API, so it doesn't need it.  We *should* be in one when it's called, to be sure, but it isn't a precondition in the same way it is elsewhere.

So basically, it doesn't hurt, but stylistically it's not quite proper to have it.


> I didn't do that because it's only used within jsregexp.cpp and the
> definition precedes all its uses so it's likely the compiler will inline it
> anyway.

Cool.
Attachment #448919 - Flags: review?(jwalden+bmo) → review+
Created attachment 449510 [details] [diff] [review]
all-static patch, v1

(In reply to comment #0)
> 
> - There are a lot of functions that seem suitable for pulling into
>   JSFooObject, but cannot for one of two reasons.
> 
>   - Ones like regexp_getProperty() and regexp_finalize() are put into tables
>     of functions, so their prototypes cannot be changed easily.
> 
>   - Ones like GetUTCTime() and js_regexp_toString() aren't
>     guaranteed to receive a JSDateObject, ie. they have a JS_InstanceOf()
>     test at the start.
> 
>     One result of this is that in some places we have an awkward mix because
>     we're using a JSObject* in some places and then a JSFooObject* version
>     of the same object in others nearby (eg. see date_setYear()).

Attached is an alternative patch that avoids this problem.  It does this by
making all the class-specific methods static.  Eg. instead of this:

  int getFoo();

we have:

  static int getFoo(JSObject *obj);

this avoids the awkward mixing of class and non-class methods and allows
many more specific functions to be pulled into the relevant class (eg. many
more Date-related functions get pulled into class DateObject).  (But note
that I've only pulled some of these functions into the relevant classes for
demonstration purposes, there are many more that could be done, esp. for
Date.)

One advantage is that there's no need for isFoo/toFoo stuff in JSObject --
JSObject::isFoo() becomes Foo::is() and JSObject::toFoo() disappears.  The
only class-specific bits in JSObject is a single 'friend class js::FooInner'
declaration.  I dismissed this advantage previously as small but it is a
real win for templatized type-specific dense arrays, because including the
necessary bits and pieces in jsobj.h is a real pain, so much so that I
haven't got it to compile successfully yet -- I've been using static
to()/is() methods instead.

The syntax is shorter with static methods when we do a single Foo operation,
eg:

    Date::setFoo(obj, x);

vs

    obj->toDate()->setFoo(x);

There is some repetition of "Foo::" in the case where we do multiple Foo
operations:

    Foo::bar(obj);
    Foo::baz(obj);

instead of:

    FooObject *fobj = obj->toFoo();
    fobj->bar();
    fobj->baz();

The static syntax also doesn't match JSObject methods, eg.

    obj->setPrivate();
    RegExp::setLastIndex(obj);

I guess the JSObject methods could be changed to be static as well.

You can see I also changed from using "FooObject" to just "Foo".  This saves
a lot of typing and I reckon it's a good idea even if "String" could be
briefly misinterpreted by newbies as a generic string class.

Overall I'm leaning towards the all-static approach now.  I got the idea
from Gal and Shaver who used it in their array-ng prototype.  Although it's
arguably a bit inelegant/not standard OO, I very much like the simplicity
and the fact that it allows all relevant functions to be pulled into the
appropriate class instead of having this awkward mix of in-class and
not-in-class functions.
Attachment #449510 - Flags: review?(jwalden+bmo)
Attachment #449510 - Flags: feedback?(brendan)
Attachment #449510 - Flags: feedback?(lw)

Comment 30

8 years ago
Comment on attachment 449510 [details] [diff] [review]
all-static patch, v1

To be honest, I'm not a big fan of this direction.  It seems that the basic problem addressed by the bug is breaking up the plethora of accessors in JSObject.  IIUC, the reason for making everything a static function is to be able to stick more stuff in these type-specific classes.  However, this seems to be solving a different and unspecified problem.  E.g., I'm not sure what is gained by having SetRegExpLastIndex and SetUCTime as static members vs. static non-members.  (In fact, it seems a little worse, as it increases the visibility of these functions.)  If this static-memberization was applied throughout SM, it seems like it would cause a lot of textual churn (and unpleasant conflicts for everyone, especially newjsvals) without much benefit.

Also, there is something appealing about having isX() and toX() be non-static members, more than just drinking the OOP-koolaid.  For one thing, having to do obj->toX() before using X members makes it clear that obj must be an X, whereas X::foo(obj) is less clear about this.

On a side note, perhaps you would consider using "asX" instead of "toX", since "to" implies conversion (e.g., "ValueToString") and "as" implies reinterpretation.  Also, newjsvals are using v.isInt32()/v.asInt32().
Attachment #449510 - Flags: feedback?(lw) → feedback-
Comment on attachment 449510 [details] [diff] [review]
all-static patch, v1

What Luke said, especially on the you-know-you-have-an-X contention.  Also, I don't think the OOP kool-aid here is in any way a bad thing -- better, really, for the x-is-X reason.  Last, and least important, X::y seems more chatty than necessary -- members are shorter, more idiomatic, what's not to like and all.
Attachment #449510 - Flags: review?(jwalden+bmo) → review-
Right, looks like the all-static approach isn't popular!  I'll go back to the original approach.

> One advantage is that there's no need for isFoo/toFoo stuff in JSObject --
> JSObject::isFoo() becomes Foo::is() and JSObject::toFoo() disappears.  The
> only class-specific bits in JSObject is a single 'friend class js::FooInner'
> declaration.  I dismissed this advantage previously as small but it is a
> real win for templatized type-specific dense arrays, because including the
> necessary bits and pieces in jsobj.h is a real pain, so much so that I
> haven't got it to compile successfully yet -- I've been using static
> to()/is() methods instead.

I've got these bits and pieces working now.  It required some fiddling, as I had to get right various interactions between template classes, namespaces and #include ordering.  (It'd be nice if GCC said "identifier Foo doesn't exist, but you do have a js::Foo hanging around, did you mean that?".)
(In reply to comment #30)
> 
> On a side note, perhaps you would consider using "asX" instead of "toX", since
> "to" implies conversion (e.g., "ValueToString") and "as" implies
> reinterpretation.  Also, newjsvals are using v.isInt32()/v.asInt32().

I'm happy with that so long as I can use maybeAsFoo() for the can-fail version;  see comment 15 about my bad experience with the is/as/to naming convention.
(In reply to comment #33)
> (In reply to comment #30)
> > 
> > On a side note, perhaps you would consider using "asX" instead of "toX", since
> > "to" implies conversion (e.g., "ValueToString") and "as" implies
> > reinterpretation.  Also, newjsvals are using v.isInt32()/v.asInt32().
> 
> I'm happy with that so long as I can use maybeAsFoo() for the can-fail version

Strongly suggest avoiding "fail" here -- failure means false or null return with an error reported (OOM, slow script dialog with user choosing to stop the script), or converted to a pending exception (typical case). A "maybe" or nullable type is not failing in any useful sense by being null. It's succeeding at being null ;-).

/be
Attachment #449510 - Flags: feedback?(brendan)
I started writing a patch for functions and I have to say I really like it.

It's a little surprising to see these explicit `toFunction()` casts all over the codebase that have been lurking there implicitly, occasionally enforced by assertions, all along. This kind of change gives every one of them an assertion, which is quite nice.

Updated

7 years ago
Blocks: 666042
Someone (probably Waldo) created RegExpObject in another patch.  DateObject hasn't been done yet but that can be done in a separate bug.
Assignee: n.nethercote → general
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.