Closed Bug 513788 Opened 15 years ago Closed 14 years ago

Revise js-facing API for js-ctypes

Categories

(Core :: js-ctypes, defect, P1)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: dwitte, Assigned: dwitte)

References

(Blocks 6 open bugs, )

Details

(Keywords: dev-doc-complete)

Attachments

(9 files, 7 obsolete files)

299.94 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
195.46 KB, patch
Details | Diff | Splinter Review
8.65 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
11.49 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
1.96 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
12.64 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
2.13 KB, patch
dwitte
: review+
Details | Diff | Splinter Review
2.06 KB, patch
dwitte
: review+
Details | Diff | Splinter Review
75.80 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
We should do this before release. Changes bsmedberg has indicated we should make:

* Imported using Cu.import... client code should not see XPCOM at all
* small API improvements of the sort "ctypes.loadLibrary("user32.dll")
instead of lib = ctypes.library(); lib.load('user32.dll');
Regarding import... how exactly do we want to do this? For the JSAPI bits, we need a place to hook it all up. I'm not sure how that'd work with Cu.import, but maybe bsmedberg has a plan?

We could stick a method on Cu itself, say Cu.importCTypes, which would go here:
http://mxr.mozilla.org/mozilla-central/source/js/src/xpconnect/idl/xpccomponents.idl#127

How does that sound?
You could do the initial import using xpconnect/nsIXPCScriptable and then ignore xpconnect after that... mrbkap may have other ideas. I don't really care as long as the user never sees it.
I've written up a very-much-draft spec of how integer, pointer, and struct types should work from js, and (vaguely) how we can implement them with JSAPI: https://wiki.mozilla.org/Jsctypes/api

Comments are very much appreciated!
concerns:

* InitCaps feels very non-JSy and doesn't match the python API either. Can we s/Pointer/pointer/ and s/Struct/struct/ unless those are future JS keywords or something?

* the doc is not clear about which host API some of these things hang from: ctypes.types.Int32? or ctypes.Int32? Why not just ctypes.types.int32_t(4) instead of having int32_t *and* Int32?

* python has a distinction between POINTER(), which is used to construct pointer types, and pointer(), which is used to construct pointer data. Currently it seems you're using JS pointer() for both, based on whether the thing being passed in is a type or data.

* I think `new ctypes.pointer()` should be disallowed. If you want a void*, you should be required to do `new ctypes.pointer(ctypes.types.VOID)`

* let i = p.contents(); // i = *p (by reference)
What does this mean? python ctypes .contents is *not* by reference. If you want to assign, you have to use the index, e.g. ptr[0] = value; (other indexes are also allowed)

* Will there be a way to take an opaque struct type (`new ctypes.struct()`) and give it structure after the fact? structtype.define([struct definition array])?

* The python method for defining structs uses tuples, not objects, and seems slightly preferable to me:

struct([['x', int32_t], ['y', int64_t]])

(Although in python you subclass the Structure type, so the exact analogy in JS would be:

function MyStruct()
{
}
MyStruct.prototype = new ctypes.Structure();
MyStruct.prototype.fields = [['x', ctypes.types.int32_t], ['y', ctypes.types.int64_t]];

* Are we going to allow unions? i.e.

union({a: Int32, b: Int64})

* structure alignment will be important on Windows where WINAPI structs don't follow the standard C structure alignment. See http://docs.python.org/library/ctypes.html#structure-union-alignment-and-byte-order

* arrays are really important. Python just uses ctypes.int32_t * 10, but I don't think we can override JS operators like that, so maybe ctypes.array(ctypes.int32_t, 10)? Passing a string buffer into a function is pretty core functionality. With arrays I think you can do it with ctypes.array(ctypes.char, 255) and then pass that into a parameter which is declared with char*. Python also allows you to resize arrays with a global resize method (although arraytype.resize() sounds more natural to me). Python has a helper function .create_string_buffer which does the same thing and initializes with a string, e.g. create_string_buffer("initial_data" [, 255])

* callback functions and function pointers later?
* saved errno and set_errno/set_last_error
* cast (for pointers at least)
* memmove
* memset
* sizeof
Wow. So I have a lot to say, but my thoughts aren't really settled yet. The proposed API is both promising and confusing.

Let me first try to explain where I'm coming from. A key problem here is
that C++ has objects, pointers, and a stack. An int is an object in C. An array
is an object. An array element is an object. JS on the other hand has a few
primitive types, and references to heap objects. (A value is never an Object
but only a reference to an Object).

We all know this of course, but I think this is more than just important--it's
the whole problem. To me the main challenge in this design is to keep people
from getting confused and lost trying to figure out why the types don't match
or why the result is always `undefined`, owing to a slightly wrong mental model
of how we map references to pointers, structs to Objects, etc.


So here's my blow-by-blow summary of the proposed API as I understand it. (well, at least the parts I focused on, the problematic ones)

1. Some pointers are of course returned by foreign functions. We don't really
know what they point at, and we have no way of trying to keep the referent
alive.

2. But then ctypes can also create objects that represent addressable values
(of any C type) and create pointers to them; and *those* pointers keep the
referent alive, right?

3. Some objects created by ctypes own their own backing store; others are
essentially references to subobjects (in the C++ sense) of backing objects. To
the user these are supposed to behave more or less the same, right?

4. It also sounds like you want pointers inside the backing store to behave
like JS references for the purpose of keeping other objects from being GC'd. To
me that sounds difficult to implement, given that foreign functions can modify
those buffers randomly.  Once you have unions, or types like jsval that don't
look like pointers but secretly are, it seems impossible.

5. Of course a whole ton of stuff isn't there yet, like what .constructor
returns for all these objects, and what .toString() does.


Here are a few concrete proposals.

I think we should just drop item #4.

I think there should be a JSClass for "malloc'd buffer of a given C type".  It
might be an array, a struct, an int, a pointer. Call this class "Buffer"-- but
it's an implementation detail; we never have to expose buffers directly to the
ctypes user.

Then there should be a separate JSClass whose objects are *references* to
objects that live in Buffers. They can refer to the "complete object" or any
"subobject", even a deeply nested one. They keep it alive. Call this class
ctypes.Reference. It has a strong reference to the Buffer via a reserved slot.
(References never contain pointers to each other, only to the Buffer and to
their Type.)

When you say `new MyStruct()`, ctypes allocates a new Buffer and returns a new
Reference to the complete object that lives in the buffer.

If ref is a Reference that refers to a struct, then ref.fieldName should return
a primitive JS value if the type of the field is a pointer, double, boolean, or
string (something we can happily autoconvert), otherwise a Reference to the
field. Same for ref[index] where ref is a Reference to an array.

Now that I've looked at it, I don't like the `new Pointer(x)` syntax, even
though that's what I was going to propose too.  Instead, let's do this:

  ctypes.addressOf(ref) --> a new Pointer
  ctypes.addressOfMember(ref, name) --> a new Pointer
  ctypes.addressOfElement(ref, index) --> a new Pointer
  ctypes.cast(pointertype, ptr) --> a new Pointer

  ptr.toString()  -->  "(FILE *) 0xfca8f740"
                       // note: doesn't look like an Object

This proposal comes from a deep belief that the less people think of Pointers
as JS objects, the better off they'll be.

For the same reason, I propose that pointers not act like references at all.
Instead they're always just a scramble of hex digits with a type.

Now this leaves open the question of how you're supposed to dereference a
pointer handed to you by a library, one that isn't backed by a ctypes
Buffer; or a pointer fetched out of a struct. I'm not sure yet, and this is as far as I got tonight. More tomorrow.
No longer blocks: 513783
Blocks: 429551
Attached patch wip patch v0.1 (obsolete) — Splinter Review
First stab at implementing ctypes.CType, ctypes.CData, and ctypes.int32_t.

I'm not really sure how you want the prototype/constructor stuff laid out in these objects - I've made a guess, and got something working, but it's far from complete. (For one, I haven't implemented the .constructor properties.) Not really sure whether I should be using JS_InitClass or doing things manually, but for ctypes.int32_t I took the latter approach.

This also stubs out CData, ConvertToJS, and ExplicitConvert. Once we get the type objects figured out I'll get to work on those.

Would be nice to know if I'm on the right track here. :)
Assignee: nobody → dwitte
Status: NEW → ASSIGNED
Attachment #403940 - Flags: review?(jorendorff)
(This patch is on top of the s/xpconnect/JSAPI/ work in bug 518721, FYI.)
Comment on attachment 403940 [details] [diff] [review]
wip patch v0.1

new patch coming later today.
Attachment #403940 - Attachment is obsolete: true
Attachment #403940 - Flags: review?(jorendorff)
Product: Other Applications → Core
dwitte, do you have a guess when this patch will land? Want to be sure I get the doc updated when it does.
OS: Linux → All
Hardware: x86 → All
Attached patch patch v1 (obsolete) — Splinter Review
Fín.

There are a few TODOs left, which we would preferably leave for followup patches:
* Fill out the string autoconversion API a little. (I'm thinking readString/writeString here, and perhaps being a bit stricter on char<->jschar single-character conversions. jorendorff and I discussed doing IsASCII checks, for instance.)
* Extending PointerType to hold its CData referent alive, which in turn would allow us to autoconvert JS strings to char.ptr's. (With this patch we only do this for ffi function params, where we have an opportunity to manually manage lifetime.)
* Fill out the unit tests more.
* Switch all the JS_ReportError's to JS_ReportErrorNumber instead, so we can throw TypeErrors and RangeErrors. This depends on bug 518621.
* Probably a million other little things.
Attachment #410916 - Flags: review?(jorendorff)
Do we want ptrdiff_t as well?
Attached patch patch v2 (obsolete) — Splinter Review
Makes some minor fixes, and adds a lot more tests. This now passes tests on all platforms. (Sadly, it turns out libffi_msvc doesn't deal with returning structs by value - a side-effect of being an old, unmaintained port. Filed bug 528351 on rectifying that, but for now that feature is disabled on Windows.)
Attachment #410916 - Attachment is obsolete: true
Attachment #415294 - Flags: review?(jorendorff)
Attachment #410916 - Flags: review?(jorendorff)
Blocks: 513798
OK, I'm looking at this now. I won't finish tonight though.
Blocks: 535378
I still have a few thousand lines to go on this review, but it looks very very good.
Attached patch patch v2, updated (obsolete) — Splinter Review
Updated to trunk.
Attachment #415294 - Attachment is obsolete: true
Attachment #422437 - Flags: review?(jorendorff)
Attachment #415294 - Flags: review?(jorendorff)
A few very broad comments about this patch:

* Generally instead of defining properties and methods fresh on every object,
  we should use prototypes. I should have said so in the spec.

  CData objects should have a prototype hierarchy similar to what CType objects
  have. Like this:

    var struct_tm = ctypes.StructType(...);
    var t = struct_tm({...});
    t.__proto__ === struct_tm.prototype
      ===> true   // common ancestor of all CData objects of type struct_tm
    t.__proto__.__proto__  // common ancestor of all CData objects of struct type
      ===> {addressOfField: function ...}
    t.__proto__.__proto__.__proto__ === ctypes.CData.prototype
      ===> true   // common ancestor of all CData objects

    t.hasOwnProperty("address")
      ===> false
    ctypes.CData.prototype.hasOwnProperty("address")
      ===> true

* I think it would be better if the methods of CData objects were
  non-enumerable.  That way you could use for-in to enumerate over structs and
  arrays.  (Normal arrays we discourage people from enumerating, but ctypes
  arrays' prototypes are all sealed, or that's the intent I think. So it should
  be OK.)

* It looks like we don't seal the globals. I'm not sure how much XPConnect sets
  up for us. Probably it at least creates a separate Object.prototype for the
  module. We might want to seal that, for good measure. Worth a look.

* When a function returns true to indicate success and false to indicate that
  something failed and a JSAPI error has been reported, the return type should
  be JSBool, which strongly suggests this precise kind of API behavior. 

  For example, TypeError and DefineABIConstant should return JSBool.

  As a side benefit, you could then change code like
    >+  return JS_SealObject(cx, obj, JS_FALSE) != JS_FALSE;
  to
    >+  return JS_SealObject(cx, obj, JS_FALSE);

  (Note: SpiderMonkey house style considers it OK to `return false` rather than
  JS_FALSE even if the return type is JSBool.)

  An aside: whether a function reports an error on failure is, of course, part
  of its contract. In SpiderMonkey, the answer is almost always yes, and this
  consistency is useful. It means there's one less thing to learn about how
  each function behaves. It helps us avoid returning false without reporting an
  error (an easy mistake to make). You can look at code that says

    JSBool
    Monkey(JSContext *cx, ...)
    {
        if (!JSVAL_IS_OBJECT(vp[1]))
            return false;
        if (!js_DefineMonkey(cx, vp[1], argc, vp + 2, pobjp, propp))
            return false;
        ...
    }

  and (once you train your eyes) immediately see that the first is a bug and
  the second is OK; there is little danger that Monkey is actually intended to
  return false without reporting an error, or that js_DefineMonkey can do so.

* The error messages when type conversion fails are pretty weak! I expect users
  to bump into these pretty often, so please do another pass over them and see
  if there's any helpful information you can add. A decent type error is like,

    can't pass the string "xyzzy" to argument 2 of hypot(double, double)

* Generally the code that munges strings should use 16-bit characters. For
  example, all the BuildTypeName and BuildTypeSource stuff where we use
  user-provided JS strings, like the names of fields, should use 16-bit strings
  to avoid garbling those names. This is not a huge deal but now that we all
  live in the future, it seems like we should get past the ASCII-only thing.
  This applies most places where the code uses nsCAutoString.

In CTypes.cpp:
>+// Class representing ctypes.CType.prototype.
>+// This exists to provide three reserved slots for stashing
>+// ctypes.{Pointer,Array,Struct}Type.prototype.

Nit: This comment is stale. There are 5 reserved slots.

>+static JSClass sCTypeProto = {
>+  "CType",
>+  JSCLASS_HAS_RESERVED_SLOTS(CTYPEPROTO_SLOTS),
>+  JS_PropertyStub, JS_PropertyStub, JS_PropertyStub, JS_PropertyStub,
>+  JS_EnumerateStub, JS_ResolveStub, JS_ConvertStub, JS_FinalizeStub,
>+  NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL
>+};

Style nit: Rename the Proto classes to sCTypeProtoClass and so on?

>+static JSClass sCDataClass = {
>+  "CData",
>+  JSCLASS_HAS_RESERVED_SLOTS(CDATA_SLOTS),
>+  JS_PropertyStub, JS_PropertyStub, ArrayType::Getter, ArrayType::Setter,

Here's to the day we can swap out these class-wide getters and setters for
something faster and better-behaved. :-|

>+static const char*
>+ToSource(JSContext* cx, jsval vp)
>+{
>+  JSString* str = JS_ValueToSource(cx, vp);
>+  if (str)
>+    return JS_GetStringBytesZ(cx, str);
>+
>+  JS_ClearPendingException(cx);
>+  return "<<error converting value to string>>";
>+}

I probably already reviewed this exact code, but note that the string returned
by ToSource is not necessarily GC-reachable. It would be better to use a
JSAutoTempValueRooter.

Also, a style nit: the name `vp` shouldn't be used for jsvals, only jsval*s.

>+bool
>+TypeError(JSContext* cx, const char* expected, jsval actual)

Another style nit: the return type of TypeError should be JSBool, for the same
reason as DefineABIConstant...

>+static bool
>+DefineABIConstant(JSContext* cx,
>+                  JSObject* parent,
>+                  const char* name,
>+                  ABICode code)

Style nit: 
In InitInt64Class:
>+  // Set up ctypes.{Int64,UInt64}.prototype.__proto__ === Function.prototype.

We don't want this for Int64 and UInt64; it wouldn't make sense to do
`Int64(0).apply(...)` or for `Int64(0) instanceof Function` to be true.

>+static JSObject*
>+InitSpecialType(JSContext* cx,
>+                JSObject* parent,
>+                JSObject* CTypeProto,
>+                JSFunctionSpec spec)

This is vaguely named. You could call it "InitTypeConstructor". Perhaps the
only real remedy, though, is a one-line comment above it:

  // Set up one of the type constructors ctypes.{Pointer,Array,Struct}Type.

I think the PointerType-specific, ArrayType-specific, and StructType-specific
properties (.targetType, .elementType, .length, .fields) should be defined here
on common prototypes shared by all pointers/arrays/structs. These properties
should be JSPROP_SHARED.

>+  // Set up the .prototype and .prototype.constructor properties.
>+  JSObject* prototype = JS_NewObject(cx, NULL, CTypeProto, obj);

Nit: Instead of using `obj` as the parent here, please use `parent`. The
convention is for constructors and prototypes to have the same parent, not for
one to be parented to the other.

Micro-nit: Either all four of {C,Pointer,Array,Struct}Type.prototype should
have [[Class]] "CType" or all four should have [[Class]] "Object". Your
call. The convention in JavaScript is like so:

  js> Object.prototype.toString.call(Date.prototype)
  "[object Date]"
  js> Object.prototype.toString.call(Array.prototype)
  "[object Array]"
  js> Object.prototype.toString.call(Function.prototype)
  "[object Function]"

...which would suggest [[Class]] "CType", but XPConnect makes no pretense of
sticking to this convention, so feel free to do what's easiest.

>+  // Create and attach the special class constructors:
>+  // ctypes.PointerType, ctypes.ArrayType, and ctypes.StructType.
>+  // Each of these has, respectively:
>+  //   * Class [[Function]]
>+  //   * __proto__ === Function.prototype
>+  //   * A constructor that creates a user-defined type.
>+  //   * 'prototype' property:
>+  //     * Class [[Object]]

Reminder to change this comment if you decide to make these have [[Class]] "CType".

>+  // Create and attach the ctypes.{Int64,UInt64} constructors.
>+  // Each of these has, respectively:
         ...
>+  //   * 'prototype' property:
>+  //     * Class [[{Int64Proto,UInt64Proto}]]
>+  //     * __proto__ === Function.prototype === ctypes.{Int64,UInt64}.__proto__

Reminder to change this comment if you agree Function.prototype is not what we
want here.

>+  // Attach the five prototypes just created to ctypes.CType.prototype,
>+  // so we can access them when constructing instances of those types. An
>+  // instance 't' of a ctypes.{Pointer,Array,Struct}Type will have, resp.:
>+  //   * Class [[CType]]
>+  //   * __proto__ === ctypes.{Pointer,Array,Struct}Type.prototype

Micro-nit: "an instance of a ctypes.{Pointer,Array,Struct}Type" to me means "an
instanceof any ctypes.{Pointer,Array,Struct}Type object", i.e., a CData object
of pointer type, which isn't what you mean.

I think just removing the word "a" from that phrase makes it work.

In jsvalToBool:
>+  if (JSVAL_IS_DOUBLE(val)) {
>+    jsdouble d = *JSVAL_TO_DOUBLE(val);
>+    *result = d != 0;
>+    return d == 1 || d == 0;
>+  }

I think it's worth pointing out in a comment that this code is necessary to
handle -0, even if +0 and 1 are always represented JSVAL_IS_INT.

In jsvalToInteger:
>+  if (JSVAL_IS_OBJECT(val) && !JSVAL_IS_NULL(val)) {

The idiom for this is !JSVAL_IS_PRIMITIVE(val). Please change it (several
places).

>+      // Check whether the source type is always representable, with exact
>+      // precision, by the target type. If it is, convert the value.
>+      switch (CType::GetTypeCode(cx, typeObj)) {
>+#define DEFINE_INT_TYPE(name, fromType, ffiType)                               \
>+      case TYPE_##name:                                                        \
>+        if (IsUnsigned<fromType>() && IsUnsigned<IntegerType>() &&             \
>+            sizeof(IntegerType) < sizeof(fromType))                            \
>+          return false;                                                        \
>+        if (!IsUnsigned<fromType>() && !IsUnsigned<IntegerType>() &&           \
>+            sizeof(IntegerType) < sizeof(fromType))                            \
>+          return false;                                                        \
>+        if (IsUnsigned<fromType>() && !IsUnsigned<IntegerType>() &&            \
>+            sizeof(IntegerType) <= sizeof(fromType))                           \
>+          return false;                                                        \
>+        if (!IsUnsigned<fromType>() && IsUnsigned<IntegerType>())              \
>+          return false;                                                        \
>+        *result = *static_cast<fromType*>(data);                               \
>+        break;

How about factoring the "always representable with exact precision" check into
a separate function?

  #define DEFINE_INT_TYPE(name, fromType, ffiType) \
        case TYPE_##name: \
          if (IsWider<IntegerType, fromType>()) \
             *result = *static_cast<fromType*>(data); \
          return IsWider<IntegerType, fromType>();

Just a thought.

In jsvalToFloat: The spec says (and your patch agrees) that
ImplicitConvert(UInt64(x), double) is OK, even though it may lose bits. I think
we did this for convenience, since e.g. even on a 32-bit platform size_t values
are UInt64. But really the only consistent thing is to throw a TypeError here,
just like all the other cases where we detect a potentially lossy conversion,
and make the user convert explicitly. Can we change it?

In jsvalToBigInteger:
>+// Implicitly convert val to IntegerType, allowing jsint, jsdouble,
>+// Int64, UInt64, and optionally a decimal or hexadecimal string argument.
>+// (This is used where a primitive is expected and we are converting to a
>+// size_t value or constructing an Int64 or UInt64 object, and thus it is
>+// implied that IntegerType is one of the wrapped integer types.)

This threw me off a little bit, as I was taking "implicitly convert" to mean
ImplicitConvert. It seems like the sentence in parentheses could just say
"common code shared by jsvalToSize and the Int64/UInt64 constructors."

>+    if (Int64::IsInt64(cx, obj)) {
>+      PRInt64 i = Int64Base::GetInt(cx, obj);
>+      *result = IntegerType(i);
>+
>+      // Make sure the integer is nonnegative and fits in IntegerType.

The comment isn't quite right; I think it just means "Make sure the integer
fits in IntegerType". It can be negative, as in Int64(Int64(-1)).

>+// Forcefully convert val to IntegerType when explicitly requested.
>+template<class IntegerType>
>+static bool
>+jsvalToIntegerExplicit(JSContext* cx, jsval val, IntegerType* result)
>+{
>+  if (JSVAL_IS_DOUBLE(val)) {
        [...]
>+  if (JSVAL_IS_OBJECT(val) && !JSVAL_IS_NULL(val)) {
        [...]
>+  return false;
>+}

Might as well assert !JSVAL_IS_DOUBLE(val) since we require that jsvalToInteger
was already called.

In jsvalToPtrExplicit:
>+    if (Int64::IsInt64(cx, obj)) {
>+      PRInt64 i = Int64Base::GetInt(cx, obj);
>+      if (i < 0) {
>+        // Cast through an intptr_t intermediate to sign-extend.
>+        if (PRInt64(intptr_t(i)) != i)
>+          return false;
>+
>+        *result = uintptr_t(intptr_t(i));
>+        return true;
>+      }
>+
>+      // Make sure the integer fits in the alotted precision.
>+      *result = uintptr_t(i);
>+      return PRInt64(*result) == i;
>+    }

It looks like you can save a branch here:

      PRInt64 i = Int64Base::GetInt(cx, obj);
      intptr_t p = intptr_t(i);

      // Make sure the integer fits in the alotted precision.
      if (PRInt64(p) != i)
        return false;
      *result = uintptr_t(p);
      return true;

IntegerToString:
>+  // Build the string in reverse. We use multiplication and subtraction
>+  // instead of modulus because that's much faster.
>+  bool isNegative = !IsUnsigned<IntegerType>() && i < 0;
>+  size_t sign = isNegative ? -1 : 1;
>+  do {
>+    IntegerType ii = i / IntegerType(radix);
>+    size_t index = sign * size_t(i - ii * IntegerType(radix));
>+    *--cp = "0123456789abcdefghijklmnopqrstuvwxyz"[index];
>+    i = ii;
>+  } while (i != 0);

You can eliminate another integer multiplication here by flipping the
sign first before entering the loop.

Same goes for StringToInteger.

In ConvertToJS:
>+    // Promote the unsigned 1-byte character type to a jschar, regardless of
>+    // encoding, and convert to a 1-character string.
>+    // TODO: Check IsASCII(data)?

Hmm. I offer two alternatives here and ask you to pick:

 1. Accept values 0..127, reflecting them as one-character JS strings; and
    reject everything else with a TypeError.

 2. Make char, signed_char, and unsigned_char integer types; that is, reflect
    them as JS numbers.

These may both sound too strict, but I expect converting to and from char to be
really rare in practice. I prefer #2 here, because then ConvertToJS always
works on chars, and people who want to treat chars as text are nudged toward
the readString API.

The root problem, of course, is who knows what a given C/C++ char is supposed
to mean?

What I want to avoid is reflecting ((char) 0xe1) as JavaScript '\u00e1'. That
is going to be flat-out wrong (I think) on Mac, Linux, and even on Windows
outside the U.S. and Western Europe.

>+  case TYPE_pointer: {
>+    // We're about to create a new CData object to return. If the caller doesn't
>+    // want this, return early.
>+    if (wantPrimitive) {
>+      JS_ReportError(cx, "cannot convert to primitive value");
>+      return false;
>+    }
>+
>+    JSObject* obj = PointerType::ConstructInternal(cx, typeObj, parentObj, data);
>+    if (!obj)
>+      return false;
>+
>+    *result = OBJECT_TO_JSVAL(obj);
>+    break;
>+  }
>+  case TYPE_array: {
>+    // We're about to create a new CData object to return. If the caller doesn't
>+    // want this, return early.
>+    if (wantPrimitive) {
>+      JS_ReportError(cx, "cannot convert to primitive value");
>+      return false;
>+    }
>+
>+    JSObject* obj = ArrayType::ConstructInternal(cx, typeObj, parentObj, data);
>+    if (!obj)
>+      return false;
>+
>+    *result = OBJECT_TO_JSVAL(obj);
>+    break;
>+  }
>+  case TYPE_struct: {
>+    // We're about to create a new CData object to return. If the caller doesn't
>+    // want this, return early.
>+    if (wantPrimitive) {
>+      JS_ReportError(cx, "cannot convert to primitive value");
>+      return false;
>+    }
>+
>+    JSObject* obj = StructType::ConstructInternal(cx, typeObj, parentObj, data);
>+    if (!obj)
>+      return false;
>+
>+    *result = OBJECT_TO_JSVAL(obj);
>+    break;
>+  }

This seems a little repetitive. Refactor? Just a thought.

In ImplicitConvert:
>+#define DEFINE_CHAR_TYPE(name, type, ffiType)                                  \
>+  case TYPE_##name: {                                                          \
>+    /* Convert from a 1-character string, regardless of encoding, */           \
>+    /* or from an integer, provided the result fits in 'type'. */              \
>+    /* TODO: Check IsASCII(chars) for 8-bit char types? */                     \

Even if (as I hope) you decide the char types should be integer types, I'm OK
with ImplicitConverting characters in the range '\0'..'\x7f' to these types.
Let's reject characters outside that range.

>+      jschar c = *JS_GetStringCharsZ(cx, str);                                 \

Check the return value of JS_GetStringCharsZ for null. The same change is
needed in quite a few other places.

>+    if (JSVAL_IS_NULL(val)) {
>+      // Convert to a null pointer.
>+      *static_cast<void**>(buffer) = NULL;
>+      break;
>+
>+    } else if (sourceData) {

break followed by else is against house style, right?

>+      while (JS_NextProperty(cx, iter, &id) && !JSVAL_IS_VOID(id)) {

If JS_NextProperty returns false, that's an error, and it should be propagated
without reporting another error.

>+        if (!JS_IdToValue(cx, id, &fieldVal) || !JSVAL_IS_STRING(fieldVal)) {
>+          JS_ReportError(cx, "property name is not a string");

Same goes for JS_IdToValue.

>+        jsval prop;
>+        if (!JS_GetProperty(cx, obj, name, &prop))
>+          return false;

I think you need a JSAutoTempValueRooter for prop here.

In ExplicitConvert:
>+#define DEFINE_WRAPPED_INT_TYPE(name, type, ffiType)                           \
>+  case TYPE_##name: {                                                          \
>+    /* Convert numeric values with a C-style cast, and */                      \
>+    /* allow conversion from a base-10 or base-16 string. */                   \
>+    type result;                                                               \
>+    if (!jsvalToIntegerExplicit(cx, val, &result) &&                           \
>+        (!JSVAL_IS_STRING(val) ||                                              \
>+         !StringToInteger(cx, JSVAL_TO_STRING(val), &result)))                 \

...Huh. It's good to be able to convert strings to integer types just by doing
ctypes.size_t("0xffff"), by analogy to Number("0xffff") ...but why do we limit
it to wrapped int types?

(The spec actually goes so far crazy as to specify only hex strings with "0x",
such that

    ctypes.size_t("-0x1")   // ok, size_t(-1)
    ctypes.size_t("-1")     // error

which seems pretty clearly around the bend to me. Can we just make all the int
types behave this way?)

>+  result.Insert(JS_GetStringBytesZ(cx, baseName), 0);

Throughout this code, the return value of JS_GetStringBytesZ is not checked for
NULL. Please check it.

BuildTypeSource:
>+// Given a CType 'typeObj', generate a string 'result' such that 'eval(result)'
>+// would construct the same CType. If 'makeShort' is true, assume that any
>+// StructType 't' is bound to an in-scope variable of name 't.name', and use
>+// that variable in place of generating a string to construct the type 't'.
>+// (This means the type comparison function CType::TypesEqual will return true
>+// when comparing the input and output of BuildTypeSource, since struct
>+// equality is determined by strict JSObject pointer equality.)

I'm not sure propagating makeShort to recursive calls is a good idea here; if
we ever support types like

  struct Event {
      Event *next;
  };

then we'll need to stop the runaway recursion somehow. A related problem is

  struct Behemoth { ...50 fields... };

  struct BehemothArray {
      Behemoth *begin;
      Behemoth *end;
      Behemoth *endCapacity;
  };

BehemothArray.toSource() shouldn't emit three StructType("Behemoth", ...)
types for the three fields.

Perhaps just pass makeShort=true when emitting the fields.

(The alternative is to use let-expressions to emit really precisely correct
code, but don't do that on my account!)

>+static nsCAutoString
>+BuildTypeSource(JSContext* cx, JSObject* typeObj, bool makeShort)

Can we build a 16-bit string here instead?  (The same comment applies in lots
of places, just about everywhere we build a string.)

In BuildTypeSource:
>+  case TYPE_void_t:
>+    result.Append("ctypes.void_t");
>+    break;
>+#define DEFINE_TYPE(name, type, ffiType)  \
>+  case TYPE_##name:                       \
>+    result.Append("ctypes." #name);       \
>+    break;
>+#include "typedefs.h"

Could save some code here by generating only case labels, followed by:

    result.Append("ctypes.");
    const char* name = JS_GetStringBytesZ(CType::GetName(cx, typeObj));
    if (!name)
      return JS_FALSE;
    result.Append(name);

In BuildDataSource:

A simpler way to implement this would be to use ConvertToJS and then call
JS_ValueToSource on the result. But if you want to keep this code, a few nits:

>+  case TYPE_array: {
>+    // Serialize each element of the array recursively. Each element must
>+    // be able to ImplicitConvert successfully.
>+    JSObject* baseType = ArrayType::GetBaseType(cx, typeObj);
>+    result.Append("[ ");
>+
>+
[...]
>+    result.Append(" ]");

Nit: Remove the spaces in "[ " and "] ", to conform with what
Array.prototype.toSource does.

Micro-nit: There's a double blank line.

>+  case TYPE_struct: {
>+    if (isImplicit) {
>+      // The result must be able to ImplicitConvert successfully.
>+      // Serialize the data as an object with properties, rather than
>+      // a sequence of arguments to the StructType constructor.
>+      result.Append("{ ");

Nit: Remove the extra space here too.

In CType::ConstructBasic:
>+    // perform explicit conversion
>+    if (!ExplicitConvert(cx, argv[0], obj, CData::GetData(cx, result)))

Style nit - This comment maybe can be deleted.

In CType::Create:
>+  // Define properties common to all CTypes.
>+  if (name && 
>+      !JS_DefineProperty(cx, typeObj, "name", STRING_TO_JSVAL(name),
>+         NULL, NULL, JSPROP_ENUMERATE | JSPROP_READONLY | JSPROP_PERMANENT))
>+    return NULL;
>+  if (!JS_DefineProperty(cx, typeObj, "ptr", JSVAL_VOID,
>+         PtrGetter, NULL, JSPROP_ENUMERATE | JSPROP_READONLY | JSPROP_PERMANENT))
>+    return NULL;
>+  if (!JS_DefineProperty(cx, typeObj, "size", size,
>+         NULL, NULL, JSPROP_ENUMERATE | JSPROP_READONLY | JSPROP_PERMANENT))
>+    return NULL;

I would like to see this data stored in reserved slots, and for these
properties to be SHARED PERMANENT READONLY properties of CType.prototype, with
getters. JS_GetReservedSlot is a lot faster than JS_GetProperty.

>+  // Set up the 'prototype' and 'prototype.constructor' properties.
>+  JSObject* prototype = JS_NewObject(cx, NULL, NULL, typeObj);

Nit again: Please use JS_GetParent(typeObj) instead of typeObj as the parent
here.

>+  // Seal the 'prototype' property.
>+  if (!JS_SealObject(cx, prototype, JS_FALSE))
>+    return NULL;

Comment nit:  "Seal the prototype object", not "property".

In CType::Finalize:
>+  // Make sure our TypeCode slot is legit. If it's not, bail.
>+  jsval slot;
>+  if (!JS_GetReservedSlot(cx, obj, SLOT_TYPECODE, &slot) || JSVAL_IS_VOID(slot))
>+    return;
>+
>+  // The contents of our slots depends on what kind of type we are.
>+  switch (GetTypeCode(cx, obj)) {

You can avoid calling JS_GetReservedSlot again by using JSVAL_TO_INT(slot).

>+      delete ffiType->elements;

delete[]

In CType::TypesEqual:
>+  case TYPE_array: {
>+    // Compare length, then base types.
>+    // An undefined length array matches any length.

This strikes me as too tricky. As far as I can tell there's no place in the
code where we need undefined-length array types to match other types in this
way, so let's delete this feature.

In PointerType::CreateInternal:
>+  // Finally, cache our newly-created PointerType on our pointed-to CType,
>+  // if we have one.
>+  if (baseType &&
>+      !JS_SetReservedSlot(cx, baseType, SLOT_PTR, OBJECT_TO_JSVAL(typeObj)))
>+    return NULL;

This can just go in the preceding `if (baseType)` block.

In PointerType::ContentsSetter:
>+  // Get pointer type and base type.
>+  JSObject* typeObj = CData::GetCType(cx, obj);
>+  if (CType::GetTypeCode(cx, typeObj) != TYPE_struct) {
>+    JS_ReportError(cx, "not a PointerType");

TYPE_struct is wrong here.

In ArrayType::Create:
>+  if (argc > 2) {
>+    JS_ReportError(cx, "ArrayType takes one or two arguments");

Need to check for 0 arguments, then!

>+  if (!JSVAL_IS_OBJECT(argv[0]) || JSVAL_IS_NULL(argv[0] ||
>+      !CType::IsCType(cx, JSVAL_TO_OBJECT(argv[0])))) {

One of those parentheses has wandered over to the next line! It looks like
ctypes.ArrayType(null) will crash as a result of this bug, and
ctypes.ArrayType({}) should assert, or worse.

In ArrayType::ConstructData:
>+    } else if (JSVAL_IS_OBJECT(argv[0])) {
>+      // We were given an object with a .length property.

Need to check for JSVAL_NULL, so the test should be

      } else if (!JSVAL_IS_PRIMITIVE(argv[0])) {

In ArrayType::ConstructInternal:
>+  if (!JS_DefineFunctions(cx, result, sArrayInstanceFunctions))
>+    return NULL;
>+
>+  // Duplicate the 'constructor.length' property as a 'length' property,
>+  // for consistency with JS array objects.
>+  jsval lengthVal;
>+  if (!JS_GetProperty(cx, typeObj, "length", &lengthVal) ||
>+      !JS_DefineProperty(cx, result, "length", lengthVal,
>+         NULL, NULL, JSPROP_ENUMERATE | JSPROP_READONLY | JSPROP_PERMANENT))
>+    return NULL;

Same comment here about how I'd rather see this as a reserved slot, with
"length" defined SHARED PERMANENT on the common prototype of all array CDatas.

sArrayInstanceFunctions belong on a common prototype for sure.

In ArrayType::GetLength:
>+  JS_GetProperty(cx, obj, "length", &length);

This is guaranteed to succeed; still it seems best to JS_ASSERT that it did.

If this turns into a call to JS_GetReservedSlot, then no assertion is
warranted.

In AraryType::Getter and Setter:
>+  if (!CData::IsCData(cx, obj))
>+    return JS_TRUE;

Most other places we report an error if `this` isn't the expected type. Why is
it different here? It's not immediately obvious how this can even happen.  (I
think it can only happen in bizarre corner cases involving watchpoints.)

This may be the right thing. If so, a comment would help. :)

In ExtractStructField:
>+  if (!JS_GetProperty(cx, obj, name, &propVal) ||
>+      !JSVAL_IS_OBJECT(propVal) || JSVAL_IS_NULL(propVal) ||
>+      !CType::IsCType(cx, JSVAL_TO_OBJECT(propVal)))
>+    return false;

This function may return false with an error reported or not (a problem not
just in the code quoted above, but throughout). It should be changed so that it
always reports an error on failure. Better error messages would be a nice side
effect.

In StructType::Create:
>+  if (argc < 2) {
>+    JS_ReportError(cx, "StructType takes at least two arguments");

Exactly two, right?

>+  if (!JSVAL_IS_OBJECT(argv[1]) ||
>+      !JS_IsArrayObject(cx, JSVAL_TO_OBJECT(argv[1]))) {

The obj argument to JS_IsArrayObject must not be NULL.
`StructType("x", null)` would crash.

>+    elements = new ffi_type*[len + 1];
>+    if (!elements) {
>+      JS_ReportOutOfMemory(cx);
>+      return JS_FALSE;
>+    }

Nit: This could be JS_ReportAllocationOverflow(cx).

>+    for (jsuint i = 0; i < len; ++i) {
>+      jsval item;
>+      JS_GetElement(cx, fieldsObj, i, &item);

JS_GetArrayLength is guaranteed to succeed for arrays, but JS_GetElement is
not. Ugly corner cases. :-P  So we need to check the return here.

>+      size_t delta = padding + CType::GetSize(cx, info->mType);
>+      size_t oldSize = structSize;
>+      structSize = structSize + delta;
>+      if (structSize - delta != oldSize) {
>+        JS_ReportError(cx, "size overflow");
>+        return JS_FALSE;
>+      }

This overflow check doesn't look right to me. It seems like it'll never emit
the error. A test would help convince me otherwise.

If we don't have to check for overflow when calculating delta itself, a very
brief comment explaining why would be nice. Something like:

    // No overflow here because padding is less than the alignment of
    // info->mType, and the size is a multiple of the alignment.

This should define properties on the new StructType's .prototype object as we
go through the loop. That way we can avoid doing so for each object, in
StructType::ConstructInternal.

In StructType::ConstructData:
>+  if (argc != 0 && argc != fields->Length()) {
>+    JS_ReportError(cx, "constructor takes zero or %u arguments", fields->Length());
>+    return JS_FALSE;
>+  }

The spec says that for any CType object T, T(x) essentially does
ExplicitConvert(x, T). I think this is a good feature. This should work:

  Pt = StructType("Pt", [{x: ctypes.int, y: ctypes.int}]);
  p = Pt({x: 12, y: -1})

Now I also like the shorthand `p = Pt(12, -1)`, even though the spec hasn't
caught up with you there yet. Unfortunately these conflict when a struct has
only one field:

  Vat = StructType("Vat", [{capacity: Measure}]);

  Vat({capacity: 22.7})
  Vat(22.7)
  Vat(Measure(22.7))

It gets worse if Measure is also a struct with a single field.

To resolve this, let's add this to the specification of t(val):

    If t is a struct type with exactly one field, and val is a primitive value
    or a CData object, then this is shorthand; ImplicitConvert val to the type
    of the field and initialize the new CData object that way.

Otherwise, we'll call ExplicitConvert(val, t) as normal. The rationale for this
special rule is that we will always call ExplicitConvert if it has a chance to
succeed, but in this special case it's only going to fail, and there is another
possible interpretation of the argument.

>+    // convert each field
>+    char* buffer = static_cast<char*>(CData::GetData(cx, result));
>+    for (PRUint32 i = 0; i < fields->Length(); ++i) {
>+      FieldInfo& field = fields->ElementAt(i);
>+      if (!ExplicitConvert(cx, argv[i], field.mType, buffer + field.mOffset))

Definitely ImplicitConvert here. Rationale: ExplicitConvert is reserved for
places where the user's code very likely has the type name appearing right next
to the expression being converted, as in

    ctypes.char(0xf0c3)     // ok, explicit enough context to discard bits
    Event.ptr(0xff3c7892)   // ok, explicit enough context to make unsafe ptr

So far there's only one such case: the one shown, an explicit constructor call.

When calling a StructType, the field types are not present in typical user
code, errors are fairly likely (and potentially very hard to debug), and so we
don't want those forceful coercions by default. We'll let the user do them, but
only by explicitly asking for them:

    // given struct Token { pid_t p; char c; };
    Token(pid, 0xf0c3)   // TypeError, don't silently throw away bits here
    Token(pid, ctypes.char(0xf0c3))  // ok

    // given struct Msg { time_t t; Event *e; };
    Msg(timestamp, 0xff3c7892) // TypeError, don't implicitly make unsafe ptr
    Msg(timestamp, Event.ptr(0xff3c7892)) // ok

In StructType::ConstructInternal:
>+  // add getters/setters for the fields

As mentioned above, these should be inherited from a prototype object. Same
goes for the methods.

In StructType::GetFieldInfo:
>+  for (i = 0; i < fields->Length(); ++i) {
>+    if (fields->ElementAt(i).mName.Equals(name))
>+      break;
>+  }
>+
>+  if (i == fields->Length())
>+    return NULL;
>+  return &fields->ElementAt(i);

Style nit: you could early return when a match is found and save a couple lines
of code.

In StructType::FieldGetter:
>+  FieldInfo* field = LookupField(cx, typeObj, idval);
>+  JS_ASSERT(field);

I don't think we can assert this. obj can really be any arbitrary object
(thanks, assignable __proto__ and watchpoints!), even a CData object of a
different struct type.

Same thing in FieldSetter.

In CData::Create:
>+// Create a new CData object of type 'typeObj' containing binary data supplied
>+// in 'source', optionally with a referent CData object 'baseObj'. The following
>+// semantics apply:

Nit: The words "The following semantics apply:" can just be deleted, I think.

>+// * 'typeObj' must be a CType of defined (but possibly zero) size.
>+// * If a CData object 'parentObj' is supplied, the new CData object becomes
>+//   dependent on the given parent and its buffer refers to the parent's buffer,
>+//   supplied in 'source'. [...]

Nit: "refers to a slice of the parent's buffer" (or "a subobject of" etc.)

>+  if (!JS_DefineFunctions(cx, dataObj, sCDataFunctions))
>+    return NULL;

As elsewhere, these functions should be inherited from a prototype, not defined
fresh on every CData object.

As with the CType prototypes, the CData prototypes should either all have
[[Class]] "Object", or all have [[Class]] "CData" (you pick).

>+  if (!JS_DefineProperty(cx, dataObj, "value", JSVAL_VOID,
>+         ValueGetter, ValueSetter, JSPROP_ENUMERATE | JSPROP_PERMANENT))
>+    return NULL;

This too can be a SHARED PERMANENT property of CData.prototype.

>+  // attach the buffer. since it might not be 2-byte aligned, we need to
>+  // allocate an aligned space for it and store it there. :(
>+  char** buffer = new char*;
>+  if (!buffer) {
>+    JS_ReportOutOfMemory(cx);
>+    return NULL;
>+  }

Gross! This isn't necessary on any platform I know about. We should use malloc
and overallocate if necessary to ensure proper alignment. (For example, I don't
know if jemalloc returns 2-byte-aligned memory when you just ask for 1 byte. So
ctypes might have to pad 1-byte allocations to 2.)

>+  char* data;
>+  if (baseObj) {
>+    data = static_cast<char*>(source);

Might as well assert here that baseObj actually owns its buffer and that source
points into it.

In CTypes.cpp:
>+/*******************************************************************************
>+** Int64 and UInt64 implementation
>+*******************************************************************************/

This is pretty gross, but it's the JSAPI's fault.  :-P There's plenty of room
in a sealed JSObject for 64 bits of data. The JSAPI just doesn't let you use
it.

In Int64Base::Finalize:
>+  delete static_cast<char*>(JSVAL_TO_PRIVATE(slot));

Nit: It would make more sense to static_cast<PRUint64*> here, to match the
`new PRUint64` in Int64Base::construct.

In Int64Base::ToSource:
>+  if (isUnsigned) {
>+    source.Append("ctypes.UInt64(\"");
>+    source.Append(IntegerToString(GetInt(cx, obj), 10));
>+  } else {
>+    source.Append("ctypes.Int64(\"");
>+    source.Append(IntegerToString(static_cast<PRInt64>(GetInt(cx, obj)), 10));
>+  }
>+  source.Append(')');

The result will be missing a quote mark. This needs a test.

In Int64::Lo:
>+  jsval arg = JS_ARGV(cx, vp)[0];
>+  if (argc != 1 || !JSVAL_IS_OBJECT(arg) || JSVAL_IS_NULL(arg) ||

Check argc before reading from argv[0]. Same in Int64::Hi and the UInt
functions.

In UInt64::Hi:
>+    JS_ReportError(cx, "lo takes one UInt64 argument");

Typo: "hi" not "lo".

In UInt64::Join:
>+  // Get Int64.prototype from the function's reserved slot.

Typo: "Get UInt64.prototype..."

In CTypes.h:
>+class CType {
>+public:
>+  static JSObject* Create(JSContext* cx, JSObject* proto, JSString* name, TypeCode type, jsval size, jsval align, ffi_type* ffiType);
>+  static JSObject* DefineBuiltin(JSContext* cx, JSObject* parent, const char* propName, JSObject* proto, const char* name, TypeCode type, jsval size, jsval align, ffi_type* ffiType);
    ...

All these vacant, uninstantiated classes seem a little... odd? This could be a
namespace. Another nice possibility would be to hide all this stuff in
CTypes.cpp and expose only functions that are intended as APIs.

>+  static JSBool ConstructAbstract(JSContext* cx, JSObject* obj, uintN argc, jsval* argv, jsval* rval);
>+  static JSBool ConstructData(JSContext* cx, JSObject* obj, uintN argc, jsval* argv, jsval* rval);
>+  static JSBool ConstructBasic(JSContext* cx, JSObject* obj, uintN argc, jsval* argv, jsval* rval);

In particular, I don't think functions that only make sense as JSAPI callbacks
should be exposed via CTypes.h. These can be plain old static functions in CTypes.cpp.

In Function.cpp:
> namespace mozilla {
> namespace ctypes {

Style micro-nit: I think it makes more sense to say

  using namespace mozilla::ctypes;
  using mozilla::ctypes;

and then put ctypes:: in front of definitions that need it. This is for a
couple reasons; one is that if by accident your definition doesn't match the
type/signature of the declaration, a definition with ctypes:: will be a
compiler error.

What you've written is fine with me though.

In PrepareType:
>+    typeObj = PointerType::CreateInternal(aContext, NULL, baseType, NULL);
>+    if (!typeObj) {
>+      JS_ReportError(aContext, "couldn't create pointer type from array");
>+      return false;
>+    }

PointerType::CreateInternal has already reported an error; it's a bug to call
JS_ReportError again.

In PrepareType:
>+  // libffi cannot pass types of zero size by value.
>+  JS_ASSERT(CType::GetSize(aContext, typeObj) != 0);

Can't this happen with empty structs or arrays? I think you must check, not
just assert. Same thing in PrepareResultType.

In Function.h, ~AutoValue():
>+    delete static_cast<char*>(mData);

delete[]

In Module.cpp:
> bool
> Module::Init(JSContext* cx, JSObject* aGlobal)

JSBool here (as in other places).

In typedefs.h:
>+// MSVC doesn't have ssize_t. Help it along a little.
>+#ifndef _MSC_VER
>+#define CTYPES_SSIZE_T ssize_t
>+#else
>+#define CTYPES_SSIZE_T intptr_t
>+#endif

Can we instead feature-test for this in the configury? I think you can just add
a line in configure.in that says `AC_TYPE_SSIZE_T` (right next to the one that
says `AC_TYPE_SIZE_T`).

>+// Some #defines to make handling of types whose length varies by platform
>+// easier. These could be implemented as configure tests, but the expressions
>+// are all statically resolvable so there's no need.
>+#define CTYPES_FFI_BOOL      (sizeof(bool)      == 1 ? ffi_type_uint8  : ffi_type_uint32)
>+#define CTYPES_FFI_LONG      (sizeof(long)      == 4 ? ffi_type_sint32 : ffi_type_sint64)
>+...

Maybe add a comment mentioning that the appropriate PR_STATIC_ASSERTs are in
CTypes.cpp (and why it would be wrong to put them here--obvious once you see
how this header is used, of course).

>+DEFINE_INT_TYPE        (unsigned_int,       unsigned int,       ffi_type_uint32)
>+DEFINE_INT_TYPE        (unsigned,           unsigned,           ffi_type_uint32)

unsigned should just be an alias for unsigned_int, not a separate CType (they
are the same type in C, not just compatible).
Depends on: 541212
Awesome review, jorendorff!

/be
Indeed. Most awesome review ever! Thanks for the attention -- the scope from "broad comments" to "you missed a closing paren" is somewhat epic.

I'm about halfway through addressing the comments. Want to get a new patch out ASAP, since there's talk of going alpha RSN.
Blocks: 513778
OK! New patch coming. Addressed all your comments, except as noted below. I'm gonna post a new patch, and an interdiff patch. You probably want to read the interdiff; it's big, but there's a lot of context overhead, so it's really not that scary.

This patch also removes the MSVC lame-duck "I can't return structs by value" bits and adds relevant tests; depends on bug 538216.

> A few very broad comments about this patch:
> 
> * Generally instead of defining properties and methods fresh on every object,
>   we should use prototypes. I should have said so in the spec.

Awesome! This simplified a bunch of code.

>   CData objects should have a prototype hierarchy similar to what CType objects
>   have. Like this:
> 
>     var struct_tm = ctypes.StructType(...);
>     var t = struct_tm({...});
>     t.__proto__ === struct_tm.prototype
>       ===> true   // common ancestor of all CData objects of type struct_tm
>     t.__proto__.__proto__  // common ancestor of all CData objects of struct
> type
>       ===> {addressOfField: function ...}
>     t.__proto__.__proto__.__proto__ === ctypes.CData.prototype
>       ===> true   // common ancestor of all CData objects
> 
>     t.hasOwnProperty("address")
>       ===> false
>     ctypes.CData.prototype.hasOwnProperty("address")
>       ===> true

Understood. I'll split this out into a separate patch, lest this one grow more enormous.

> * I think it would be better if the methods of CData objects were
>   non-enumerable.  That way you could use for-in to enumerate over structs and
>   arrays.  (Normal arrays we discourage people from enumerating, but ctypes
>   arrays' prototypes are all sealed, or that's the intent I think. So it should
>   be OK.)

Done. But -- for arrays, we cheat with the JSClass.{get,set}Property hooks, so there exist no properties to enumerate. :(

> * It looks like we don't seal the globals. I'm not sure how much XPConnect sets
>   up for us. Probably it at least creates a separate Object.prototype for the
>   module. We might want to seal that, for good measure. Worth a look.

OK. I'll look at this in a followup.

> * The error messages when type conversion fails are pretty weak! I expect users
>   to bump into these pretty often, so please do another pass over them and see
>   if there's any helpful information you can add. A decent type error is like,
> 
>     can't pass the string "xyzzy" to argument 2 of hypot(double, double)

Yeah, totally agree. I plan to have a followup that improves these a bunch, if you don't mind that.

> * Generally the code that munges strings should use 16-bit characters. For
>   example, all the BuildTypeName and BuildTypeSource stuff where we use
>   user-provided JS strings, like the names of fields, should use 16-bit strings
>   to avoid garbling those names. This is not a huge deal but now that we all
>   live in the future, it seems like we should get past the ASCII-only thing.
>   This applies most places where the code uses nsCAutoString.

Done. Things got a bit ugly here, though, as they are wont to do with 16-bit strings. :/

Two things:
1) All this nsAutoString/NS_LITERAL_STRING grossness will have to be changed soon anyway, if this is going jsengine-like.
2) I switched most of the JS_GetString* uses over to JS_GetStringChars, which kinda mallocs and fails to report failure. But whatever. We can fix this by just using JSString::{chars,length} later; in most (all?) places it should be safe.

> >+static JSObject*
> >+InitSpecialType(JSContext* cx,
> >+                JSObject* parent,
> >+                JSObject* CTypeProto,
> >+                JSFunctionSpec spec)
> 
> Micro-nit: Either all four of {C,Pointer,Array,Struct}Type.prototype should
> have [[Class]] "CType" or all four should have [[Class]] "Object". Your
> call. The convention in JavaScript is like so:
> 
>   js> Object.prototype.toString.call(Date.prototype)
>   "[object Date]"
>   js> Object.prototype.toString.call(Array.prototype)
>   "[object Array]"
>   js> Object.prototype.toString.call(Function.prototype)
>   "[object Function]"
> 
> ...which would suggest [[Class]] "CType", but XPConnect makes no pretense of
> sticking to this convention, so feel free to do what's easiest.

OK. I gave the type constructors a JSClass of sCTypeProto, same as ctypes.CType.prototype, which has a JSClass.name of "CType".

  Object.prototype.toString.call(ctypes.StructType.prototype)
    ===> [object CType]
  ctypes.StructType.prototype instanceof ctypes.CType
    ===> true
  ctypes.StructType("foo", []) instanceof ctypes.StructType
    ===> true
  ctypes.StructType("foo", []) instanceof ctypes.CType
    ===> true
  ctypes.int32_t instanceof ctypes.CType
    ===> true

Using 'instanceof' on ctypes.CType.prototype results in an 'invalid operand' TypeError. Is this desired?

> In jsvalToFloat: The spec says (and your patch agrees) that
> ImplicitConvert(UInt64(x), double) is OK, even though it may lose bits. I think
> we did this for convenience, since e.g. even on a 32-bit platform size_t values
> are UInt64. But really the only consistent thing is to throw a TypeError here,
> just like all the other cases where we detect a potentially lossy conversion,
> and make the user convert explicitly. Can we change it?

Done. But, I'm not sure I agree. In jsvalToInteger, we convert from CData integers that are always exactly representable. But we convert from jsint, jsdouble, and UInt64/Int64's if the value happens to fit. So we're treating Int64/UInt64 as a primitive, which sorta makes sense. Should we not do the same when converting to float?

> IntegerToString:
> >+  // Build the string in reverse. We use multiplication and subtraction
> >+  // instead of modulus because that's much faster.
> >+  bool isNegative = !IsUnsigned<IntegerType>() && i < 0;
> >+  size_t sign = isNegative ? -1 : 1;
> >+  do {
> >+    IntegerType ii = i / IntegerType(radix);
> >+    size_t index = sign * size_t(i - ii * IntegerType(radix));
> >+    *--cp = "0123456789abcdefghijklmnopqrstuvwxyz"[index];
> >+    i = ii;
> >+  } while (i != 0);
> 
> You can eliminate another integer multiplication here by flipping the
> sign first before entering the loop.

Not so! If 'i' is the minimum signed integer (i.e. -2147483648 for int32_t), negating it is a nop. The alternative is to upconvert to PRUint64 before entering the loop, which I can do if you'd like. But then we're doing 64-bit multiplies for 32, 16, and 8 bit integer types.

> In ConvertToJS:
> >+    // Promote the unsigned 1-byte character type to a jschar, regardless of
> >+    // encoding, and convert to a 1-character string.
> >+    // TODO: Check IsASCII(data)?
> 
> Hmm. I offer two alternatives here and ask you to pick:
> 
>  1. Accept values 0..127, reflecting them as one-character JS strings; and
>     reject everything else with a TypeError.
> 
>  2. Make char, signed_char, and unsigned_char integer types; that is, reflect
>     them as JS numbers.
> 
> These may both sound too strict, but I expect converting to and from char to be
> really rare in practice. I prefer #2 here, because then ConvertToJS always
> works on chars, and people who want to treat chars as text are nudged toward
> the readString API.

I like 2). Do we still want jschar to become a 1-char string? It might be nice to have that become an int, too, but I don't have a strong opinion.

If we ever do wchar_t we'll want to have that become an int, too.

> In ConvertToJS:
> >+  case TYPE_pointer: {
> >+  case TYPE_array: {
> >+  case TYPE_struct: {
> This seems a little repetitive. Refactor? Just a thought.

Brilliant suggestion about moving properties to the prototypes with JSPROP_SHARED - all this disappeared. :)

> >+        jsval prop;
> >+        if (!JS_GetProperty(cx, obj, name, &prop))
> >+          return false;
> 
> I think you need a JSAutoTempValueRooter for prop here.

Done, but - what are the rules here? JS_GetProperty is used all over the place in this patch, and I didn't think a root was needed for this use. :/

> In ExplicitConvert:
> >+#define DEFINE_WRAPPED_INT_TYPE(name, type, ffiType)                           \
> >+  case TYPE_##name: {                                                          \
> >+    /* Convert numeric values with a C-style cast, and */                      \
> >+    /* allow conversion from a base-10 or base-16 string. */                   \
> >+    type result;                                                               \
> >+    if (!jsvalToIntegerExplicit(cx, val, &result) &&                           \
> >+        (!JSVAL_IS_STRING(val) ||                                              \
> >+         !StringToInteger(cx, JSVAL_TO_STRING(val), &result)))                 \
> 
> ...Huh. It's good to be able to convert strings to integer types just by doing
> ctypes.size_t("0xffff"), by analogy to Number("0xffff") ...but why do we limit
> it to wrapped int types?
> 
> (The spec actually goes so far crazy as to specify only hex strings with "0x",
> such that
> 
>     ctypes.size_t("-0x1")   // ok, size_t(-1)
>     ctypes.size_t("-1")     // error
> 
> which seems pretty clearly around the bend to me. Can we just make all the int
> types behave this way?)

Sounds good. Done!

> In BuildDataSource:
> A simpler way to implement this would be to use ConvertToJS and then call
> JS_ValueToSource on the result. But if you want to keep this code, a few nits:

Ah, cute! I'll leave the code here for now and follow up on this.

> In CType::TypesEqual:
> >+  case TYPE_array: {
> >+    // Compare length, then base types.
> >+    // An undefined length array matches any length.
> 
> This strikes me as too tricky. As far as I can tell there's no place in the
> code where we need undefined-length array types to match other types in this
> way, so let's delete this feature.

Not so! Consider:

  let fn = library.declare("fn", ctypes.default_abi, ctypes.void_t, ctypes.size_t.array().ptr);
  let array = ctypes.size_t.array(3)();
  fn(array.address());

which fails without that hack. Now, the '.ptr' and '.address()' aren't necessary, because ImplicitConvert has a special case for treating arrays as pointers when used in a function argument, just like C. So this is really the same as

  let fn = library.declare("fn", ctypes.default_abi, ctypes.void_t, ctypes.size_t.array());
  let array = ctypes.size_t.array(3)();
  fn(array);

which works fine without the hack. So I'm fine with removing it, if you agree that people shouldn't be doing '.ptr' on array types for function args.

Not sure if there are any other cases where we'd want this. I can't think of any.

> In StructType::ConstructData:
> >+  if (argc != 0 && argc != fields->Length()) {
> >+    JS_ReportError(cx, "constructor takes zero or %u arguments", fields->Length());
> >+    return JS_FALSE;
> >+  }
> 
> The spec says that for any CType object T, T(x) essentially does
> ExplicitConvert(x, T). I think this is a good feature. This should work:
> 
>   Pt = StructType("Pt", [{x: ctypes.int, y: ctypes.int}]);
>   p = Pt({x: 12, y: -1})
> 
> Now I also like the shorthand `p = Pt(12, -1)`, even though the spec hasn't
> caught up with you there yet. Unfortunately these conflict when a struct has
> only one field:
> 
>   Vat = StructType("Vat", [{capacity: Measure}]);
> 
>   Vat({capacity: 22.7})
>   Vat(22.7)
>   Vat(Measure(22.7))
> 
> It gets worse if Measure is also a struct with a single field.
> 
> To resolve this, let's add this to the specification of t(val):
> 
>     If t is a struct type with exactly one field, and val is a primitive value
>     or a CData object, then this is shorthand; ImplicitConvert val to the type
>     of the field and initialize the new CData object that way.
> 
> Otherwise, we'll call ExplicitConvert(val, t) as normal. The rationale for this
> special rule is that we will always call ExplicitConvert if it has a chance to
> succeed, but in this special case it's only going to fail, and there is another
> possible interpretation of the argument.

Done, but with a twist. I think the two possibilities (StructType with a single field, ImplicitConverting to that type; vs. ExplicitConverting a { ... }) are mutually exclusive, so there's no ambiguity. I *think*. But there are more conditions where we'd want the former to succeed:

0) the StructType in question must have 1 field; and the arg must be:
1) a CData of different type to the StructType in question; or
2) a primitive; or
3) an Int64 or UInt64; or
4) a JS array object; or
5) the singular field of the StructType must be a different StructType, and the arg is a regular JS object, containing fields that match the latter StructType.

Assuming we want the above: then rather than list all these conditions out, which is obviously fragile (we'd have to change it every time we change ImplicitConvert), I just kept it simple and said "let's ExplicitConvert first, and if that fails, and we have a single arg, ImplicitConvert it". This involves some messing with exceptions, but hopefully that's OK. (We do that in ExplicitConvert itself, anyway.)

(Also note that ExplicitConvert === ImplicitConvert for StructTypes, but that's not terribly relevant.)

I'll update the spec with whatever we decide to do here.

> in CData::Create:
> As with the CType prototypes, the CData prototypes should either all have
> [[Class]] "Object", or all have [[Class]] "CData" (you pick).

Right now they're all [[Object]], I think, as defined in CType::Create when we create the 'prototype' property. But I'll change them to be [[CDataProto]] or somesuch when I do the patch to fix up the CData prototype hierarchy.

> in CData::Create:
> >+  // attach the buffer. since it might not be 2-byte aligned, we need to
> >+  // allocate an aligned space for it and store it there. :(
> >+  char** buffer = new char*;
> >+  if (!buffer) {
> >+    JS_ReportOutOfMemory(cx);
> >+    return NULL;
> >+  }
> 
> Gross! This isn't necessary on any platform I know about. We should use malloc
> and overallocate if necessary to ensure proper alignment. (For example, I don't
> know if jemalloc returns 2-byte-aligned memory when you just ask for 1 byte. So
> ctypes might have to pad 1-byte allocations to 2.)

Yeah. :( But the problem isn't malloced buffers - it's when a CData buffer points into the middle of some other CData referent. The buffer could have arbitrary alignment, and PRIVATE_TO_JSVAL requires 2-byte minimum. Right?

> >+  char* data;
> >+  if (baseObj) {
> >+    data = static_cast<char*>(source);
> 
> Might as well assert here that baseObj actually owns its buffer and that source
> points into it.

But 'baseObj' doesn't have to own its buffer - could have a chain of referents (e.g. nested structs). And 'source' might not point into baseObj's buffer, either. Consider:

  let g_t = ctypes.StructType(...);
  let g = g_t(); // let's call the struct buffer g_buf.
  let g_a = g.address(); // creates a new CData object - note that it doesn't have 'g' as its referent, because PointerType CData's don't do that. the pointer buffer is 'ptr_buf = &g_buf'.
  let g_a_c = g_a.contents; // new CData object with 'g_a' as its referent, such that changes to 'g_a_c' are really changing 'g' itself. the buffer is '*ptr_buf', unrelated to 'ptr_buf' in its location.

Now, if myData.address() doesn't have myData as a referent, then maybe myData.address().contents shouldn't have myData.address() as a referent, either. (Who cares if the pointer stays alive? It won't affect myData.) So I'm open to suggestions here. (We could always decide to make .address() have a referent, in which case we could assert something meaningful as you suggest.)

> In CTypes.h:
> >+class CType {
> >+public:
> >+  static JSObject* Create(JSContext* cx, JSObject* proto, JSString* name, TypeCode type, jsval size, jsval align, ffi_type* ffiType);
> >+  static JSObject* DefineBuiltin(JSContext* cx, JSObject* parent, const char* propName, JSObject* proto, const char* name, TypeCode type, jsval size, jsval align, ffi_type* ffiType);
>     ...
> 
> All these vacant, uninstantiated classes seem a little... odd? This could be a
> namespace. Another nice possibility would be to hide all this stuff in
> CTypes.cpp and expose only functions that are intended as APIs.

Yeah :( The only purpose of the classes is to make it immediately obvious what kind of JSObject a given function operates on, but I guess a namespace would accomplish this just as well.

I'll play around with this and split it out into a separate patch. No point in inflating patch sizes with trivial changes.

> >+  static JSBool ConstructAbstract(JSContext* cx, JSObject* obj, uintN argc, jsval* argv, jsval* rval);
> >+  static JSBool ConstructData(JSContext* cx, JSObject* obj, uintN argc, jsval* argv, jsval* rval);
> >+  static JSBool ConstructBasic(JSContext* cx, JSObject* obj, uintN argc, jsval* argv, jsval* rval);
> 
> In particular, I don't think functions that only make sense as JSAPI callbacks
> should be exposed via CTypes.h. These can be plain old static functions in
> CTypes.cpp.

Likewise.

> In PrepareType:
> >+  // libffi cannot pass types of zero size by value.
> >+  JS_ASSERT(CType::GetSize(aContext, typeObj) != 0);
> 
> Can't this happen with empty structs or arrays? I think you must check, not
> just assert. Same thing in PrepareResultType.

Nope, because:

  a) structs with no fields have a size of 1 byte (just like C).
  b) structs cannot have fields of zero size (C allows this, but I decided to make it illegal. We can change this if you want. Requires some hoopjumping because libffi doesn't like zero-sized types.)
  c) arrays are converted to pointers a few lines above.

For PrepareResultType, same thing, except c) arrays are rejected a few lines above.

> In AraryType::Getter and Setter:
> >+  if (!CData::IsCData(cx, obj))
> >+    return JS_TRUE;
> 
> Most other places we report an error if `this` isn't the expected type. Why is
> it different here? It's not immediately obvious how this can even happen.  (I
> think it can only happen in bizarre corner cases involving watchpoints.)

We could assert and save some code. mrbkap agrees that it should only happen with watchpoints. For now, I added a JS_ReportError and a comment.

For most of the other getters & setters, they can be called on prototype objects, so they need to check. I'm not sure whether convention is to return JS_TRUE (-> 'undefined') or to throw here.

> In StructType::Create:
> >+      size_t delta = padding + CType::GetSize(cx, info->mType);
> >+      size_t oldSize = structSize;
> >+      structSize = structSize + delta;
> >+      if (structSize - delta != oldSize) {
> >+        JS_ReportError(cx, "size overflow");
> >+        return JS_FALSE;
> >+      }
> 
> This overflow check doesn't look right to me. It seems like it'll never emit
> the error. A test would help convince me otherwise.

Good catch. :( Fixed, but unfortunately it's hard to test. The only way to get massive type sizes is via ArrayType or StructType, which amount to the same thing - just creating the type will allocate a massive elements array for the ffi_type. Some architectures require this. We can fix it, but it will require a patch to libffi. Followup material. In the meantime, if you want, we can test it by progressively creating arrays of arrays to get a big enough one, but it won't be pretty.
Attached patch patch v3Splinter Review
Full patch.
Attachment #422437 - Attachment is obsolete: true
Attachment #422437 - Flags: review?(jorendorff)
Attached patch patch v2-v3 interdiff (obsolete) — Splinter Review
Interdiff patch.
Attachment #423873 - Flags: review?(jorendorff)
Forgot to update a comment.
Attachment #423873 - Attachment is obsolete: true
Attachment #423893 - Flags: review?(jorendorff)
Attachment #423873 - Flags: review?(jorendorff)
This fixes some rooting bugs I introduced in the recent patch.
Attachment #424109 - Flags: review?(jorendorff)
> > In StructType::Create:
> > >+      size_t delta = padding + CType::GetSize(cx, info->mType);
> > >+      size_t oldSize = structSize;
> > >+      structSize = structSize + delta;
> > >+      if (structSize - delta != oldSize) {
> > >+        JS_ReportError(cx, "size overflow");
> > >+        return JS_FALSE;
> > >+      }
> > 
> > This overflow check doesn't look right to me. It seems like it'll never emit
> > the error. A test would help convince me otherwise.
> 
> Good catch. :( Fixed, but unfortunately it's hard to test.

Recaffeinated and attacked. Turned out simple enough! This also tweaks the alignment arithmetic to be faster, and fixes an x86_64 overflow test failure. (Whoops. That test was on crack.)
Attachment #424110 - Flags: review?(jorendorff)
Working on CData proto hierarchy. Question about one of the details. (Tried to find precedent to go off, but none leapt out at me.)

So we want ctypes.CData === ctypes.CType.prototype. Now, we want ctypes.CType to be [[Function]], ctypes.CType.prototype to be [[CType]], thus it follows that we want ctypes.CData to be [[Function]] and ctypes.CData.prototype to be [[CData]].

There's a contradiction here. What should we do?

a) Make CType.prototype, and thus ctypes.CData, be [[CType]]. (Side note: this would require a custom JSClass.hasInstance hook on the CType.prototype class to make 'instanceof' work; [[Function]] gets it by default.)

b) Make them be [[Function]].

c) ???

I think I'm going to do a). This will make ctypes.CData look a bit funky, but it's only an abstract constructor that throws, anyway - not all that interesting. Having ctypes.CType.prototype and ctypes.CData.prototype have sensible classes seems more important to me. Other ideas welcome.
What's the rationale for the ctypes.CData === ctypes.CType.prototype requirement?

/be
As I understand it, to take a simple case, we want the following:

  ctypes.CType ===> abstract constructor that throws
  ctypes.CType.__proto__ === Function.prototype
  ctypes.CType.prototype class [[CType]]
  ctypes.CType.prototype.__proto__ === Function.prototype
  ctypes.CType.prototype.constructor === ctypes.CType
  ctypes.int32_t.__proto__ === ctypes.CType.prototype
  ctypes.int32_t.prototype class [[CData]]
  ctypes.int32_t.prototype.__proto__ === ctypes.CData.prototype
  ctypes.int32_t.prototype.constructor === ctypes.int32_t

such that

  ctypes.int32_t().__proto__ === ctypes.int32_t.prototype
  ctypes.int32_t().__proto__.__proto__ === ctypes.ctypes.CData.prototype

so in fact there's no need for

  ctypes.int32_t.__proto__.prototype === ctypes.CData.prototype
  ctypes.int32_t.__proto__.prototype.constructor === ctypes.CData

to hold true, since ctypes.int32_t has its very own 'prototype' property. Therefore we could simply have

  ctypes.CData ===> abstract constructor that throws
  ctypes.CData.__proto__ === ctypes.CType.prototype
  ctypes.CData.prototype class [[CData]]

and there's no need for object equality between ctypes.CType.prototype and ctypes.CData, and 'ctypes.CData instanceof ctypes.CType' will hold true, for whatever usefulness that holds.

I will still need to implement the JSClass.hasInstance hook on [[CType]] and [[CData]]. ctypes.CType, ctypes.CData, and the type constructors ctypes.{Struct,Pointer,Array}Type are exempt from this, since they're already [[Function]].

I hope jorendorff agrees!
Attached patch patch v3 part 4 - fix protos (obsolete) — Splinter Review
Implements what I said above, and gives us three levels of __proto__ for CData's coming from the special types, as suggested. Also tweaks parenting across the board so that everything is parented to 'ctypes'. (This seems to be convention elsewhere.)

About half of this patch is tests.

I didn't do anything fancy like ctypes.CType.prototype.prototype, or ctypes.StructType.prototype.prototype, but if you think there's value in it then I could scrounge something together in a followup followup.

For great justice!
Attachment #425080 - Flags: review?(jorendorff)
Seals Object, Function, Array and their prototypes; and then seals the global object. Sealing it recursively breaks xpconnect, but a shallow seal seems to work fine - this is purely an empirical observation, however. If you'd rather I didn't, we can always JS_SetPropertyAttributes to make Object, Function, and Array JSPROP_READONLY | JSPROP_PERMANENT. (As it stands, they're not.)

I toyed with the idea of creating a new global object, swapping it into the JSContext, setting up all the ctypes stuff, and recursively sealing the whole thing - then swapping the old global back in. I didn't try this; it struck me as a bit fantastical.
Attachment #425126 - Flags: review?(jorendorff)
> > >+        jsval prop;
> > >+        if (!JS_GetProperty(cx, obj, name, &prop))
> > >+          return false;
> > 
> > I think you need a JSAutoTempValueRooter for prop here.
> 
> Done, but - what are the rules here? JS_GetProperty is used all over the place
> in this patch, and I didn't think a root was needed for this use. :/

It's not the call to JS_GetProperty itself that requires a root but the
subsequent use of prop. JS_GetProperty does not promise that the value it
stores in *vp is rooted anywhere.

Offhand, I would write:

        JSAutoTempValueRooter prop(cx);
        if (!JS_GetUCProperty(cx, obj, name, namelen, prop.addr()))
          return false;

        // Convert the field via ImplicitConvert().
        char* fieldData = intermediate + field->mOffset;
        if (!ImplicitConvert(cx, prop.value(), field->mType, fieldData, false, NULL))
          return false;
> > * I think it would be better if the methods of CData objects were
> >   non-enumerable.  [...]
>
> Done. But -- for arrays, we cheat with the JSClass.{get,set}Property hooks, so
> there exist no properties to enumerate. :(

Rats. :(  Well, if anyone asks for it, we can add a newEnumerate hook. Later.

> > * The error messages when type conversion fails are pretty weak! [...]
>
> Yeah, totally agree. I plan to have a followup that improves these a bunch, if
> you don't mind that.

Great. Is the bug already filed?

> Using 'instanceof' on ctypes.CType.prototype results in an 'invalid operand'
> TypeError. Is this desired?

As you mentioned later, you could fix it by making the CData class and
sCTypeProto implement hasInstance. Not a priority. If you're interested,
I can file the followup bug.

> > You can eliminate another integer multiplication here by flipping the
> > sign first before entering the loop.
> 
> Not so! If 'i' is the minimum signed integer (i.e. -2147483648 for int32_t),
> negating it is a nop.

Oops! You're right of course.

> >  2. Make char, signed_char, and unsigned_char integer types; that is, reflect
> >     them as JS numbers.
> > [...]
> 
> I like 2). Do we still want jschar to become a 1-char string? It might be nice
> to have that become an int, too, but I don't have a strong opinion.

Changed the spec. I prefer for jschar to become a 1-char string.

> If we ever do wchar_t we'll want to have that become an int, too.

Agreed.

> > Gross! [...]
> 
> Yeah. :( But [...] Right?

Right. :(

> > >+  char* data;
> > >+  if (baseObj) {
> > >+    data = static_cast<char*>(source);
> > 
> > Might as well assert here that baseObj actually owns its buffer and that source
> > points into it.
> 
> But 'baseObj' doesn't have to own its buffer - could have a chain of referents
> (e.g. nested structs). And 'source' might not point into baseObj's buffer,
> either. Consider:

We discussed on IRC and it turns out that the baseObj paramter wasn't what I
thought. Filed bug 550982.

> For most of the other getters & setters, they can be called on prototype
> objects, so they need to check. I'm not sure whether convention is to return
> JS_TRUE (-> 'undefined') or to throw here.

The convention is to throw.
> > In jsvalToFloat: The spec says (and your patch agrees) that
> > ImplicitConvert(UInt64(x), double) is OK, even though it may lose bits. I think
> > we did this for convenience, since e.g. even on a 32-bit platform size_t values
> > are UInt64. But really the only consistent thing is to throw a TypeError here,
> > just like all the other cases where we detect a potentially lossy conversion,
> > and make the user convert explicitly. Can we change it?
> 
> Done. But, I'm not sure I agree. In jsvalToInteger, we convert from CData
> integers that are always exactly representable. But we convert from jsint,
> jsdouble, and UInt64/Int64's if the value happens to fit. So we're treating
> Int64/UInt64 as a primitive, which sorta makes sense. Should we not do the same
> when converting to float?

You're right, this is inconsistent. However, there is another design principle
in play there, which is that values should round-trip from C++ to JS and back
without errors or loss of data:

  size_t v = 77;                            // C++ variable
  ImplicitConvert(ConvertToJS(v), size_t)   // should return v

So the silliness of making a size_t into a UInt64 on 32-bit platforms begets
the silliness of accepting UInt64 values as 32-bit integers. Not ideal, but I
don't think we want to fix it (which we could do e.g. by introducing separate
ctypes.{IntPtr,UIntPtr} objects).

> > In CType::TypesEqual:
> > >+  case TYPE_array: {
> > >+    // Compare length, then base types.
> > >+    // An undefined length array matches any length.
> > 
> > This strikes me as too tricky. As far as I can tell there's no place in the
> > code where we need undefined-length array types to match other types in this
> > way, so let's delete this feature.
> 
> Not so! Consider:
> 
>   let fn = library.declare("fn", ctypes.default_abi, ctypes.void_t,
> ctypes.size_t.array().ptr);
>   let array = ctypes.size_t.array(3)();
>   fn(array.address());
> 
> which fails without that hack. Now, the '.ptr' and '.address()' aren't
> necessary, because ImplicitConvert has a special case for treating arrays as
> pointers when used in a function argument, just like C. So this is really the
> same as
>
>   let fn = library.declare("fn", ctypes.default_abi, ctypes.void_t,
> ctypes.size_t.array());
>   let array = ctypes.size_t.array(3)();
>   fn(array);
> 
> which works fine without the hack. So I'm fine with removing it, if you agree
> that people shouldn't be doing '.ptr' on array types for function args.
> 
> Not sure if there are any other cases where we'd want this. I can't think of
> any.

G++ at least doesn't allow this. You can't even declare such functions.

    #include <stdlib.h>
    typedef size_t sizet_arr[];
    int f(sizet_arr *h);         // line 3
    void g() {
	size_t data[3];
	sizet_arr *p = &data;    // line 6
	f(&data);                // line 7
    }

I get these errors with GCC 4.4.1:

    wtfarray.cpp:3: error: parameter ‘h’ includes pointer to array of unknown
                           bound ‘size_t []’
    wtfarray.cpp: In function ‘void g()’:
    wtfarray.cpp:6: error: cannot convert ‘size_t (*)[3]’ to ‘size_t (*)[]’
                           in initialization
    wtfarray.cpp:7: error: cannot convert ‘size_t (*)[3]’ to ‘size_t (*)[]’
                           for argument ‘1’ to ‘int f(size_t (*)[])’

Even trying to declare the function extern "C" doesn't change this. So I say we
get rid of the hack and shoot down such uses. I really want SameType be an
equivalence relation, for sanity's sake.
Blocks: 551057
(In reply to comment #31)
> > > * The error messages when type conversion fails are pretty weak! [...]
> >
> > Yeah, totally agree. I plan to have a followup that improves these a bunch, if
> > you don't mind that.
> 
> Great. Is the bug already filed?

Filed as bug 551057.

> > Using 'instanceof' on ctypes.CType.prototype results in an 'invalid operand'
> > TypeError. Is this desired?
> 
> As you mentioned later, you could fix it by making the CData class and
> sCTypeProto implement hasInstance. Not a priority. If you're interested,
> I can file the followup bug.

Fixed in attachment 425080 [details] [diff] [review] by way of CType::HasInstance, I believe.

(In reply to comment #32)
> > > In CType::TypesEqual:
> > > >+  case TYPE_array: {
> > > >+    // Compare length, then base types.
> > > >+    // An undefined length array matches any length.
> > > 
> > > This strikes me as too tricky. As far as I can tell there's no place in the
> > > code where we need undefined-length array types to match other types in this
> > > way, so let's delete this feature.
> > 
> > Not so! Consider:
> > 
> >   let fn = library.declare("fn", ctypes.default_abi, ctypes.void_t,
> > ctypes.size_t.array().ptr);
> >   let array = ctypes.size_t.array(3)();
> >   fn(array.address());
> > 
> > which fails without that hack. Now, the '.ptr' and '.address()' aren't
> > necessary, because ImplicitConvert has a special case for treating arrays as
> > pointers when used in a function argument, just like C. So this is really the
> > same as
> >
> >   let fn = library.declare("fn", ctypes.default_abi, ctypes.void_t,
> > ctypes.size_t.array());
> >   let array = ctypes.size_t.array(3)();
> >   fn(array);
> > 
> > which works fine without the hack. So I'm fine with removing it, if you agree
> > that people shouldn't be doing '.ptr' on array types for function args.
> > 
> > Not sure if there are any other cases where we'd want this. I can't think of
> > any.
> 
> G++ at least doesn't allow this. You can't even declare such functions.
> 
>     #include <stdlib.h>
>     typedef size_t sizet_arr[];
>     int f(sizet_arr *h);         // line 3
>     void g() {
>     size_t data[3];
>     sizet_arr *p = &data;    // line 6
>     f(&data);                // line 7
>     }
> 
> I get these errors with GCC 4.4.1:
> 
>     wtfarray.cpp:3: error: parameter ‘h’ includes pointer to array of unknown
>                            bound ‘size_t []’
>     wtfarray.cpp: In function ‘void g()’:
>     wtfarray.cpp:6: error: cannot convert ‘size_t (*)[3]’ to ‘size_t (*)[]’
>                            in initialization
>     wtfarray.cpp:7: error: cannot convert ‘size_t (*)[3]’ to ‘size_t (*)[]’
>                            for argument ‘1’ to ‘int f(size_t (*)[])’
> 
> Even trying to declare the function extern "C" doesn't change this. So I say we
> get rid of the hack and shoot down such uses. I really want SameType be an
> equivalence relation, for sanity's sake.

Good point, but I think we have our wires crossed: the code

> >   let fn = library.declare("fn", ctypes.default_abi, ctypes.void_t,
> > ctypes.size_t.array().ptr);
> >   let array = ctypes.size_t.array(3)();
> >   fn(array.address());

was trying to declare the C prototype 'void fn(size_t a[])'. (The two snippets I originally posted were declaring and calling the same C prototype.) So my question was really whether we want the argument of said prototype to be declared as 'ctypes.size_t.array().ptr', or 'ctypes.size_t.array()'. The latter is more natural C, but the former is more strictly correct, given our model of an array() as actually being a chunk of data and never a pointer to a chunk. But, we're trying to match C as closely as possible, so let's do what's natural and fix TypesEqual() like you say. The new patch (v3) has this change.
Fixes JS_GetProperty rooting and friends, with a couple of bonus fixes as well.
Attachment #431250 - Flags: review?(jorendorff)
(In reply to comment #33)
> > >   let fn = library.declare("fn", ctypes.default_abi, ctypes.void_t,
> > > ctypes.size_t.array().ptr);
> > >   let array = ctypes.size_t.array(3)();
> > >   fn(array.address());
> 
> was trying to declare the C prototype 'void fn(size_t a[])'. (The two snippets
> I originally posted were declaring and calling the same C prototype.) So my
> question was really whether we want the argument of said prototype to be
> declared as 'ctypes.size_t.array().ptr', or 'ctypes.size_t.array()'. The latter
> is more natural C, but the former is more strictly correct, given our model of
> an array() as actually being a chunk of data and never a pointer to a chunk.

Oh - sorry, I didn't get that at all. Yes, following C/C++ here is the right thing.

In C/C++, as far as I can tell, the declaration
  int f(int x[]);
is interpreted as if it had read
  int f(int *x);

In other words the fact that (int x[]) is allowed there, and the way it is handled, are shallow surface things. The actual type of the parameter is int * either way. We should do the same thing--just convert such types to pointer types at an early stage in library.declare().
Rationale: following C/C++ is crucial in areas that affect linkage or calling conventions.

(In other places, like ImplicitConvert, I'm more willing to depart from both C/C++ and JS to pursue other wins -- such as easier debuggability.)
Attached patch mingw fix for patch v3 (obsolete) — Splinter Review
I've tested patch v3 with mingw and it requires some fixes. I've attached a patch. CTypes.cpp changes fix the problem of wchar_t != unsigned short. Also mingw seems to use macro for calloc (although I can't see it in its headers), to it needs #undef calloc before calloc implementation in JSRuntime.
In InitTypeClasses:
>   // Each of these has, respectively:
>   //   * Class [[Function]]
>   //   * __proto__ === Function.prototype
>   //   * A constructor that creates a user-defined type.
>   //   * 'prototype' property:
>-  //     * Class [[Object]]
>+  //     * Class [[CTypeProto]]

Punctuation nit: the ECMAScript magic property this comment refers to is called
[[Class]], and it's a string. So it would be:

   * [[Class]] "Function"
   ...
   * 'prototype' property:
     * [[Class]] "CTypeProto"

In jsvalToBool:
>   if (JSVAL_IS_DOUBLE(val)) {
>     jsdouble d = *JSVAL_TO_DOUBLE(val);
>     *result = d != 0;
>+    // Disallow -0.
>     return d == 1 || d == 0;

s/Disallow/Allow/   :-)

> #define DEFINE_CHAR_TYPE(name, type, ffiType)                                  \
>   case TYPE_##name:                                                            \
>     /* Serialize as an integer. */                                             \
>     result.Append(IntegerToString(*static_cast<type*>(data), 10));             \
>     break;
>+#define DEFINE_JSCHAR_TYPE(x, y, z) DEFINE_CHAR_TYPE(x, y, z)

This should generate the source code for a one-character JavaScript string.
Ideally it would work like JS_ValueToSource (which is ultimately implemented by
js_QuoteString in jsopcode.cpp).

>@@ -2092,17 +2143,17 @@ CType::TypesEqual(JSContext* cx, JSObjec
>   jsval size;
>-  JS_GetProperty(cx, obj, "size", &size);
>+  JS_GetReservedSlot(cx, obj, SLOT_SIZE, &size);

In DEBUG builds, please assert that this succeeds:

    Infallibly(JS_GetReservedSlot(cx, obj, SLOT_SIZE, &size));

where Infallibly is defined like this:

    JS_ALWAYS_INLINE void
    Infallibly(JSBool ok)
    {
	JS_ASSERT(ok);
    }

(Name the inline helper whatever you want - "Infallibly" is maybe too cute.)

Do the same in other places where JS_GetReservedSlot or other effectful
JSAPI functions are called without checking the return value.

In CType::GetProtoFromCtor:
>   // Look at the 'prototype' property of the type constructor.
>   jsval prototype;
>   JS_GetProperty(cx, obj, "prototype", &prototype);

This could get the reserved slot instead.

>   JSObject* proto = JSVAL_TO_OBJECT(prototype);
>   JS_ASSERT(proto);
>   JS_ASSERT(JS_GET_CLASS(cx, proto) == &sCTypeProtoClass);
> 
>   // Get the requested ctypes.{Pointer,Array,Struct}Type.prototype.
>   jsval result;
>   JS_GetReservedSlot(cx, proto, slot, &result);
>   return JSVAL_TO_OBJECT(result);
> }

Judging from the two places where this function is called, I think the slot
argument and this last step are unnecessary. I think proto is already the
desired object (either PointerType.prototype or StructType.prototype).

>+CType::ProtoGetter(JSContext* cx, JSObject* obj, jsval idval, jsval* vp)

PrototypeGetter would be a better name.

In PointerType::CreateInternal:
>+    // Determine the name of the PointerType, since it wasn't supplied.
>+    nsAutoString typeName = BuildTypeName(cx, typeObj);
>+    JSString* nameStr = JS_NewUCStringCopyN(cx,
>+      reinterpret_cast<const jschar*>(typeName.get()), typeName.Length());
>+    if (!nameStr ||
>+        !JS_SetReservedSlot(cx, typeObj, SLOT_NAME, STRING_TO_JSVAL(nameStr)))
>+      return NULL;

Can we compute the name on first use instead? Same question in
ArrayType::CreateInternal, and in StructType::Create regarding
t.fields. These introspection features will probably be used mainly during
development; it would be nice to optimize the cost away once the add-on is
deployed. If you agree, I can file a follow-up bug.

>-  if (!JS_SealObject(cx, typeObj, JS_FALSE))
>-    return NULL;

Why not seal the new type object? Same question for ArrayType::CreateInternal
and StructType::Create. I don't think types have any setters.

>+PointerType::TargetGetter(JSContext* cx, JSObject* obj, jsval idval, jsval* vp)

TargetTypeGetter, please.

In ExtractStructField:
>   jsid id;
>   if (!JS_NextProperty(cx, iter, &id))
>     return false;
>+  if (JSVAL_IS_VOID(id)) {

This check for undefined...

>+    JS_ReportError(cx, "struct field descriptors require a valid name and type");
>+    return false;
>+  }
>   jsval nameVal;
>+  if (!JS_IdToValue(cx, id, &nameVal))
>+    return false;
>+  if (!JSVAL_IS_STRING(nameVal)) {
>+    JS_ReportError(cx, "struct field descriptors require a valid name and type");
>+    return false;
>+  }

...is redundant with the subsequent JSVAL_IS_STRING test. And theoretically
it's bad to pass a jsid to JSVAL_IS_VOID.

>   JSAutoTempValueRooter nameroot(cx, nameVal);

Jeff landed a monster patch splitting this class into several smaller, more
type-safe ones. It breaks your patch, but the fix should be easy.

>+  if (JSVAL_IS_PRIMITIVE(propVal) ||
>+      !CType::IsCType(cx, JSVAL_TO_OBJECT(propVal))) {

We do this so often it seems like a CType::ValueIsCType(JSContext *, jsval)
method would be justified.

In StructType::ConstructData:
>+      if (fields->Length() != 1) {
>+        JS_ReportError(cx, "constructor takes 0, 1, or %u arguments",
>+          fields->Length());
>+        return JS_FALSE;
>+      }

This is the wrong error message for this branch. If we get here, clearing the
pending exception was a mistake; we should have propagated it.

To me, it seems like the ifs could be arranged so as to make the case analysis
a little more obvious.

    if (argc == 0)                   <--- CASE 0
      return true;

    if (argc == 1) {                 <--- CASE 1
      if (ExplicitConvert(...))
	return true;
      if (fields->Length() != 1)
	return false;  // propagate error

      ... (clean up)
      ... (try again, or just fall through)
    }

    if (argc == fields->Length()) {  <--- CASE n
      ... (ImplicitConvert on each field)
    }

    JS_ReportError(...0, 1, or %u...);
    return false;

Your call.

> // * 'typeObj' must be a CType of defined (but possibly zero) size.
> // * If a CData object 'parentObj' is supplied, the new CData object becomes
>+//   dependent on the given parent and its buffer refers to a slice of the
>+//   parent's buffer, supplied in 'source'. 'parentObj' will be held alive by
>+//   the resulting CData object.
> // * If 'parentObj' is null, the new CData object will create a new buffer of
> //   size given by 'typeObj'. If 'source' data is supplied, the data will be
> //   copied from 'source' into the new buffer; otherwise, the entirety of the
> //   new buffer will be initialized to zero.

Please update the comment--there's no 'parentObj' parameter here.

>+  JSString* result = JS_NewUCStringCopyN(cx,
>+    reinterpret_cast<const jschar*>(source.get()), source.Length());

This idiom seems worthy of an inline function--and you might as well
justify the reinterpret_cast by PR_STATIC_ASSERTing that sizeof jschar and
sizeof PRUnichar are both 2.

In Library::Create:
>     library = PR_LoadLibraryWithFlags(libSpec, 0);
>-    if (!library)
>+    if (!library) {
>+      JS_ReportError(cx, "couldn't open library");
>       return NULL;
>+    }

Can we get a useful error message out of NSPR? This is another error that's
going to be very, very common during development.

A follow-up bug is fine here.

>     nsCOMPtr<nsILocalFile> localFile = do_QueryInterface(file);
>-    if (!localFile)
>+    if (!localFile) {
>+      JS_ReportError(cx, "open takes a string or nsIFile argument");

Wrong interface name in error message. Should be nsILocalFile.

>     rv = localFile->Load(&library);
>-    if (NS_FAILED(rv))
>+    if (NS_FAILED(rv)) {
>+      JS_ReportError(cx, "couldn't open library");
>       return NULL;
>+    }

Another error message worth improving soon.

The dependency on nsILocalFile and XPConnect here is worth a closer look before we ship this. Is it safe to call PR_LoadLibraryWithFlags without notifying Gecko of what we're doing? Conversely, can we just JS_GetProperty(cx, aPath, "persistentDescriptor", &v) and thereby avoid the explicit dependency entirely?

Perhaps the js-ctypes entry point should just take a "callbacks" parameter for loading and unloading libraries. That way the exact behavior is up to the embedding (i.e., Gecko) which can then use its own longstanding custom code (i.e. nsLocalFile) without us having to put all that in js/ctypes.

(We should do this before we ship because subtle changes in this behavior after a version ships will be a huge pain for users.)
I'm not happy with our CData-to-CData conversions and I'm not really thrilled with the state of StructType::ConstructData, but we need to move forward. I'll file followup bugs if I find I'm sufficiently unhappy after playing around a bit.

Still looking at the tests.
I really dig all the added test coverage.

>+function run_jschar_tests(library, t, name, limits) {
>+  do_check_eq(t.name, name);
>+  do_check_eq(t.size, 2);
>+
>+  do_check_true(t instanceof ctypes.CType);
>+  do_check_true(t.constructor === ctypes.CType);
>+  do_check_true(t.__proto__ === ctypes.CType.prototype);
>+  do_check_true(t.prototype.constructor === t);
>+
>+  do_check_throws(function() { t.prototype.address(); }, Error);
>+  do_check_throws(function() { t.prototype.readString(); }, Error);
>+  do_check_throws(function() { t.prototype.toString(); }, Error);
>+  do_check_throws(function() { t.prototype.toSource(); }, Error);
>+
>+  do_check_eq(t.toString(), "type " + name);
>+  do_check_eq(t.toSource(), "ctypes." + name);
>+  do_check_true(t.ptr === ctypes.PointerType(t));
>+  do_check_eq(t.array().name, name + "[]");
>+  do_check_eq(t.array(5).name, name + "[5]");
>[...]

I'm not telling you anything you don't already know, but this test could be
factored better. This function is extremely similar to code
elsewhere in the file.

>   do_check_eq(p_t.toSource(),
>     "ctypes.StructType(\"g_t\", [{ \"a\": ctypes.int32_t }, { \"b\": ctypes.double }]).ptr");

Micro-nit: consider single quotes to make this a bit more readable.

     'ctypes.StructType("g_t", [{ "a": ctypes.int32_t }, { "b": ctypes.double }]).ptr'

Same comment in the other places where \" appears in the tests.
Comment on attachment 423893 [details] [diff] [review]
patch v2-v3 interdiff

r=me with the changes requested. Let's try to get this landed.
Attachment #423893 - Flags: review?(jorendorff)
(In reply to comment #38)
> > #define DEFINE_CHAR_TYPE(name, type, ffiType)                                  \
> >   case TYPE_##name:                                                            \
> >     /* Serialize as an integer. */                                             \
> >     result.Append(IntegerToString(*static_cast<type*>(data), 10));             \
> >     break;
> >+#define DEFINE_JSCHAR_TYPE(x, y, z) DEFINE_CHAR_TYPE(x, y, z)
> 
> This should generate the source code for a one-character JavaScript string.
> Ideally it would work like JS_ValueToSource (which is ultimately implemented by
> js_QuoteString in jsopcode.cpp).

Hum. This means the result won't be able to eval() and pass ImplicitConvert. (Unless we have a dequoting algorithm, and the result is a 1-char string?) I'll fix by calling JS_ValueToSource, for now.

> In CType::GetProtoFromCtor:
> >   // Look at the 'prototype' property of the type constructor.
> >   jsval prototype;
> >   JS_GetProperty(cx, obj, "prototype", &prototype);
> 
> This could get the reserved slot instead.

Fixed in attachment 425080 [details] [diff] [review]. But that required storing it on a reserved slot first. So I'll leave it til that lands.

> Judging from the two places where this function is called, I think the slot
> argument and this last step are unnecessary. I think proto is already the
> desired object (either PointerType.prototype or StructType.prototype).

Indeed, nice. :)

> Can we compute the name on first use instead? Same question in
> ArrayType::CreateInternal, and in StructType::Create regarding
> t.fields. These introspection features will probably be used mainly during
> development; it would be nice to optimize the cost away once the add-on is
> deployed. If you agree, I can file a follow-up bug.

Sure, followup. I tried doing that initially, but it led to some complications. I forget exactly what.

> Why not seal the new type object? Same question for ArrayType::CreateInternal
> and StructType::Create. I don't think types have any setters.

Because CType::Create does it now :)

> >+  if (JSVAL_IS_PRIMITIVE(propVal) ||
> >+      !CType::IsCType(cx, JSVAL_TO_OBJECT(propVal))) {
> 
> We do this so often it seems like a CType::ValueIsCType(JSContext *, jsval)
> method would be justified.

Mm, agreed. Saved for followup since this will rot my other patches.

> Can we get a useful error message out of NSPR? This is another error that's
> going to be very, very common during development.
> 
> A follow-up bug is fine here.

Will follow up in bug 551057.

> The dependency on nsILocalFile and XPConnect here is worth a closer look before
> we ship this. Is it safe to call PR_LoadLibraryWithFlags without notifying
> Gecko of what we're doing? Conversely, can we just JS_GetProperty(cx, aPath,
> "persistentDescriptor", &v) and thereby avoid the explicit dependency entirely?

Love that idea. Will follow up on it in bug 538324.

> Perhaps the js-ctypes entry point should just take a "callbacks" parameter for
> loading and unloading libraries. That way the exact behavior is up to the
> embedding (i.e., Gecko) which can then use its own longstanding custom code
> (i.e. nsLocalFile) without us having to put all that in js/ctypes.
> 
> (We should do this before we ship because subtle changes in this behavior after
> a version ships will be a huge pain for users.)

Yeah. We could do the same for the charset stuff. I'll get on that for beta.
Attached patch mingw fixSplinter Review
Attachment #431688 - Attachment is obsolete: true
Attachment #432122 - Flags: review?(dwitte)
(In reply to comment #42)
> (In reply to comment #38)
> > > #define DEFINE_CHAR_TYPE(name, type, ffiType)                                  \
> > >   case TYPE_##name:                                                            \
> > >     /* Serialize as an integer. */                                             \
> > >     result.Append(IntegerToString(*static_cast<type*>(data), 10));             \
> > >     break;
> > >+#define DEFINE_JSCHAR_TYPE(x, y, z) DEFINE_CHAR_TYPE(x, y, z)
> > 
> > This should generate the source code for a one-character JavaScript string.
> > Ideally it would work like JS_ValueToSource (which is ultimately implemented by
> > js_QuoteString in jsopcode.cpp).
> 
> Hum. This means the result won't be able to eval() and pass ImplicitConvert.
> (Unless we have a dequoting algorithm, and the result is a 1-char string?) I'll
> fix by calling JS_ValueToSource, for now.

What I was trying to say here was that:
  ctypes.jschar("\xfe").toSource() === 'ctypes.jschar("\\xfe")'

That should round-trip through eval.

Only for jschar, not the other character types.
Filed bug 551982 for calculating t.name and t.fields lazily, on first use, rather than eagerly.
Attachment #432122 - Flags: review?(dwitte) → review+
Comment on attachment 432122 [details] [diff] [review]
mingw fix

>diff --git a/js/ctypes/CTypes.cpp b/js/ctypes/CTypes.cpp

>@@ -958,8 +958,8 @@ IntegerToString(IntegerType i, jsuint radix)

>+  PRUnichar buffer[sizeof(IntegerType) * 8 + 1];
>+  PRUnichar *cp = buffer + sizeof(buffer) / sizeof(jschar);

Can you switch the remaining references to 'jschar' (just sizeof arguments, I think) to 'PRUnichar' in this function? Mix'n'matching types should be avoided, except where necessary...

Also, style in this file is for '*' to come after 'PRUnichar' with no space.

>@@ -3175,7 +3175,7 @@ AddFieldToArray(JSContext* cx,

>-  if (!JS_DefineUCProperty(cx, fieldObj, name.get(), name.Length(),
>+  if (!JS_DefineUCProperty(cx, fieldObj, reinterpret_cast<const jschar *>(name.get()), name.Length(),

Need to rewrap to 80 chars. Same '*' nit.

>@@ -3268,7 +3268,7 @@ StructType::Create(JSContext* cx, uintN argc, jsval* vp)

>+      instanceProp->name = reinterpret_cast<const jschar *>(info->mName.get());

Same here.

>diff --git a/js/src/jscntxt.h b/js/src/jscntxt.h

>+#undef calloc

Hmm. Is this still necessary? (I moved jscntxt.h to the top of the include list.)

If so, please request review on this bit from cjones.

r=dwitte on the other bits
Comment on attachment 424109 [details] [diff] [review]
patch v3 part 2 - fix rooting

(patch v3 part 2 - fix rooting)

>   // Set up the .prototype and .prototype.constructor properties.
>   JSObject* prototype = JS_NewObject(cx, &sCTypeProtoClass, CTypeProto, parent);
>   if (!prototype)
>     return NULL;
> 
>+  if (!JS_DefineProperty(cx, obj, "prototype", OBJECT_TO_JSVAL(prototype),
>+         NULL, NULL, JSPROP_ENUMERATE | JSPROP_READONLY | JSPROP_PERMANENT))
>+    return NULL;
>+
>   // Define properties on the type constructor's 'prototype' property using
>   // reserved slots and getters.
>   if (!JS_DefineProperties(cx, prototype, props))
>     return NULL;
> 
>-  if (!JS_DefineProperty(cx, obj, "prototype", OBJECT_TO_JSVAL(prototype),
>-         NULL, NULL, JSPROP_ENUMERATE | JSPROP_READONLY | JSPROP_PERMANENT))
>-    return NULL;
>-

Maybe worth a one-liner comment, at your discretion:
    // Define property before proceeding, for GC safety.

In PointerType::CreateInternal:
>     nsAutoString typeName = BuildTypeName(cx, typeObj);
>     JSString* nameStr = JS_NewUCStringCopyN(cx,
>       reinterpret_cast<const jschar*>(typeName.get()), typeName.Length());
>+    JSAutoTempValueRooter nameroot(cx, nameStr);
>     if (!nameStr ||
>         !JS_SetReservedSlot(cx, typeObj, SLOT_NAME, STRING_TO_JSVAL(nameStr)))
>       return NULL;

nameroot isn't necessary since the next JSAPI call roots nameStr.

In ArrayType::CreateInternal:
>   JSString* name = JS_NewUCStringCopyN(cx,
>     reinterpret_cast<const jschar*>(typeName.get()), typeName.Length());
>+  JSAutoTempValueRooter nameroot(cx, name);
>   if (!name ||
>       !JS_SetReservedSlot(cx, typeObj, SLOT_NAME, STRING_TO_JSVAL(name)))
>     return NULL;

Ditto.

r=me with the two extraneous nameroots removed.
Attachment #424109 - Flags: review?(jorendorff) → review+
Comment on attachment 424110 [details] [diff] [review]
patch v3 part 3 - add overflow tests


In CTypes.cpp:
>+#define ALIGN(ptr, align) (((static_cast<uintptr_t>(ptr) - 1) | ((align) - 1)) + 1)

The name of the 'ptr' parameter and the cast to uintptr_t are misleading, since
the actual parameter is never a pointer and is size_t. :-)

An inline function would be better than the macro for this.

In CType::Create:
>+  // Assert a sanity check on size and alignment: size % alignment should always
>+  // be zero.
>+  JS_ASSERT(!IsSizeDefined(cx, typeObj) ||
>+            GetSize(cx, typeObj) % GetAlignment(cx, typeObj) == 0);
>+

There is a handy JS_ASSERT_IF macro for this case:

    JS_ASSERT_IF(IsSizeDefined(cx, typeObj),
                 GetSize(cx, typeObj) % GetAlignment(cx, typeObj) == 0);

No benefit except readability. :)

In StructType::Create:
>       size_t fieldSize = CType::GetSize(cx, info->mType);
>       size_t fieldAlign = CType::GetAlignment(cx, info->mType);
>-      size_t padding = (fieldAlign - structSize % fieldAlign) % fieldAlign;
>-      if (structSize + padding < structSize ||
>-          structSize + padding + fieldSize < structSize) {
>+      size_t fieldOffset = ALIGN(structSize, fieldAlign);
>+      // Check for overflow. Since we hold invariant that fieldSize % fieldAlign
>+      // be zero, we can safely check fieldOffset + fieldSize without first
>+      // checking fieldOffset for overflow. Therefore, the latter is simply
>+      // an assertion.
>+      JS_ASSERT(fieldOffset >= structSize);

I think the assertion will fail if structSize == (size_t) -1 and fieldAlign > 1.
fieldOffset will be zero.

>     // Pad the struct tail according to struct alignment.
>-    size_t delta = (structAlign - structSize % structAlign) % structAlign;
>-    if (structSize + delta < structSize) {
>+    size_t structTail = ALIGN(structSize, structAlign);
>+    if (structTail < structSize) {
>       JS_ReportError(cx, "size overflow");
>       return JS_FALSE;
>     }

Here (just a few lines down) you do check the result of ALIGN, which I think is
the right thing.

In test_jsctypes.js.in:
>+    while (large_t.size != ctypes.UInt64("0xfffffffffffff800"))

I think ctypes.UInt64 is misleading here.  It suggests that this will be an
exact comparison. But the left-hand side is (according to the spec, anyway) a
primitive number, and the right-hand side will be converted to a primitive
number too.

An exact comparison could be achieved... but that number on the right is
exactly representable as a primitive number anyway, so let's just do that.

The same goes for the other loops.

>+    while (large_t.size != ctypes.UInt64("0x1fffffffffffff"))

This number is exactly representable too.

>+    while (large_t.size != ctypes.UInt64("0x8000000000000000"))

And this one.
Attachment #424110 - Flags: review?(jorendorff) → review+
I just noticed that 

  var buf_t = new ctypes.char.array();
  var buf = new buf_t(N);  // takes N * sizeof(void *) bytes for the type
                           // in addition to N bytes for the buffer

This is a reasonable thing to want to do without incurring that kind of overhead. And for this kind of use case we'll never need the ffiType, right? As a follow-up bug, we should find a way around it -- perhaps by generating the ffiType lazily.
Attachment #425126 - Flags: review?(jorendorff) → review+
Comment on attachment 431250 [details] [diff] [review]
patch v3 part 6 - fix rooting 2

In BuildTypeSource:
>-    result.AppendASCII("ctypes.");
>+    result.Append(NS_LITERAL_STRING("ctypes."));

Cool. It looks like you've still got a call to AppendASCII in BuildDataSource
(though it seems likely you've already fixed that and that the v3 patch is just
a touch out of date).

> bool
> CType::TypesEqual(JSContext* cx, JSObject* t1, JSObject* t2)
> {
>   JS_ASSERT(IsCType(cx, t1) && IsCType(cx, t2));
> 
>+  // Fast path: check for object equality.
>+  if (t1 == t2)
>+    return true;

Heh.  :-)

In Int64::Construct:
>   // Get ctypes.Int64.prototype from the 'prototype' property of the ctor.
>-  jsval slot;
>-  JS_GetProperty(cx, JSVAL_TO_OBJECT(JS_ARGV_CALLEE(argv)), "prototype", &slot);
>-  JSObject* proto = JSVAL_TO_OBJECT(slot);
>+  JSAutoTempValueRooter slot(cx);
>+  JS_GetProperty(cx, JSVAL_TO_OBJECT(JS_ARGV_CALLEE(argv)), "prototype",
>+    slot.addr());
>+  JSObject* proto = JSVAL_TO_OBJECT(slot.value());
>   JS_ASSERT(JS_GET_CLASS(cx, proto) == &sInt64ProtoClass);

If ctypes.Int64 is sealed, this TempValueRooter is unnecessary.

Same comment in UInt64::Construct.
Attachment #431250 - Flags: review?(jorendorff) → review+
Blocks: 552070
Blocks: 552076
(In reply to comment #49)
> This is a reasonable thing to want to do without incurring that kind of
> overhead. And for this kind of use case we'll never need the ffiType, right? As
> a follow-up bug, we should find a way around it -- perhaps by generating the
> ffiType lazily.

Filed bug 552070 on having the ffi_type not be O(n) in space. We could also generate the type lazily, you're right -- filed bug 552076 for that optimization.
(In reply to comment #26)
> What's the rationale for the ctypes.CData === ctypes.CType.prototype
> requirement?

Here's a purely aesthetic argument: http://pastebin.mozilla.org/707875

More next week; I'm out of time.
Depends on: 552155
Thanks for review.

(In reply to comment #46)
> (From update of attachment 432122 [details] [diff] [review])
> >diff --git a/js/ctypes/CTypes.cpp b/js/ctypes/CTypes.cpp
> 
> >@@ -958,8 +958,8 @@ IntegerToString(IntegerType i, jsuint radix)
> 
> >+  PRUnichar buffer[sizeof(IntegerType) * 8 + 1];
> >+  PRUnichar *cp = buffer + sizeof(buffer) / sizeof(jschar);
> 
> Can you switch the remaining references to 'jschar' (just sizeof arguments, I
> think) to 'PRUnichar' in this function? Mix'n'matching types should be avoided,
> except where necessary...

Done.

> >+#undef calloc
> 
> Hmm. Is this still necessary? (I moved jscntxt.h to the top of the include
> list.)

No, it's not, I must have tested it with other version. Thanks.
Attached patch mingw fix v2Splinter Review
Keywords: checkin-needed
Attachment #432321 - Flags: review+
Keywords: checkin-needed
Blocks: 552214
Blocks: 552215
Blocks: 552525
Blocks: 552558
Blocks: 552561
Updated for rot.

While this evidently doesn't implement jorendorff's ideal, it's still much better than the prototype situation as it stands, so hopefully we can get this in. :)
Attachment #425080 - Attachment is obsolete: true
Attachment #432915 - Flags: review?(jorendorff)
Attachment #425080 - Flags: review?(jorendorff)
Blocks: 554705
I agree actually -- let me review this tomorrow.
Blocks: 519474
P1, need this for 1.9.3.
Priority: -- → P1
I only have very minor nits here.

In InitTypeClasses:
>   //   * Class [[Function]]
> [...]
>   //     * Class [[CTypeProto]]

These are punctuated wrong. This is talking about the [[Class]] special
property mentioned in the ECMAScript standard. So it should say:
    //   * [[Class]] "Function"
But it's fine to omit the brackets if you think it's prettier:
    //   * Class "Function"

In CType::GetProtoFromCtor:
>+  JS_GetReservedSlot(cx, obj, SLOT_FN_CTORPROTO, &protoslot);

Please assert that this succeeds, in DEBUG builds.

(The idiom we use in js/src is:
 +#ifdef DEBUG
 +  JSBool ok =
 +#endif
 +    JS_GetReservedSlot(cx, obj, SLOT_FN_CTORPROTO, &protoslot);
 + JS_ASSERT(ok);

A new macro in jsutil.h, JS_ALWAYS_TRUE, is less clumsy:
 +  JS_ALWAYS_TRUE(JS_GetReservedSlot(cx, obj, SLOT_PROTO, &slot));
but I don't think m-c has it yet.)

In CType::HasInstance:
>+  JS_GetReservedSlot(cx, obj, SLOT_PROTO, &slot);

Assert here too.

In CData::Create:
>   JSObject* proto = JSVAL_TO_OBJECT(slot);
>-  JSObject* dataObj = JS_NewObject(cx, &sCDataClass, proto, NULL);
>+  JSObject* parent = JS_GetParent(cx, typeObj);
>+  JS_ASSERT(parent);
>+
>+  JSObject* dataObj = JS_NewObject(cx, &sCDataClass, proto, parent);

This change is unnecessary.

In Int64Base::Construct:
>-  JSObject* result = JS_NewObject(cx, clasp, proto, NULL);
>+  JSObject* result = JS_NewObject(cx, clasp, proto, JS_GetParent(cx, proto));

Same here, and one or two other places -- but you can leave these if you like.

>+  // Check that the shared properties and functions on 't.prototype.__proto__'
>+  // throw.
>+  for each (let p in instanceProps)
>+    do_check_throws(function() { t.prototype.__proto__[p]; }, Error);
>+  for each (let f in instanceFns)
>+    do_check_throws(function() { t.prototype.__proto__[f]() }, Error);

They should even throw when you try to access them via t.prototype itself,
if I understand correctly.

> function run_StructType_tests() {
>+  run_type_ctor_class_tests(ctypes.StructType,
>+    ctypes.StructType("s", [{"a": ctypes.int32_t}, {"b": ctypes.int64_t}]),
>+    ctypes.StructType("t", [{"c": ctypes.int32_t}, {"d": ctypes.int64_t}]),
>+    [ "fields" ], undefined, undefined, [ "addressOfField" ], [ "a", "b" ]);

You could just pass empty arrays instead of `undefined`. Same comment at each
other call site.
Comment on attachment 432915 [details] [diff] [review]
patch v3 part 4 - fix protos (take 2)

r=me with the nits picked.
Attachment #432915 - Flags: review?(jorendorff) → review+
Comment on attachment 432915 [details] [diff] [review]
patch v3 part 4 - fix protos (take 2)

http://hg.mozilla.org/mozilla-central/rev/c6cfe544d4b9
For great justice!
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
This stuff is all documented, in the MDC subtree starting here:

https://developer.mozilla.org/en/js-ctypes
No longer depends on: 541212
Blocks: 803855
You need to log in before you can comment on or make changes to this bug.