Last Comment Bug 756258 - Support union types in new DOM bindings
: Support union types in new DOM bindings
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla16
Assigned To: Peter Van der Beken [:peterv]
:
Mentors:
Depends on:
Blocks: 749866 762654 767546
  Show dependency treegraph
 
Reported: 2012-05-17 14:11 PDT by Peter Van der Beken [:peterv]
Modified: 2012-06-24 20:08 PDT (History)
2 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
v1 (39.98 KB, patch)
2012-05-17 14:11 PDT, Peter Van der Beken [:peterv]
no flags Details | Diff | Splinter Review
Example of codegen (3.67 KB, text/plain)
2012-05-17 14:39 PDT, Peter Van der Beken [:peterv]
bzbarsky: feedback+
Details
v1.1 (43.96 KB, patch)
2012-06-11 08:15 PDT, Peter Van der Beken [:peterv]
no flags Details | Diff | Splinter Review
v2 (60.47 KB, patch)
2012-06-15 11:57 PDT, Peter Van der Beken [:peterv]
bzbarsky: review-
Details | Diff | Splinter Review
v3 (69.59 KB, patch)
2012-06-21 16:01 PDT, Peter Van der Beken [:peterv]
peterv: review-
Details | Diff | Splinter Review
Attempted interdiff from v2 to v3, just in case I need it (59.74 KB, patch)
2012-06-21 18:26 PDT, Boris Zbarsky [:bz]
no flags Details | Diff | Splinter Review
v3.1 (75.75 KB, patch)
2012-06-22 07:52 PDT, Peter Van der Beken [:peterv]
bzbarsky: review+
Details | Diff | Splinter Review
Interdiff v2 to v3.1 (68.79 KB, patch)
2012-06-22 07:54 PDT, Peter Van der Beken [:peterv]
no flags Details | Diff | Splinter Review
Interdiff v3 to v3.1 (36.78 KB, patch)
2012-06-22 08:29 PDT, Boris Zbarsky [:bz]
no flags Details | Diff | Splinter Review

Description Peter Van der Beken [:peterv] 2012-05-17 14:11:43 PDT
Created attachment 624895 [details] [diff] [review]
v1
Comment 1 Peter Van der Beken [:peterv] 2012-05-17 14:39:20 PDT
Created attachment 624904 [details]
Example of codegen

Here's an example of the codegen for |(Blob or DOMString or object or long)|. Asking for feedback first, before we start the review of the patch.
Comment 2 Boris Zbarsky [:bz] 2012-05-17 17:12:37 PDT
Comment on attachment 624904 [details]
Example of codegen

I'm a little confuse about how SetAsString works: NonNull can be set to a _pointer_, but this looks like it's setting it to a reference?  How does that work?

I don't think the generated code at the very end there is right. The only way TrySetToString can fail is if there's a toString() method that throws, and in that case we should throw, not try to convert to a long.

Now the good news is that the IDL this code was produced from is invalid: DOMString and long are not distinguishable, so can't appear in the same union.  The codegen should probably be checking mutual distinguishability of union members and throwing if they're not distinguishable.

Similarly, "Blob" and "object" are not distinguishable.

I _think_ this codegen should be fine if the non-distinguishable pairs are excluded.
Comment 3 Peter Van der Beken [:peterv] 2012-05-18 03:44:43 PDT
(In reply to Boris Zbarsky (:bz) from comment #2)
> I'm a little confuse about how SetAsString works: NonNull can be set to a
> _pointer_, but this looks like it's setting it to a reference?  How does
> that work?

With an ugly hack, we'll get to that when you review :-). Union types have their own NonNull.

> I don't think the generated code at the very end there is right. The only
> way TrySetToString can fail is if there's a toString() method that throws,
> and in that case we should throw, not try to convert to a long.

I'll probably have to have Try* return true and use a |bool& failed| argument that signals whether we have to bail because a fatal condition (like an exception) occured.

> The codegen should probably be checking mutual distinguishability of
> union members and throwing if they're not distinguishable.

Yeah, I'd forgotten to add that.
Comment 4 Boris Zbarsky [:bz] 2012-05-18 06:35:14 PDT
> I'll probably have to have Try* return true and use a |bool& failed| argument 

You actually might not need to.  I think the only things that can throw in an intermittent way are conversions to primitive types, and those would be coming last anyway.
Comment 5 Peter Van der Beken [:peterv] 2012-06-11 08:15:51 PDT
Created attachment 631899 [details] [diff] [review]
v1.1
Comment 6 Peter Van der Beken [:peterv] 2012-06-15 11:57:20 PDT
Created attachment 633616 [details] [diff] [review]
v2

I've disabled the test related to interfaces and enums in unions, I'd need to export so that anything including UnionTypes.h has access to it. Maybe putting them all in one file wasn't such a good idea :-/.

Getting the name of different types is a bit of a pain (see typeName). Not sure how we can make that better.
Comment 7 Boris Zbarsky [:bz] 2012-06-15 20:09:26 PDT
Peter, what's the difference between UnionTypes and UnionArgumentTypes?  (And pleaes document it too, but for now I'm just trying to make sense of the code...)

As far as tests, you could just use actual web interfaces instead of test ones if the issue is header includes.  We have several interfaces and at least one enum now, and I think using all that stuff in the test IDL should be fine.
Comment 8 Peter Van der Beken [:peterv] 2012-06-16 00:47:46 PDT
(In reply to Boris Zbarsky (:bz) from comment #7)
> Peter, what's the difference between UnionTypes and UnionArgumentTypes? 

UnionTypes.h should be used by the API implementers, it contains the UnionTypes with the enum and the values union. UnionArgumentTypes.h contains the code to help unwrap arguments (all the TryToSet* methods) and the holders. I tried to separate the two because the UnionArgumentTypes code is not very useful outside of binding code (and uses JS API which most users of UnionTypes shouldn't need to include). Hope that makes sense.
I haven't really figured out return union types, especially when it comes to holders. It seems some APIs might need a holder and some not. The ones that don't should work fine right now, they should be able to use the union types generated in UnionTypes.h.

> As far as tests, you could just use actual web interfaces instead of test
> ones if the issue is header includes.  We have several interfaces and at
> least one enum now, and I think using all that stuff in the test IDL should
> be fine.

I'll keep that in mind. Right now I'm looking to see if I could generate the union headers with some kind of prefix (and use 'Test' as the prefix when generating the test bindings).

I also noticed I forgot to do the throwing that comment 2 talks about. I'll work on it, but I think it shouldn't block the review for now.
Comment 9 Boris Zbarsky [:bz] 2012-06-16 09:05:04 PDT
I wouldn't worry about the throwing for the moment; once my existing patches to move distinguishability stuff to the parser land, we can just do it in the parser.  Just file a followup?

So UnionTypes is the "public" header, basically, and UnionTypesArguments is the "private" one?

Would it make sense to call the latter UnionTypeConversions.h (akin to the existing PrimitiveConversions.h)?
Comment 10 Boris Zbarsky [:bz] 2012-06-18 23:13:50 PDT
Comment on attachment 633616 [details] [diff] [review]
v2

>+++ b/dom/bindings/Codegen.py
>@@ -334,30 +348,20 @@ class CGHeaders(CGWrapper):
>+                if t.unroll().isUnion():
>+                    bindingHeaders.add("mozilla/dom/UnionArgumentTypes.h")

Do we not need to add the headers for the types in the union because UnionArgumentTypes.h is doing that?  Might be worth documenting, if so.

>+def UnionTypes(descriptors):

Could you please document what the return value is?

Do we need to worry about sequences or dictionaries containing unions here?  I guess t = t.unroll() deals with sequences, but what about dictionaries?

Though actually, checking "t.isUnion()" will fail for sequences of unions, right?  You want to move the "t = t.unroll()" outside that check, at least.

>+                        if f.isSpiderMonkeyInterface():

Could you refactor the code from here and CGHeaders into a separate function as well?  The version in CGHeaders got updated to deal with dictionaries, but this did not....

>+    return (headers, declarations, CGList(unionStructs.values(), "\n"))

You probably want to sort the unionStructs.values() somehow so the order is the same across codegen passes if nothing has changed..

>+def UnionArgumentStructs(descriptors):
>+    # Now find all the things we'll need as arguments because we
>+    # need to unwrap them.

Please clearly document how this differs from UnionTypes?  I guess the renaming might help with that.

>+            if type.isUnion():
>+                type = type.unroll()

Again, for sequences, you want to unroll() before checking isUnion().

>+    return CGWrapper(CGList(unionArgumentStructs.values(), "\n"), post="\n\n")

Again, sort please.

> def getJSToNativeConversionTemplate(type, descriptorProvider, failureCode=None,

>+        unionArgumentObj = "${holderName}"

Hmm.  This is using ${holderName} even when there was no holder CGThing returned, which makes argument unwrapping fail for sequence-of-unions types (because sequences do not have a holder to work with, so don't pass in a holderName at all).  Union members of dictionaries would have similar issues.

I'm not quite sure what to do about that, given the requirement that the types in the FooOrBar actually be unionable (and in particular the requirement that they not have constructors).

I wonder whether we can get around it by making the FooOrBar just not use C++ unions directly and instead have a union of memory chunks of the right sizes and alignments (a la what Maybe does) and reinterpret_cast as needed?  Then we can use any decltype we want in there, and can have a separate holder struct for any holder types we need...  We wouldn't need the separate NonNull either, then.  We'd just need to be a bit more careful in our public setters.

A followup is probably ok for that, I guess.
    
>+        nonArrayObjectMemberTypes = filter(lambda t: t.isGeckoInterface(), memberTypes)

I don't think this is right.  In particular, consider the case of NodeList.  That's an isGeckoInterface(), but IsArrayLike() for it is supposed to return true (though it doesn't yet in our impl).  Conversely, ArrayBuffer is a spidermonkey interface, but not IsArrayLike.  That last we could add to tests, probably (at least in terms of inspecting the codegen to make sure it's correct for that case).

I believe the right test here is to just look for types that are isInterface() and not isCallbackInterface() (assuming bug 764698 lands first; otherwise you have to check for callbacks the hard way).  And for those types, just try to set to the value.  That's what overload resolution does.

Note that part of the confusion here is that IsArrayObject() is meant to be the "a platform array object, a native Array object, or a platform object that supports indexed properties" test from WebIDL, not the "a platform array object" test.

>+        if nonArrayObject is not None or arrayObject is not None:
...
>+                                       pre="if (IsPlatformObject(cx, &argObj)) {\n",

A JS Array is not IsPlatformObject() but should unwrap to a sequence or IDL array, whever we get those.  Though note that it looks like a union that includes a sequence fails codegen right now anyway....  Followup bug on that, please.

Again, it's worth seeing how overload resolution deals here.

>+        dateObjectMemberTypes = filter(lambda t: t.isDate(), memberTypes)
...
>+            memberType = dateObjectMemberTypes[0]
>+            (memberTemplate, memberDeclType,
>+             memberHolderType) = getJSToNativeConversionTemplate(
>+                memberType, descriptorProvider)

I think those 4 lines can go away: nothing uses the result, afaict.

>+            dateObject = CGGeneric("%s.SetTo%s(cx, ${val}, ${valPtr});" % (unionArgumentObj, name))

There is no |name| defined in this codepath, so if we hit this, it will throw, I think.  Might as well just make the Date case throw explicitly and write the code if/when we need to.

>+        callbackMemberTypes = filter(lambda t: t.isCallback(), memberTypes)

This needs to test for callback interfaces too, I think.

>+        if len(callbackMemberTypes) > 0:
>+            callbackObject = []

You can't have more than one callback or callback interface in a union.  So you don't need to worry about having a list here; just assert len(callbackMemberTypes) == 1.

And this should only be set if we didn't already set an interface type, just like Object.  Though I guess callback interfaces and objects and normal interfaces are not distinguishable from each other, so we'll never have more than one of them around...

It may, again, beworth looking at how overload resolution does it, but that's also suboptimal in various ways.  Maybe we should just land this as-is and do a followup to clean them both up?

Either way, you don't want to allow unwrapping platform objects to callback or callback interface or dictionary arguments.  The conversions for those three should be behind a !IsPlatformObject(), I think.

>+        # Need to deal with dictionaries here.

Please just throw if there's a dictionary type in there, for now.

>+        otherMemberTypes.extend(t for t in memberTypes if t.isPrimitive())
>+        if len(otherMemberTypes) > 0:

Then the length is 1: strings, enums, and primitives are not distinguishable from each other.

>+        def handleNull(templateBody, setToNullVar):
>+            null = CGGeneric("if (${val}.isNull()) {\n"

isNullOrUndefined(), please.

>+        assert not (nullable and hasNullableType)

If the parser doesn't prevent this, this should probably be an explicit TypeError, not an assert.

The optional/nullable handling here is really complicated, but I don't see an obvious way to make it simpler.  I'll think about it a bit.  For one thing, the "use Maybe for the holder" code pattern is a good one to hoist up into the main place we handle this stuff; I'd used Optional for the holder when the decl is Optional, but Maybe is a better fit.  Maybe a followup bug on this?

> def typeNeedsCx(type):

This should probably be adjusted for sequences and dictionaries too, right?  Followup bug for that, please.

>+def getSignatureType(type, descriptorProvider):

How about getUnionAccessorSignatureType, and document that it returns a CGThing?

>+    if type.isSequence():

This seems to reimplement getJSToNativeConversionTemplate for sequences, pretty much, modulo the const.  Why can we not do the same "call getJSToNativeConversionTemplate and then strip off the const" trick we do later on?

I think we should move the const-adding to instantiateJSToNativeConversionTemplate, by the way, in a followup; that would enable us to drop the const-stripping bit.

>+    if type.isUnion():

Again, this does exactly what getJSToNativeConversionTemplate does, I think.

>+    if type.isGeckoInterface():

This one is tougher because of the InterfaceName& bit, I guess.  I still wish we could merge it with the other somehow, but I don't see how.  Just document why this is not using getJSToNativeConversionTemplate?

>+    if type.isEnum():
>+        if type.nullable():
>+            raise TypeError("We don't support nullable enumerated arguments "

s/arguments/union members/

>+    if type.isObject():

This doesn't match what getJSToNativeConversionTemplate does, but I'm starting to think that was a mistake and we should pass JSObject& for non-nullable object.  Again, a followup?

>+    typeName = CGGeneric(builtinNames[type.tag()])
>+    if type.nullable():
>+        typeName = CGWrapper(typeName, pre="Nullable< ", post=" >")

Again, this is what getJSToNativeConversionTemplate does, right?

>+def getUnionTypeTemplateVars(type, descriptorProvider):

>+    if type.isDictionary() and type.isSequence():

s/and/or/, please.

>+        raise TypeError("Can't handle dictionaries in unions")

And s/dictionaries/sequences/?

>+    if type.isGeckoInterface():
>+        name = type.unroll().inner.identifier.name
>+    elif type.isEnum():
>+        name = type.inner.identifier.name

The latter should also be type.unroll() etc, no?  If not, why do we need it for the former?

Not needing it for either one is plausible: This is only called on flatMemberTypes, which in particular are never nullable.

By the way, that also means, I think, that the type can never be nullable in getSignatureType, so the branches on nullability there are dead code anyway.

>+    (template, declType, holderType,
>+     dealWithOptional) = getJSToNativeConversionTemplate(
>+        type, descriptorProvider, failureCode="return false;",

Hmm.  So the only reason we want a failureCode here is so that we only throw at the end in the union conversion, after we try doing all the members?

The problem is failureCode only really means something for nonfatal (i.e. non-throwing) conversion errors right now.  So if I have a union that includes "DOMString", and an object is passed in whose toString throws, then the conversion code will have that exception on the JSContext.  But the union code will then try to also throw one of our DOM exceptions.  Does that actually work out right?

In practice, I don't think using a failureCode that returns false is really distinguishable from the "exception thrown" case.

What we really want for these setters is some of tristate return value: "exception thrown, stop now", "conversion failed, try something else", "conversion succeeded" and the failureCode should return the second of these.  I just tested, and something like this:

  struct X {
    X(bool arg) : mBoolSet(true), mBool(arg) {}
    X() : mBoolSet(false) {} 
   bool mBoolSet;
    bool mBool;
  };

can be used as the return value of a function, and returning "false" will do what we'd want.  So something like that is an option here, I guess, if we don't want to mess with out parameters.

>+    # This is ugly, but this will be used for the member of a union so it can't
>+    # be const.

Yeah, please file a followup on moving the const bits into instantiateJSToNativeConversionTemplate.

>+        jsConversion = instantiateJSToNativeConversionTemplate(
>+            (template, None, None, False),

Given that "None, None" bit, isn't this just a fancy way of doing

  jsConversion = CGGeneric(
      string.Template(template).substitute(

and then passing the exact same dictionary?  If so, I would prefer that, I think.

>+class CGUnionStruct(CGThing):
>+    // The cast to ${externalType} is needed to work around a bug in Apple's
>+    // clang compiler, for some reason it doesn't call |S::operator T&| when
>+    // casting S<T> to T& and T is forward declared.

Can we get followups filed to remove these workarounds at some point?  We're really not supporting building with old clang versions going forward, as I understand....

>+class CGUnionArgumentStruct(CGThing):
>+        if not self.type.nullableType is None:

"if self.type.nullableType is not None", please.

>+    def UnionTypes(config):
>+        curr = CGNamespace.build(['mozilla', 'dom'], unions)
>+
>+        # Wrap all of that in our namespaces.

That comment should be a few lines earlier, right?

>+                # namesapce

"namespace"

>+    def UnionArgumentTypes(config):
>+        # Wrap all of that in our namespaces.

Likewise here?

>+++ b/dom/bindings/Nullable.h
>+  Nullable(const T& aValue, bool aIsNull)

Worth documenting why we need such a thing.

>@@ -966,16 +973,19 @@ class IDLSequenceType(IDLType):
>+    def isUnion(self):
>+        return False

Do we really need this here?  Seems like the IDLObject version should suffice.

>+class IDLUnionType(IDLType):
>+        self.builtin = False

The superclass already does this, but I guess other places are doing it too.  Maybe a followup to clean this up?

>+    def __str__(self):
>+        return self.name

This is already covered by IDLType, no?

>+    def nullable(self):
>+    def isPrimitive(self):
>+    def isString(self):

And these.

>+    def isSequence(self):
>+    def isArray(self):
>+    def isDictionary(self):
>+    def isInterface(self):

And these.

>+    def isEnum(self):

And this by IDLObject.

>+    def isComplete(self):
>+        return False

Why not actually return whether we had complete() called?

>+    def complete(self, scope):
>+        while i < len(self.flatMemberTypes):
>+            while self.flatMemberTypes[i].nullable() or self.flatMemberTypes[i].isUnion():
>+                if self.flatMemberTypes[i].nullable():

I think it would be clearer to take out that middle while loop and make this more like so:

  while i < len(self.flatMemberTypes):
      if self.flatMemberTypes[].nullable():
          # deal with it as needed
          continue
      if self.flatMemberTypes[i].isUnion():
          self.flatMemberTypes[i:i + 1] = self.flatMemberTypes[i].memberTypes
          continue
      i += 1

Is there a reason we're storing the actual nullable type?  It seems a bit odd to do that; we only ever check whether it's None.  Why not store a boolean hasNullableType instead?  (Note that the current nullableType can be a union type, which would not be in self.flatMemberTypes, which is a bit surprising.)

>+                        raise WebIDLError("Can't have more than one nullable types in a union",
>+                                          self.location)

This would actually be where you can make use of the self.nullableType bit, by passing self.nullableType.location as the second arg and "extraLocation=self.flatMemberTypes[i].location" as the third arg.  I think that would give more readable error messages where typedefs are involved (as I'm sure they will be sometime).

>+                    raise WebIDLError("Flat member types of a union should be "
>+                                      "distinguishable, " + str(t) + " is not "
>+                                      "distinguishable from " + str(u),
>+                                      self.location)

Again here we can use the actual locations of t and u if desired.  Unless IDLType.location is broken for non-nullable primitive types (which is possible!).

>+    def isDistinguishableFrom(self, other):

This implements the spec as written, but I believe the spec as written is wrong.  In particular, in the case when other is a nullable union type you want to return false when we have a nullable member.  Conversely, for IDLNullableType's implementation it needs to check for unions.  Or it needs to forward to other.isDistinguishableFrom when other is not nullable or somethin (though that sort of thing can be dangeours if other also relies on reflexivity for its implementation).  I raised a spec issue at http://lists.w3.org/Archives/Public/public-script-coord/2012AprJun/0231.html

>+        return reduce(lambda b, t: reduce(lambda c, u: c and t.isDistinguishableFrom(u), self.memberTypes, b), otherTypes, True)

This is really hard to follow.  I tried using clearer variable names, but it's still kinda hard to grok:

  return reduce(lambda distinguishable, otherType:
    reduce(lambda otherTypeDistinguishableFromAllOfOurs, ourType:
      otherTypeDistinguishableFromAllOfOurs and otherType.isDistinguishableFrom(ourType),
           self.memberTypes, distinguishable),
                otherTypes, True)

Maybe just do this instead:

  for ourType in self.memberTypes:
    for otherType in otherTypes:
      if not ourType.isDistinguishableFrom(otherType):
        return False
  return True

?  Seems way more readable to me.

> class IDLArrayType(IDLType):
>+    def isUnion(self):
>+        return False

I don't think this is needed.

>@@ -1170,16 +1287,19 @@ class IDLWrapperType(IDLType):
>+    def isUnion(self):
>+        return False

Or this.

>@@ -1188,17 +1308,17 @@ class IDLWrapperType(IDLType):
>     def isDistinguishableFrom(self, other):
>-        assert self.isInterface() or self.isEnum()
>+        assert self.isInterface() or self.isEnum() or self.isDictionary()

This is actually fixed in bug 764698, so if it's not strictly necessary for this patch, I'd prefer to not have the merge conflict... ;)

>     def p_Attribute(self, p):
>-            Attribute : Inherit ReadOnly ATTRIBUTE AttributeType IDENTIFIER SEMICOLON
>+            Attribute : Inherit ReadOnly ATTRIBUTE Type IDENTIFIER SEMICOLON

I believe this introduces a regression.  The set of actual types allowed for attributes (and exception fields and constants) is restricted in prose such that sequences, for example are not allowed.  We actually implemented that on the grammar level using this AttributeType production, but you're changing that.  Unfortunately, we seem to not have a test that verifies that "attribute sequence<long> foo" throws a parser exception.... we should add one.

If we move this restriction out of the grammar, we should do it in the attribute/exception/constant parser code as needed.  We already do for dictionaries to some extent, since we can't tell at parse time whether an identifier is a dictionary.

I also just raised a spec issue about attribute types that are unions of sequence and something else, because it's not clear to me whether the spec is correct on that count.  Depending on how that goes, we might need non-grammar checking here anyway, so I would be fine with just moving this out of the grammar.  But check with Kyle?

>+    def p_TypeUnionType(self, p):
>+        for (modifier, modifierLocation) in p[2]:

Can you please file a followup to coalesce the 6 instances we have of this code now into a function?

>+    def p_UnionMemberTypeAny(self, p):

I'd actually call this UnionMemberTypeArrayOfAny

>+    def p_NonAnyType(self, p):
>+                       | ARRAYBUFFER TypeSuffix

I think the ARRAYBUFFER bit here is just leftovers that don't matter anymore.  Please file a followup bug to remove that?

>+++ b/dom/bindings/test/TestBindingHeader.h
>+    } else {
>+      int32_t i = arg.GetAsLong();
>+    }

Use i for something?  This code triggers compiler warnings.  ;)
Comment 11 Boris Zbarsky [:bz] 2012-06-20 22:24:51 PDT
> I also just raised a spec issue about attribute types that are unions of sequence 

Turns out the spec covered this already.  They're not allowed.  So we need to do this check in the actual data model, not the parser per se, once we have unions around.
Comment 12 Peter Van der Beken [:peterv] 2012-06-21 16:01:02 PDT
Created attachment 635515 [details] [diff] [review]
v3

(In reply to Boris Zbarsky (:bz) from comment #10)
> Do we not need to add the headers for the types in the union because
> UnionArgumentTypes.h is doing that?  Might be worth documenting, if so.

Done.

> Could you please document what the return value is?

Done.

> Do we need to worry about sequences or dictionaries containing unions here? 
> I guess t = t.unroll() deals with sequences, but what about dictionaries?

Followup bug will be filed.

> Though actually, checking "t.isUnion()" will fail for sequences of unions,
> right?  You want to move the "t = t.unroll()" outside that check, at least.

Done.

> Could you refactor the code from here and CGHeaders into a separate function
> as well?  The version in CGHeaders got updated to deal with dictionaries,
> but this did not....

They look too diferent to refactor. I added dictionary support but unions of dictionaries aren't supported yet.

> You probably want to sort the unionStructs.values() somehow so the order is
> the same across codegen passes if nothing has changed..

Done.

> Please clearly document how this differs from UnionTypes?  I guess the
> renaming might help with that.

I've renamed UnionArgumentTypes to UnionConversions.

> Again, for sequences, you want to unroll() before checking isUnion().

Done.

> Again, sort please.

Done.

> I wonder whether we can get around it by making the FooOrBar just not use
> C++ unions directly and instead have a union of memory chunks of the right
> sizes and alignments (a la what Maybe does) and reinterpret_cast as needed?

I did this, but I'm not sure how it's related to sequence-of-unions or union members of dictionaries. I thought holderType was supposed to be for a local variable that holds the data and declType the public type that we want to pass in the API. I don't really want to expose the FoorOrBarArgument type to the API, and I don't really want to move the holders to the declType. We don't expose holders for union member types to the API either if they're not part of a union.

I did

    if type.isUnion():
        if isMember:
            raise TypeError("Can't handle unions as members, we have a "
                            "holderType")

and we can figure it out later.

> I believe the right test here is to just look for types that are
> isInterface() and not isCallbackInterface() (assuming bug 764698 lands
> first; otherwise you have to check for callbacks the hard way).

Made this check |t.isNonCallbackInterface()|.

> A JS Array is not IsPlatformObject() but should unwrap to a sequence or IDL
> array, whever we get those.  Though note that it looks like a union that
> includes a sequence fails codegen right now anyway....  Followup bug on
> that, please.

sequence in union throws now. Followup bug will be filed.

> I think those 4 lines can go away: nothing uses the result, afaict.

Done.

> There is no |name| defined in this codepath, so if we hit this, it will
> throw, I think.

Fixed.

> This needs to test for callback interfaces too, I think.

Made this check |t.isCallback() or t.isCallbackInterface()|.

> You can't have more than one callback or callback interface in a union.  So
> you don't need to worry about having a list here; just assert
> len(callbackMemberTypes) == 1.

Done.

> Either way, you don't want to allow unwrapping platform objects to callback
> or callback interface or dictionary arguments.  The conversions for those
> three should be behind a !IsPlatformObject(), I think.

I've tried to change it to this:

    if (isObject()) {
        try to convert to interfaces
        if (!set) {
            if (IsArrayLike(...))  // This doesn't handle list objects
                try to convert to array or sequence
            else if (JS_ObjectIsDate(...))
                convert to date
            else if (!IsPlatformObject(...))
                try to convert to callback or callback interface or dictionary
            if (!set)
                set to object
        }
    }

> Please just throw if there's a dictionary type in there, for now.

Done.

> Then the length is 1: strings, enums, and primitives are not distinguishable
> from each other.

Done.

> isNullOrUndefined(), please.

Done.

> If the parser doesn't prevent this, this should probably be an explicit
> TypeError, not an assert.

Made parser handle this, added test.

> The optional/nullable handling here is really complicated, but I don't see
> an obvious way to make it simpler.  I'll think about it a bit.  For one
> thing, the "use Maybe for the holder" code pattern is a good one to hoist up
> into the main place we handle this stuff; I'd used Optional for the holder
> when the decl is Optional, but Maybe is a better fit.  Maybe a followup bug
> on this?

Followup bug will be filed.

> This should probably be adjusted for sequences and dictionaries too, right? 
> Followup bug for that, please.

Followup bug will be filed.

> How about getUnionAccessorSignatureType, and document that it returns a
> CGThing?

I wasn't sure whether we could eventually use this to generate a comment documenting the types of the call right before it or even a skeleton class with the API that the binding expects. Let me know if you still want to rename it.

> This seems to reimplement getJSToNativeConversionTemplate for sequences,
> pretty much, modulo the const.  Why can we not do the same "call
> getJSToNativeConversionTemplate and then strip off the const" trick we do
> later on?

> Again, this does exactly what getJSToNativeConversionTemplate does, I think.

> This one is tougher because of the InterfaceName& bit, I guess.  I still
> wish we could merge it with the other somehow, but I don't see how.  Just
> document why this is not using getJSToNativeConversionTemplate?

> This doesn't match what getJSToNativeConversionTemplate does, but I'm
> starting to think that was a mistake and we should pass JSObject& for
> non-nullable object.  Again, a followup?

> Again, this is what getJSToNativeConversionTemplate does, right?

Given that getSignatureType is about the types that the APIs should use I'm not sure it makes sense to use getJSToNativeConversionTemplate to implement it.

> s/arguments/union members/

did s/arguments/arguments or union members/

> s/and/or/, please.

Done.

> >+        raise TypeError("Can't handle dictionaries in unions")
> 
> And s/dictionaries/sequences/?

Did s/dictionaries/dictionaries and sequences/

> The latter should also be type.unroll() etc, no?  If not, why do we need it
> for the former?
> 
> Not needing it for either one is plausible: This is only called on
> flatMemberTypes, which in particular are never nullable.

Removed .unroll().

> By the way, that also means, I think, that the type can never be nullable in
> getSignatureType, so the branches on nullability there are dead code anyway.

Sure, when called from here.

> The problem is failureCode only really means something for nonfatal (i.e.
> non-throwing) conversion errors right now.  So if I have a union that
> includes "DOMString", and an object is passed in whose toString throws, then
> the conversion code will have that exception on the JSContext.  But the
> union code will then try to also throw one of our DOM exceptions.  Does that
> actually work out right?

It does because we check the context, but I ended up adding a |bool failed| argument to the Try* functions (it allows the or'ing of calls to Try* functions because they return whether they're done).

> Yeah, please file a followup on moving the const bits into
> instantiateJSToNativeConversionTemplate.

Followup bug will be filed.

> Given that "None, None" bit, isn't this just a fancy way of doing
> 
>   jsConversion = CGGeneric(
>       string.Template(template).substitute(
> 
> and then passing the exact same dictionary?  If so, I would prefer that, I
> think.

Done.

> Can we get followups filed to remove these workarounds at some point?  We're
> really not supporting building with old clang versions going forward, as I
> understand....

This is the default clang on OS X Lion, and clang is the only default compiler supported there afaik.

> "if self.type.nullableType is not None", please.

Replaced with |if self.type.hasNullableType:|.

> That comment should be a few lines earlier, right?

Done.

> >+                # namesapce
> 
> "namespace"

Done.

> >+    def UnionArgumentTypes(config):
> >+        # Wrap all of that in our namespaces.
> 
> Likewise here?

Done.

> Worth documenting why we need such a thing.

Removed.

> Do we really need this here?  Seems like the IDLObject version should
> suffice.

Ok. I'll file a followup to clean the other places where we do this, I tried to follow existing code.

> The superclass already does this, but I guess other places are doing it too.
> Maybe a followup to clean this up?

Followup bug will be filed.

> >+    def __str__(self):
> >+    def nullable(self):
> >+    def isPrimitive(self):
> >+    def isString(self):
> >+    def isSequence(self):
> >+    def isArray(self):
> >+    def isDictionary(self):
> >+    def isInterface(self):
> >+    def isEnum(self):

Removed.

> Why not actually return whether we had complete() called?

Done.

> I think it would be clearer to take out that middle while loop and make this
> more like so:
> 
>   while i < len(self.flatMemberTypes):
>       if self.flatMemberTypes[].nullable():
>           # deal with it as needed
>           continue
>       if self.flatMemberTypes[i].isUnion():
>           self.flatMemberTypes[i:i + 1] = self.flatMemberTypes[i].memberTypes
>           continue
>       i += 1

Done.

> Is there a reason we're storing the actual nullable type?  It seems a bit
> odd to do that; we only ever check whether it's None.

I thought it might actually be useful to know the type. Changed to store a bool.

> This would actually be where you can make use of the self.nullableType bit,
> by passing self.nullableType.location as the second arg and
> "extraLocation=self.flatMemberTypes[i].location" as the third arg.

I stored the type in a local var and did this.

> Again here we can use the actual locations of t and u if desired.  Unless
> IDLType.location is broken for non-nullable primitive types (which is
> possible!).

Done.

> 
> >+    def isDistinguishableFrom(self, other):
> 
> This implements the spec as written, but I believe the spec as written is
> wrong.  In particular, in the case when other is a nullable union type you
> want to return false when we have a nullable member.  Conversely, for
> IDLNullableType's implementation it needs to check for unions.  Or it needs
> to forward to other.isDistinguishableFrom when other is not nullable or
> somethin (though that sort of thing can be dangeours if other also relies on
> reflexivity for its implementation).  I raised a spec issue at
> http://lists.w3.org/Archives/Public/public-script-coord/2012AprJun/0231.html

Done.

> This is really hard to follow.  I tried using clearer variable names, but
> it's still kinda hard to grok:

Made it:

        # For every type in otherTypes, check that it's distinguishable from
        # every type in our types
        for u in otherTypes:
            if any(not t.isDistinguishableFrom(u) for t in self.memberTypes):
                return False
        return True

> I don't think this is needed.

> Or this.

Removed.

> This is actually fixed in bug 764698, so if it's not strictly necessary for
> this patch, I'd prefer to not have the merge conflict... ;)

Rebased on top of bug 764698.

> If we move this restriction out of the grammar, we should do it in the
> attribute/exception/constant parser code as needed.  We already do for
> dictionaries to some extent, since we can't tell at parse time whether an
> identifier is a dictionary.

I started redoing the grammar, but in the end I think it makes more sense to do this in the parser code. Restricting union member types (to exclude sequences in unions) was making things rather complicated, so we need to do that check outside of the parser anyway. The const code doesn't need changes (it doesn't use Type), the exception code doesn't exist afaict, but I added restrictions to the attribute code.

> Can you please file a followup to coalesce the 6 instances we have of this
> code now into a function?

I've added:

    @ staticmethod
    def handleModifiers(type, modifiers)
        for (modifier, modifierLocation) in modifiers:
            assert modifier == IDLMethod.TypeSuffixModifier.QMark or \
                   modifier == IDLMethod.TypeSuffixModifier.Brackets

            if modifier == IDLMethod.TypeSuffixModifier.QMark:
                type = IDLNullableType(modifierLocation, type)
            elif modifier == IDLMethod.TypeSuffixModifier.Brackets:
                type = IDLArrayType(modifierLocation, type)

        return type


> I'd actually call this UnionMemberTypeArrayOfAny

Done.

> >+    def p_NonAnyType(self, p):
> >+                       | ARRAYBUFFER TypeSuffix
> 
> I think the ARRAYBUFFER bit here is just leftovers that don't matter
> anymore.  Please file a followup bug to remove that?

Followup bug will be filed.
Comment 13 Boris Zbarsky [:bz] 2012-06-21 18:26:47 PDT
Created attachment 635562 [details] [diff] [review]
Attempted interdiff from v2 to v3, just in case I need it
Comment 14 Peter Van der Beken [:peterv] 2012-06-22 00:49:57 PDT
Comment on attachment 635515 [details] [diff] [review]
v3

Actually, nevermind that one. Looks pretty broken.
Comment 15 Boris Zbarsky [:bz] 2012-06-22 01:44:08 PDT
Comment on attachment 635515 [details] [diff] [review]
v3

> I thought holderType was supposed to be for a local variable that holds the data and declType the
> public type that we want to pass in the API.

That works for basic arguments.  But for a sequence, that setup doesn't work very well because you'd need an entire array of holders or something.  So for the sequence (and dictionary) case what we've done is that the isMember code for things that can be members uses a declType that doesn't need a holderType at all.

So for example, for strings the declType is normally something that can produce a |const nsAString&| and the holderType is an nsDependentString.  But for strings-in-sequence we just use nsString for the declType and skip the holder altogether.

Similar for interface arguments: in the sequence case we just use a declType that owns the data (nsRefPtr) so we can make the holder, if any is needed at all, just a temporary that's allowed to go out of scope.

All of which is to say that for sequence each of the sequence elements needs to own its data, the way things are set up right now.

We _could_ try to set up something where we have a parallel array of holders, but some of the types involved just can't really be used in an nsTArray anyway (e.g. you can't do nsTArray<const nsAString&> or really nsTArray<nsAString> either).

What explicitly using memory chunks gives you is the ability to have a union struct which actually owns its data (because you can effectively use types with ctors/dtors in there) and hence doesn't need any sort of separate holder long-term.  The separate FoorOrBarArgument can still be used to do the _conversion_ in the sequence/dictionary case but it can't be relied on to keep data alive.  Note that the user-facing API for the union struct can stay the same even if it owns the data itself.  The only caveat is that the owning and non-owning versions would need different typenames.

In any case, just explicitly disallowing union members for now and sorting this out in a followup sounds good, in any case.

>+++ b/dom/bindings/Codegen.py

>+def UnionConversions(descriptors):
>+        def addUnionTypes(type):
>+            if type.isUnion():
>+                type = type.unroll()

You still need to move that unroll() call outside the isUnion() check for cases when type is a sequence of unions.

>+        nonArrayPlatformObjectMemberTypes = filter(lambda t: t.isNonCallbackInterface(), memberTypes)

How about interfaceMemberTypes?  That's what they are, really.  Also maybe s/nonArrayPlatformObject/interfaceMembers/ ?

And you really do need the isInterface() there.

>+                if type.isGeckoInterface():
>+                    name = memberType.unroll().inner.identifier.name

I don't think you need the unroll() there.

I don't understand the done/failed setup.  Looking at the codegen for a TrySetToLong, for example, it never sets failed but returns whether the conversion succeeded.  So we get this:

  if (!done) {
    done = arg0_holder.TrySetToLong(cx, argv[0], &argv[0], failed);
  }
  if (!done) {
    if (!failed) {
      Throw<false>(cx, NS_ERROR_XPC_BAD_CONVERT_JS);
    }
    return false;
  }

If the conversion succeeds, life is fine.  But if it fails, we'll do the Throw<> thing, which is not needed at all, since TrySetToLong already throws if it fails to convert.  What's the actual contract here for "failed" and "done"?  I'd like to understand that before I try to make sense of this code.

>+        arrayObjectMemberTypes = filter(lambda t: t.isArray() or t.isSequence(), memberTypes)
>+        if len(arrayObjectMemberTypes) > 0:
>+            arrayObject = []

You can assert there's at most one item in the list here, since sequence or array types that are not equal are not distinguishable.  And then simplify the conversion code here.

> Let me know if you still want to rename it.

I think so, since so far it's definitely very union-specific.  If we want to make it about the type of the arguments on the C++ side, maybe call it getNativeArgumentType?

> Given that getSignatureType is about the types that the APIs should use I'm not sure it makes sense to
> use getJSToNativeConversionTemplate to implement it.

Hmm.  I would still prefer to share the code where we can, but I guess I can go either way....  It _really_ bothers me to have copy/pasted code around.  :(

Maybe we can just refactor common code into helper functions?  Followup OK for that.

> and clang is the only default compiler supported there afaik.

We don't support building with gcc 4.2 on Lion?  In any case, I'm no going to worry about this too much.  ;)

>+++ b/dom/bindings/parser/WebIDL.py

>@@ -1628,16 +1726,27 @@ class IDLAttribute(IDLInterfaceMember):
>+        if self.type.isSequence():

Could you please either move this up to where the t.isDictionary check is or move that check down here?

>+        if self.type.isUnion():
>+            for f in self.type.flatMemberTypes:
>+                if f.isSequence():

Or f.isDictionary().

For the exception location, I think self.location make more sense here, since it doesn't matter where the type is defined so much; what matters is the attribute it's on.

>+    @ staticmethod
>+    def handleModifiers(type, modifiers):

This is not used so far.  I assume there's going to be a followup on that?

>+++ b/dom/bindings/parser/tests/test_attr_sequence_type.py

Different strings on the different ok() calls would be nice.

The rest looks great.  r- for the fail/done thing, but we're really close!
Comment 16 Peter Van der Beken [:peterv] 2012-06-22 04:19:48 PDT
(In reply to Boris Zbarsky (:bz) from comment #15)
> I don't understand the done/failed setup.

Yep, that was totally broken. I realized it while trying to get some sleep.

> > Given that getSignatureType is about the types that the APIs should use I'm not sure it makes sense to
> > use getJSToNativeConversionTemplate to implement it.
> 
> Hmm.  I would still prefer to share the code where we can, but I guess I can
> go either way....  It _really_ bothers me to have copy/pasted code around. 

There's almost as many that are the same as that are different, but the differences are non-trivial to deal with. Optional isn't that hard to deal with, but I'm not going to try converting NonNull, OwningNonNull or nsRefPtr back to pointers and references. It seems to me that there's a clear difference between what the APIs should use and how we implement the conversion. I could add the API types to getJSToNativeConversionTemplate but I think that function is already pretty hard to follow as is.

> We don't support building with gcc 4.2 on Lion? 

https://bugzilla.mozilla.org/show_bug.cgi?id=555727#c107

> >+        if self.type.isUnion():
> >+            for f in self.type.flatMemberTypes:
> >+                if f.isSequence():
> 
> Or f.isDictionary().

Is that in the spec? I don't see it.

> This is not used so far.  I assume there's going to be a followup on that?

Seems I forgot to attach the right patch somehow.
Comment 17 Peter Van der Beken [:peterv] 2012-06-22 07:52:54 PDT
Created attachment 635751 [details] [diff] [review]
v3.1

(In reply to Boris Zbarsky (:bz) from comment #15)
> You still need to move that unroll() call outside the isUnion() check for
> cases when type is a sequence of unions.
> 
> >+        nonArrayPlatformObjectMemberTypes = filter(lambda t: t.isNonCallbackInterface(), memberTypes)
> 
> How about interfaceMemberTypes?  That's what they are, really.  Also maybe
> s/nonArrayPlatformObject/interfaceMembers/ ?
> 
> And you really do need the isInterface() there.

isNonCallbackInterface() doesn't mean isInterface() is true?

> 
> >+                if type.isGeckoInterface():
> >+                    name = memberType.unroll().inner.identifier.name
> 
> I don't think you need the unroll() there.

Done.

> I don't understand the done/failed setup.

I've replaced it with failed/tryNext. A TryToSet* method returns whether a failure happened, if no failure happened it sets tryNext to whether the conversion succeeded or the next conversion should be tried. done is set to |failed || !tryNext|.

> You can assert there's at most one item in the list here, since sequence or
> array types that are not equal are not distinguishable.  And then simplify
> the conversion code here.

Done.

> > Let me know if you still want to rename it.
> 
> I think so, since so far it's definitely very union-specific.

Renamed to getUnionAccessorSignatureType.

> Could you please either move this up to where the t.isDictionary check is or
> move that check down here?

Moved isDictionary check.

> For the exception location, I think self.location make more sense here,
> since it doesn't matter where the type is defined so much; what matters is
> the attribute it's on.

Done.

> Different strings on the different ok() calls would be nice.

Done.
Comment 18 Peter Van der Beken [:peterv] 2012-06-22 07:54:01 PDT
Created attachment 635752 [details] [diff] [review]
Interdiff v2 to v3.1
Comment 19 Boris Zbarsky [:bz] 2012-06-22 08:23:27 PDT
> There's almost as many that are the same as that are different

Yeah, fair.  If we do want to do helper functions for the ones that are the same, that's great, but followup is fine.

> Is that in the spec? I don't see it.

It is as of last night (it was one of the spec issues I raised while reviewing this initially).  ;)  http://dev.w3.org/2006/webapi/WebIDL/#idl-attributes and you may have to force-reload if you have a cached version.

> isNonCallbackInterface() doesn't mean isInterface() is true?

It does, sorry.  I thought I'd deleted that comment, but clearly not....
Comment 20 Boris Zbarsky [:bz] 2012-06-22 08:29:26 PDT
Created attachment 635764 [details] [diff] [review]
Interdiff v3 to v3.1
Comment 21 Boris Zbarsky [:bz] 2012-06-22 09:12:03 PDT
Comment on attachment 635751 [details] [diff] [review]
v3.1

>+++ b/dom/bindings/Codegen.py

>+            callbackObject = CGGeneric("done = (failed = !%s.TrySetTo%s(cx, ${val}, ${valPtr}, tryNext)) || !tryNext;" % (unionArgumentObj, name))

Likewise: there is no non-fatal way to fail to wrap a JSObject to a callback type.  This one matters more, since I think we _can_ run this code.

>+            if arrayObject or dateObject or nonPlatformObject:
>+                templateBody = CGList([arrayObject, dateObject, nonPlatformObject], " else ")

This worried me, since IsArrayObject and !IsPlatformObject can be true for the same object, but it works in practice, because arrayObject and nonPlatformObject can't both be set because those are not distinguishable (precisely because those tests can both be true at once).  Might be worth a comment and assert.

>+            if interfaceObject:
>+                if (arrayObject or dateObject or nonPlatformObject):

That can just test for templateBody, since that's set if and only if that condition above.

>+def getUnionTypeTemplateVars(type, descriptorProvider):
>+    tryNextCode = """if (mUnion.mType != mUnion.eUninitialized) {
>+  mUnion.Destroy%s();

This destroy bit is only needed for the isGeckoInterface case, I believe.  Worth resricting it to that case?

>+++ b/dom/bindings/parser/WebIDL.py
>@@ -1624,21 +1722,33 @@ class IDLAttribute(IDLInterfaceMember):
>+        if self.type.isUnion():
>+            for f in self.type.flatMemberTypes:
>+                if f.isSequence():

Do please add the isDictionary() check here.

r=me with that.  Hurrah!
Comment 22 Boris Zbarsky [:bz] 2012-06-22 14:24:44 PDT
> Likewise: there is no non-fatal way to fail to wrap a JSObject to a callback type.  This
> one matters more, since I think we _can_ run this code.

Please ignore this part: I figured out that you were just trying to keep a consistent signature for TrySetTo* and that this simplified the code...

Though if you do want to nix the tryNext bits where they're not needed, let me know.  ;)
Comment 23 Peter Van der Beken [:peterv] 2012-06-24 05:14:51 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/f84e59e1fd66
Comment 24 Ryan VanderMeulen [:RyanVM] 2012-06-24 20:08:22 PDT
https://hg.mozilla.org/mozilla-central/rev/f84e59e1fd66

Note You need to log in before you can comment on or make changes to this bug.