Closed Bug 1025183 Opened 6 years ago Closed 6 years ago

support ScalarValueString type in webidl

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: bkelly, Assigned: bkelly)

References

Details

Attachments

(5 files, 16 obsolete files)

15.29 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
1.85 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
1.39 KB, patch
mshal
: review+
Details | Diff | Splinter Review
15.66 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
12.14 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
ScalarValueString is defined in the encoding spec[1] as:

"ScalarValueString is identical to DOMString except that convert a DOMString to a sequence of Unicode characters is used subsequently when converting to an IDL value. Ergo, the IDL value is a sequence of scalar values."

As a temporary work-around webidl files are just using a DOMString typedef.

This bug is to implement ScalarValueString in our webidl compiler as a type that can be referenced directly and automatically performs the necessary conversion before calling the binding class.
I've been told in IRC this will be similar to implementing ByteString.
See Also: → 796850
Depends on: 1027256
Interim patch that allows ScalarValueString to compile, although it basically just does exactly what DOMString does.  This compiles with the patches from bug 1027256.

I'll now go back and figure out the diffs to make ScalarValueString do the right thing.  I believe on IRC we decided to use a new ScalarValueString class for arguments and a DOMString for return values.
I believe this is mostly correct.  Still todo:

1) Clean up the patch a bit by moving a few changes into separate patches.
2) Write tests to confirm conversion to unicode characters is correct.

Thoughts on the implementation so far?
Attachment #8442527 - Attachment is obsolete: true
Attachment #8443743 - Flags: feedback?(bzbarsky)
Attachment #8441597 - Attachment is obsolete: true
Attachment #8443743 - Attachment is obsolete: true
Attachment #8443743 - Flags: feedback?(bzbarsky)
Attachment #8444708 - Attachment is obsolete: true
So, these patches have bit rotted. In particular, I made ScalarValueString mimic FakeDependentString.  When SetData() is called, then the unicode conversion is executed.

After bug 1032748, however, SetData() is gone.  Instead there is a Rebind() method and a BeginWriting() method.  I can perform the conversion in Rebind(), but not sure how to do it with BeginWriting().

Boris, what do you think?
Flags: needinfo?(bzbarsky)
See Also: → 1032748
(In reply to Ben Kelly (PTO / back July 21) [:bkelly] from comment #9)
> After bug 1032748, however, SetData() is gone.  Instead there is a Rebind()
> method and a BeginWriting() method.  I can perform the conversion in
> Rebind(), but not sure how to do it with BeginWriting().

SetData was renamed to Rebind, but it's just a naming change. Doing s/SetData/Rebind/ in your patches should be trivial.

Note that BeginWriting was added by bug 1032726, not bug 1032748.
I think you have two options.

Option 1 is to do the conversion in SetLength, which is the last thing AssignJSString will call on your string class.

Option 2 is to do the conversion in codegenned code, immediately after doing the ConvertJSValueToString.  Now that ConvertJSValueToString always makes a copy, there is no issue with just making the modification after the fact.
Flags: needinfo?(bzbarsky)
Thanks.  I think option 2 sounds good.  I could use my ScalarVauleString class at that point or just have a helper method.  I'd like to avoid making another copy if I can help it.  Not sire.if I can replace inline.
You can replace inline, since the replacement is of unpaired surrogates (which are by definition a single 16-bit code unit) with U+FFFD, which is also a single 16-bit code unit.  So no extra copy needed.
Attachment #8444706 - Attachment is obsolete: true
Attachment #8465117 - Flags: review?(bzbarsky)
Comment on attachment 8465117 [details] [diff] [review]
P1 Add ScalarValueString to webidl parser (v2)

Sorry for the flag spam.  Realized I need to tweak a few more things in the following patches.
Attachment #8465117 - Flags: review?(bzbarsky)
Attachment #8444707 - Attachment is obsolete: true
Attachment #8465127 - Flags: review?(bzbarsky)
Attachment #8444716 - Attachment is obsolete: true
Attachment #8465128 - Flags: review?(bzbarsky)
Attachment #8465117 - Flags: review?(bzbarsky)
Remove some stale code I accidentally left commented in.
Attachment #8465127 - Attachment is obsolete: true
Attachment #8465127 - Flags: review?(bzbarsky)
Attachment #8465131 - Flags: review?(bzbarsky)
Try build:

  https://tbpl.mozilla.org/?tree=Try&rev=5254401e0d52

Also, I believe P3 will need to be rebased after bug 1036214, but it appears to be pretty minor.  No major conflicts that I can see.
Comment on attachment 8465117 [details] [diff] [review]
P1 Add ScalarValueString to webidl parser (v2)

>@@ -2594,16 +2622,18 @@ class IDLValue(IDLObject):
>+        elif self.type.isString() and type.isString():
>+            return self

Should we not return a new IDLValue of type "type" instead?  Please document why "self" is the right thing to return here.  In practice, is it just that self.type is DOMString if we're taking this return and that's what callers expect to see for a string default value?  If so, worth asserting that self.type.isDOMString(), or just checking for self.type.isDOMString().

This will allow things like void method(optional ByteString arg = "foo"), right?  Is that purposeful?

>+    harness.ok(True, "TestScalarValueString interface parsed without error.")

No need, since if an exception got thrown that will fail the test automatically.

>+    harness.check(len(members), 1, "Should be one production")

"one member"?

Please add the new type to dom/bindings/parser/tests/test_distinguishability.py

r=me with those nits picked.
Attachment #8465117 - Flags: review?(bzbarsky) → review+
Comment on attachment 8465128 [details] [diff] [review]
P3 Add tests for ScalarValueString. (v2)

Oh, here's where test_distinguishability.py gets changed.  OK, great.

Don't you nee to:

  setDistinguishable("ScalarValueString", nonStrings)

?  How did the test pass without that?

>+  void passUnionWithByteString((ByteString or long) arg);

Could you please put this in the same place in the list in all three test interfaces?

r=me
Attachment #8465128 - Flags: review?(bzbarsky) → review+
Comment on attachment 8465131 [details] [diff] [review]
P2 Add ScalarValueString to webidl Codegen.py (v4)

I'm not sure I follow NormalizeScalarValueStringInternal. Why dose it ever need to change lengths?  I didn't think ScalarValueString ever changed the length in 16-bit units of the input.

Seems to me like a fairly faithful implementation of http://heycam.github.io/webidl/#dfn-obtain-unicode (well, but using NS_IS_LOW_SURROGATE/NS_IS_HIGH_SURROGATE) would just work in-place and not even require using UTF16CharEnumerator.  Am I just missing something?

The rest looks good, but I'd like to understand this part.
Flags: needinfo?(bkelly)
(In reply to Boris Zbarsky [:bz] (On vacation Aug 5-18) from comment #22)
> I'm not sure I follow NormalizeScalarValueStringInternal. Why dose it ever
> need to change lengths?  I didn't think ScalarValueString ever changed the
> length in 16-bit units of the input.

Yea, I realized this was the case after I logged off last night.  My original understanding of the algorithm was a bit off, but I came to this conclusion as I worked through the test cases.

I'll go back and make it work completely in-place.

> Seems to me like a fairly faithful implementation of
> http://heycam.github.io/webidl/#dfn-obtain-unicode (well, but using
> NS_IS_LOW_SURROGATE/NS_IS_HIGH_SURROGATE) would just work in-place and not
> even require using UTF16CharEnumerator.  Am I just missing something?

I was trying to use UTF16CharEnumerator to avoid duplicating code within the tree.  It seemed to be implementing the same algorithm to me, but perhaps I missed something subtle.
Flags: needinfo?(bkelly)
(In reply to Boris Zbarsky [:bz] (On vacation Aug 5-18) from comment #21)
> Oh, here's where test_distinguishability.py gets changed.  OK, great.
> 
> Don't you nee to:
> 
>   setDistinguishable("ScalarValueString", nonStrings)
> 
> ?  How did the test pass without that?

This might explain the huge amount of orange during make check in the try build. :-)  I guess I never re-ran these tests after doing the P1 patch.  I was only testing mochitest.  Thanks for pointing out the error!
I moved the test_distinguishability.py changes to this patch.

Also, I restricted the default type coercion to only ScalarValueString as including ByteString was incorrect.  I left the return as self since Codegen.py expects string default values to be exactly DOMString.  Added a comment and assert to explain.
Attachment #8465117 - Attachment is obsolete: true
Attachment #8465452 - Flags: review+
Attachment #8465128 - Attachment is obsolete: true
Attachment #8465454 - Flags: review+
This is a short patch the documents how UTF16CharEnumerator treats the buffer position when it returns the UCS2_REPLACEMENT_CHAR.  I also clarified that the final error condition is actually unreachable.
Attachment #8465520 - Flags: review?(dbaron)
Greatly simplify the ScalarValueString code now that its clear to me that UTF16CharEnumerator simply replaces orphaned surrogates with the replacement character and does not change the length of the string.
Attachment #8465131 - Attachment is obsolete: true
Attachment #8465131 - Flags: review?(bzbarsky)
Attachment #8465523 - Flags: review?(bzbarsky)
Renumber since I inserted a new patch.
Attachment #8465454 - Attachment is obsolete: true
Attachment #8465525 - Flags: review+
Comment on attachment 8465520 [details] [diff] [review]
P2 Document how UTF16CharEnumerator treats buffer position on error. (v0)

r=me; I verified that this is in fact what happens.
Attachment #8465520 - Flags: review?(dbaron) → review+
Comment on attachment 8465523 [details] [diff] [review]
P3 Add ScalarValueString to webidl Codegen.py (v5)

>+  const nsString::char_type* nextChar = aString.BeginWriting();

s/nsString::char_type/char16_t/, please.  And document that it's const because you can't pass a foo** as a const foo**?

>+      MOZ_ASSERT(lastChar == (nextChar - 1));

Given this, could also just stash a non-const version of BeginWriting() in writableChars at the very beginning, nix lastChar, and here do:

  writableChars[nextChar - 1 - writableChars] = static_cast<uint16_t>(enumerated);

or some such...  Does lose the assert, though.

r=me either way
Attachment #8465523 - Flags: review?(bzbarsky) → review+
Had to include mozilla/Assertions.h from nsUTF8Utils.h since the try build was failing (on windows only for some reason).
Attachment #8465520 - Attachment is obsolete: true
Attachment #8465758 - Flags: review+
Updated to use char16_t and try to avoid const_cast.  I still use a separate lastCharIndex variable, though, as I think it makes the code more readable.  Also added a comment about the strangeness of not being able to pass char** for const char**.

Last try build:

https://tbpl.mozilla.org/?tree=Try&rev=c0ccbfe9bedd

Since I've had a few red tries, I'll wait until this is fairly green to push to inbound.
Attachment #8465523 - Attachment is obsolete: true
Attachment #8465773 - Flags: review+
The b2g and android package Makefiles don't set MOZ_DEBUG properly.  This means TestInterfaceJS.manifest is not properly included in these builds.
This should fix the b2g and android package Makefiles so that TestInterfaceJS is properly included.

https://tbpl.mozilla.org/?tree=Try&rev=3ad12594451d
Attachment #8466167 - Flags: review?(mshal)
Renumber.
Attachment #8465525 - Attachment is obsolete: true
Attachment #8466168 - Flags: review+
Does that mean you can re-enable test_getWebIDLCaller.html on android?  Seems like you should be able to...
Attachment #8466167 - Flags: review?(mshal) → review+
Blocks: 1047514
(In reply to Boris Zbarsky [:bz] (On vacation Aug 5-18) from comment #39)
> Does that mean you can re-enable test_getWebIDLCaller.html on android? 
> Seems like you should be able to...

Looks like it.  Follow-up in bug 1047514.
Catch up with bug 1035654 which landed in mozilla-inbound last night.  This creates another spot within Codegen.py where we need to treat ScalarValueString the same as DOMString.
Attachment #8465773 - Attachment is obsolete: true
Attachment #8466401 - Flags: review+
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.