Closed Bug 262385 Opened 20 years ago Closed 20 years ago

nsIScriptableUnicodeConverter is not very scriptable

Categories

(Core :: Internationalization, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla1.8alpha5

People

(Reporter: darin.moz, Assigned: Biesinger)

References

Details

Attachments

(1 file, 1 obsolete file)

nsIScriptableUnicodeConverter is not very scriptable.

this interface is used by some JS code to convert unicode data:

  http://lxr.mozilla.org/mozilla/search?string=nsIScriptableUnicodeConverter

but, the interface has methods like this:

  string ConvertFromUnicode(in wstring aSrc);

|string| is only capable of representing ascii codepoints, and xpconnect will
decimate non-ASCII codepoints, so this interface is completely useless for
javascript applications.

cc'ing rginda since venkman is affected.  calendar is apparently also affected.
Darin, |string ConvertFromUnicode(in wstring aSrc)| is inevitable because what
we want out of that is a sequence of bytes(octets) that represent a multi-byte
string ('multi' including one). Crossing XPCOM, it will be "inflated" (with
zero-padding) but the information is still there. Needless to say, clients have
to be careful. Javascript being typeless, I don't know what better way there is
for this. 


ACString may be slightly better... or [array,size_is(foo)] out octet
jshin: i totally understand how it manages to work today, but xpconnect asserts
(in debug mode) that the high-bit of any element of a |string| will be zero. 
it's a hack to rely on this behavior, and we should do something better.  an
octet array is probably the best option (disregarding performance issues) since
you really don't want to treat the resulting character array as a JS string. 
another option would be to use a nsISupportsCString instance to represent the
non-unicode encoded string.
I wrote a comment concurring with comment #2 (and now with comment #3 as well),
but somehow I didn't press the submit button :-).  In comment #1, I didn't
mention using an octet array because I saw once a method in an interface being
marked as unscriptable for using it, which made me (incorrectly) believe that an
octet array is no better than |string|  when crossing xpconnect. 

Btw, I heard a complaint that the calendar in Mozilla-suite has problems with
non-ASCII characters while Sunbird doesn't. It occurred to me that this
discrepancy may have to do with different treatments of 'pseudo-Unicode' JS
strings (for multi-byte encoded C strings) in two versions of calendar (if
there's a fork between two.). Chatzilla seems to have managed to work so far
(this interface was added for it in bug 54857). 
*** Bug 269649 has been marked as a duplicate of this bug. ***
Attached patch suggested patch (obsolete) — Splinter Review
how about this?
Assignee: smontagu → cbiesinger
Status: NEW → ASSIGNED
Comment on attachment 165841 [details] [diff] [review]
suggested patch

r=jshin

thanks.
>+  wstring convertByteArrayToUnicode(in unsigned long aCount,
>+                                    [const,array,size_is(aCount)] in octet aData);

>+  void convertToByteArray(in wstring aString,
>+                          out unsigned long aLen,
>+                          [array, size_is(aLen),retval] out octet aData);

How about |convertFromByteArray| and |convertToByteArray| these methods being
always used in scripts where 'Unicode' is implicitly assumed? It's unfortunate
that two existing methods did exactly the opposite.
Alternatively, |convertByteArrayToUnicode| and |convertUnicodeToByteArray|
would do.
Attachment #165841 - Flags: review?(jshin) → review+
Attached patch patch v2Splinter Review
now using convertFromByteArray/convertToByteArray. this also improves the
documentation in the idl file.
Attachment #165841 - Attachment is obsolete: true
Comment on attachment 165906 [details] [diff] [review]
patch v2

since I'm not quite sure my Finish comment change is correct, re-requesting
review
Attachment #165906 - Flags: review?(jshin)
Target Milestone: --- → mozilla1.8alpha5
Comment on attachment 165906 [details] [diff] [review]
patch v2

r=jshin
Attachment #165906 - Flags: review?(jshin) → review+
Attachment #165906 - Flags: superreview?(darin)
So, since STRICT_CHECK_OF_UNICODE is not defined in xpcconvert.cpp (contrary to
my comment about it being defined in DEBUG builds), what's the real advantage to
using these octet array methods over the ones that use |string|?  I've heard
that octet arrays are pretty slow in JS :(
well, for example, you can convert to UTF-16 or UTF-32... (where string would be
truncated due to the null byte)
It could be used in nsIStreamLoaderObserver.onStreamComplete. That gives on
array of octets, not a string. (i guess because it isn't a string, it can
contain nulls)
Comment on attachment 165906 [details] [diff] [review]
patch v2

OK, those sound like good reasons :)


>Index: idl/nsIScriptableUConv.idl

>   /**
>    * Converts the data from Unicode to one Charset.
>    */
>   string ConvertFromUnicode(in wstring aSrc);

Bonus points: It would be good to better document this result.


>   wstring ConvertToUnicode([const] in string aSrc);

[const] here is redundant.


sr=darin
Attachment #165906 - Flags: superreview?(darin) → superreview+
Comment on attachment 165906 [details] [diff] [review]
patch v2

Darin: let's optimize JS byte arrays (ECMA TG1 is back in the game, working on
262 Edition 4); someone file a bug if performance is urgently bac.

Why does this interface have StudlyCaps method names?  IDL, especially
scriptable IDL, should use camelCaps.

/be
brendan: why it is using them, I don't know. but I left them unchanged to keep
compat with existing JS code using this interface. if people don't think that's
necessary, sure, I can change it.
(In reply to comment #14)
> Bonus points: It would be good to better document this result.

I added this commnet:
     56    * Converts the data from Unicode to one Charset.
     57    * Returns the converted string. After converting, Finish should be called
     58    * and its return value appended to this return value.

Checking in idl/nsIScriptableUConv.idl;
/cvsroot/mozilla/intl/uconv/idl/nsIScriptableUConv.idl,v  <-- 
nsIScriptableUConv.idl
new revision: 1.8; previous revision: 1.7
done
Checking in src/nsScriptableUConv.cpp;
/cvsroot/mozilla/intl/uconv/src/nsScriptableUConv.cpp,v  <--  nsScriptableUConv.cpp
new revision: 1.15; previous revision: 1.14
done
Checking in src/nsScriptableUConv.h;
/cvsroot/mozilla/intl/uconv/src/nsScriptableUConv.h,v  <--  nsScriptableUConv.h
new revision: 1.8; previous revision: 1.7
done
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: