Closed Bug 262385 Opened 21 years ago Closed 21 years ago

nsIScriptableUnicodeConverter is not very scriptable

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla1.8alpha5

People

(Reporter: darin.moz, Assigned: Biesinger)

References

Details

Attachments

(1 file, 1 obsolete file)

suggested patch 21 years ago Christian :Biesinger (don't email me, ping me on IRC) 8.11 KB, patch	jshin1987 : review+	Details \| Diff \| Splinter Review
patch v2 21 years ago Christian :Biesinger (don't email me, ping me on IRC) 8.31 KB, patch	jshin1987 : review+ darin.moz : superreview+	Details \| Diff \| Splinter Review

Darin Fisher

Reporter

Description

•

21 years ago

nsIScriptableUnicodeConverter is not very scriptable. this interface is used by some JS code to convert unicode data: http://lxr.mozilla.org/mozilla/search?string=nsIScriptableUnicodeConverter but, the interface has methods like this: string ConvertFromUnicode(in wstring aSrc); |string| is only capable of representing ascii codepoints, and xpconnect will decimate non-ASCII codepoints, so this interface is completely useless for javascript applications. cc'ing rginda since venkman is affected. calendar is apparently also affected.

Jungshik Shin

Comment 1

•

21 years ago

Darin, |string ConvertFromUnicode(in wstring aSrc)| is inevitable because what we want out of that is a sequence of bytes(octets) that represent a multi-byte string ('multi' including one). Crossing XPCOM, it will be "inflated" (with zero-padding) but the information is still there. Needless to say, clients have to be careful. Javascript being typeless, I don't know what better way there is for this.

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Comment 2

•

21 years ago

ACString may be slightly better... or [array,size_is(foo)] out octet

Darin Fisher

Reporter

Comment 3

•

21 years ago

jshin: i totally understand how it manages to work today, but xpconnect asserts (in debug mode) that the high-bit of any element of a |string| will be zero. it's a hack to rely on this behavior, and we should do something better. an octet array is probably the best option (disregarding performance issues) since you really don't want to treat the resulting character array as a JS string. another option would be to use a nsISupportsCString instance to represent the non-unicode encoded string.

Jungshik Shin

Comment 4

•

21 years ago

I wrote a comment concurring with comment #2 (and now with comment #3 as well), but somehow I didn't press the submit button :-). In comment #1, I didn't mention using an octet array because I saw once a method in an interface being marked as unscriptable for using it, which made me (incorrectly) believe that an octet array is no better than |string| when crossing xpconnect. Btw, I heard a complaint that the calendar in Mozilla-suite has problems with non-ASCII characters while Sunbird doesn't. It occurred to me that this discrepancy may have to do with different treatments of 'pseudo-Unicode' JS strings (for multi-byte encoded C strings) in two versions of calendar (if there's a fork between two.). Chatzilla seems to have managed to work so far (this interface was added for it in bug 54857).

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Comment 5

•

21 years ago

*** Bug 269649 has been marked as a duplicate of this bug. ***

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Comment 6

•

21 years ago

Attached patch suggested patch (obsolete) — Details — Splinter Review

how about this?

Assignee: smontagu → cbiesinger

Status: NEW → ASSIGNED

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Updated

•

21 years ago

Attachment #165841 - Flags: review?(jshin)

Jungshik Shin

Comment 7

•

21 years ago

Comment on attachment 165841 [details] [diff] [review] suggested patch r=jshin thanks. >+ wstring convertByteArrayToUnicode(in unsigned long aCount, >+ [const,array,size_is(aCount)] in octet aData); >+ void convertToByteArray(in wstring aString, >+ out unsigned long aLen, >+ [array, size_is(aLen),retval] out octet aData); How about |convertFromByteArray| and |convertToByteArray| these methods being always used in scripts where 'Unicode' is implicitly assumed? It's unfortunate that two existing methods did exactly the opposite. Alternatively, |convertByteArrayToUnicode| and |convertUnicodeToByteArray| would do.

Attachment #165841 - Flags: review?(jshin) → review+

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Updated

•

21 years ago

Blocks: 248884

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Comment 8

•

21 years ago

Attached patch patch v2 — Details — Splinter Review

now using convertFromByteArray/convertToByteArray. this also improves the documentation in the idl file.

Attachment #165841 - Attachment is obsolete: true

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Comment 9

•

21 years ago

Comment on attachment 165906 [details] [diff] [review] patch v2 since I'm not quite sure my Finish comment change is correct, re-requesting review

Attachment #165906 - Flags: review?(jshin)

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Updated

•

21 years ago

Target Milestone: --- → mozilla1.8alpha5

Jungshik Shin

Comment 10

•

21 years ago

Comment on attachment 165906 [details] [diff] [review] patch v2 r=jshin

Attachment #165906 - Flags: review?(jshin) → review+

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Updated

•

21 years ago

Attachment #165906 - Flags: superreview?(darin)

Darin Fisher

Reporter

Comment 11

•

21 years ago

So, since STRICT_CHECK_OF_UNICODE is not defined in xpcconvert.cpp (contrary to my comment about it being defined in DEBUG builds), what's the real advantage to using these octet array methods over the ones that use |string|? I've heard that octet arrays are pretty slow in JS :(

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Comment 12

•

21 years ago

well, for example, you can convert to UTF-16 or UTF-32... (where string would be truncated due to the null byte)

Michiel van Leeuwen (email: mvl+moz@)

Comment 13

•

21 years ago

It could be used in nsIStreamLoaderObserver.onStreamComplete. That gives on array of octets, not a string. (i guess because it isn't a string, it can contain nulls)

Darin Fisher

Reporter

Comment 14

•

21 years ago

Comment on attachment 165906 [details] [diff] [review] patch v2 OK, those sound like good reasons :) >Index: idl/nsIScriptableUConv.idl > /** > * Converts the data from Unicode to one Charset. > */ > string ConvertFromUnicode(in wstring aSrc); Bonus points: It would be good to better document this result. > wstring ConvertToUnicode([const] in string aSrc); [const] here is redundant. sr=darin

Attachment #165906 - Flags: superreview?(darin) → superreview+

Brendan Eich [:brendan]

Comment 15

•

21 years ago

Comment on attachment 165906 [details] [diff] [review] patch v2 Darin: let's optimize JS byte arrays (ECMA TG1 is back in the game, working on 262 Edition 4); someone file a bug if performance is urgently bac. Why does this interface have StudlyCaps method names? IDL, especially scriptable IDL, should use camelCaps. /be

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Comment 16

•

21 years ago

brendan: why it is using them, I don't know. but I left them unchanged to keep compat with existing JS code using this interface. if people don't think that's necessary, sure, I can change it.

Christian :Biesinger (don't email me, ping me on IRC)

Assignee

Comment 17

•

21 years ago

(In reply to comment #14) > Bonus points: It would be good to better document this result. I added this commnet: 56 * Converts the data from Unicode to one Charset. 57 * Returns the converted string. After converting, Finish should be called 58 * and its return value appended to this return value. Checking in idl/nsIScriptableUConv.idl; /cvsroot/mozilla/intl/uconv/idl/nsIScriptableUConv.idl,v <-- nsIScriptableUConv.idl new revision: 1.8; previous revision: 1.7 done Checking in src/nsScriptableUConv.cpp; /cvsroot/mozilla/intl/uconv/src/nsScriptableUConv.cpp,v <-- nsScriptableUConv.cpp new revision: 1.15; previous revision: 1.14 done Checking in src/nsScriptableUConv.h; /cvsroot/mozilla/intl/uconv/src/nsScriptableUConv.h,v <-- nsScriptableUConv.h new revision: 1.8; previous revision: 1.7 done

Status: ASSIGNED → RESOLVED

Closed: 21 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.