Closed
Bug 332646
Opened 18 years ago
Closed 13 years ago
Investigate using native uconv on Win32
Categories
(Core :: Internationalization, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: dougt, Unassigned)
References
Details
I wrote a native Unicode converter for windows ce which also works on Windows 32. This allows us to remove alot of code+data from firefox (on ARM i saved .67MB). This might be something we could enable on the trunk since we have dropped Windows 98 support. To create a build, just add this like to your mozconfig: ac_add_options --enable-native-uconv The code for this converter lives here: http://lxr.mozilla.org/mozilla/source/intl/uconv/native/nsWinCEUConvService.cpp
Comment 1•18 years ago
|
||
This is what I sent to Doug in response to his email seeking my opinion (with emphasis added around 'consistently') I'm not sure what you had in mind when you wrote 'This might be something we could enable on the trunk since we have dropped Windows 98 support.'. When we interact with the OS, we rely on OS APIs (nsNativeCharsetUtils.cpp), but we need our converters (unless there's an acute need to save memory footprint as in minimo) when dealing with web pages/forms/mail messages that come from outside and that we send out to the wild because our converters do more than what the OS APIs can do and do things a little differently. They also enable us to handle all sorts of character encodings **consistently** across platforms. In short, I don't see any connection between dropping support for Win98 and using the native uconv. Please, 'enlighten' me if I'm missing anything and I'd be glad to stand corrected.
Comment 2•18 years ago
|
||
(In reply to comment #1) > In short, I don't see any connection between dropping support for > Win98 and using the native uconv. Ok. I can see some connections in that WideCharToMultiByte and MultibytoWideChar on Win 2k or later support a lot more encodings than on Win 9x/ME. Still, I don't like some of their converters and I prefer ours to theirs. Moreover, I'm loath to give up the consistency across platforms. Neither do I like to let go our control over the way incoming data stream is interpreted and outgoing data is encoded.
Reporter | ||
Comment 3•18 years ago
|
||
thanks for your response. are there any encoding/decoding tests that we can run to see if there are significant differences between the native uconv and the mozilla converters?
Comment 4•18 years ago
|
||
(In reply to comment #3) > are there any encoding/decoding tests that we can > run to see if there are significant differences between the native uconv and > the mozilla converters? http://smontagu.damowmow.com/encodingtest.html
Comment 5•18 years ago
|
||
I don't like using native uconv too. If the native uconvs don't have compatibility on each versions of Windows (including future releases), we are not happy...
Reporter | ||
Comment 6•18 years ago
|
||
maybe this isn't so much "I am demanding that we use native uconv", but rather "is there a way to reduce code+data bloat in the uconv code. your helps and ideas in making improvements in this area are important. What can we do?
Comment 7•18 years ago
|
||
By the way, does MultibyteToWideChar emit UTF-16 or UCS-2? Testcase: http://www.i18nguy.com/unicode-plane1-utf8.html
Comment 8•18 years ago
|
||
MultibyteToWideChar emits UTF-16 from Windows 2000 upward. It might break on the 9x, but it will be very hard to display non BMP characters on those anyway. 670 Kb just for encodings is really a lot. It might be worth thinking of a way to do better on this point.
Comment 9•18 years ago
|
||
dougt: what sort of performance improvement do you see from this change? is it just a codesize savings? does that translate to performance in this case? i tend to agree with jshin+masayuki+smontagu. i18n consistency across ff builds is important, so this change sounds risky.
Reporter | ||
Comment 10•18 years ago
|
||
i have not measured perf. I will post a engineering build shortly. Also, I am not avocating breaking consistency for the sake of it. See comment #6.
Reporter | ||
Comment 11•18 years ago
|
||
not sure what it means. Build running native uconv against the tests yields these failures. iso-8859-3 28 codepoint(s) failed iso-8859-6 48 codepoint(s) failed iso-8859-7 5 codepoint(s) failed iso-8859-8 32 codepoint(s) failed iso-8859-10 46 codepoint(s) failed iso-8859-11 87 codepoint(s) failed iso-8859-13 56 codepoint(s) failed iso-8859-14 32 codepoint(s) failed iso-8859-16 40 codepoint(s) failed Shift-JIS 3 codepoint(s) failed windows-936 5 codepoint(s) failed I didn't run anything past the Windows-949 Korean testcase.
Comment 12•18 years ago
|
||
A build with --enable-native-uconv doesn't get a scriptable unicode converter, which is pretty essential for various code. Lack of it completely breaks ChatZilla and Venkman, for example (bug 327835, bug 327827).
Comment 13•18 years ago
|
||
(In reply to comment #11) > Shift-JIS 3 codepoint(s) failed > windows-936 5 codepoint(s) failed Native uconv randomly fails when converting multibyte charset because MultiByteToWideChar can't hold the state. At least, you will have to use IMultiLanguage. (I don't know whether WinCE supports IMultiLanguage)
Comment 14•18 years ago
|
||
(In reply to comment #13) > At least, you will have to use IMultiLanguage. (I don't know whether WinCE > supports IMultiLanguage) That's not a good idea, either if it means we have to rely on the presence of MS IE. It seems like the trunk build already does (it uses Mlang), but IMHO, we should try to get rid of that dependency.
Comment 15•18 years ago
|
||
On trunk the minimum system requirements are Win2k, so we can rely on any components which are shipped with that.
Reporter | ||
Comment 16•18 years ago
|
||
ideally, this code would only depend on stuff that windows ce 4.2 would have so that I don't get left out implementing something on my own.
Comment 17•18 years ago
|
||
IMultiLanguage requires Windows CE .NET 4.0 and later per MSDN. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wceielng/html/cerefimultilanguageiunknown.asp Nonetheless if we don't use IMultiLanguage, I propose WONTFIXing this bug. 1. Both of intl owners opposed using native uconv. 2. Saving code size will be little unless we use native uconv for multibyte charsets (namely GBK, UHC, and Shift_JIS).
Comment 18•18 years ago
|
||
(In reply to comment #15) > On trunk the minimum system requirements are Win2k, so we can rely on any > components which are shipped with that. What I was implying was that if we rely on MS IE's presence (in this case 'Mlang'), the whole point of developing firefox is sort of moot in a sense. To me, for us to depend on Mlang looks like MS IE depending on our intl library.
Reporter | ||
Comment 19•18 years ago
|
||
if you mark WONTFIX, please reopen a new bug to address my comment #6.
Comment 20•18 years ago
|
||
I opened bug 336553.
Updated•15 years ago
|
QA Contact: amyy → i18n
Comment 22•13 years ago
|
||
Native uconv is gone.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•