Closed Bug 154426 Opened 22 years ago Closed 21 years ago

Make mfcembed build with Unicode

Categories

(Core Graveyard :: Embedding: APIs, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: teruko, Assigned: adamlock)

References

()

Details

(Keywords: intl)

Attachments

(2 files, 2 obsolete files)

Japanese file name in Save As dialog of MFCEmbed are displayed as "..."

Steps of reproduce
1. Copy ucvja.dll from Netscape build (component directory) to paste to 
Mfcembed build (component directory)
2. Launch MFCEmbed build
3. Go to http://www.asahi.com
4. Select menu File | Save Page as... to open the Save As dialog

Actual result
Japanese title name are displayed as "..." in File Name.

Expected result
Japanese title name are displayed correctly in File Name.

Tested 6-26 branch MfcEmbed build on Win2K-J.
Keywords: intl
QA Contact: mdunn → teruko
I tried this with the 7/3/2002 nightly from the trunk without the DLL file and 
was prompted with this name:

¥¢¥µ¥Ò¡¦¥³¥àc
[They look like Japanese characters to me] 
Might this just be a problem in the branch?
This was on WindowsXP.
I can't confirm or test this since I don't have the Japanese edition of Windows.

Mfcembed uses this as the suggested file name in the save-as dialog. Noteworthy
is the fact that the code to retrieve the title assigns it to a C string and is
therefore lossy.

http://lxr.mozilla.org/seamonkey/source/embedding/tests/mfcembed/BrowserView.cpp#619

I will attach that cleans up this code somewhat, deferring the conversion into a
C string, using TCHARs etc. Can you apply it and verify whether or not it helps?
Attached patch Patch (obsolete) — Splinter Review
Cleanup file save as code, use W2T macro instead of lossy assignment.
Has anyone had a chance to try the patch out yet? Thanks
Adam,

I'll try and test this out in K-Meleon. I'll let you know how it goes.

Adam,

OK, I tried this out on K-Meleon but it still doesn't seem to be handling all 
of the characters correctly. For some, like Czech, it seems to do OK. For 
Japanese and Arabic, it's still not displaying the characters. Just to make 
sure this wasn't an OS thing, I checked against Mozilla 1.3 beta and IE 6 and 
they both displayed the correct characters for Japanese and Arabic in the Save 
As dialog. 

We have a test page for Unicode here:

http://kmeleon.freewebsites.com/unicode/bookmarks.html

I'm wondering if these needs to be tied into this:

http://bugzilla.mozilla.org/show_bug.cgi?id=127801

and this:

http://bugzilla.mozilla.org/show_bug.cgi?id=92601

as they all seem to be interrelated. We are currently working on revamping the 
K-Meleon code to be better able to handle UTF-8 characters so we'll be very 
interested in seeing where this goes. 
amutch: making MFCEmbed to an Unicode app would fix this bug.

>I checked against Mozilla 1.3 beta
moz1.3 displays the non-ACP text correctly in the dlgbox;
but it doesn't save the file.  NSPR and nsIURI need to 
be changed in order to support Unicode file i/o.
related bugs are 162361, 104305, 169712
http://www.nissan.co.jp/ seems like a useful test page for saving titles.
Mozilla displays the Japanese title in the save as dialog but cops out when
saving to disk by replacing all the characters with underscores. If possible I
would like mfcembed to at least do the same. IE saves it properly.

Are there specific issues I should be aware about here for mfcembed? Presumably
it should work whether mfcembed is built with UNICODE defined or not right?

My code tried to make the save-as dialog more correct by removing an unnecessary
conversion when meant the save as text went from PRUnichar to ascii to TCHAR.
Should I also be UTF-8 encoding it too or something of that nature?
cc'ing chak
I'm trying to trace this issue and I thought I'd address the title bar first. In
Mozilla it displays Japanese chars, but I see question marks with mfcembed. If I
can understand this behaviour, then the save as dialog solution will probably
fall out too.

I've traced the title setting code and the actual W2T macros are lousing things
up. When _UNICODE is not defined, the W2T macro maps onto WideCharToMultiByte,
however it converts it to all to question mark characters. I noticed that the
mfcembed makefile didn't define _MBCS so I tried adding that but it still
doesn't work.

So I hacked the SetBrowserFrameTitle function to directly call SetWindowTextW
with the Unicode title and it *still* comes out as question marks, i.e.

SetWindowTextW(pThis->GetSafeHwnd(), aTitle);

Obviously Mozilla manages to set its title bar properly which is half the battle
so I'm wondering why it doesn't work with mfcembed. Roy or Frank, what is the
story with that?

Do we have to set some kind of code page for this to work?
adam:
SetWindowTextW(pThis->GetSafeHwnd(), aTitle); causes to send a
WM_SETTEXT _windows msg_.

The key here is that Window _decides_ the content of the lParam 
of WM_SETEXT despite the fact that aTitle has a correct unicode text.

Windows sends an unicode text "if and only if" the application is an unicode app.
Otherwise, Window sends an ACP text.  ( in your case, ACP is english and
title text contains japanese text.  japanese text will be converted to
unicode undefined char, '?', for you automatically. courtesy of Windows system)

>Do we have to set some kind of code page for this to work?
set the system to japanese code page for Japanese text.  
But others (eg. korean) won't work.

To sum up: we have to convert MFCEmbed to Unicode app.
I understand it would work if mfcembed becomes Unicode, but are you saying you
*can't* be work when it's mbcs? How is Mozilla able to display the proper
Japanese page title seeing as it builds with the same settings as mfcembed?

If I could get at least that functionality working and the save as dialog I
would be happy.
Adam: I'm guessing because Mozilla is using XUL for the interface, there's not 
the same conversion issues that we see with the Windows-specific code.

Roy: My big concern is the statement on the MS-site that Unicode is not 
supported on Win9x platforms. That might not be an issue for Mozilla testing 
but it is a huge issue for our embedded app. and a lot of our users on Win9x 
PCs. One of our coders found some support for Unicode apps on the MS site that 
would allow us to extend that to Win9x users if they downloaded the support 
files. Any idea if this is something that can be addressed or not?
adam:
>are you saying you *can't* be work when it's mbcs? 
yes, we can't not make it work with mbcs.  mbcs (aka multi byte char set) is 
usually refers to one of CJK code pages.  MultiByteToWideChar() is from
one of CJK code page to Unicode.  The source string can't contain
mixture of CJK.  First param of MultiByteToWideChar() defines the codepage.

>How is Mozilla able to display the proper Japanese page title 
>seeing as it builds with the same settings as mfcembed?
We decided _not_ to use UNICODE compiler flag.  It required to do
a lot of work......   So instead, we register moz differently at runtime.
See http://lxr.mozilla.org/seamonkey/source/widget/src/windows/nsToolkit.cpp#626

amutch:
>That might not be an issue for Mozilla testing 
It was a big issue while we implemented the runtime registration.
We have number of Windows 95 users/testers and we needed to support them.

Just to recap:
mozilla is an unicode app in NT base system and it is also an ACP (one codepage)
application in Win9x base system.  I was going to present a paper about this
in the upcomming Unicode conference. :-)
Well it sounds like this bug won't go away unless mfcembed can be built as
unicode, or unless it does similar clever tricks as Mozilla.

I've tried building with _UNICODE defined and there is a lot of build errors
which I'm fixing one at a time. Most of these are _T / TCHAR issues with a few
#ifdefs too.

I don't mind renaming the bug to cover that work, however it's obvious that
other embedders are going to run into similar issues. At the very least we'll
need to document what is going on here to explain the limitations of building in
ansi mode. Better yet, Roy if you could put your presentation up on the web so
that embedders might know how to deal with similar issues for themselves.
Adam: I'm not sure if this should be the meta bug or a new bug should be opened.
If it goes that route, and it sounds like it should, there's several existing
bugs related to this issue. I'm sure there are going to be others that will come
up from this change. So, we need somewhere to add dependencies and blockers so a
meta bug would be in order.
Attached patch Work in progress (obsolete) — Splinter Review
New patch fixes up mfcembed to build with UNICODE defined (and also when it's
not defined). There is a lot of junk code in mfcembed that assumes TCHAR is
char and most of the patch is fixup for that and some other dodgy looking
string conversions.

While all the source compiles, it still doesn't link when UNICODE is defined
and I haven't been able to test it yet in this mode. Until I figure out the
issues it is still work in progress.
Attachment #113734 - Attachment is obsolete: true
Adam: Let us know if there is anything we can do to assist.
Renaming to cover the work. Patch & picture follows.
Summary: Japanese file name in Save As dialog of MFCEmbed are not displayed correctly → Make mfcembed build with Unicode
Attached patch PatchSplinter Review
This patch is as far as I know complete. It allows mfcembed to be built as a
Unicode app, or as an ANSI app. By default, it builds as ANSI, but you may
build as Unicode with the following from mozilla/embedding/tests.

make BUILD_UNICODE_MFCEMBED=1

Note that mfcembed and the mfcembed components must both be built with this
flag or MFC gets a bit upset.

I've tested the major functionality such as editing, search, url bar, print &
save in both modes and it seems to be working. I would appreciate people who
care about this mode to try it out for themselves. As far as saving goes, the
behaviour is now the same as Mozilla, i.e. Japanese window titles, Japanese
file names but when you come to save it replaces the chars with underscores.

Reviews please?

A picture follows
Attachment #116108 - Attachment is obsolete: true
Attached image Screenshot
Title & save as displaying Japanese characters
Blocks: 92601, 127801
adam: the patch looks good. I would prefer if you could run some plugin tests.
      I had problem with Flash when I converted moz to unicode.
The issues with plugins may stem from the same issues I encountered with
mfcEmbedComponents.dll. Namely, that MFC extension DLLs may not be able to load
correctly against a Unicode executable and Unicode mfc DLL if they are built in
ANSI mode.

With that said, Shockwave sites such as these appear to function correctly.

http://www.shockwave.com/sw/content/jigsawpuzzles

Short of specific crash issues you know of with the patch applied I would like
to check this in as soon as possible.

If the patch looks okay and appears to work as far as you can tell, can I have
an r on this?

I am considering raising a new bug to cover the issue of making mfcembed do what
Mozilla does, namely being able to run in ANSI mode but still offer certain
Unicode features. Is this feasible in an MFC app? Presumably it boils down to
setting certain runtime values on the .exe and calling the wide-char version of
certain Win32 functions right?
Roy: Any chance that a set of tests could be put together to check out all of 
these changes? I know that as far as character handling, etc. you seem to have 
a good handle on these. We will need them for K-Meleon too when we incorporate 
these changes into our code.
Most of this patch is just fixing busted use of TCHAR which causes compiler
errors when _UNICODE is defined and TCHARs map onto wchar_t. Unless K-Meleon is
a Unicode app I don't think this patch would mean much.
Adam: We are looking at going that direction. Right now, i18n issues are a big 
problem with the current code and like Roy, we've been incrementally trying to 
address issues from the underlying MFCEmbed code that we use. Doesn't this 
address a lot of those issues in a more global fashion?
It certainly cleans up a lot of mess in the code and at least gives the option
of building ansi or Unicode. What it doesn't do is fix the existing ansi
behaviour so that it suddenly displays Japanese characters in the title bar or
save as dialog where it didn't before.

From comment 14, it seems that it would be possible to fold certain Unicode
behaviour into the ansi build mode, but this patch is about general cleanup,
fixing broken code which assumes TCHAR is a char when it isn't and other abuse.
If it is  possible and reasonably clean for an MFC app to do what Mozilla does,
then I think it would be a valid and desirable next stage in this process.
Adam: Maybe I missed something. From your screenshot, it is displaying Japanese 
characters. This is something that MFCEmbed did not do with either the title 
bar or the Save dialog. 
Yes, it does display Japanese chars but only when you build in Unicode mode.

make BUILD_UNICODE_MFCEMBED=1

Without defining BUILD_UNICODE_MFCEMBED you're stuck with the existing ansi
behaviour. Only NT, W2K and XP support Unicode so you can't build this way if
you have to support Windows 95/98/Me too.

It is my understanding that Japanese / Thai / Chinese etc. versions of Windows
95/98/Me use MBCS for their encodings and unless you are running mfcembed on one
of these systems you will not see the proper characters in the title bar or the
save as dialog no matter what you try because of code page issues. 

So my patch is basically clean up. It makes it possible to build Unicode and see
all the proper characters but it shouldn't affect the existing ansi mode.

It may be possible for my patch to do one further thing, namely if
BUILD_UNICODE_MFCEMBED is not defined, to define _MBCS instead. This might mean
that certain behaviour such as title bars and save as dialogs is correct in Ansi
mode on Japanese / Thai / Chinese versions of Windows 95/98/Me but I have no way
of verifying that. It will probably never render correctly in the US versions.
Adam: Thanks for the clarification. I better understand what is going on now. 

Roy or anyone else: Does anyone know if this MS patch:

http://msdn.microsoft.com/library/default.asp?url=/library/en-
us/mslu/winprog/microsoft_layer_for_unicode_on_windows_95_98_me_systems.asp

would allow us to run a Unicode app. on Win9x? It appears to say so but I 
haven't tested it and don't have access to a Win9x machine handy to check it 
out.
amutch: MSLU (Microsoft Layer for Unicode) is a set of libraries which allows
        an Unicode application to run under Win9x OSs.  It requires the application
        to link with MSLU libraries and distribute the MSLU with the application.
        We didn't want to introduce a new library dependency for mozilla; so 
        we decided to go with the approach as in the comment #14
        Related bugs are : 118013, 162362
 
adam: i will review your patch today.
Comment on attachment 116198 [details] [diff] [review]
Patch

/r=yokoyama
Attachment #116198 - Flags: review+
Roy: What are the limitations of going the route used in #14?
Comment on attachment 116198 [details] [diff] [review]
Patch

Hi Chris, can you sr this patch to make mfcembed build in Unicode mode? Mostly
it's cleanup to ensure it compiles cleanly when TCHAR maps onto wchar_t.
Attachment #116198 - Flags: superreview?(blizzard)
>Roy: What are the limitations of going the route used in #14?
No limitations.  As in comment #23,  it boils down to setting certain runtime 
values on the .exe and calling the wide-char version of certain Win32 
functions.  (I copy/past from comment 23.  copyright....)

Having said, it may be easir for MFCEmbed to provide both UNICODE and ANSI
versions for embedders.  More applications are moving away from ANSI
and Microsoft is not going to add any more new functionalities and supports 
for ANSI applications.


> +    void HandleFlag(
> +#ifdef _UNICODE
> +        const nsAString& flag,
> +#else
> +        const nsACString& flag,
> +#endif
> +        const TCHAR * param = nsnull)

Just write the function twice.  We won't mind.  Really.

+            if (!str.IsEmpty())
+            {
+                USES_CONVERSION;
+                m_strHomePage = A2CT(str.get());
+            }
+            else
+                m_strHomePage.Empty();

if an if() has {} the else should too.  But that's just personal style.

Other than that, sr=blizzard
Attachment #116198 - Flags: superreview?(blizzard) → superreview+
Thanks all, patch is checked into trunk. If there is a clean way that we can
enable at least some Unicode support (e.g. titlebars, save as box) in the ansi
build on NT platforms, please raise a new bug on the issue.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Verified as fixed in 3-11 mfcembed trunk build.
Status: RESOLVED → VERIFIED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: