Closed Bug 6672 Opened 21 years ago Closed 20 years ago

Japanese text is removed from HTML mail

Categories

(MailNews Core :: Internationalization, defect, P3)

x86
Windows NT
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: nhottanscp, Assigned: bugzilla)

References

Details

Attachments

(2 files)

Tested with 1999-05-17-09-M6 build on Japanese NT.

I created a new mail windows and typed both Japanese and ascii, messenger only
gets text in ascii range from Ender.
So I cannot send Japanese text with HTML mail.
Target Milestone: M6
M6, cannot send Japanese HTML mails (blocks the testing).
Assignee: kostello → ducarroz
A few weeks back we added a new signature to OutputHTML to take in an
nsIOStream* object. This new method returned back a stream of HTML in the
document specified charset. The old OutputHTML method is still there, but it
only returns back Unicode characters. The mail code need only call the method
using the new parameter to get the properly encoded document.

Reassigning to Jean-Francois to make the change in the mail/news code.
Status: NEW → ASSIGNED
nsIDOMEditorAppCore.h does not contain the stream version.
nsEditorAppCore.h does have the stream version but it does not take the charset
arguement.
Ender interface should also change if messenger change to use the stream
version.
Target Milestone: M6 → M7
Changing Messenger for using nsIOstream will imply a lot of work that we will have to trash anyway in the next
milestone as we are rewriting the backend of message compose. Sorry but too late for this change. If HTML is really
a blocking problem especially for Japanese, we should maybe change the default for message composition to be plain
text and not HTML
Target Milestone: M7 → M8
Has the nsIOstream version been provided by Ender? Otherwise, we would like
nsString version to be fixed.
Status: ASSIGNED → NEW
I am transitioning the Ender code to Akkana. We should have the stream API
completed by the end of M7.
Blocks: 7228
Status: NEW → ASSIGNED
Depends on: 5894
The new nsIStream methods are in place in the editor interface.  I'm not quite
sure what's being asked for in this bug -- could someone please take a look at
the interfaces that are there now and let me know if we need to do something
else?

For instance, the new editor shell doesn't have any stream methods; does it need
to?  I think I had some in the editor app core, but then the app core went away.
Target Milestone: M8 → M9
Target Milestone: M9 → M10
M10
This still happens (with my local build pulled 8/1).
I put three Japanese characters in the editor (0x3042, 0x3044, 0x3046) but those
are not in the unicode text returned from the following call,
m_editor->GetContentsAs(format.GetUnicode(), flags, &bodyText);
in nsMsgCompose::SendMsg.
Other text (e.g. html tags) were returned from the editor.
I have Japanese NT, let me know if further testing is needed.
I debugged with my Japanese NT and patched nsHTMLContentSinkStream::Write to
make Japanese to work. Note that this is not a fix proposal, I am not familiar
with the code. Someone in Ender should fix this.
In the code, there are two cases handled (stream and nsString). For nsString, I
think we don't need conversion since both source and target are UCS2. So I
changed to just append the string. Japanese was treated as Latin1 and converted
wrong.
In order to make Japanese to work I had to change one more thing. The same
problem as bug#10940, nbsp 0x00A0 was returned to messenger and Japanese
(ISO-2022-JP) converter cannot map this character. So I changed to replace nbsp
to space. This should be done somewhere else to create an entity (&nbsp)
instead.
I'll attach my diff.
Attached file diff file
I looked at the patch, and I don't understand it.  We send the buffer through
the unicode encoder, then if we're doing the string (rather than stream) case we
have special-case logic for that one html entity.

Shouldn't the unicode encoder, or a service related to it, handle that?  Isn't
this wrapped up with the ongoing issue of having a service to encode entities,
bug 8865?  It shouldn't be the job of the nsHTMLContentSinkStream, because then
we'll have to duplicate the logic in the nsHTMLToTextSinkStream and any other
output content sink that might appear in the future.
>I looked at the patch, and I don't understand it.
If you're talking about the entity, I put the hack there just make it work.
Ender eng. should find out the right solution.
> Note that this is not a fix proposal, I am not familiar
> with the code. Someone in Ender should fix this.
Please let me know for any help for debugging/testing (I have Japanese NT).
Depends on: 8865
I don't see that this is an Ender issue.  It looks to me like an entity/I18n
issue, related to bug 8865 and dependant on that bug.
> I don't see that this is an Ender issue.  It looks to me like an entity/I18n
> issue, related to bug 8865 and dependant on that bug.
There are two problems, the entity bug is a secondary. How about the problem of
applying incorrect (and probably unnecessary) conversion to get UCS2 from UCS2?
I think that needs to be fixed, is that also related to an entity/I18n issue?
I don't see in the patch where you've avoided doing any conversions; all it does
is loop over the string before it's appended to mString.  (I wonder if bugzilla
isn't showing me the whole patch, or something?)

Which conversion are you saying doesn't need to be there -- the EncodeToBuffer
call?  That could well be right -- I haven't been able to find anyone who admits
to understanding what EncodeToBuffer does or when it needs to be called.  If you
know of a case where it's being called and shouldn't be, I'm quite willing to
believe you.
> I don't see in the patch where you've avoided doing any conversions;
No, I didn't but I ignored the result of the conversion then appened the
original string instead of converted one.
> Which conversion are you saying doesn't need to be there -- the EncodeToBuffer
> call?
Yes, I think that's not needeed when appending to mString because both strings
are UCS2.
Okay, thanks for the explanation.  I've attached a possible rewrite of
nsHTMLContentSinkStream::Write.  Could you take a look at it and see if you
think it would solve the problem, and/or plug it in and see if it works okay?
Or, if it's easier (let me know), I can check it in inside an ifdef.
CC'd tague because of the entity issue.  I agree with Akkana, the entity
issue probably should be resolved by the fix for bug 8865, but that is planned
for M10.  If we could apply a M9 workaround, then it would allow the testing
of the mail sending code.  Would it be possible to modify the code that already
generates other entities (e.g., < or &eacute) to also generate   ?
That code will probably be modified in conjuction with the fix for bug 8865.
This temporary fix would give us 3 more weeks of testing Japanese mail sending.
Tague, Can you look into this and confer with Akkana?
> and/or plug it in and see if it works okay?
Yes, I plugged in the code and tested a simple ascii html mail, worked fine.
Japanese mail cannot be tested/verified because of the entity issue. But the
change should handle any unicode equally ascii or Japanese, so I think that's no
problem. I would like this to be checked in (no ifdef) so if the entity issue is
resolved then we can send non-ascii html correctly.
I didn't know how to check stream case so please check if that part is also
working before check in.
The fix to not convert in the nsString case is checked in (no ifdef).
by removing the two nbsp entities in defaultHTMLBody.html, nhotta and myself
were able to send HTML Japanese message. The only problem left is if you delete
the default body before typing your message, ender(?) will add a nbsp at the end
of the data and therefore Japanese mail composition will failed. it's same than
for plain text japanese message (bug 10940)
Except entity issue with nbsp character in the body, HTML Japanese mail works
now. We need to wait for bug 8865 to be fixed in order to be able to close this
bug report.
Target Milestone: M10 → M11
Move for M11. Still waiting on 8865 to close it...
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Hooked up the entity converter to nsMsgSend.cpp.
Status: RESOLVED → VERIFIED
** Compared 9/15/99 Win32 M11 build to 9/16/99 Win32 M11 build **

Beginning with the 9/16 build, I see &nbsp's in the Japanese HTML
mail body when I input ASCII spaces. The number of SPACES match
the number of NBSP's. Whereas in the 9/15 build, I see only spaces
in HTML (i.e. workaround) for the same ASCII space input.
I'll mark fix verified.
No longer depends on: 8865
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.