Last Comment Bug 8343 - UTF-7 optionally encoded characters need to be accommodated
: UTF-7 optionally encoded characters need to be accommodated
Status: VERIFIED FIXED
:
Product: MailNews Core
Classification: Components
Component: Internationalization (show other bugs)
: Trunk
: x86 Windows NT
: P3 normal (vote)
: M10
Assigned To: nhottanscp
: Katsuhiko Momoi
:
Mentors:
http://rocknroll/users/momoi/publish/...
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 1999-06-16 15:35 PDT by Katsuhiko Momoi
Modified: 2008-07-31 01:22 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
zipped mailbox file contains UTF-7 encoded Latin1 character (13.70 KB, application/octet-stream)
1999-06-30 13:35 PDT, nhottanscp
no flags Details

Description Katsuhiko Momoi 1999-06-16 15:35:15 PDT
** Observed with 6/16/99 Win32 build **

Here's a part of a UTF-7 mail sent from OutlookExpress:

+ADwAIQ-DOCTYPE HTML PUBLIC +ACI--//W3C//DTD W3 HTML//EN+ACIAPg-

This is how it is displayed in Messenger 5.0 above.

Communicator 4.6 does not display this at all since it is part of
the HTML structure, abd that is correct.

Here, UTF-7 Set O (optional direct characters) are encoded by Outlook Express,
but we apprently expect all Set 0 characters to be directly represented.
We should be able to deal with optionally encoded characters also.
Comment 1 Katsuhiko Momoi 1999-06-17 00:36:59 PDT
Here's additional info. I took the above UTF-7 string and put that into
an unlabeld .txt file and placed it at the above URL:

1. http://rocknroll/users/momoi/publish/seamonkey/tests/optcharutf7.txt

Also I replaced the optionally encoded characters with direct
representations and placed the string in a .txt file at:

2. http://rocknroll/users/momoi/publish/seamonkey/tests/optcharutf7b.txt

Under Communicator 4.6, I can see both of them correctly under UTF-7 as:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD W3 HTML//EN">

With 5.0 Browser, under Latin 1 encoding, I see

+ADwAIQ-DOCTYPE HTML PUBLIC +ACI--//W3C//DTD W3 HTML//EN+ACIAPg-

for URL 1 but NOTHING for URL 2.

Under UTF-7. I see NOTHING for either URL 1 or URL 2.

These are text files and we should be able to show the string
under either Latin 1 (incorrectly) or UTF-7 (correctly).
Comment 2 cata 1999-06-29 14:20:59 PDT
I just checked. Actually the UTF-7 decoder is accepting anything base64-encoded,
including Set O. So, the converter seems to be ok.

About the two test URLs: they are txt files with HTML inside. But that HTML is
only a header, so of course we don't show anything in the page. If there is a
problem, that is that the txt is parsed as HTML. But the very fact that for the
URL 1 with encoding set to UTF7, we don't show anything, proves that the
converter is working right: the text got converted into that HTML header!!!!!!
Adding a single valid HTML tag in there puts something visible in the page,
proving once again that the converter is ok.

Now, about the original issue. Tested in browser (that piece of encoded text)
works. So I guess it is some other problem. Maybe the encoding is not setted
right in the mail? I do not know. But we should reassign this bug to the right
owner.
Comment 3 Katsuhiko Momoi 1999-06-29 15:08:59 PDT
OK. Let's send this over to nhotta then.
What should we do about the .txt file interpreted like .html file in 5.0?
A separate bug?
Comment 4 nhottanscp 1999-06-30 13:35:59 PDT
Created attachment 634 [details]
zipped mailbox file contains UTF-7 encoded Latin1 character
Comment 5 nhottanscp 1999-06-30 13:38:59 PDT
I was able to see UTF-7 encoded Latin1 character mailed by OE5 (attachment
created).
We need more data for this (original bug was filed with OE4, we also need 4.x
data).
Viewing UTF-7 is not a not a major requirement for M8. Moving to M10.
Comment 6 nhottanscp 1999-07-09 16:09:59 PDT
Marking as FIXED. I saw 6983 was fixed and verified. With the converter's fix,
this should be resolved now.
Comment 7 Katsuhiko Momoi 1999-07-12 17:50:59 PDT
** Checked with 7/12/99 Win32 build **

The original problem, i.e. some HTML structures were
displayed rather than suppressed, is now gone.
So in this sense, the bug has been fixed though I don't
know what check-in has solved the problem.

The 2nd problem is that of mistaking .txt files as if it were
.html files. This is not directly relevant to the UTF-7 coverter
and will be filed as a separate bug.

Marking it verified/fixed.

Note You need to log in before you can comment on or make changes to this bug.