Closed
Bug 26734
Opened 25 years ago
Closed 9 years ago
Implement RFC 3676 delsp=yes for format=flowed ISO-2022-JP CTE 7bit plaintext messages for sending (comment #148). Already implemented for reading in bug 231701.
Categories
(MailNews Core :: MIME, defect)
Tracking
(Not tracked)
VERIFIED
FIXED
Thunderbird 45.0
People
(Reporter: momoi, Assigned: jorgk-bmo)
References
(Blocks 1 open bug, )
Details
(Keywords: intl, Whiteboard: [DelSP=yes is currently for iso-2022-jp only, so please don't open excess bug for other charset at B.M.O])
Attachments
(1 file, 15 obsolete files)
7.71 KB,
text/plain
|
Details |
** Observed with 2/5/2000 Win32 build **
In Bug 16398, an enhancement feature for format=flowed for plain/text
mail is implemented. This seems to be a good thing and in principle
I agree with its desires. But the RFC 2646 which defines it,
http://www.cis.ohio-state.edu/htbin/rfc/rfc2646.html
does not seem to provide for languages which do not contain
spaces as word-breaks. For example, there is no space between words in
pure Japanese, Chinese, and Korean text generally.
Thus it is possible to have a "word" as defined in RFC 2646
which could be 15 more lines long. In such languages, there really
are often no differences between a word break and a paragraph break.
Bug 16398 seems to have some problem left still, but I'm filing this
bug to make sure that for Beta 1, either format=flowed is compliant with
CJK line breaking conventions, or should not be the default setting
for these languages.
With the above M14 build, I see 2 problems in Japanese plain/text mail.
1. When we create a Japanese paragraph consiting of many characters, e.g.
700-1200 characters without a space break, it goes out without any
CJK line breaking, which should occur every 70 bytes or so whether or
not a space exists. Not being able to follow the CJK line-breaking
rules leads to a 1-line message containing several hundred characters
for these paragraphs, which then cannot be displayed well at all in
mailers which do not have line-wrapping functions. This in effect violates
the spirit of RFC 2646.
2. When we create these long paragraphs in Japanese (equivalent of
word in the current RFC 2616 implementation unfortuantely),
we seem to have a problem in converting to ISO-2022-JP
message in send. Thus this type of paragraph ends in data
corruption for Japanese mail send. I attach a JPG view of such
a problem.
To reproduce this problem of data corrution, view the following
Japanese line under Japanese (Shift_JIS) encoding setting
and copy/paste into Mozilla Plain/Text Composer. Paste it
many times without any space break so it will be become
a fairly long (10 or more lines) but not unusual Japanese paragraph.
Make a few more paragraphs like that and vary the length of each
paragraph and if you have enough characters in it, the data corruption a
s you see in the attached image will occur.
In general, we need to review other issues of RFC 2616 for CJK and
other languages which may not fit well within the definition.
These issues should be somehow dealt in Beta1.
Reporter | ||
Comment 1•25 years ago
|
||
The Japanese data line to copy/paste:
これは長い日本語のテキストですので、行が折れると思います。
Reporter | ||
Comment 2•25 years ago
|
||
Reporter | ||
Comment 3•25 years ago
|
||
Assigning myself as QA contact.
Keywords: beta1
QA Contact: lchiang → momoi
Comment 4•25 years ago
|
||
Assigning to Daniel since he wrote this support.
- rhp
Assignee: rhp → bratell
Comment 5•25 years ago
|
||
This is interesting. Is there a simple way to distinguish western languages from
the asian (or other similar) when looking at a mail header?
I can't really "fix" this until I have gotten bug 16398 back on track and I
think that a couple of the problems mentioned depends on bug 16398 being broken.
I don't know when I can do since I will present my master thesis in only a
couple a weeks. I will look at both bugs today but no progress is promised.
Depends on: 16398
Reporter | ||
Comment 6•25 years ago
|
||
nhotta should be able to help you with the mail charset and
what line beraking rules to apply. I would think that we already
have CJK line-breaking rules implemented in format=fixed plain/text.
We need to have CJK line-breakign rules to be working in M14,
be it format=fixed or format=flowed. If not, I suggest that
we set the default to be the one which can do CJK line-breaking
rules, i.e. format=fixed.
Comment 7•25 years ago
|
||
My initial idea was to only use format=flowed when knowing that the text was of
a western, alternatively set format=fixed when seeing a message that needed
special line breaking. That would be a start until someone (maybe me) figured
out how format=flowed and those line breaks could work together. To set the
default to format=fixed would be a pity since very few would know how to turn it
on, but if it's necessary to unbreak mail, sure it should be done. When is the
M14 deadline?
Reporter | ||
Comment 8•25 years ago
|
||
Daniel, I think you should work with nhotta. He should be able to
look at this on Monday. Then we should be able to arrive at
what would be the best way to preserve CJK line breaking in plain
text mail. Please discuss the schedule with him.
I would be also curious to see what would happen to these problems
when Bug 16398 is fixed.
Reporter | ||
Comment 9•25 years ago
|
||
I tried a pref line to turn off format=flowed and make plain text
into format=fixed, i.e.
user_pref("mailnews.send_plaintext_flowed", false);
But this did not work. It did not change the plain text
mail to regular plain text mail or perform line-wrapping
on CJK mail as mail was generated. It was still marked as
format=flowed.
Comment 10•25 years ago
|
||
Could you file a seperate bug about the pref not turning format=flowed off and
assign it to me?
What the format=flowed does when composing mails is to space stuff some lines
and add a space to the end of some others as well as stripping trailing spaces
from a line. If that could interfere, it's certainly a problem with the
standard. If it's just the line wrapping that's wrong it's in the same file but
not actually a part of the format=flowed code.
Comment 11•25 years ago
|
||
>My initial idea was to only use format=flowed when knowing that the text was of
>a western
I agree, until we figure out how to do this for other languages.
You can check a charset then generate format=flowed only if it's "us-ascii" or
"ISO-8859-1".
Comment 12•25 years ago
|
||
I found an additional issue.
The following is from RFC2546 page 5,
[Quoted-Printable] encoding SHOULD NOT be used with Format=Flowed
unless absolutely necessary (for example, non-US-ASCII (8-bit)
characters over a strictly 7-bit transport such as unextended SMTP).
In particular, a message SHOULD NOT be encoded in Quoted-Printable
for the sole purpose of protecting the trailing space on flowed lines
unless the body part is cryptographically signed or encrypted (see
Section 4.6).
The intent of Format=Flowed is to allow user agents to generate
flowed text which is non-obnoxious when viewed as pure, raw
Text/Plain (without any decoding); use of Quoted-Printable hinders
this and may cause Format=Flowed to be rejected by end users.
We are putting Format=Flowed with bodies containing quoted-printable.
That needs to be fixed.
Comment 13•25 years ago
|
||
Because of the QP issue, we want to use format=flowed only for "us-ascii" for
beta (sending 7bit with QP is our default).
Daniel, could you write a patch to do that?
Comment 14•25 years ago
|
||
I've seen that and noone would be happier than me if quoted printable was
disabled by default for 8 bit mail. I think the world is ready for that now.
Note that the RFC says that you shouldn't QP-encode a mail when it has been
format=flowed, not the other way around. Furthermore, it leaves it open to use
qp anyway with the phrase "unless absolutely necessary (for example,
non-US-ASCII (8-bit) characters over a strictly 7-bit transport such as
unextended SMTP)". If we use qp because we thinks that phrase apply, we could
also use format=flowed. If we don't think that phrase apply, we shouldn't
qp-encode 8 bit mail in the first place.
What i want to say is that I'm reluctant to disable format=flowed for 8 bit
e-mails for that reason. For one thing I could never more use format=flowed in
mozilla. It's very difficult to write a mail in swedish without using an 8 bit
character.
I would rather disable the qp thing in that case but I'm not sure how to do
that. When looking at the source ot looks like it has something to do with some
strict_mime-prefs.
So what do you think? In any case, I think it's harmless for 8 bit e-mails.
Comment 15•25 years ago
|
||
There is a correction in my last comment. Sending 7bit with QP is NOT our
default (it is for headers but not for bodies).
The strict_mime pref has UI, so the user can enable/disable. I think what we can
do for now is to check the strict_mime pref value in addition to check charset
names ("us-ascii" and "ISO-8859-1") to decide whether to generate format=flowed.
Does that sound good?
Reporter | ||
Comment 16•25 years ago
|
||
Added John M. to CC.
Comment 17•25 years ago
|
||
Is this bug about CJK line breaking or QP? If the former, could the QP issue be
moved to another bug?
Comment 18•25 years ago
|
||
Okay, I'll file a separate bug for QP.
Comment 19•25 years ago
|
||
Filed a separate bug for QP issue 26904. Please add comments to that bug for QP
issue.
Comment 20•25 years ago
|
||
My take:
A client that has better definition of "word" than RFC 2646 should use that
better definition. If Mozilla knows that text is Japanese, it should use the
Japanese line-breaking rules for deciding when to insert soft line breaks for
both generation and layout.
Technically, format=fixed does not allow for *any* line breaking during display.
Comment 21•25 years ago
|
||
Ok, the problem is that RFC 2646 provides no way to generate a soft line break
without also generating a space.
Comment 22•25 years ago
|
||
So for CJK, lines are all fixed. As long as the lines are wrapped properly, it
should work okay for CJK. But do we want to put format=flowed when the lines are
actually not flowed for CJK (and languages which line breaks without a space)?
Also we need to check if layouts (both ours and other mailers) work with CJK
when format=flowed is used (e.g., check if kinsoku works properly).
Comment 23•25 years ago
|
||
I have made a patch to nsMsgCompUtils.cpp and nsMsgCompoese.cpp that only allows
format=flowed for us-ascii and iso-8859-?. If someone wants a charset added to
the list of charsets that surely can cope with format=flowed, it's just to say.
I will add the patch as an attachment to this bug so if someone (rhp?)can review
and check it in, this bug should be "fixed".
Status: NEW → ASSIGNED
Whiteboard: Fix waiting for someone to check it in.
Comment 24•25 years ago
|
||
Comment 25•25 years ago
|
||
Restricting format=flowed to a set of inclusive charsets is not the correct fix.
Reporter | ||
Comment 26•25 years ago
|
||
Daniel, jgmyers, nhotta and I had a chat yesterday about format=flowed
and CJK after the MozillaZine i18n session
I'm going to attach a log of that discussion to this bug report.
Please take a look.
Comment 27•25 years ago
|
||
Should we provide a set of exclusive charsets instead?
Or can we provide non space soft break for CJK?
Reporter | ||
Comment 28•25 years ago
|
||
Comment 29•25 years ago
|
||
Naoki,
Could you check this in today. I'm swamped with some other tree changes at the
moment.
- rhp
Comment 30•25 years ago
|
||
I see, but I don't quite understand what the bug is and even less how to fix it.
A comment to the log is that when format=flowed messages are recieved, they are
not wrapped in a <PRE>...</PRE> but instead <blockquote> and <br> are used. And
<tt> if the mail should be displayed with a fixed width font.
If format=flowed is the thing breaking CJK mail, then the patch I attached could
at least be seen as temporary workaround but that would make continued debugging
hard. So what do you think? I'm sorry but since I don't know how to test this
myself I have to rely on what you say.
Comment 31•25 years ago
|
||
I think the key points from the chat which relate to this bug are:
1) For layout, Mozilla can and should insert soft line breaks using Japanese
line breaking rules.
2) For generation, format=flowed provides no method of inserting soft line
breaks which does not also insert a space, so Japanese cannot use format=flowed
soft line breaks.
3) For format=fixed, generation wraps lines using hard line breaks. Having
format=flowed generation wrap lines using hard line breaks is no worse.
4) The image attached to this bug looks like the result of a CRLF inserted while
ISO 2022 was shifted into a non-ASCII mode. Per the iso-2022-jp charset, the
generator must shift back to us-ascii before inserting the CRLF.
Comment 32•25 years ago
|
||
It would help to have an attachment of the message corresponding to the attached
image. That way, we can determine whether it is the generation of format=flowed
that is broken.
Comment 33•25 years ago
|
||
So if I understand things correctly:
1) has nothing to do with mail nor format=flowed but is a layout issue.
2) says that we have to disable format=flowed for Japanese.
3) and 4) the line wrapping may damage mail contents. This has nothing to do
with format=flowed really.
Disabling format=flowed for Japanese is simple. The crude fix I attached could
easily be changed to that.
Line wrapping must be changed to change mode(?) before inserting line breaks and
than back again before letting text continue. Does the same apply to quote signs
(>), ordered lists, (1., 2., etc) and unordered lists (*)? Is this only with a
certain charset or with several? How do one know which mode(?) the text is in?
Comment 34•25 years ago
|
||
(2) does not say we have to disable format=flowed for Japanese.
(4) is an issue for any stateful charset where CRLF is only allowed in a strict
subset of the possible states.
Comment 35•25 years ago
|
||
Well, (2) could be implemented by never using soft line breaks in format=flowed,
but without soft line breaks there are no reason at all to use format=flowed so
I think it's honest to disable the whole thing until the RFC hava come up with
an alternative solution.
For (4), how is that solved in for instance Netscape 4? It sounds like it must
be a really hard problem and way out of anything I can do in a week on my spare
time.
Comment 36•25 years ago
|
||
For (2), the points mentioned in my last comments are,
Do we want to put format/flowed even all lines are fixed for Japanese?
Are we sure that putting format/flowed for Japanese do not break layouts
(mozilla and others)?
I am not arguing about applying format/flowed to CJK should work or not.
For (4), we can investigate and estimate how much work needed for fixing it.
Comment 37•25 years ago
|
||
Without format=flowed, layout is technically not allowed to wrap lines.
There's no reason not to use format=flowed; the reason to use it is to remove
the need for a format=fixed generator.
(4) is an issue for both format=flowed and format=fixed. Whatever solution
exists for format=fixed can be directly applied to format=flowed.
Comment 38•25 years ago
|
||
> Whatever solution exists for format=fixed can be directly applied to
> format=flowed.
That's a theory. It depends on the implementation (e.g., when is the generation
applied before/after convert to ISO-2022-JP, if the generation code does not
understand ISO-2022-JP then that also need changes, etc...).
Comment 39•25 years ago
|
||
I thought of a reason why we might not want to use format=flowed for Japanese
text. With format=fixed and the "wrap incoming text" preference enabled, layout
treats all lines as if they are flowed. With format=flowed, all lines not
ending with a space should be treated as fixed, regardless of the preference
setting. So emitting a hard line break in format=flowed could end up being
worse than emitting it in format=fixed.
I still would like to see the message corresponding to the attached image.
Reporter | ||
Comment 40•25 years ago
|
||
John, I'll send it you later this afternoon -- I'm involved in something else currently.
Reporter | ||
Comment 41•25 years ago
|
||
I sent you the msgs using 2/7 build but the problem did not occur.
My original problem was reporte dwith 2/4 build. Someone migh fixed the corruption problem.
I'll try out the new build again and report back.
Comment 42•25 years ago
|
||
The message you sent me had a hard line break (CRLF without preceeding SP) in
the middle of a JIS X 0208-1983 sequence. This is not legal ISO-2022-JP.
Reporter | ||
Comment 43•25 years ago
|
||
John, I created 3 paragraphs in that message. It's not all one paragraph.
Comment 44•25 years ago
|
||
Yes, the third paragraph had the error.
Comment 45•25 years ago
|
||
Oops, never mind, that CRLF was added by the 4.7x client in the process of
saving the message to a file. The message itself appears to have no errors,
other than having lines longer than 996 characters.
Comment 46•25 years ago
|
||
...plus each paragraph should end with a single SP, in order to mark the line as
flowed.
Comment 47•25 years ago
|
||
Subject: Bugzilla bug 26734
Date: Mon, 07 Feb 2000 17:18:28 -0800
From: leger@netscape.com (Jan Leger)
To: bobj@netscape.com
Bob-
Can you PDT+ or PDT- this one, we're not sure. -Jan
======================================================
With the powers vested in me by the PDT, I declare this PDT-.
This is a great feature, but not a Beta1 stopper.
If it does not work for CJK, then disable it by default for now and
resolve it post-Beta1.
Whiteboard: Fix waiting for someone to check it in. → [PDT-] Fix waiting for someone to check it in.
Comment 48•25 years ago
|
||
So the primary problem is line breaking/wrapping. There is no difference when
using format=flowed and normal mails in the algorithm that finds the place to
put a CRLF. Right now, when a line gets a little longer than the wrap length,
the algoritm searches for a space using nsString::IsSpace(...), and finding a
space it is replaced by CRLF.
Whiteboard: [PDT-] Fix waiting for someone to check it in. → [PDT-] Patch to disable format=flowed for certain charsets waiting for someone to check it in.
Comment 49•25 years ago
|
||
I've read through the history and comments and also the IRC log, and I agree
with John's comments.
When generating Format=Flowed, soft breaks are only inserted after an existing
space. The generating agent SHOULD NOT insert a space, because that changes the
text. (However, if you are generating very long runs of octets with no spaces,
you get into trouble, for example, with lines longer than 998 octets.) If you
already have guidelines for inserting CRLFs into non-Western charsets, you
should use them.
It is the intent of Format=Flowed that non-flowed lines shouldn't be wrapped
on display (they might be ASCII art or source code or a spreadsheet or
something). One of the advantages of F=F is that you can now tell flowable
lines from non-flowable.
Comment 50•25 years ago
|
||
In discussing this with client engineers, the suggestion is that when dealing
with non-Western charsets:
1. When generating, insert soft breaks between characters, not between spaces.
2. When receiving, delete the entire soft break (the SP CRLF).
Comment 51•25 years ago
|
||
Randy, is "utf-8" a "Western" or "non-Western" charset?
Comment 52•25 years ago
|
||
How about "when receiving, if the character before the SP of a soft break is
CJK, delete the SP as well".
We'd need a good definition of "CJK" I'm thinking something like U3000-UD7FF,
UF900-UFAFF. I don't know what todo with UFF00-UFFEF.
Comment 53•25 years ago
|
||
I think Korean uses spaces for word break.
Comment 54•25 years ago
|
||
Man, we really need a ZERO WIDTH SOFT HYPHEN character.
For utf-8 and other unicode charsets, format=flowed could be updated to say that
a CRLF after ZERO WIDTH SPACE is a soft break, delete both the CRLF and ZERO
WIDTH SPACE.
iso-2022-jp doesn't have an encoding of ZERO WIDTH SPACE, so we would seem to be
stuck with having the receiver be charset dependent, deleting the space only for
a specific list of charsets.
Another possibility is to revise format=flowed to say that TAB CRLF is a soft
break, delete the TAB as well as the CRLF.
Comment 55•25 years ago
|
||
Don't forget that you cannot make any assumptions regarding what the receiving
UA will do. The mail must be fine even if the receiving sida doesn't remove any
inserted characters.
Updated•25 years ago
|
Target Milestone: M17
Comment 56•25 years ago
|
||
Katsuhiko Momoi wrote
k>oes not seem to provide for languages which do not contain
k>spaces as word-breaks. For example, there is no space between words in
k>pure Japanese, Chinese, and Korean text generally.
This is simply not true for Korean.
Korean text IS different from Chinese and Japanese in that
we DO use space as word delimeter. Of course, that does NOT
mean that lines can be broken ONLY at spaces as is currently
implemented in Mozilla 5.x(as of 20000321) and Netscape 3.x/4.x. Lines can be
broken at ANY syllable boundaries including space.
Reporter | ||
Comment 57•25 years ago
|
||
Thanks, jung-shik for the correction. I did realize my mistake but have forgotten to
correct it. Now you have. So the problem I describe here applies mainly to Japanese
and Chinese.
Comment 58•25 years ago
|
||
Are there many non-MIME UAs in Japan and China? How do they deal with line
wrapping?
Comment 59•25 years ago
|
||
Is this completely fixed by nhotta's switch to the use of the linebreaker
module? If noone objects, I'll mark this bug fixed in a couple of days.
Comment 60•25 years ago
|
||
I don't think this is completely fixed; the existing linebreaker module
generates hard line breaks
causing paragraphs to occasionally format with short lines.
Does anyone have a feel for the prevalance of non-MIME UAs in this space? Would
it be reasonable to instead break lines with quoted-printable soft breaks?
Comment 61•25 years ago
|
||
We're discussing this within Eudora. Comments that have come up include: using
TAB CRLF will break a lot of stuff. Using Q-P soft line breaks will cause
problems because lots of clients mess that stuff up. Having a rule to delete the
SP CRLF if the preceding character is CJK means that when sending to older
clients, a space will be left between characters, which looks bad. Another
suggestion is to update the F=F spec to say that the SP CRLF sequence should
always be deleted, which means that SP SP CRLF needs to be used for soft breaks
in Latin charsets. This of course causes an extra space to be left between words
in current F=F clients, and in non-F=F clients.
Comment 62•25 years ago
|
||
Just to clarify somewhat my comment above: the suggestion to add a rule to
delete the SP CRLF if the preceding character is CJK means that an older client
will leave the space, causing a space to be left between CJ characters, which
looks bad. The suggestion to update the F=F spec to say that SP SP CRLF should
always be deleted would require an additional parameter, e.g.,
Content-Type=text/plain; format=flowed; flowed-version=2
Or something equally ugly; otherwise old text sent to new clients would have no
spaces between words. Even with the extra parameter, new text sent to old
clients will have an extra space between words.
Comment 63•25 years ago
|
||
Randy, what do you recommend we do now? I like the idea of a spec revision, but
we're running up against a 5/16 deadline.
You could revise the F=F spec to specify that TAB CRLF always get deleted. That
way you could get away without a flowed-version parameter.
Comment 64•25 years ago
|
||
(Actually, the two spaces idea doesn't help -- when sending to old clients, an extra space will always be there.)
In addition to the problems that some existing clients would have with tab, it would still leave a tab between CJK characters. Would that be better than a space?
It seems that an extra, unwanted character will be present between CJK characters when receivied by older clients.
Comment 65•25 years ago
|
||
Our Japanese team says tab is worse than space.
Not using F=F causes just as bad problems with poor line wrap and quotes in Japanese as in English, I'm told.
The proposed spec update is:
1. On generation, add SP CRLF where soft line break is desired following or preceding a CJK character.
2. If a space is desired at the point where a soft line break is being added (i.e., the user typed a space) add an
extra space: SP SP CRLF.
3. On receiving, if you see 1*SP CRLF, and the character before the space(s) is CJK, or the character
following the CRLF is CJK, delete the SP as well as the CRLF.
Comment 66•25 years ago
|
||
Old F=F clients and non-F=F clients will treat TAB CRLF as whitespace at the end
of the line followed by a hard line break. This is no worse than the current
alternative, which is to insert a hard line break. Since TAB is whitespace, it
will be invisible at the end of the line.
Comment 67•25 years ago
|
||
Sorry for the lateness of this comment: Ending lines with TAB isn't going to fly. I know too many cases where
TABs will do weird things, even if they are at the ends of lines (like ending up with double spacing because
the TAB wraps).
Randy's statement of the way to proceed seems to be the best; i.e., if you've got a CJK character on either side of the CRLF, remove one SP and the CRLF.
Comment 68•25 years ago
|
||
How do you define "a CJK character"?
Updated•24 years ago
|
Target Milestone: M17 → M20
Reporter | ||
Comment 69•24 years ago
|
||
I think John asked an important question.
How do we define CJK characters? For native encodings,
for sure, we know. But CJK characters are also written
in Uncode. If we have mixed script of ASCII & Japanese,
it could be hard to draw a clear distinction.
Some of the people on this bug report met at IMC MailConnect 6 in
San Jose recently and talked briefly about this issue.
Randy re-iterated his 5/11 proposal and John proposed
the "tab" solution as seen above.
So there is discussion going on.
In the meantime, we have a practical problem to solve for
Beta 2. If I'm not mistaken, we are inserting space+CR
into Japanese plain text mail. So far, I have seen only
Mozilla being able to understand this. Presumably, a new
version of Eudora will or is able to uderstand format-flowed.
Given that we probably will inconvenience many other mailers,
I have a proposal for Beta 2.
A. Let's consider a workaround -- Daniel, do you have a patch to
make it format=fixed rather than format=flowed when the send
charset is one of Japanese or Chinese? (We'll put aside
Unicode issue unles we know a reasonable approach to John's
question.) All other charsets can remain format=flowed as the
default.
B. We can put in this workaround until the dust settles and then
we can evaluate post Beta 2.
C. One way to go in future is to have an option to turn on/off
format=flowd in UI. For Japan, we might have to ship
Beta 2 with the pref set to OFF without the UI.
Daniel, do you have a safe workaround fix? If so, please tell us. I want
this bug to move forward to a Beta 2 workaround solution. We can keep
it open and see how we progress on revision of RFC 2646.
My poitn again is that we should not be sending out Japanese or
Plain text message which does not conform to an established
standard.
Nominatinf ro nsbeta2 and removing [-PDT].
Comment 70•24 years ago
|
||
The attachment from 02/08/00 11:32 was a safe workaround but since then things
have changed. The function where the decision about format=flowed or
format=fixed is made (ConvertBufToPlainText in nsMsgCompUtils) no longer knows
the target charset which it used to know.
Rich removed it in version 1.67 of the file (it wasn't used then) but maybe we
can put it back. Rich?
If we do it, it's easy to check the charset against a list of "good" (or "bad"
depending on our approach) charsets when turning format=flowed on or off.
Comment 71•24 years ago
|
||
If you want to make the changes, thats fine. Just let me know what you want me
to review.
- rhp
Comment 72•24 years ago
|
||
Putting on [NEED INFO] radar. Sending to intl folks to get help on PDT call.
Whiteboard: Patch to disable format=flowed for certain charsets waiting for someone to check it in. → [NEED INFO]Patch to disable format=flowed for certain charsets waiting for someone to check it in.
Comment 73•24 years ago
|
||
What I need to know is for which charsets we want to disable format=flowed and
their exact name (for instance if they uses underscores or hyphens).
Shift_JIS?
ISO-2022-JP?
EUC_JP?
Big5?
GBK?
HZ?
EUC_TW?
EUC_KR?
GB2312?
Comment 74•24 years ago
|
||
I think you can use nsICharsetConverterManager2 interface.
A sample code is here.
http://lxr.mozilla.org/seamonkey/source/mailnews/base/util/nsMsgI18N.cpp#440
Comment 75•24 years ago
|
||
Putting on [nsbeta2+] radar.
Whiteboard: [NEED INFO]Patch to disable format=flowed for certain charsets waiting for someone to check it in. → [nsbeta2+]Patch to disable format=flowed for certain charsets waiting for someone to check it in.
Reporter | ||
Comment 76•24 years ago
|
||
Daniel, please let us have an option to turn on
format-flowed to these language via prefs.js item
so that in the near future when the RFC issue is settled,
we can begin to use format=flowed for testing purposes.
Comment 77•24 years ago
|
||
So. I've reimplemented the ability to switch off format=flowed depending on
charset. I had to change a little more than I would have wanted, but I think
it's a safe fix. A say "think" because this is unchartered land for me.
For those daring ones who think format=fixed are for cowards, I added a pref
mailnews.disable_format_flowed_for_cjk. Set it to false and there will be no
check at charsets when deciding if format=flowed is good or not.
The check right now is to disable format=flowed if
nsMsgI18Nmultibyte_charset(charset) is true.
I will attach the patch for review.
Comment 78•24 years ago
|
||
Comment 79•24 years ago
|
||
I'll put this on M17, not because the issue will be resolved then, but as a
reminder to check in the bandage.
Target Milestone: M20 → M17
Comment 80•24 years ago
|
||
Comment 81•24 years ago
|
||
I rewrote some parts of the patch after some suggestions and good advises by
nhotta. This new patch is much smaller too. Rich, could you review the last
patch I attached?
Comment 82•24 years ago
|
||
Hi Naoki,
The patch looks good to me.
- rhp
Comment 83•24 years ago
|
||
Feel free to check it in. I'm still an unresponsible person without checkin
privileges. :-)
Comment 84•24 years ago
|
||
Let me test the last patch and I will check in.
Comment 85•24 years ago
|
||
I checked in the patch yesterday.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 86•24 years ago
|
||
I will reopen this. We have a workaround, but the fix (to be able to use
format=flowed with all charsets) doesn't exist yet. This has no longer any
target milestone but waits for an updated RFC. Hopefully.
Clearing nsbeta2 keyword and status.
Status: RESOLVED → REOPENED
Keywords: nsbeta2
Resolution: FIXED → ---
Whiteboard: [nsbeta2+]Patch to disable format=flowed for certain charsets waiting for someone to check it in.
Target Milestone: M17 → Future
Updated•24 years ago
|
Status: REOPENED → ASSIGNED
Reporter | ||
Comment 87•24 years ago
|
||
** Checked with 6/12/2000 late night build for Win32 **
The above build includes the patch authorde by Daniel and
checked in by nhotta.
Sfter looking at Plain text mail send for JPN (ISO-2022-JP),
Chinese (Big5 & GB2312), Korean (Unicode), and Unicode (UTF-8),
I am satisfied that these on the defaut menu are the only
ones which actually send out format=fixed msgs for plain text.
For other items on the menu like ISO-8859-x, KOI8-R, etc.
format=flowed msgs go out.
This confirms the workaround fix on an M17 build.
We will levae thi bug open for when the RFC has an
agreed upon method to do format=flowed for these languages.
We could have left out EUC-KR but hopefully that would not
inconvenience a lot of people at this stage. Other unresolved
issue is Thai which uses dictionary-based line breaking and
has no discernible line breaks either. Don't know what we should do
about it. Perhaps, we can leave it out of the list later also.
Comment 88•21 years ago
|
||
Maybe Bug 231701 is helpful to solve your problems: A _suggestion_ for a new
standard defines "format=flowed delsp=yes" which means that a space before the
CRLF linebreak is logically removed before the message is shown. So, Mozilla can
add a space and CRLF (format=flowed) where the new linebreak should be without
having a space in the final message that is shown.
The only open question would be: Where can/should such linebreaks be added?
Updated•20 years ago
|
Product: MailNews → Core
Comment 89•20 years ago
|
||
Is RFC 3676 already changed to "Standard" from "Draft"?
http://www.ietf.org/rfc/rfc3676.txt
Updated•20 years ago
|
Comment 90•18 years ago
|
||
WADA in comment #89:
> Is RFC 3676 already changed to "Standard" from "Draft"?
> http://www.ietf.org/rfc/rfc3676.txt
dunno, but daniel writes:
"It's probably still a problem since it was something the specification authors had forgotten about, though for such messages format=flowed has been disabled to avoid destroying people's mail.
I won't look at it more since I've upgraded myself to Opera after my years with Mozilla."
reassigning to default
Assignee: bratell → nobody
Status: ASSIGNED → NEW
QA Contact: momoi → mime
Updated•18 years ago
|
OS: Windows NT → All
Comment 91•17 years ago
|
||
FYI, RFC 3676 is IETF standards track. You can check the status for an RFC with
a URL like this:
http://tools.ietf.org/html/rfc3676
You will see RFC 3676 is "PROPOSED STANDARD" (the same standard status as IMAP).
Updated•16 years ago
|
Product: Core → MailNews Core
Updated•16 years ago
|
Priority: P3 → --
Target Milestone: Future → ---
Comment 92•15 years ago
|
||
depend on 553526. Before passing body text to serializer, unexpected space is inserted to by wrap setting.
Depends on: 553526
Updated•15 years ago
|
Updated•15 years ago
|
Assignee: nobody → m_kato
Comment 93•15 years ago
|
||
Comment 94•13 years ago
|
||
The problem still exists, see the screen shot and detailed report on:
http://forums.mozillazine.org/viewtopic.php?f=39&t=2315453
Also, see the bug report:
https://bugzilla.mozilla.org/show_bug.cgi?id=653342
Thanks.
Comment 95•12 years ago
|
||
Attachment #448471 -
Attachment is obsolete: true
Comment 96•12 years ago
|
||
Because rfc3676 is already changed to official RFC, "format=flowed,delsp=yes in text/plain mail composition or text/plain part generation" is sufficient as solution of this bug, isn't it?
Comment 97•12 years ago
|
||
This tests depends on the test for bug 653342.
I hope this test will work as expected but I am not 100% sure because there is no fix for this issue yet. ;-p
Comment 98•10 years ago
|
||
This utility method can be also used for unit test for this issue.
Attachment #648253 -
Attachment is obsolete: true
Attachment #8531547 -
Flags: review?(Pidgeot18)
Comment 99•10 years ago
|
||
https://treeherder.mozilla.org/ui/#/jobs?repo=try-comm-central&revision=34c80ed7d9b3
I forgot to add https://bugzilla.mozilla.org/attachment.cgi?id=8530843 in bug 553526 so some mozmill tests failed there.
I will post a new patch which will pass all mozmill tests without the fix for bug 553526.
Attachment #646002 -
Attachment is obsolete: true
Comment 100•10 years ago
|
||
https://treeherder.mozilla.org/ui/#/jobs?repo=try-comm-central&revision=f7d8285a17cf
This try seems good.
Attachment #8531548 -
Attachment is obsolete: true
Attachment #8531573 -
Flags: review?(Pidgeot18)
Comment 101•10 years ago
|
||
Note about mozmill tests for this issue.
After bug 1069518, we can not implement nsIEditor in JS so we can not write any xpcshell tests for this issue.
Writing mozmill test that 'delsp=yes' message can send correctly needs mailnews/test/fakeserver/smtpd.js. And smtpd.js imports auth.js from resource://testing-common/mailnews. But testing-common path can not be used in mozmill tests now. I've open a new bug for it. Bug 1107045. I will write more tests for this issue after bug 1107045 is ficed.
Comment 102•10 years ago
|
||
Comment on attachment 8531547 [details] [diff] [review]
Part 1: Move getMsgHeaders into test_message-helpers and rename it to get_message_contents
Review of attachment 8531547 [details] [diff] [review]:
-----------------------------------------------------------------
::: mail/test/mozmill/shared-modules/test-message-helpers.js
@@ +49,5 @@
> +/**
> + * Helper to get the full message content.
> + *
> + * @param aMsgHdr: nsIMsgDBHdr object whose text body will be read
> + * @return {Map(partnum -> message headers), Map(partnum -> message text)}
This documentation is wrong--you're not mentioning the object property names.
Attachment #8531547 -
Flags: review?(Pidgeot18) → feedback+
Comment 103•10 years ago
|
||
(In reply to Hiroyuki Ikezoe (:hiro) from comment #101)
> Writing mozmill test that 'delsp=yes' message can send correctly needs
> mailnews/test/fakeserver/smtpd.js. And smtpd.js imports auth.js from
> resource://testing-common/mailnews. But testing-common path can not be used
> in mozmill tests now. I've open a new bug for it. Bug 1107045. I will write
> more tests for this issue after bug 1107045 is ficed.
If you just need a body for the message, you can steal the draft-based approach I used in the test-multipart-related.js approach (which was split off from my large xpcshell test testing all of these conditions precisely because it needed an editor). There is very little done in SMTP- or NNTP-specific code that is not done when saving via draft, so I would be surprised if a draft-based approach were insufficient to test the code.
Comment 104•10 years ago
|
||
I've noticed that between this bug and bug 553526, as well as a few others in the tangled web of related bugs, you seem to be attempting to work on the problems surrounding the mess that is line wrapping in MIME messages. However, the patch in this bug and the patch in bug 553526 do not appear to fit together into one coherent story. This patch implies that delSp=true will be in effect whenever format=flowed is in effect, while bug 553526 implies they are two separate, independently configurable options. Could you please explain your overarching approach to these issues, and how all of these patches fit together? This is unfortunately necessary before I can try to fully review the remaining patches.
Comment 105•10 years ago
|
||
(In reply to Joshua Cranmer [:jcranmer] from comment #103)
> (In reply to Hiroyuki Ikezoe (:hiro) from comment #101)
> > Writing mozmill test that 'delsp=yes' message can send correctly needs
> > mailnews/test/fakeserver/smtpd.js. And smtpd.js imports auth.js from
> > resource://testing-common/mailnews. But testing-common path can not be used
> > in mozmill tests now. I've open a new bug for it. Bug 1107045. I will write
> > more tests for this issue after bug 1107045 is ficed.
>
> If you just need a body for the message, you can steal the draft-based
> approach I used in the test-multipart-related.js approach (which was split
> off from my large xpcshell test testing all of these conditions precisely
> because it needed an editor). There is very little done in SMTP- or
> NNTP-specific code that is not done when saving via draft, so I would be
> surprised if a draft-based approach were insufficient to test the code.
Actually I do not know it's sufficient or not because code paths of draft and sending message were hard to read for me. So I just thought we should write sending case for the safety.
Comment 106•10 years ago
|
||
(In reply to Joshua Cranmer [:jcranmer] from comment #104)
> I've noticed that between this bug and bug 553526, as well as a few others
> in the tangled web of related bugs, you seem to be attempting to work on the
> problems surrounding the mess that is line wrapping in MIME messages.
> However, the patch in this bug and the patch in bug 553526 do not appear to
> fit together into one coherent story. This patch implies that delSp=true
> will be in effect whenever format=flowed is in effect, while bug 553526
> implies they are two separate, independently configurable options. Could you
> please explain your overarching approach to these issues, and how all of
> these patches fit together? This is unfortunately necessary before I can try
> to fully review the remaining patches.
In short:
* As a message sender Thunderbird should always use "delsp=yes". (this bug)
* As a message receiver Thunderbird should care both of "delsp=yes" and no "delsp". (bug 553526)
Comment 107•10 years ago
|
||
Comment on attachment 8531573 [details] [diff] [review]
Part 2: Use delsp=yes for text/plain message
Review of attachment 8531573 [details] [diff] [review]:
-----------------------------------------------------------------
I have not fully played with this patch yet (there's one change in nsMsgSend.cpp that scares me, noted below). At the very least, I think this patch needs some tests in HTML scenarios as well, since some of the changes you are poking at affect HTML logic.
::: mail/test/mozmill/composition/test-delsp.js
@@ +48,5 @@
> +}
> +
> +const kMultibyteWord = "いろはにほへと";
> +const kRepeatCount = 1000;
> +let longMultibyteLine = create_long_multibyte_line(kRepeatCount);
const kLongMultibyteLine = kMultibyteWord.repeat(kRepeatCount);
(It's awfully nice of JS to give us a very easy way to make very long strings. ☺)
::: mailnews/compose/src/nsMsgSend.cpp
@@ +1525,5 @@
>
> //
> // Query the editor, get the body of HTML!
> //
> + uint32_t flags = nsIDocumentEncoder::OutputRaw | nsIDocumentEncoder::OutputNoFormattingInPre;
This change is especially scary to me. As I understand it, this means we are now relying entirely on EnsureLineBreaks to keep the content tight enough, and that method has a pretty lousy line break mechanism. I will not accept a change here without some tests that specifically test overly-long lines in HTML.
@@ +3152,5 @@
> body.Trim(" ", false, true);
>
> if (body.Length() > 0)
> {
> + if (!PL_strcmp(attachment1 [details] [diff] [review]_type, TEXT_HTML)) {
This condition should probably be inverted--if the body is text/plain, don't run the EnsureLineBreaks, otherwise run EnsureLineBreaks.
I don't think we get this far with bodies that aren't text/html or text/plain, but it never hurts to be safe.
Attachment #8531573 -
Flags: review?(Pidgeot18) → feedback+
Comment 108•10 years ago
|
||
Attachment #8537112 -
Flags: review+
Comment 109•10 years ago
|
||
Fixed the comment.
Attachment #8531547 -
Attachment is obsolete: true
Attachment #8537113 -
Flags: review?(Pidgeot18)
Comment 110•10 years ago
|
||
(In reply to Joshua Cranmer [:jcranmer] from comment #107)
> ::: mail/test/mozmill/composition/test-delsp.js
> @@ +48,5 @@
> > +}
> > +
> > +const kMultibyteWord = "いろはにほへと";
> > +const kRepeatCount = 1000;
> > +let longMultibyteLine = create_long_multibyte_line(kRepeatCount);
>
> const kLongMultibyteLine = kMultibyteWord.repeat(kRepeatCount);
Done.
> ::: mailnews/compose/src/nsMsgSend.cpp
> @@ +1525,5 @@
> >
> > //
> > // Query the editor, get the body of HTML!
> > //
> > + uint32_t flags = nsIDocumentEncoder::OutputRaw | nsIDocumentEncoder::OutputNoFormattingInPre;
>
> This change is especially scary to me. As I understand it, this means we are
> now relying entirely on EnsureLineBreaks to keep the content tight enough,
> and that method has a pretty lousy line break mechanism. I will not accept a
> change here without some tests that specifically test overly-long lines in
> HTML.
This was a mistake. Removed the change in this patch.
> @@ +3152,5 @@
> > body.Trim(" ", false, true);
> >
> > if (body.Length() > 0)
> > {
> > + if (!PL_strcmp(attachment1 [details] [diff] [review] [diff] [review]_type, TEXT_HTML)) {
>
> This condition should probably be inverted--if the body is text/plain, don't
> run the EnsureLineBreaks, otherwise run EnsureLineBreaks.
Done.
Attachment #8537115 -
Flags: review?(Pidgeot18)
Comment 111•10 years ago
|
||
This patch aims to parse format=flowed message on editing the message 'as new'.
Attachment #8537118 -
Flags: review?(Pidgeot18)
Comment 112•10 years ago
|
||
I will post those patch series to try server once bug 1111304 is fixed.
Updated•10 years ago
|
Attachment #8531573 -
Attachment is obsolete: true
Comment 113•10 years ago
|
||
https://treeherder.mozilla.org/ui/#/jobs?repo=try-comm-central&revision=dab45275104a
Unfortunately there are some failures but all failures are not related to this issue.
Updated•10 years ago
|
Attachment #8537113 -
Flags: review?(Pidgeot18) → review+
Comment 114•10 years ago
|
||
Comment on attachment 8537115 [details] [diff] [review]
Part 2 use_delsp.patch
Review of attachment 8537115 [details] [diff] [review]:
-----------------------------------------------------------------
To convince myself that these changes were okay, I started doing some deep dives of nsMsgSend-related code. It's all ugly done there. :-(
::: mail/test/mozmill/composition/test-delsp.js
@@ +48,5 @@
> +
> +function create_delsp_message(aMessage) {
> + let msgCompose = Cc["@mozilla.org/messengercompose/compose;1"]
> + .createInstance(Ci.nsIMsgCompose);
> + // 2: character width of each characaters in kMultibyteWord
Nit: characters
Attachment #8537115 -
Flags: review?(Pidgeot18) → review+
Comment 115•10 years ago
|
||
Comment on attachment 8537118 [details] [diff] [review]
Part 3 parse_flowed_draft_message.patch
Review of attachment 8537118 [details] [diff] [review]:
-----------------------------------------------------------------
This is some code where I'd really like to know the history since it seems so brain-damaged, but that history has alas gone to the great bit bucket in the sky.
Excepting the unnecessary parts of the condition in mimetpfl.cpp and the nits, this patch looks good and shouldn't cause problems.
::: mail/test/mozmill/composition/test-delsp.js
@@ +105,5 @@
> + let delspBody = create_delsp_message(longMultibyteLine);
> + let converter = Cc["@mozilla.org/intl/scriptableunicodeconverter"].
> + createInstance(Ci.nsIScriptableUnicodeConverter);
> + converter.charset = "UTF-8";
> + let convertedBody = converter.ConvertFromUnicode(delspBody);
I'm wary of using nsIScriptableUnicodeConverter (I'm not convinced it won't get deleted on us before long), but the message generator code only works on binary strings and not typed arrays, so there's not really a good alternative.
::: mailnews/mime/src/mimemsg.cpp
@@ +180,3 @@
> {
> + // if we are processing a flowed plain text line, we need to parse the
> + // line in mimeInlineTextPlainFlowedClass
Nit: capitalize the If and add punctuation.
::: mailnews/mime/src/mimetpfl.cpp
@@ +325,5 @@
> }
>
> + if (obj->options &&
> + obj->options->decompose_file_p &&
> + !obj->options->is_multipart_msg &&
This part of the condition is not necessary.
@@ +327,5 @@
> + if (obj->options &&
> + obj->options->decompose_file_p &&
> + !obj->options->is_multipart_msg &&
> + obj->options->decompose_file_output_fn &&
> + !obj->options->decrypt_p)
Nor is this one. (I think).
@@ +330,5 @@
> + obj->options->decompose_file_output_fn &&
> + !obj->options->decrypt_p)
> + {
> + return obj->options->decompose_file_output_fn(line,
> + length,
Nit: a tad too many spaces.
Updated•10 years ago
|
Attachment #8537118 -
Flags: review?(Pidgeot18) → review+
Comment 116•9 years ago
|
||
Removing myslef on all the bugs I'm cced on. Please NI me if you need something on MailNews Core bugs from me.
Assignee | ||
Comment 117•9 years ago
|
||
Just curious:
A bug which is 15 years old, is classified 'critical', has 11 votes, has 4 patches with r+.
Is this going to land one day?
Kent, do you think we can find someone to finish this off?
BTW, how does this relate to bug 553526?
I believe we have a strong user base in Japan, so perhaps it's about time we fix this.
Flags: needinfo?(rkent)
Flags: needinfo?(Pidgeot18)
Comment 118•9 years ago
|
||
(In reply to Jorg K (GMT+1) from comment #117)
> Just curious:
> A bug which is 15 years old, is classified 'critical', has 11 votes, has 4
> patches with r+.
> Is this going to land one day?
Good catch Jorg.
It won't land from hiro - he had to leave. He was very prolific developer for us, and so there also many other patches of his that need attention. I've just marked them up so that other developers can pick from them http://mzl.la/1HRuaJs
> Kent, do you think we can find someone to finish this off?
>
> BTW, how does this relate to bug 553526?
>
> I believe we have a strong user base in Japan, so perhaps it's about time we
> fix this.
Yes we do. Unfortunately have almost no developer representation from there.
Flags: needinfo?(Pidgeot18)
Assignee | ||
Comment 119•9 years ago
|
||
As far as I can see, this bug is assigned to Makoto Kato, not Hiro.
Wayne, please leave the NI as it is, since I'd like to draw attention to this bug. Since we have r+ on all four patches, I could just fix the nits of the last review and then have it all (crash-)landed.
But I prefer not to do that. Perhaps Joshua has a few spare cycles to take another look at this and get it landed. Or even Magnus, who has worked in international stuff before.
Flags: needinfo?(mkmelin+mozilla)
Flags: needinfo?(Pidgeot18)
Comment 120•9 years ago
|
||
(In reply to Jorg K (GMT+1) from comment #119)
> As far as I can see, this bug is assigned to Makoto Kato, not Hiro.
>
> Wayne, please leave the NI as it is, since I'd like to draw attention to
> this bug. Since we have r+ on all four patches, I could just fix the nits of
> the last review and then have it all (crash-)landed.
I don't see any problem in doing that, do you?
> But I prefer not to do that. Perhaps Joshua has a few spare cycles to take
> another look at this and get it landed. Or even Magnus, who has worked in
> international stuff before.
frankly IMO joshua's time is better spent on other issues. he barely has time to do his own reviews, which is one reason I took him off.
Assignee | ||
Comment 121•9 years ago
|
||
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #120)
> frankly IMO joshua's time is better spent on other issues. he barely has
> time to do his own reviews, which is one reason I took him off.
Frankly, Joshua has already poured hours into this (see for example comment #114).
So I see two options: We let everyone's work on this go to waste, or we confirm that the overall solution will hold, and then I can do the leg-work of fixing nits and un-rotting it where required. Can we please take the "resource" discussion elsewhere.
Assignee | ||
Comment 122•9 years ago
|
||
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #120)
> > I could just fix the nits of
> > the last review and then have it all (crash-)landed.
> I don't see any problem in doing that, do you?
I do. Having things landed means that you need to have at least a basic understanding of what's going on.
I think the first step is always to reproduce the problem. So here is what I've done:
1) Checked that mailnews.send_plaintext_flowed is set to 'true'
2) Created a plain-text message and pasted the text from comment #1 a good lot of times.
Yes, annoyingly, I get one long line which doesn't wrap.
3) <CTRL><SHIFT><ENTER> so place the message into the outbox. This is what I get:
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: base64
44GT44KM44Gv6ZW344GE5pel5pys6Kqe44Gu44OG44Kt44K544OI44Gn44GZ44Gu44Gn44CB
6KGM44GM5oqY44KM44KL44Go5oCd44GE44G+44GZ44CC44GT44KM44Gv6ZW344GE5pel5pys
...
Since the message is base64 encoded, there is no problem. I tried another encoding and get:
MIME-Version: 1.0
Content-Type: text/plain; charset=Shift_JIS
Content-Transfer-Encoding: base64
grGC6oLNkreCopP6lnuM6oLMg2WDTINYg2eCxYK3gsyCxYFBjXOCqpDcguqC6YLGjnaCooLc
greBQoKxguqCzZK3gqKT+pZ7jOqCzINlg0yDWINngsWCt4LMgsWBQY1zgqqQ3ILqgumCxo52
So whilst the long line is still a problem, I can't reproduce the data corruption issue.
Next I pasted the text from comment #1 into Notepad++ first, then copied it from there and pasted it into a TB message.
Result: Wrapping happened as desired, so apparently another clipboard "flavour" is pasted.
Sending this I got:
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
or
MIME-Version: 1.0
Content-Type: text/plain; charset=Shift_JIS
Content-Transfer-Encoding: 8bit
or
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-2022-JP
Content-Transfer-Encoding: 7bit
Shift_JIS and ISO-2022-JP show the message wrapped, but without the artefacts seen in attachment 5000 [details].
In other words:
Nothing will land here if there is not a problem that can be reproduced and proven fixed with the patches.
Maybe another Japanese developer can enlighten us here.
Flags: needinfo?(rkent)
Flags: needinfo?(mkmelin+mozilla)
Flags: needinfo?(ishikawa)
Comment 123•9 years ago
|
||
My sense is that we should land the bugs after making sure they apply cleanly, and any nits are fixed. There was sufficient work done earlier to decide they were worth landing, so any new cleanup work does not have to reproduce that.
Hi,
I noticed I was "needinfo"ed for this bug.
Let me digest what I can glean for this.
(I won't clear the needinfo request until I get to repeat the error thingy.)
However, I do notice that there seem to be strange white space character inserted
in the latest TB when I type in longish line (or pasting longish lines from another buffer) in HTML mail (!). Maybe unrelated bug.
Stay tuned. I am right now trying to fix OS X build issue (which seems to be due to duplicate entries in jar.mn for theme/osx and it seems to be finally solved by a patch in M-C tree. I created the similar patch to remove the duplication and FINALLY os x build proceeds to compile a binary for me (after six months or so).
Thank you for your patience.
Flags: needinfo?(ishikawa)
Flags: needinfo?(ishikawa)
Assignee | ||
Comment 125•9 years ago
|
||
(In reply to Kent James (:rkent) from comment #123)
> My sense is that we should land the bugs after making sure they apply
> cleanly, and any nits are fixed. There was sufficient work done earlier to
> decide they were worth landing, so any new cleanup work does not have to
> reproduce that.
Boy, I need to contradict Kent for a third time within twenty minutes:
Facts are: There are four patches here, which may or may not apply cleanly. One has some issues and nits.
The patches are about one year old. A lot can have happened in one year.
The people who knew what they are doing (Hiro, perhaps Makoto) have disappeared. Perhaps Joshua can enlighten us.
The person who will carry this to landing needs to
a) understand partly what this is about and why it was abandoned
b) do the most basic test: Doesn't work without the patch, works with the patch.
I tried doing b) but couldn't even reproduce the (alleged) problem.
If this gets landed, there needs to be someone who can at least be partly responsible.
Assignee | ||
Comment 126•9 years ago
|
||
Well, there seem to be many bugs around the area of CJK encoding, for example bug 653342. I also found http://forums.mozillazine.org/viewtopic.php?f=39&t=2315453.
That delivers a reproducible case:
Copy this into a plaintext e-mail a few times. Send the e-mail.
哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈
I get this result:
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
The content is pretty much broken. I'll see whether the patches fix anything.
Assignee | ||
Comment 127•9 years ago
|
||
OK, my mistake, even with the original text
これは長い日本語のテキストですので、行が折れると思います。
I can reproduce the problem now. Extra spaces are very had to spot for the European eye in a long string of Japanese characters.
I also looked at the four patches. There are 16 hunks, 10 that fix tests, 6 to change code. Those six still apply without a problem. I will do some testing.
(Sadly, right now on a Daily I get a crash upon sending a message:
Assertion failure: IsOuterWindow(), at nsGlobalWindow.cpp:7678)
Assignee | ||
Comment 128•9 years ago
|
||
Now that I fixed the assertion failure, here the test result. Summary: No good, I won't land that.
Firstly, I never ever reproduced the problem that the patches here are supposed to fix: Some corrupted CJK characters, see attachment 5000 [details].
What I can reproduce it the problem reported in bug 653342, that is, that spurious spaces are inserted.
Applying the code part of the patches (6 hunks) makes no difference. This is what I get:
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed; delsp=yes
Content-Transfer-Encoding: 8bit
Note the delsp=yes, which is new.
Maybe I'm missing something in viewing the message, but it's not included in the bug. The sent message is sprinkled with additional spaces.
Unless someone has some enlightenment, this is not going anywhere.
Note that bug 653342 is quite severe and Asian users are complaining heavily.
Comment 129•9 years ago
|
||
Bug 653342 is always in my list of top 5 bugs that need attention, but no active developer seems to understand it. If you have any ideas, feel free to pursue them.
Assignee | ||
Comment 130•9 years ago
|
||
I tested with this patch and don't see any difference other than "delsp=yes" being added.
Attachment #5068 -
Attachment is obsolete: true
Attachment #9947 -
Attachment is obsolete: true
Attachment #9982 -
Attachment is obsolete: true
Assignee | ||
Comment 131•9 years ago
|
||
I would like to stop working on this and instead focus on bug 653342 comment #158.
Flags: needinfo?(ishikawa)
Flags: needinfo?(Pidgeot18)
(In reply to Jorg K (GMT+1) from comment #131)
> I would like to stop working on this and instead focus on bug 653342 comment
> #158.
(In reply to ISHIKAWA, Chiaki from comment #124)
> However, I do notice that there seem to be strange white space character
> inserted
> in the latest TB when I type in longish line (or pasting longish lines from
> another buffer) in HTML mail (!). Maybe unrelated bug.
This behavior I think I have noticed is bug 653342 (!)
It *IS* annoying.
The only reason I have noticed it
- about 1/3 of the e-mail I compose is in English, and
- most of the time my Japanese sentences are short for intra-division communication
which are composed in TB and/or Emacs and copy&pasted into TB mail creation text buffer., and usually, longer stuff is put in the ATTACHMENT.
So I didn't realize this bug: However, of late, I have exchanged longish e-mails with my colleagues and noticed that when I cc'ed myself I get the strange blank as mentioned in bug 653342. I was wondering what caused it, but now I know TB does it.
I believe bug 653342 has much higher priority as of now.
TIA
Assignee | ||
Updated•9 years ago
|
Summary: format=flowed message generation must follow CJK line-breaking conventions → Implement RFC 3676 delsp=yes for format=flowed plaintext messages.
Assignee | ||
Comment 133•9 years ago
|
||
After looking at this for a few days now I have come to the conclusion that implementing "delsp=yes" is the desperate attempt to get rid of spaces which were added to *incorrect* wrapping of CJK strings.
"delsp=yes" is only a plaintext option, so it wouldn't fix the full problem anyway since we also get erroneous spaces in HTML e-mail. The problem needs to be treated at the root in bug 1225864. Refer to bug 1225864 comment #4 for details.
If at a later stage we decide that "delsp=yes" is a useful option, we can do this in a new bug.
Also note: The corruption shown in attachment 5000 [details] cannot be reproduced. It also has nothing to do with this bug which is about adding a flag to a message and deleting a space. I have no idea how this corruption was caused back in 2000. Since then, "formatting" of messages has changed in bug 83381 and bug 84261. It is about to change again in bug 1225904 which will guarantee that we don't destroy multi-byte characters by injecting a space in the middle.
Status: NEW → RESOLVED
Closed: 24 years ago → 9 years ago
Resolution: --- → WONTFIX
Assignee | ||
Comment 134•9 years ago
|
||
Note that if we want to go ahead and implement delsp=yes one day, here is another snippet:
Attachment 8530844 [details] [diff] of bug 553526.
Also see bug 168420 for more on format=flowed.
No longer blocks: 653342
Assignee | ||
Updated•9 years ago
|
Summary: Implement RFC 3676 delsp=yes for format=flowed plaintext messages. → Implement RFC 3676 delsp=yes for format=flowed plaintext messages for sending. Already implemented for reading in bug 231701.
Assignee | ||
Comment 135•9 years ago
|
||
Note that in bug 1225904 comment #15 we discuss the connection between "delsp=yes" and ISO-2022-JP.
I quote parts:
For ISO-2022-JP (https://www.ietf.org/rfc/rfc1468.txt) a Content-Transfer-Encoding of 7bit is suggested:
===
The ISO-2022-JP encoding is already in 7-bit form, so it is not
necessary to use a Content-Transfer-Encoding header. It should be
noted that applying the Base64 or Quoted-Printable encoding will
render the message unreadable in current JUNET software.
===
As it stands, ISO-2022-JP can be used for HTML encoding although it's only designed for plain text. In HTML we don't have "format=flowed; delsp=yes;" so we must encode base64 if there are long lines.
To fully adhere to the *suggestion* of the RFC 1468 (which is from 1993!), we'd have to do the following:
1) implement the "delsp=yes;" option and cut long CJK strings for plain text only.
2) not ever use ISO-2022-JP for HTML parts.
Comment 136•9 years ago
|
||
(In reply to Jorg K (GMT+1) [currently frustrated by waiting for reviews/feedback] from comment #133)
> If at a later stage we decide that "delsp=yes" is a useful option, we can do
> this in a new bug.
It is, in principle, a useful option to include still.
Assignee | ||
Comment 137•9 years ago
|
||
Yes.
But if we implement the "delsp=yes" option, we will also have to define the cases in which to use it, for example for ISO-2022-JP plaintext.
However, in the condensed patch I created from Hiro's four patches (attachment 8688319 [details] [diff] [review]) I don't see any decision taking. "delsp=yes" is *always* appended to "format=flowed", and that ain't desirable.
Plus the patch still maneuvers around EnsureLineBreaks() which is about to be eliminated in bug 1225904.
Apparently the whole sending logic is currently being rewritten, so let's leave this bug for later.
Comment 138•9 years ago
|
||
I don't understand why WONTFIX is the appropriate status for "If at a later stage we decide that "delsp=yes" is a useful option, we can do this in a new bug." Why do it in a new bug instead of this one? Usually WONTFIX represents an explicit decision to not implement something, ever.
Assignee | ||
Comment 139•9 years ago
|
||
I prefer not to start implementing something at comment #140. The bug has become way too confusing. The patches are not correct and would have to change once I finish the CJK cleanup elsewhere.
Joshua and I have this option on the radar, but I prefer not to add this option to the system until we get his rewrite. So well and truly, this bug here won't be fixed in its current form, which doesn't mean we will never implement the feature.
Please note that I changed the summary from:
format=flowed message generation must follow CJK line-breaking conventions
to
Implement RFC 3676 delsp=yes for format=flowed plaintext messages for sending.
Already implemented for reading in bug 231701
to make at least some sense of it.
If you want, I can carry it over to a new bug. If you insist I can also reopen it, but it won't receive any action until we get Joshua's rewrite. Joshua was already against meddling with the send logic in nsMsgSend.cpp and friends, but minor changes there are necessary to fix the CJK mess. Please refer to Joshua's statement in bug 1225904 comment #17. Forcing delsp=yes into the system now would just add more "magic EOL conversions".
Comment 140•9 years ago
|
||
I agree this should stay open if it's still a future want. (Though maybe the patches should be marked obsolete?)
This bug still has all the context, so no need to move it off to another bug.
Assignee: m_kato → nobody
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Updated•9 years ago
|
Status: REOPENED → NEW
Comment 141•9 years ago
|
||
(In reply to Jorg K (GMT+1) [currently frustrated by waiting for reviews/feedback] from comment #139)
> I prefer not to start implementing something at comment #140. The bug has
> become way too confusing. The patches are not correct and would have to
> change once I finish the CJK cleanup elsewhere.
>
> Joshua and I have this option on the radar, but I prefer not to add this
> option to the system until we get his rewrite. So well and truly, this bug
> here won't be fixed in its current form, which doesn't mean we will never
> implement the feature.
There are lots of bugs and RFEs that are open that are unlikely to get fixed until post-compose-rewrite, but that doesn't entail closing a bug as WONTFIX.
Assignee | ||
Comment 142•9 years ago
|
||
Looks like I got overruled. OK, we hang on to the bug from the year 2000 when neither Firefox nor Thunderbird existed ;-)
Assignee | ||
Comment 143•9 years ago
|
||
Comment on attachment 8688319 [details] [diff] [review]
Here are the 6 hunks of code changes extracted from the four patches.
This is just not right. Without the patch I get this e-mail:
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu
huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu
There is a space at the end of every line, since this is flowed.
With the patch I get:
Content-Type: text/plain; charset=windows-1252; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu
huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu huhu
There are *two* spaces at the end of every line, one to delete later and one to make it flowed.
This confirms one of the two reasons stated in comment #137 for rejecting this solution.
Attachment #8688319 -
Flags: review-
Assignee | ||
Comment 144•9 years ago
|
||
Comment on attachment 8537115 [details] [diff] [review]
Part 2 use_delsp.patch
This patch contains most of the code to which I object. See comment #137.
Attachment #8537115 -
Flags: review-
Assignee | ||
Comment 145•9 years ago
|
||
Sorry to repeat myself from comment #135:
The only use for "delsp=yes" I can see is for ISO-2022-JP encoded plaintext messages that are sent with CTE 7bit to comply with the "should" from RFC 1468, *if* they contain otherwise unbreakable long lines.
Frankly, I don't think that is a good enough reason to implement this, since these messages can also use CTE base64. Base64 might have upset someone in 1993 when the RFC was created, but not twenty years later.
It most certainly was a really bad idea to suggest to use "delsp=yes" for all flowed plaintext messages.
Summary: Implement RFC 3676 delsp=yes for format=flowed plaintext messages for sending. Already implemented for reading in bug 231701. → Implement RFC 3676 delsp=yes for format=flowed ISO-2022-JP CTE 7bit plaintext messages for sending (comment #145). Already implemented for reading in bug 231701.
Assignee | ||
Updated•9 years ago
|
Attachment #5000 -
Attachment description: JPG image of a corrupted Japanese message with a long long word/paragraph when sent with format=flowed → NOT REPRODUCIBLE in 2015 and most likely due to a different problem: JPG image of a corrupted Japanese message with a long long word/paragraph when sent with format=flowed
Comment 146•9 years ago
|
||
I'm not really sure why you are putting so much energy into a bug that in your opinion is either don't implement at all, or implement later. My earlier complaint about WONTFIX was a purely administrative comment, that it was an inappropriate status for "If at a later stage we decide that "delsp=yes" is a useful option, we can do this in a new bug." without any examination of the technical merits.
Are you really trying to get some of us to examine this issue more closely to decide if WONTFIX or "fix much later" is the best status? That does not seem to be important enough to be worthy of much attention.
Assignee | ||
Comment 147•9 years ago
|
||
We have a mess of CJK related problems and we need to define very clearly what will help to cut though this mess and what won't. Fixing this here will help in one very specific (unimportant) case which is now even mentioned in the summary. I am not pleased that we got close to landing "delsp=yes" for all flowed e-mail, even in plain English. Anyway, I agree, let's stop commenting on this further.
Assignee | ||
Comment 148•9 years ago
|
||
OK, the Japanese users have convinced me that delsp=yes is the right way to flow ISO-2022-JP text.
In bug 553540 the nsIDocumentEncoder::OutputFormatDelSp was already introduced and it works.
Makoto Kato implemented this, so after five years we finally come to use this option.
The reshuffling in bug 653342 made it an extra two lines to do delsp=yes.
So I decided to do it. Here you do: Attachment 8693229 [details] [diff].
So once bug 653342 lands, this will be fixed, too, and we can finally close a 15 year old bug.
Assignee | ||
Updated•9 years ago
|
Attachment #5000 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Attachment #8537112 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Attachment #8537113 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Attachment #8537115 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Attachment #8537118 -
Attachment is obsolete: true
Assignee | ||
Comment 149•9 years ago
|
||
Comment on attachment 8688319 [details] [diff] [review]
Here are the 6 hunks of code changes extracted from the four patches.
Everything is obsolete here.
This will be implemented by bug 653342.
Attachment #8688319 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Summary: Implement RFC 3676 delsp=yes for format=flowed ISO-2022-JP CTE 7bit plaintext messages for sending (comment #145). Already implemented for reading in bug 231701. → Implement RFC 3676 delsp=yes for format=flowed ISO-2022-JP CTE 7bit plaintext messages for sending (comment #148). Already implemented for reading in bug 231701.
Assignee | ||
Comment 150•9 years ago
|
||
Joshua: I've implemented delsp=yes over in bug 653342 and it seems to work when testing it manually and with the test I wrote.
However, Hiro in his patches had changes to mimetpfl.cpp and and mimemsg.cpp which I don't understand and haven't carried forward. You can best see them here:
https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=26734&attachment=8688319
Since you reviewed and approved this, can you please tell me what they are about and whether they are required.
Flags: needinfo?(Pidgeot18)
Assignee | ||
Comment 151•9 years ago
|
||
Or maybe it's better viewed in the original patch which also shows review comments:
https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=26734&attachment=8537118
Was this stuff added for testing??
Assignee | ||
Comment 152•9 years ago
|
||
Fixed by bug 653342.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago → 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 45.0
Assignee | ||
Comment 153•9 years ago
|
||
Looks like the changes to mimetpfl.cpp and mimemsg.cpp were necessary.
At least they fix some of bug 1230968. Let's move the discussion over there.
Flags: needinfo?(Pidgeot18)
Comment 154•8 years ago
|
||
Checked with Daily 53.0a1 (2016-12-18) (32-bit) on Win10.
1-1. mailnews.send_plaintext_flowed=true
1-2. compose text mail in iso-2022-jp, type lo------ng Japanese Kanji text without entering space or linebreak.
1-3. Send Later -> charset=iso-2022-jp; format=flowed; delsp=yes
2-1. mailnews.send_plaintext_flowed=false
2-2. compose text mail in iso-2022-jp, type lo------ng Japanese Kanji text without space or linebreak.
compose text mail in utf-8, type lo---ng SBCS chars including euro-sign without space or linebreak.
2-3. Send Later -> charset=iso-2022-jp or utf-8 without format=flowed.
Content-Transfer-Encoding: base64
Content-Language: en-US was seen in source. Why en-US even though I wrote Japanese Kanji characters?
Because I like en-US build?
Content-Languge shouldn't be language what Thunderbird uses.
Content-Languge should be language what I read, write, speak :-)
By sending in base64, "broken data when line break is inserted due to 1000bytes line length limitation" problem was resolved.
By this bug, pretty lo-----ng Japanese text without space/linebreak is sent in format=flowed;delsp=yes.
So, we'll not see bug report from "pretty lo-----ng Japanese text without space/linebreak" lovers in Japan.
Verified.
By the way, does how many mailers already support delsp=yes?
IIRC, Apple and/or Opera generated delsp=yes mail despite that many mailers didn't support delsp=yes in the past.
When did MS support delsp=yes? Can we already use format=flowed;delsp=yes freely?
Status: RESOLVED → VERIFIED
Comment 155•8 years ago
|
||
An "Asian crisis" has disappeared :-)
Comment 156•8 years ago
|
||
By the way, why you know so well/deeply about Character set/code, JavaScript, Thunderbird code and so on?
Assignee | ||
Comment 157•8 years ago
|
||
(In reply to WADA:World Anti-bad-Duping Agency from comment #154)
Thunderbird has supported displaying e-mail with delsp=yes for a log time, I believe since bug 231701.
TB has never produced delsp=yes e-mail until the Asian crisis was fixed and this bug is part of this fix. Normally we transmit long lines base64-encoded, bug RFC 1468 mandates CTE 7bit when ISO-2022-JP is used (https://www.ietf.org/rfc/rfc1468.txt). So only for ISO-2022-JP we use delsp=yes:
https://dxr.mozilla.org/comm-central/rev/b020cffa188fac078bff74c9e19c98b044ea4173/mailnews/compose/src/nsMsgCompUtils.cpp#1794
Content-Language: en-US comes from the dictionary you were using. Use a Japanese dictionary and you'll get something else.
(In reply to WADA:World Anti-bad-Duping Agency from comment #156)
> By the way, why you know so well/deeply about Character set/code,
> JavaScript, Thunderbird code and so on?
I am one of the core developers here, Thunderbird and Compose peer (https://wiki.mozilla.org/Modules/Thunderbird) and Mailnews MIME and Editor peer (https://wiki.mozilla.org/Modules/MailNews_Core).
More questions?
Comment 158•8 years ago
|
||
More questions.
Why you can reply to my useless questions in this bug despite that you are pretty busy to resolve important bugs? :-)
> core developers here
Hired? If so, by whom? (I can't imagine big profit from Thunderbird...)
Or professor of University or people of Government, Army, Research Center, Laboratory and so on?
Or people of corporate who supports free software, co-operates with free software(in order to get big profits finally :-) )
Are you Netscape people? Do you know David Bienvenu and MScott? (but you seem younger than them...)
Assignee | ||
Comment 159•8 years ago
|
||
As I said, every question/answer takes away time from more important problems. So don't ask ;-)
We are all volunteers, however, the Thunderbird Council is considering to hire a paid maintainer:
https://mail.mozilla.org/pipermail/tb-planning/attachments/20161119/7b31f41f/attachment.html
You should follow tb-planning.
None of the current volunteers is from the Netscape days. I don't know the people you mentioned, but I've had a short exchange with David once.
Updated•8 years ago
|
Whiteboard: [DelSP=yes is currently for iso-2022-jp only, so please don't open excess bug for other charset at B.M.O]
You need to log in
before you can comment on or make changes to this bug.
Description
•