Closed Bug 1689804 Opened 4 years ago Closed 4 years ago

Problem with format=flowed during forward when CTE is QP: Spaces at beginning of the line are ignored.

Categories

(MailNews Core :: Composition, defect, P2)

Tracking

(thunderbird_esr78 wontfix, thunderbird86 affected)

RESOLVED FIXED
87 Branch
Tracking Status
thunderbird_esr78 --- wontfix
thunderbird86 --- affected

People

(Reporter: klaus.bartosch, Assigned: klaus.bartosch)

References

(Regression)

Details

(Whiteboard: [TM:78.9.0)

Attachments

(6 files, 6 obsolete files)

1.32 KB, text/plain
Details
543 bytes, text/plain
Details
2.07 KB, patch
klaus.bartosch
: review+
Details | Diff | Splinter Review
405 bytes, text/x-go
Details
2.92 KB, patch
infofrommozilla
: review+
Details | Diff | Splinter Review
1021 bytes, patch
Details | Diff | Splinter Review
Attached file text-block.txt

STR:
Compose a message in UTF-8 (default) containing the text from the attachment, one long line.
Press Ctrl+Shift+Enter to send the message to the Outbox.
In the outbox, select the message, press Ctrl+F for forward or Ctrl+E for "Edit as new".

Observe that you now see "einterpretaciones" in the first line, however, in the original text there where two words "e interpretaciones".

That comes from the message source looking like:

A lo largo de mi vida he sido testigo de muchas historias, experiencias e=
 interpretaciones de la vida. Y me doy cuenta de que lo que m=C3=A1s apre=

So the space at the end for the flowing is glommed together with the space at the front of the line.

I've noticed words being glued together on forward/edit as new quite a bit recently, but until now, I haven't pinned it down. This was working in TB 68, but then TB 68 used Content-Transfer-Encoding: 8bit and not Content-Transfer-Encoding: quoted-printable.

Alice, could you please find the regression.

Flags: needinfo?(alice0775)

I see what happened. Somehow the encoding was switched from 8bit to QP (quoted-printable) and that has always been broken :-(

Setting mail.strictly_mime to true in TB 68 forces QP (quoted printable). We see this in TB 68:

A lo largo de mi vida he sido testigo de muchas historias, experiencias=20
e interpretaciones de la vida. Y me doy cuenta de que lo que m=C3=A1s apr=
ecio=20
es la honestidad. En estos =C3=BAltimos meses siento que se ha polarizado=
 a=C3=BAn=20

The original text is "polarizado aún", yet, when forwarding in TB 68, we get "polarizadoaún" with the space lost.

To my knowledge, QP wasn't widely used before, but it's used now, so that exposes the fault more frequently.

EDIT: I caused it to be used by setting mailnews.wraplength to 0.

Summary: Problem with format=flowed during forward → Problem with format=flowed during forward when CTE is QP: Spaces at beginning of the line are ignored.

Forget all about "strictly_mime" and "Send Later" 😉

STR:

  • Import this example to TB (It has a qp softlinebreak followed by a space in the next line between is and QP)
  • When you view it, it is correctly displayed as is QP
  • Do "Edit As New Message" (or forward it I guess - not tested) with it

=> In the editor the space character has vanished => isQP

Forget all about "strictly_mime" and "Send Later"

That was just to produce the message. You're right, I'm sorry, all we need is your example.
Totally off topic: I've only noticed this recently and right now I can't work out why this particular profile suddenly prefers QP over 8bit and exposes the bug. The code that picks the CTE is here:
https://searchfox.org/comm-central/rev/d0e849e295a3acdd98d3201ecb3bbff3b15007c6/mailnews/compose/src/nsMsgAttachmentHandler.cpp#298
EDIT: Found it: mailnews.wraplength set to 0 will trigger QP. I feel like a fool now :-( - Let me obsolete everything irrelevant here.

Attachment #9200194 - Attachment is obsolete: true

(In reply to Klaus B. from comment #10)

Totally off topic: I've only noticed this recently and right now I can't work out why this particular profile suddenly prefers QP over 8bit and exposes the bug.

I think you must have set mail.strictly_mime to TRUE yourself.
Why, I don't know. Possibly because of Bug 1435903?

I think you must have set mail.strictly_mime to TRUE yourself.

You missed my edit: mailnews.wraplength was 0 since I tried to help out in bug 1688333. I'm not sure why that triggers QP, but it does.

Alice, please accept my sincere apologies for sending you on a wild-goose-chase. I'm really sorry.

Alfred, are you looking into this or should I? I know you've been working on similar bugs (that's why I CC'ed you in the first place).

It's a little surprising that it's displayed correctly and then forward/edit-as-new (same code path) messes up.

No longer blocks: tb78found

(In reply to Klaus B. from comment #14)

Alfred, are you looking into this or should I?

Please take over.

No longer blocks: tb78found

OK, I've looked into it a bit. The space is removed here:
https://searchfox.org/comm-central/rev/d0e849e295a3acdd98d3201ecb3bbff3b15007c6/mailnews/mime/src/mimemsg.cpp#158

      if (mime_typep(kid, (MimeObjectClass*)&mimeInlineTextPlainFlowedClass)) {
        // Remove any stuffed space.
        if (length > 0 && ' ' == *line) {
          line++;
          length--;
        }
        return kid->clazz->parse_line(line, length, kid);

Typically in f=f leading spaces, so-called "stuffing", are removed.

I'm really confused here since the message text doesn't even appear to be flowed. Flowed should have a space at the end, no? =20?
If I change your example to

Bug 1689804 - Problem with format=flowed during forward when CTE is=20
 QP: Spaces at beginning of the line are ignored.

then everything is fine. If you look at comment #1, TB 68 did end the line in =20, only to mess up below at polarizado=.

So what's the issue here? That the QP we produce is wrong or that it's right and we don't process it correctly?

EDIT: I really don't have much of a clue about QP :-(

Flags: needinfo?(infofrommozilla)
Attached patch Debug patch (obsolete) — Splinter Review

Here's my debug patch. When displaying the message, I see this:

=== MimeMessage_parse_line | QP: Spaces at beginning of the line are ignored.
|
=== 1
=== 1
=== 0
=== 1
=== 0
=== 1
=== decode QP | QP: Spaces at beginning of the line are ignored.

When forwarding the message, I see this:

=== MimeMessage_parse_line | QP: Spaces at beginning of the line are ignored.
|
=== 1
=== 1
=== 1
=== 1
=== 1
=== 1
== removing leading space on | QP: Spaces at beginning of the line are ignored.
|
=== MimeInlineTextPlainFlowed_parse_line 1 |QP: Spaces at beginning of the line
are ignored.
|
=== MimeInlineTextPlainFlowed_parse_line 2 |QP: Spaces at beginning of the line
are ignored.
|
=== decode QP |QP: Spaces at beginning of the line are ignored.

So for displaying the message, the code block that removes the space isn't triggered since the condition isn't met. obj->options->decompose_file_p and obj->options->decompose_file_output_fn aren't set for display.

Who knows what's going on here? Anyone left from the dinosaur times of the project? Ben? You tried to get a grip on this in bug 1463289.

Assignee: nobody → klaus.bartosch
Flags: needinfo?(ben.bucksch)

(In reply to Klaus B. from comment #16)

OK, I've looked into it a bit. The space is removed here:

  if (mime_typep(kid, (MimeObjectClass*)&mimeInlineTextPlainFlowedClass)) {
    // Remove any stuffed space.

Typically in f=f leading spaces, so-called "stuffing", are removed.

RFC 3676 - 4.1. Interpreting Format=Flowed:
| If the first character of a line is a space, the line has been
| space-stuffed (see Section 4.4). Logically, this leading space is
| deleted before examining the line further (that is, before checking
| for flowed).

So it is correct to remove the space character.
This in turn means that:
a) the generation is wrong when sending.
b) we shouldn't show the stuffed space in the view.

I'm really confused here since the message text doesn't even appear to be flowed. Flowed should have a space at the end, no? =20?

This is indeed tricky.
As I already said, there is a 'soft line break' ('SLB') at the end of the line. But it is a 'SLB' in the context of QP ("="+CRLF)

RFC 2045 - 6.7. Quoted-Printable Content-Transfer-Encoding
| (5) (Soft Line Breaks) The Quoted-Printable encoding
| REQUIRES that encoded lines be no more than 76
| characters long. If longer lines are to be encoded
| with the Quoted-Printable encoding, "soft" line breaks
| must be used. An equal sign as the last character on a
| encoded line indicates such a non-significant ("soft")
| line break in the encoded text.

So what's the issue here? That the QP we produce is wrong or that it's right and we don't process it correctly?

This 'QP-SLB' is added here:
https://searchfox.org/comm-central/source/mailnews/mime/src/mimeenc.cpp#966

This doesn't care about the following space. But because the space is now in the firs column of the next line, it has to be space-stuffed (doubled) when format-flowed is used.
This would fix Problem a). But because of b) now I see 2 spaces in the view of such an article.


Just to note:
RFC 3676 - 4.2. Generating Format=Flowed
| [Quoted-Printable] encoding SHOULD NOT be used with Format=Flowed
| unless absolutely necessary (for example, non-US-ASCII (8-bit)
| characters over a strictly 7-bit transport such as unextended
| [SMTP]).

Thanks for all the explanation. So what needs to be fixed here is:

a) the generation is wrong when sending.
b) we shouldn't show the stuffed space in the view.

b) seems easy, we just move the code that does it before the block. Any suggestion on a)? Looks like your stating one above. Could that be made into a patch?

https://searchfox.org/comm-central/source/mailnews/mime/src/mimeenc.cpp#966

Please get into the habit of hitting the "Permalink" feature on Searchfox if you want your link to enter eternity:
https://searchfox.org/comm-central/rev/93c19c069f2865b2a9f994c55f2673553a928069/mailnews/mime/src/mimeenc.cpp#966

Comment on attachment 9200194 [details]
text-block.txt

If this is a generation issue, then this attachment will come in handy for testing.

As for part a), wouldn't it be as "easy" as adding space stuffing after the QPEncoder::Write() has been called in the MIME code for f=f? Likely in mimetpfl.cpp (text plain flowed). A breakpoint in the debugger should show you were that is. Rant: Given all the MIME pointer magic that seems to come from a different universe, it's hard to tell by just reading code. Likely that the QPEncoder::Write() will be assigned to some function pointer field. Alfred, you should have a go ;-)

Attachment #9200194 - Attachment is obsolete: false
Attached patch 1689804-fix-QP-forward.patch (obsolete) — Splinter Review

OK, this is part b), I now see "isQP" displayed. Looking into part a) "generation" later today unless someone beats me to it.

I looked at this a little further. I had a bit of trouble debugging this, looks like sending might go through the new JS backend. I was more successful when setting mailnews.send.jsmodule to false.

Overall, this is very unfortunate. Looks like we get the body from the composer DOM and send it to the Mozilla serialiser with the "flowed" flag. What we get back is a string containing the long line from attachment 9200194 [details] already broken into single lines. That is handed to QPEncoder::Write(). That function calls the callback it was given here:
https://searchfox.org/comm-central/rev/6e5375ffde39267f9c17db22ccbdb8bc1cc9030e/mailnews/mime/src/mimeenc.cpp#926
and two more times down the line. The callback is mime_write_message_body() via mime_encoder_output_fn(), both in nsMsgSend.cpp.

IOW, the only chance we have to get the stuffing right is in the QP encoder itself. But that has no knowledge of the fact that it's dealing with format=flowed. The encoder is also set in nsMsgSend.cpp:
https://searchfox.org/comm-central/search?q=SetEncoder%28&path=nsMsgSend.cpp&case=false&regexp=false

So I can offer these ideas:
The QPEncoder could check the mailnews.send_plaintext_flowed preference and/or check heuristically whether the input appears flowed, so "mostly" a space before the end of the line.
The other option would be a hack. Since to my knowledge TB is the only mail client that produces format=flowed, we could infringe the spec as we already do for display and not remove the space.

Any more ideas, Alfred? I can't NI you since you never cleared the NI. Was that a trick ;-)

This would be my approach.
Without patch:

[...]
es la honestidad. En estos =C3=BAltimos meses siento que se ha polarizado=
 a=C3=BAn=20

With patch:

[...]
es la honestidad. En estos =C3=BAltimos meses siento que se ha polarizado=20
a=C3=BAn=20
Flags: needinfo?(infofrommozilla)

Nice, can we fix this for TB 78 as well? Can you please add a version for mailnews.send.jsmodule to false, so the old C++ backend. And what about the "part b)" patch: 1689804-fix-QP-forward.patch. We want that or we don't?

What if there is more then one space?
Do we need to maintain mimeenc.cpp as well?
We certainly need a test!?
And probably one or the other test is also broken by this.

(In reply to Klaus B. from comment #22)

Any more ideas, Alfred? I can't NI you since you never cleared the NI. Was that a trick ;-)

Sure, you would have set the flag again anyway. No, seriously, that was a mid-air collision.

(In reply to Klaus B. from comment #24)

Nice, can we fix this for TB 78 as well?

That would be C++ only. Is the bug that significant? It only occurs under very rare conditions.

Can you please add a version for mailnews.send.jsmodule to false, so the old C++ backend.

Sure, that would be mimeenc.cpp.

And what about the "part b)" patch: 1689804-fix-QP-forward.patch. We want that or we don't?

I would like that.

Who does the review?

Posting this as a mid-air after comment #25:

(In reply to Alfred Peters from comment #25)

What if there is more then one space?
Do we need to maintain mimeenc.cpp as well?
We certainly need a test!?
And probably one or the other test is also broken by this.

  1. I don't understand the first question. The existing code removes one space only, there is no loop.
  2. I don't know, but it needs to be patched for TB 78.
  3. Feel free.
  4. Remains to be seen, I doubt it, depends on a try run. There is this https://searchfox.org/comm-central/source/mail/test/browser/composition/browser_drafts.js#313, but it's not using QP (I think).

If we took my "part b)" patch, we'd be more RFC compliant, but we'd break the display of previously badly encoded messages, so perhaps there isn't much point. What do you think?

(In reply to Alfred Peters from comment #26)

That would be C++ only. Is the bug that significant? It only occurs under very rare conditions.

It hit me many times until I worked it out, so yes. It will be the same fix in C++, the JS code was copied from it.

I would like that.

I'm not sure I would. But at least you'd see the bad state during display already.

Review, hmm, if you do the C++ part, then Ben C, Ping for the JS part.

Flags: needinfo?(ben.bucksch)
Comment on attachment 9200271 [details] [diff] [review] 1689804-fix-QP-forward.patch Review of attachment 9200271 [details] [diff] [review]: ----------------------------------------------------------------- ::: mailnews/mime/src/mimemsg.cpp @@ +160,1 @@ > #ifdef MIME_DRAFTS Shouldn't your change be inside the #ifdef?
Comment on attachment 9200271 [details] [diff] [review] 1689804-fix-QP-forward.patch Review of attachment 9200271 [details] [diff] [review]: ----------------------------------------------------------------- ::: mailnews/mime/src/mimemsg.cpp @@ +160,1 @@ > #ifdef MIME_DRAFTS MIME_DRAFTS is defined, hell will possibly break loose if you undefine it. Personally, I'd rip it out completely. As for the question: I'd say "no" since this is about display and not drafts.

Comment on attachment 9200271 [details] [diff] [review]
1689804-fix-QP-forward.patch

Alright, let's get some more opinions on this. I'll change the commit message to:
Bug 1689804 - Always remove space stuffing for format=flowed, not only when forwarding/edit-as-new. r=benc

Attachment #9200271 - Flags: review?(benc)
Attachment #9200271 - Flags: feedback?(infofrommozilla)

Added the changes to mimeenc.cpp.
Before I ask for review, I'll check the tests. But that has to wait till tomorrow.
nDv.

Attachment #9200503 - Attachment is obsolete: true
Attachment #9200264 - Attachment is obsolete: true

Ben, if you get there before "tomorrow", please look at both patches and send them to try.

Sorry, I cannot do that today.

Comment #34 referred to Ben C of whom I've already requested review.

Comment on attachment 9200271 [details] [diff] [review] 1689804-fix-QP-forward.patch Review of attachment 9200271 [details] [diff] [review]: ----------------------------------------------------------------- Looks good for the b) part in comment 19. try run here (with both patches): https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=f20d71a44a38a7cb06f2a6ef10feaba4950886df
Attachment #9200271 - Flags: review?(benc) → review+

Comment on attachment 9200519 [details] [diff] [review]
Save a space to the previous line on a quoted printable soft line break.

I don't see that these patches caused any of the test failures.

Attachment #9200519 - Flags: review?(benc)

I don't see that these patches caused any of the test failures.

FYI, we don't have a lot of test coverage, and particularly the MIME converters are not covered with a lot of problematic emails.

As promised with a test included.

(In reply to Klaus B. from comment #38)

Comment on attachment 9200519 [details] [diff] [review]
I don't see that these patches caused any of the test failures.

I don't see anything like that either.

Attachment #9200519 - Attachment is obsolete: true
Attachment #9200519 - Flags: review?(benc)
Flags: needinfo?(benc)
Attachment #9200951 - Flags: review?(benc)
Attachment #9200271 - Flags: feedback?(infofrommozilla) → feedback+
Flags: needinfo?(benc)

Same patch, but with this commit message:
Bug 1689804 - Always remove space stuffing for format=flowed, not only when forwarding/edit-as-new. r=benc

Attachment #9200271 - Attachment is obsolete: true
Attachment #9200957 - Flags: review+

Sorry, forgot to lint the test.

Attachment #9200951 - Attachment is obsolete: true
Attachment #9200951 - Flags: review?(benc)
Attachment #9200976 - Flags: review?(benc)
Comment on attachment 9200976 [details] [diff] [review] Save a space to the previous line on a quoted printable soft line break. Review of attachment 9200976 [details] [diff] [review]: ----------------------------------------------------------------- ::: mailnews/compose/test/unit/test_messageBody.js @@ +76,5 @@ > + msgData = mailTestUtils.loadMessageToString( > + gDraftFolder, > + mailTestUtils.firstMsgHdr(gDraftFolder) > + ); > + let endOfHeaders = msgData.indexOf("\r\n\r\n"); Are these line endings platform dependent? It's just disappointing if a patch gets backed out due to CRLF vs. LF issues. I've seen this happen :-( https://searchfox.org/comm-central/rev/f5f081ba007d398dace13ad30cd571f5132ad4a1/mailnews/compose/test/unit/test_longLines.js#105
Comment on attachment 9200976 [details] [diff] [review] Save a space to the previous line on a quoted printable soft line break. Review of attachment 9200976 [details] [diff] [review]: ----------------------------------------------------------------- Looks pretty good, but I think the rule is also needed for tabs as well as spaces, right? ::: mailnews/compose/src/MimeEncoder.jsm @@ +407,5 @@ > } > > if (currentColumn >= 73) { > // Soft line break for readability > + if (i + 1 < this._bodySize && this._body[i + 1] == " ") { I think [QP](https://tools.ietf.org/html/rfc2045#section-6.7) says this rule needs to apply for tabs as well as spaces? ::: mailnews/compose/test/unit/test_messageBody.js @@ +68,5 @@ > + > + fields = new CompFields(); > + fields.forceMsgEncoding = true; > + fields.to = "Nobody <nobody@tinderbox.invalid>"; > + fields.subject = "Test QP encoding for non-ascii and trailing tab"; "tab"? should be "space"? Actually, I'd ditch the non-ascii part of the string (the "\u00e1", below ) for this test and just concentrate on the trailing space. @@ +76,5 @@ > + msgData = mailTestUtils.loadMessageToString( > + gDraftFolder, > + mailTestUtils.firstMsgHdr(gDraftFolder) > + ); > + let endOfHeaders = msgData.indexOf("\r\n\r\n"); (EDIT: not clear here, but this comment was replying to the question from Klaus about platform-dependent line endings) [rfc 2045](https://tools.ietf.org/html/rfc2045#section-6.7) says CRLF is canonical, so I think it's reasonable for the test to expect it. ::: mailnews/mime/src/mimeenc.cpp @@ +965,5 @@ > > if (mCurrentColumn >= 73) // Soft line break for readability > { > *out++ = '='; > + if (in + 1 < end && in[1] == ' ') { Again, tabs too?
Attached file qp.go

I hacked up a little program so I could easily try out some edge cases. It's in go, and just encodes stdin with QP encoding, limiting the lines to 76 chars.
Posting it here just in case it's useful to anyone else.

(self-deleted)

this rule needs to apply for tabs as well as spaces?

Well, we're basically here because of the code in the patch I provided that removes a leading space. We just want to avoid that QP produces a leading space.

(In reply to Klaus B. from comment #43)

Comment on attachment 9200976 [details] [diff] [review]

  • let endOfHeaders = msgData.indexOf("\r\n\r\n");

Are these line endings platform dependent? It's just disappointing if a
patch gets backed out due to CRLF vs. LF issues. I've seen this happen :-(
https://searchfox.org/comm-central/rev/
f5f081ba007d398dace13ad30cd571f5132ad4a1/mailnews/compose/test/unit/
test_longLines.js#105

I have read that too. The code comes from the exact file:
https://searchfox.org/comm-central/rev/f5f081ba007d398dace13ad30cd571f5132ad4a1/mailnews/compose/test/unit/test_longLines.js#35
If the comment were true, the test shouldn't work either.

(In reply to Ben Campbell from comment #44)

Comment on attachment 9200976 [details] [diff] [review]
Save a space to the previous line on a quoted printable soft line break.

Review of attachment 9200976 [details] [diff] [review]:

Looks pretty good, but I think the rule is also needed for tabs as well as
spaces, right?

I think QP says this rule
needs to apply for tabs as well as spaces?

For QP, this is true. The QP code is actually correct. The bug only occurs in connection with format-flowed. And there only the space disturbs.

  • fields.subject = "Test QP encoding for non-ascii and trailing tab";

"tab"? should be "space"? Actually, I'd ditch the non-ascii part of the
string (the "\u00e1", below ) for this test and just concentrate on the
trailing space.

I took the subject from the previous test. Something like:
"Bug 1689804 - Save a space to the previous line on a quoted printable soft line break."
would be more suitable.

(In reply to Alfred Peters from comment #48)

(In reply to Klaus B. from comment #43)

Comment on attachment 9200976 [details] [diff] [review]

  • let endOfHeaders = msgData.indexOf("\r\n\r\n");

https://searchfox.org/comm-central/rev/f5f081ba007d398dace13ad30cd571f5132ad4a1/mailnews/compose/test/unit/test_longLines.js#35
If the comment were true, the test shouldn't work either.

The comment could refer to the output of the BASE64 decoder.

So what's the next step here? Address the review comments and request anther review?

(In reply to Klaus B. from comment #51)

So what's the next step here? Address the review comments and request anther review?

You are so frenetic. 😉

@Ben Campbell
Did my explanations regarding the tabs make you agree, or should we handle the tabs as well?
I would then upload a new version with the changed subject of the test mail in any case.

Flags: needinfo?(benc)

You are so frenetic.

https://dict.leo.org/englisch-deutsch/frenetic - only in the best possible sense. Thunderbird used to be riddled with "little" issues, like bad encodings, mojibake, etc. To my knowledge, all those issue that made the product look unprofessional have been resolved. I was quite surprised that even at version 87 there are still little annoyances like this one here looming, especially in view of the fact that many people might have switched to QP due to the incompetency of Yahoo and friends. So I find this issue a little urgent. That said, you need to keep the geographic distribution of the project in mind. You might get Ben's answer while you're sleeping. So the more expedient way would be to fix the review issues, not treat tabs, since it's not necessary, and if Ben still doesn't like it, you can always fix it later. An r? and a NI at the same time doesn't make much sense IMHO, I'd provide the next rev. of the patch without asking how the reviewer would like it. Anyway, just personal style.

Comment on attachment 9200976 [details] [diff] [review] Save a space to the previous line on a quoted printable soft line break. Review of attachment 9200976 [details] [diff] [review]: ----------------------------------------------------------------- Well, it sounds like the patch works and is urgent, so I'm happy for it to go in. r+ But I'd still update the subject line in the test, and get rid of that unicode in the body to make it clearer exactly what the test is testing. Regarding the tabs question: I don't quite understand the bigger picture, so I really don't know if the tabs are an issue or not. I'll work up some examples to satisfy myself that it's all QP-ing as intended (and probably add some more tests). If I find any issues I'll let you know! Thanks!
Attachment #9200976 - Flags: review?(benc) → review+

(In reply to Ben Campbell from comment #54)

Comment on attachment 9200976 [details] [diff] [review]

and get rid of that
unicode in the body to make it clearer exactly what the test is testing.

That doesn't work. We need a 8-bit char to force quoted-printable. Otherwise we'll end with a 7bit article.

(In reply to Alfred Peters from comment #55)

(In reply to Ben Campbell from comment #54)

Comment on attachment 9200976 [details] [diff] [review]
That doesn't work. We need a 8-bit char to force quoted-printable. Otherwise we'll end with a 7bit article.

Ahh, I see! Probably worth adding a little comment to that effect in that test - I was really scratching my head over it.

Flags: needinfo?(benc)

(In reply to Ben Campbell from comment #56)

(In reply to Alfred Peters from comment #55)

(In reply to Ben Campbell from comment #54)

Comment on attachment 9200976 [details] [diff] [review]
That doesn't work. We need a 8-bit char to force quoted-printable. Otherwise we'll end with a 7bit article.

Ahh, I see! Probably worth adding a little comment to that effect in that test - I was really scratching my head over it.

I take it all back and claim the opposite.
fields.forceMsgEncoding = true; does indeed force the encoding, How surprising.

Test message adjusted.

  • Subject corrected.
  • Body reduced to ASCII.
Attachment #9200976 - Attachment is obsolete: true
Attachment #9201248 - Flags: review+
Component: General → Message Compose Window

Checkin needed for two patches.

Component: Message Compose Window → Composition
Product: Thunderbird → MailNews Core
Target Milestone: --- → 87 Branch

For ESR78, only the C++ hunk applies. MimeEncoder.jsm and the test are new and not in ESR78. I'll request uplift when the time comes.

Status: NEW → ASSIGNED

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/63087f70a26a
Always remove space stuffing for format=flowed, not only when forwarding/edit-as-new. r=benc
https://hg.mozilla.org/comm-central/rev/289ac419affc
Save a space to the previous line on a quoted printable soft line break. r=benc

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED

Comment on attachment 9201248 [details] [diff] [review]
Save a space to the previous line on a quoted printable soft line break.

[Approval Request Comment]
Regression caused by (bug #): It was always wrong :-(
User impact if declined: QP format=flowed messages with leading spaces which are removed in an inconsistent way under some circumstances leading to surprising results, that is, words glued together.
Testing completed (on c-c, etc.): Yes, on C-C.
Risk to taking this patch (and alternatives if risky):
This part is not risky. It fixes QP encoding to put a space at the end of a line instead of the beginning of the next line.

Attachment #9201248 - Flags: approval-comm-beta?

Comment on attachment 9201366 [details] [diff] [review]
Save a space to the previous line on a QP soft line break - ESR78 version

See previous comment.

Attachment #9201366 - Flags: approval-comm-esr78?

Comment on attachment 9200957 [details] [diff] [review]
1689804-fix-QP-forward.patch

[Approval Request Comment]
Regression caused by (bug #): It's away been wrong.
User impact if declined: Leading space in QP format=flowed messages treated inconsistently leading to surprising results.
Testing completed (on c-c, etc.): Yes, C-C.
Risk to taking this patch (and alternatives if risky):
The point is that the other patch fixes bad QP encoding. This patch here unifies the treatment of leading spaces in format=flowed messages, they are now consistently removed, so displaying a message and forwarding it will give the same result. Before, displaying a badly encoded message would display the space, but forwarding would remove it, which is surprising. Now the behaviour is consistent with the down-side that messages already in users' mailboxes are now displayed without the space. You could argue that you don't want such a change to go into the current stable release, but delaying it won't make the situation any better. If it goes out in TB 91 ESR, then users will notice the "missing" spaces then. And TB 91 ESR will come with a whole heap of other changes (and regressions), so IMHO it wouldn't be wise to deal with it then.

Something should go into the release notes, like:
Corrected quoted-printable encode if format=flowed messages. That can lead to seemingly incorrect display of existing message, that is, some words may not be separated by a space any more.

Attachment #9200957 - Flags: approval-comm-esr78?
Attachment #9200957 - Flags: approval-comm-beta?

Comment on attachment 9200957 [details] [diff] [review]
1689804-fix-QP-forward.patch

[Triage Comment]
Thanks for the patch and test.

It feels like this should have a full beta before going to esr, so it may as well just ride the train to the next beta, 87, rather than uplift. And then potentially take for 78.9.0

Attachment #9200957 - Flags: approval-comm-beta? → approval-comm-beta-

Comment on attachment 9201248 [details] [diff] [review]
Save a space to the previous line on a quoted printable soft line break.

[Triage Comment]

Attachment #9201248 - Flags: approval-comm-beta? → approval-comm-beta-

Planned target at 78.9.0, or late 78.8.x

Whiteboard: [TM:78.9.0

(In reply to Klaus B. from comment #64)

Something should go into the release notes, like:
Corrected quoted-printable encode if format=flowed messages. That can lead to seemingly incorrect display of existing message, that is, some words may not be separated by a space any more.

The docs say "past tense, describe the problem that was fixed, not the solution."

"When encoding with Quoted-Printable, spaces adjacent to soft line breaks were not encoded properly, resulting in missing spaces after decoding."

Hopefully I interpreted that correctly.

You didn't. When the fix lands, there will be missing spaces now since we're now following the RFC and ignoring them in all cases when before they were only ignored upon forward/edit as new, but not display. The original text is correct including the tenses. I see no reason to uppercase quoted-printable, the format=flowed is also essential. You could rephrase it towards what you had:

When encoding with format=flowed messages with quoted-printable, spaces following soft line breaks were not encoded properly. The correction can now lead to seemingly incorrect display of existing messages, that is, some words may not be separated by a space any more.

The second sentence is the important one, that why I kept the first sentence as short as possible.

Apologies for continuing to massage it, but our release note format has a pretty limited amount of space, so I'm trying to make it fit. How's this?

When encoding messages with quoted-printable and format=flowed, spaces following soft line breaks were not encoded properly; existing messages so encoded may display with some words not separated by a space

Nice :-) - How's this?

FIXED: Encoding of spaces following soft line breaks in messages using quoted-printable and format=flowed; existing messages which were previously incorrectly encoded may now display with some words not separated by a space.

I'll take it.

Comment on attachment 9200957 [details] [diff] [review]
1689804-fix-QP-forward.patch

[Triage Comment]
Approved for esr78

Attachment #9200957 - Flags: approval-comm-esr78? → approval-comm-esr78+

Comment on attachment 9201366 [details] [diff] [review]
Save a space to the previous line on a QP soft line break - ESR78 version

[Triage Comment]
Approved for esr78

Attachment #9201366 - Flags: approval-comm-esr78? → approval-comm-esr78+

I would like to request some clarification regarding the way QP and format=flowed interact. It is my impression that given that QP is a transfer encoding and not part of the content type, it would make sense to first decode the QP encoding, before handling format=flowed.

That would mean that one of the examples from this bug report

A lo largo de mi vida he sido testigo de muchas historias, experiencias=20
e interpretaciones de la vida. Y me doy cuenta de que lo que m=C3=A1s apr=
ecio=20
es la honestidad. En estos =C3=BAltimos meses siento que se ha polarizado=
 a=C3=BAn=20

is first decode as (note the spaces at the end of the lines)

A lo largo de mi vida he sido testigo de muchas historias, experiencias 
e interpretaciones de la vida. Y me doy cuenta de que lo que más aprecio 
es la honestidad. En estos últimos meses siento que se ha polarizado aún 

which is then finally decoded as

A lo largo de mi vida he sido testigo de muchas historias, experiencias e interpretaciones de la vida. Y me doy cuenta de que lo que más aprecio es la honestidad. En estos últimos meses siento que se ha polarizado aún

Should this assumption be correct, then I find it strange that the QP encoder needs special handling for the spaces required by format=flowed. After all, encoding and decoding something using QP should not affect the content. I believe that this bug is caused by incorrect ordering of the format=flowed and QP encoders/decoders. However, as I am not familiar with the thunderbird code that does this (nor with how other clients handle this) and would like to know whether this analysis is correct or not.

Some context: I came across this bug when suddenly an email I replied to had some spaces between words removed (as indicated in the release notes) and felt that there is a solution that might not break older emails.

So we come back here to look at this again, always on a weekend ;-) - Thanks for your comment. I don't know of any mail client except TB that would handle format=flowed, let alone produce it.

To summarise, the problem reported here was addressed in two parts: First the QP encoder converts a leading space to a trailing space, so this

this is text=
 and here is more

becomes

this is text=20
and here is more

In a second part we now treat message display the same as message forward, so the "missing space" is treated consistently and always visible in pre-existing "badly" encoded messages.

However, none of what I said so far answers your question, which is in short why the original mail content can't be interpreted as

this is text and here is more

As you pointed out, if TB first decoded the QP and then interpreted the format=flowed, it would work better. I'd have to dig into the code to check what happens exactly.

Any more comments from you, Alfred?

I've dusted off the debug patch in attachment 9200264 [details] [diff] [review]. With that I see the following, slightly edited:

=== MimeMessage_parse_line |A lo largo de mi vida he sido testigo de muchas historias, experiencias=20|
=== decode QP |A lo largo de mi vida he sido testigo de muchas historias, experiencias=20|
=== MimeMessage_parse_line |e interpretaciones de la vida. Y me doy cuenta de que lo que m=C3=A1s apr=|
=== decode QP |e interpretaciones de la vida. Y me doy cuenta de que lo que m=C3=A1s apr=|
=== MimeInlineTextPlainFlowed_parse_line 1 |A lo largo de mi vida he sido testigo de muchas historias, experiencias |
=== MimeMessage_parse_line |ecio=20|
=== decode QP |ecio=20|
=== MimeMessage_parse_line |es la honestidad. En estos =C3=BAltimos meses siento que se ha polarizado=|
=== decode QP |es la honestidad. En estos =C3=BAltimos meses siento que se ha polarizado=|
=== MimeInlineTextPlainFlowed_parse_line 1 |e interpretaciones de la vida. Y me doy cuenta de que lo que más aprecio |
=== MimeMessage_parse_line | a=C3=BAn=20|
=== decode QP |a=C3=BAn=20|
=== MimeInlineTextPlainFlowed_parse_line 1 |es la honestidad. En estos últimos meses siento que se ha polarizadoaún|

So the leading space removal happens in MimeMessage_parse_line() before we QP-decode and treat "soft line breaks". That function is agnostic of the CTE used.

Yes, this architecture is not ideal, but the MIME code goes back more than 20 years and no one like to touch it. You can see that it would need turning it inside-out, or upside-down, and I don't think you'll find a developer who's prepared to do this. So what we've implemented here is, well, the best solution given the restrictions.

Regressed by: 1287126

Vincent Vanlear is right that quoted-printable (QP) should be decoded before format=flowed. The debug output that Klaus B. posted above suggests that QP is decoded before format=flowed runs on it, so it seems that is working as expected. That would imply that this is not what is causing the bug here. Vincent, if you have more specific comments on what is going wrong where exactly in the code, please feel free to post it.

the leading space removal happens in MimeMessage_parse_line() before we QP-decode [and before format=flowed]

That seems to be the wrong place to do it, and it seems to be the root cause here. I digged and it seems this was introduced in bug 1287126 and then again modified here.

I think the right fix would be to remove this code from this class completely and put it into the flowed class. The fix would need to consider both bug 1287126 (but choose a different implementation to fix that bug) and this bug here.

the MIME code goes back more than 20 years

True, but the bug here goes back only 5 years, and the cause is that it was fixed in a way that is not correct. It's also merely moving these few code lines, and instead of moving them up, to move them into the flowed class instead. So, no worries, you're not rocking the MIME castle :).

Sorry for coming late into the game after this bug was already fixed, reviewed, landed, and backported. I don't want to say that this was all wrong. I'm just posting my observations here, in response to Vincent.

Sorry for coming late into the game ...

You were invited two months ago, after comment #17 :-(

RIght, the now offending piece of code was added in bug 1287126 here: https://hg.mozilla.org/comm-central/rev/f99d81cd24ed#l2.12

The debug output that Klaus B. posted above suggests that QP is decoded before format=flowed runs on it ...

Hmm, not quire sure whether your statement is right, check the last three lines of the debug:

=== MimeMessage_parse_line | a=C3=BAn=20|
=== decode QP |a=C3=BAn=20|
=== MimeInlineTextPlainFlowed_parse_line 1 |es la honestidad. En estos últimos meses siento que se ha polarizadoaún|

First the space is removed, and then the QP decoding happens. What you you mean by "format=flowed runs on it"? The TextPlainFlowed processing that comes later and prints the last debug line?

On the positive side, the space is only removed for f=f, see:
https://hg.mozilla.org/comm-central/rev/63087f70a26a#l1.14 - mime_typep(kid, (MimeObjectClass*)&mimeInlineTextPlainFlowedClass)).
If you take TB's MIME as a somewhat derailed OO approach, then the super class knows something about the derived classes, yuk, but that was always like that, see: https://hg.mozilla.org/comm-central/rev/f99d81cd24ed#l2.10

So would you like to file a new bug and correct the situation? Removing the code is easy, but where exactly does it need to go? The area is now covered by tests, so it's not totally risky.

There is already space stuffing removal here in MimeInlineTextPlainFlowed_parse_line():
https://searchfox.org/comm-central/rev/c96bef7839f29e11313cad9d7a3cb15f83034f90/mailnews/mime/src/mimetpfl.cpp#278
However, apparently that didn't completely do the trick, hence bug 1287126 and the maybe incorrect attempt to do it elsewhere.

Actually, bug 1287126 isn't fully to blame here, it was just trying to correct a an issue introduced here:
https://hg.mozilla.org/comm-central/rev/99eb42a1fd3d
That changeset is described in bug 1287126 comment #6. So that "stuffed space removal code" was only restored, after bug 1230968 removed it. Take a look at attachment 8688319 [details] [diff] [review] from bug 26734 where the whole story began.

Now we really need an expert to fix the mess caused in four bugs :-(

EDIT: Since bug 1230968/bug 26734 was about DelSp processing for ISO-2022-JP which is no longer used for composition (all UTF-8 now), perhaps we can revert those hunks now.

Hi, I was investigating bug 1702197 and found this one. In my opinion, this bug is a rendering issue, the sending code should not be changed. I mean I can still send the following message from other mail client, and the rendering in TB is still wrong.

Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable

A lo largo de mi vida he sido testigo de muchas historias, experiencias e=
 interpretaciones de la vida. Y me doy cuenta de que lo que m=C3=A1s apre=

I'm trying to find out where a message is parsed and rendered into the messagepane <browser>, and fix from there. But haven't found it.

I've come here from bug 1702197. Looking at comment #79 down to comment #82 it appears that the fix here was incorrect or at least not optimal. Maybe it would be best to revert this bug here completely since the fix for a minor problem has now created much more severe issues, many plain text e-mails are broken. Surely a backout can reach the end-users quicker than another fix.

I'm trying to find out where a message is parsed and rendered into the messagepane <browser>, and fix from there. But haven't found it.

You will be dealing with 20 y/o MIME code that is extremely brittle as this bug here shows. You will see (some of) the relevant code paths in attachment 9200264 [details] [diff] [review]. You also need to look at format=flowed and not format=flowed cases, like in bug 1702197. Only TB supports format=flowed.

I.e. we have:

I think we keep having bugs, because the original fixes were not in the right place.

@rnons: The right fix would be somewhere in MimeInlineTextPlainFlowed_parse_line.

Just for convenience:

Bug 1230968 comment #9
https://hg.mozilla.org/comm-central/rev/99eb42a1fd3dba3a2cfaec4bb7a84bf36ebe3c1d
Change in MimeMessage_parse_line() and others.

Bug 1287126 comment #20
https://hg.mozilla.org/comm-central/rev/f99d81cd24ed
Change in MimeMessage_parse_line() and others, restoring what was removed.

Bug 1689804 comment #61 (this bug here)
https://hg.mozilla.org/comm-central/rev/63087f70a26a
https://hg.mozilla.org/comm-central/rev/289ac419affc
First changeset in MimeMessage_parse_line(), second changeset changes the generation.

So all those changes should be seen in context.

I sent a patch to bug 1702197. For 78, I suggest backing out the two commits in comment 75.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Flags: needinfo?(rob)

This didn't need to be reopened since it's not backed out from trunk (or beta). Ping is working on a separate fix for trunk in bug 1702197.

Yeah, closing again. We'll just leave 78 as it was before.

Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Attachment #9200957 - Flags: approval-comm-esr78+
Attachment #9201366 - Flags: approval-comm-esr78+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: