Closed Bug 27991 Opened 25 years ago Closed 25 years ago

Plaintext quotation is inserted as html source

Categories

(MailNews Core :: Composition, defect, P3)

x86
Linux

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: BenB, Assigned: rhp)

Details

(Keywords: regression)

Attachments

(4 files)

Reproduce: 1. Create a plain text msg (with any mailer) with an HTML tag, e.g. bla <strong>bla bla 2. Reply to it in Mozilla Actual result: The HTML tag disappears, but is evaluated, e.g. the last 2 "bla"s are displayed "strong". Expected result: The "<strong>" appears just like any other text in the composer. The "bla"s are not displayed "strong".
cc Akkana
Ben didn't say whether he was using html or plaintext compose, but when I try to reply to a plaintext message using html compose, the result is all screwed up and linebreaks are lost. Neither InsertText nor InsertHTML in the editor are ever called for the message body, though InsertHTML is called with the incorrect string: "Joe Schmoe wrote:<br><html>" (why is that <html> tag there?)
akk, yes, I was speaking about the HTML editor.
Raising severity to CRITICAL due to data loss.
Severity: normal → critical
reassign to rhp
Assignee: ducarroz → rhp
Severity: critical → major
Ok, I think I see what is going on here. Now, there are a couple of things that should change. First, for doing HTML to TXT conversion, I don't think I should be applying the nsIDocumentEncoder::OutputFormatted on the converterFlags, but that is the minor problem. The big issue is that if we pass in the following text to the HTML - TXT converter: <PRE>this is some test <strong> text and we are <strong> checking </PRE> it gets turned into: this is some test text and we are checking I think any <PRE> that gets hit should just write out the original text. - rhp
Assignee: rhp → akkana
Summary: Quoted plain text not escaped → NS_New_HTMLToTXT_SinkStream ignoring <PRE> tag
rhp, PRE is defined as inline, not CDATA, i.e. you have to escape "<>&" like everywhere else. I don't understand, why the HTML->TXT converter is used (not the opposite).
Rich: you definitely want OutputFormatted on the output flags. That's the whole point of OutputFormatted -- to make plaintext mail that looks like the html you saw in the html compose window. The flag was created specifically for that purpose. As for the bug, it has nothing whatsoever to do with the sink stream or html->plaintext conversion. The problem is that nsHTMLEditor::InsertAsQuotation assumes its argument is text if we're in a text editor, html if we're in an html editor. But mail has its own idea about whether the message was plaintext, which has nothing to do with which kind of editor we're going to use for the reply. So in this case, mail sees that the message is plaintext, gets the plaintext (which of course does not have entities escaped, though for some bizarre reason it has all spaces converted to nbsp's -- separate bug), and calls the editor's InsertAsQuotation method. The editor gets the string (not knowing what type it is), sees that it's an html editor, says "Oh, this string must be html", and inserts it as such. The best solution, IMO, is to change InsertAsCitedQuotation to take a mime type argument (e.g. text/plain or text/html), which it will check to see which type of insertion to do. (InsertAsPlaintextQuotation can only insert text, and so does not need this argument.) This means that nsMsgCompose also has to change to call InsertAsPlaintextQuotation or InsertAsCitedQuotation (which it should do anyway, to pass in the cite string, if we're ever going to be compatible with the W3C mail quoting spec); but in addition, it will have to find out somehow what the type is of the string being quoted, and I don't know whether nsMsgCompose has that information. Cc'ing ducarroz, since he'll have to do the nsMsgCompose part of this. I will add the API in the editor to make this possible, as soon as the tree opens for M15, then will pass the bug to ducarroz for the mail portion.
Status: NEW → ASSIGNED
Summary: NS_New_HTMLToTXT_SinkStream ignoring <PRE> tag → plaintext quotation is inserted as html source
Target Milestone: M15
My mozTXTToHTMLConv::ScanTXT escapes plain text, and I know, it was called for this type of quotation at some point in time. Now, it /looks/ to me, as if this function were never called for these quotations: I neither see recognized URLs nor debug output on the console. If I'm right and it was a hotfix for my "favorite" bug #21203, the responsible person will die. As for akk's proposal: Looks good to me. These (new?) functions would have to call ScanTXT then (cc me for the right flags). BTW: The caller should know about the content type, because it currently has to convert the text, at least it did so in the past.
If mail goes back to calling ScanTXT and escaping characters which need to be escaped, then we don't need this editor change at all -- then we'd be passing legal html in to InsertAsCitedQuotation and everything would work fine. I'll probably add the API anyway, since it's unclear from the current API whether the passed-in string is expected to be text or html. Then mail can decide whether to tell InsertAsCitedQuotation that it's passing in plaintext, or to call ScanTXT to convert to html and then pass in html like it was doing before.
Akk, I suggest you call ScanTXT inside the Insert*Quotation functions, because this "feature" is not bound to Mailnews - it more "converts" the formatting in the plain text to HTML. E.g. for quoting a RFC in a HTML page, usage of this function would make sense as well.
I'll give that some thought. I'm not entirely comfortable with it, since I think in some types of quoting (particularly Paste As Quotation) people wouldn't expect that behavior. And the prefs concerning smart formatting (emoticons and such) are currently mail/news prefs; if the functionality is moved into the editor, then the prefs probably should be as well. What do the mail people on the cc list think about the idea?
akk, the problem I see is the following: If I understood you right, the idea behind your new API is to make the caller independant from the type of editor, which is used. If Mailnews has to call ScanTXT, it has to know, which type of editor is used, because ScanTXT converts from TXT to HTML. If Mailnews doesn't special-case, a txt mail quoted in a txt mail will run through ScanTXT *and* nsHTMLToTXTSinkStream, which will result in double formatting (e.g. "**mail** b@b.com <b@b.com>" for "*mail* b@b.com"). You can tell ScanTXT exactly, what to recognize, down to doing nothing else than escaping. Does this make it less bad?
I wrote: > If I'm right and it was a hotfix for my "favorite" bug #21203, the responsible > person will die. It looks like this was introduced by fixing bug #18718, so you don't have to die, rhp :-).
Mail compose already does lots of special casing based on the type of editor. In fact, InsertAsQuotation is the only insertion it does that is NOT special-cased, and that one may eventually need to be anyway, because of the W3C mail-citing standard. And it needs to know because if we're going to an html editor, it needs to insert html, whereas if we're going to a plaintext editor, it needs to insert text. I don't see where it makes sense for the editor to call ScanTXT. With the new API as proposed, InsertAsPlaintextQuotation takes a text string (so no escaping should be needed), InsertAsCitedQuotation takes either a text string (so no escaping should be needed) or an html string (so escaping should have already happened already, else it isn't a valid html string). In which of these cases are you arguing that ScanTXT should be called, and with what flags?
> In which of these cases are you arguing that ScanTXT should be called When a plain text mail is quoted in a HTML mail. I guess, that is the case, when InsertAsCitedQuotation is called with a plain text string for a HTML editor. > with what flags? I suggest, ScanTXT is called in this case with the same flags as for mail send, i.e. kURLs (URL recognition) always on, kGlyphSubstitution (emoticon etc. conversion) always off and kStructPhrase (structured phrases recognition) on based on a pref.
> When a plain text mail is quoted in a HTML mail. d/mail d/a
It just came to me, that we may be a double-formatting issue when running ScanTXT for the quote and ScanHTML at send for the whole msg. I'll have to chat with akk about that.
Keywords: regression
I have a fix for this. Patch follows. Rich, can you review? I neither know, what the difference between |nsMimeOutput::nsMimeMessageQuoting| and |nsMimeOutput::nsMimeMessageBodyQuoting| is, nor, what |force_user_charset| means. Move them around as needed. Whenever you have plain text as input and need HTML as output, you should go through the converter. If you don't want the recognition etc., just pass 0 (zero) as |whattodo| param to ScanTXT (maybe I should change the name?).
Summary: plaintext quotation is inserted as html source → Plaintext quotation is inserted as html source
Improving Phil's statistic a bit... :-)
Assignee: akkana → rhp
Status: ASSIGNED → NEW
Summary: Plaintext quotation is inserted as html source → [FIXED] Plaintext quotation is inserted as html source
Ben, Can you try a patch command like this: cvs diff -c -p mime....cpp The patch I got here didn't work. By the way: |nsMimeOutput::nsMimeMessageQuoting| and |nsMimeOutput::nsMimeMessageBodyQuoting| are both quoting operations. We shouldn't be mucking with this recognition/conversion on quoting operations. I try to look at the changes when I get a happy diff. force_user_charset is very important in determining what charset conversion needs to be done. - rhp
Status: NEW → ASSIGNED
Can't call this fixed until I can verify the patch. - rhp
Summary: [FIXED] Plaintext quotation is inserted as html source → Plaintext quotation is inserted as html source
> The patch I got here didn't work. As always my fault :-). Forgot to update. ScanTXT isn't "mucking with this recognition/conversion", if you pass 0 as whattodo param to it, but it does escape the string (i.e. turns it into a valid HTML part), what was missing and caused this bug. I asked about |force_user_charset|, because ScanTXT is skipped in this case. As you know this best, you should know, if escaping is needed in this case. I missed mimetpfl.cpp, changes added to the new patch.
For the record: Filed bug #30311 about Insert*Quotation. BTW: I agree now with akk, that no recognition should be done at inserting, we already do this at sending. rhp, sorry for the many bad patches.
So the last patch is a valid one that I should apply right? - rhp
lol, yes, I hope so.
Ok, its in my tree and looks reasonable. - rhp
Summary: Plaintext quotation is inserted as html source → [FIXED] Plaintext quotation is inserted as html source
After doing some testing, I realize this breaks several I18N test cases. Looking at the logic again. - rhp
Summary: [FIXED] Plaintext quotation is inserted as html source → Plaintext quotation is inserted as html source
Ok, I found the problem. The conditional that you rewrote like this: PRBool skipConversion = !conv || obj->options && ( obj->options->force_user_charset || obj->options->format_out == nsMimeOutput::nsMimeMessageSaveAs ); Needs to be written like the following to preserve the correct logic: PRBool skipConversion = !conv || ( obj->options && ( obj->options->force_user_charset || obj->options->format_out == nsMimeOutput::nsMimeMessageSaveAs ) ); There are times that I leave conditional statements written in a manner that may look less than efficient to be VERY clear about its intent. Modern compilers will optimize these types of conditionals so I would rather make the source code look totally obvious and easy to understand than clever and difficult to figure out. - rhp
Summary: Plaintext quotation is inserted as html source → [FIXED] Plaintext quotation is inserted as html source
Ugh...nope...still broken... still digging :-(
Summary: [FIXED] Plaintext quotation is inserted as html source → Plaintext quotation is inserted as html source
Sorry Ben...I think this is ok after all....there was a change/regression in the webshell...I think....still investigating. - rhp
Summary: Plaintext quotation is inserted as html source → [FIXED] Plaintext quotation is inserted as html source
rhp, dunno, if that helps you, but I can't see problems with the I18N tests you sent me.
This was fixed as the side affect of PDT bug fix. - rhp
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Summary: [FIXED] Plaintext quotation is inserted as html source → Plaintext quotation is inserted as html source
changing qa
QA Contact: lchiang → esther
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: