Open Bug 913768 Opened 11 years ago Updated 4 months ago

[mime] structs not recognized across internal line wraps in plaintext messages, f=f (e.g. _/*foo bar*/_ with line break in the middle not displayed as bold)

Categories

(MailNews Core :: MIME, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: davidbourguignon.net, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: ux-consistency, ux-implementation-level)

Attachments

(2 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:23.0) Gecko/20100101 Firefox/23.0 (Beta/Release)
Build ID: 20130814063812

Steps to reproduce:

Type the text below in a plain text email, then send it to yourself.

---- TEST ----

Post-scriptum : tu devrais, sous, l'original de l'attestation /de minimis/ signée.

Post-scriptum : devrais, sous, l'original de l'attestation /de minimis/ signée.

Post-scriptum : tu devrais, sous, /de minimis/ signée. 


Actual results:

Only in the last two sentences is /de minimis/ correctly rendered in italic font (see screen capture).


Expected results:

In all three sentences should have been /de minimis/ correctly rendered in italic font... Thanks in advance for your help!
David, can you report your setting for mailnews.wraplength pref?
Tools > options > advanced > General > config editor
Confirming as an issue.

[mozTXTToHTMLConv] algorithm fails even for the most basic use cases if they happen to have more than one word and end up near end of line wrap point as defined by mailnews.wraplength, TB default value=72 characters (first word remaining on line 1, second word pushed to line 2 as user types).

Very strange and unreliable effect to the user who is just typing along using structured plain text, and some of them will work, while others fail, for no apparent reason.


Tentative analysis:

I'll indicate with | position between 72 and 73:

      123456789X123456789X123456789X123456789X123456789X123456789X123456789X72|73
> (1) Post-scriptum : tu devrais, sous, l'original de l'attestation /de minimi|s/ signée.
> (2) Post-scriptum : devrais, sous, l'original de l'attestation /de minimis/ |signée.
> (3) Post-scriptum : tu devrais, sous, /de minimis/ signée.                  |

Only (1) fails (this bug). This is what happens:
- User types in real plaintext composition which has automatical line wrap after 72 chars.
- starts structured text before line wrap, "/de minimi"
- when typing "s", line is longer than 72 chars, so the whole second word, "minimis", is now moved to the next line (but there's no hard line break as such).
- when sending as format=flowed, an end of line (eol) character (line break) is inserted before the word minimis, i.e. after "/de ".
- [mozTXTToHTMLConv] then fails to parse the struct, probably just because of the eol character in format=flowed, or maybe the combination space+eol.
- it shouldn't fail because in format=flowed, single eol characters have no meaning except to cut the text into slices for transportation (and accordingly, such line breaks are not rendered when viewing with TB). So structs parsing needs to happen *after* the "meaningless" eol characters from f-f have been removed, or otherwise somehow ignore them.

Ben, could you comment?
Do we have an existing bug for this?
Flags: needinfo?(ben.bucksch)
Summary: Strange side effect in italic emphasis using / / markup in plain text emails → [mozTXTToHTMLConv] multiword structs in format=flowed not recognized near line wrap / mailnews.wraplength (e.g. _/*foo bar*/_ going beyond end of line in structured plaintext messages not displayed as bold, underlined, italics)
This is the source snippet as sent correctly and then parsed inconsistently by [mozTXTToHTMLConv]:

I'll use [eol] to indicate [cr][lf] or whatever character is there at the end of line on your OS (I don't know all of these details...)

Post-scriptum : tu devrais, sous, l'original de l'attestation /de [eol]
minimis/ signée.[eol]
[eol]
Post-scriptum : devrais, sous, l'original de l'attestation /de minimis/ [eol]
signée.[eol]
[eol]
Post-scriptum : tu devrais, sous, /de minimis/ signée.[eol]
[eol]
To ease Ben's analysis, I've prepared a testcase.eml from comment 0.
Attachment #801110 - Attachment description: Testcase1.eml (format=flowed, inconsistent structs recognition when displayed in TB) → Testcase1.eml (format=flowed, with multi-word structs near eol)
Attachment #801089 - Attachment description: bug.png → Screenshot 1: Testcase1.eml displayed in TB with incomplete structs recognition
Actually, screenshot of attachment 801089 [details] has unnecessary distractions because coincidentally (?), message reader window size causes display line wrap which isn't really there.

So here's a better screenshot of testcase1.eml (attachment 801110 [details]) viewed with TB, showing that while we're actually parsing format=flowed correctly by ignoring single [eol] characters found in msg source, structured plaintext recognition doesn't handle that case correctly.
Attachment #801089 - Attachment is obsolete: true
(In reply to Thomas D. from comment #1)
> David, can you report your setting for mailnews.wraplength pref?
> Tools > options > advanced > General > config editor

Thanks Thomas for the feedback. My settings for mailnews.wraplength are: status=default;type=integer;value=72

Does this help? Thanks in advance!
Yes, none of the recognitions can cross linebreaks. This is not due to mozTXTToHTMLConv, it's written to support that. But libmime works line-based, i.e. processes one line after the other, and thus never sees a whole paragraph. This is true even for format=flowed. We output one line at a time, and the HTML renderer only shows linebreaks where we explicitly enter them, but libmime never has a whole paragraph in memory. That's unfortunate, but by design of libmime. Please talk to jwz :).

Thus, this is effectively the same as bug 5351. Marking dependency.

Please note that this is not a bug, because the recognizer is fuzzy by nature and can never recognize everything correctly.
Severity: normal → enhancement
Component: Untriaged → MIME
Depends on: 5351
Flags: needinfo?(ben.bucksch)
OS: Windows 8 → All
Product: Thunderbird → MailNews Core
Hardware: x86_64 → All
Summary: [mozTXTToHTMLConv] multiword structs in format=flowed not recognized near line wrap / mailnews.wraplength (e.g. _/*foo bar*/_ going beyond end of line in structured plaintext messages not displayed as bold, underlined, italics) → [mime] structs not recognized across line wrap (e.g. _/*foo bar*/_ with line break in the middle not displayed as bold)
Version: 17 → unspecified
(In reply to Ben Bucksch (:BenB) from comment #7)
> Yes, none of the recognitions can cross linebreaks. This is not due to
> mozTXTToHTMLConv, it's written to support that. But libmime works
> line-based, i.e. processes one line after the other, and thus never sees a
> whole paragraph. This is true even for format=flowed. We output one line at
> a time, and the HTML renderer only shows linebreaks where we explicitly
> enter them, but libmime never has a whole paragraph in memory. That's
> unfortunate, but by design of libmime. Please talk to jwz :).

Thanks Ben, that's helpful and constructive information. I'm ready to talk to jwz at any time, because in pursuit of our common goal of improving Thunderbird, I regularly try to move things by clarifying their nature and intentions, identifying next steps, and trying to involve the right people. But I see that you've assigned yourself to work on bug 5351, so I understand you'll do it yourself. If you'd still want me to talk to jwz (whoever he/she is), pls let me know. Btw, I greatly appreciate that you are willing to fix bug 5351, which really needs a skilled programmer as you are. We should communicate with other involved parties as much as possible for inputs, ideas, and cooperative success.

> Thus, this is effectively the same as bug 5351. Marking dependency.

Dependency looks good to me, let's keep it that way. I realize the structural similarity, even identity, but I suggest that we keep some of the fallout symptoms on separate records (Bug 5351 focus on URLs, this bug focus on structs recognition). That will also make it easier to verify that symptoms have really been fixed when the root causes have been addressed.

> Please note that this is not a bug, because the recognizer is fuzzy by
> nature and can never recognize everything correctly.

Well, yeah, I'd agree that recognizing everyting correctly is pretty hard, even impossible, but in this case, as you said, we do have a design problem in the current implementation of libmime (at least). However, that's a problem (aka bug) on our side, and we could fix it if there was enough will and manpower. So notwithstanding the technical correctness of your evaluation, implementation problems don't count for evaluation of user-facing bugs, which makes this a case of ux-implementation-level (1):

> User experience principle: interfaces should not be organized around the underlying implementation
> and technology in ways that are illogical, or require the user to have access to additional
> information that is not found in the interface itself.

So from a QA perspective, this IS a BUG because it's a problem which users experience because of problems in our code. More precisely, we're violating ux-consistency as some multi-word structs will be recognized, but not others of exactly the same type (user is just typing along and knows nothing about the technicalities of line breaks inserted for transport in format=flowed). So someone should mark this as a bug.

Ben, thanks for working to improve the summary. For the uninitiated, I think we should make it clear that this is not about user-inserted line breaks, but internal line breaks which result from sending as mime / format=flowed. That's why I had added these keywords in the summary. Also, having "plaintext" in the summary is absolutely crucial for descriptive adequacy, finding this bug and avoiding unnecessary duplicates. So as a compromise, I'll preserve your shorter summary and just add a small hint to clarify that matter, and I hope it won't offend your strong personal preference for short summaries. We need to strike a balance between the legitimate and helpful desire for brevity (which is not a goal in itself), and QA's legitimate and helpful needs for efficient bug management.

(1) https://bugzilla.mozilla.org/describekeywords.cgi
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: [mime] structs not recognized across line wrap (e.g. _/*foo bar*/_ with line break in the middle not displayed as bold) → [mime] structs not recognized across internal line wraps in plaintext msgs, f=f (e.g. _/*foo bar*/_ with line break in the middle not displayed as bold)
Severity: normal → S3
Summary: [mime] structs not recognized across internal line wraps in plaintext msgs, f=f (e.g. _/*foo bar*/_ with line break in the middle not displayed as bold) → [mime] structs not recognized across internal line wraps in plaintext messages, f=f (e.g. _/*foo bar*/_ with line break in the middle not displayed as bold)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: