Open Bug 117009 Opened 23 years ago Updated 4 years ago

nsPlaintextSerializer: Remove trailing spaces around structured text

Categories

(Core :: DOM: Serializers, enhancement, P5)

enhancement

Tracking

()

People

(Reporter: BenB, Unassigned)

References

Details

Reproduction: 1. Open Compose windows 2. Enter text "hello you there" 3. Mark " you " (not the spaces) and make it bold 4. Send it as plaintext 5. Recieve Actual result: "hello* you *there" Expected result: "hello *you* there" Comment #4 in bug 116842 from bienvenu@netscape.com: <quote src="http://bugzilla.mozilla.org/show_bug.cgi?id=116842#c4"> Afte thrashing around for a while trying to figure out if I've broken anything, I've found out that converting html to plain text is broken in mozilla, and has nothing to do with my changes. If you use the html compose window to make some text bold, and send it as plain text, it can get sent incorrectly, or displayed incorrectly (not sure which - if you make " test " bold, it will get sent and displayed as "* test *", which is broken. Perhaps the struct code doesn't handle this case correctly, because of the leading and trailing spaces. Is there a bug on this? If that's the bug, it seems like this code is pretty seriously flawed. </quote> Why do you think so? The actual behaviour is not optimal, but why "seriously flawed"? I thought about it, and IIRC, it would've been quite hard to implement (don't remember why), and I didn't consider it worth it.
Blocks: 116842
Nonsnense, not mozTXTToHTMLConv, but nsPlaintextSerializer (TXT->HTML). This bug is in the generator, not the recognizer.
Assignee: neeti → harishd
Component: Networking → DOM to Text Conversion
QA Contact: benc → sujay
Summary: mozTXTToHTMLConv: Remove trailing spaces around structured text → nsPlaintextSerializer: Remove trailing spaces around structured text
> (not the spaces) s/not/note/ Sorry. I don't think that this code is "seriously flawed", because we (er, I) merely fail to fixup the suboptimal formatting. I now remember the reason why I didn't fix it when I wrote the code: It was next to impossible to fix it in the bold-processing code, because I didn't have a DON and at the time </b> was processed, the last trailing space was already written out to the result. We'd have had to cache the whole content between <b> and </b> and process it differently - and it could contain more formatting/structure. Since we moved to the DOM conversion (in contrast to the old stream conversion), this bug is probably easier to fix, because we could modify the DOM first and "fixup" all markup before we convert. I'd remove DoOpen/CloseContainer and move their content into Open/CloseContainer <http://lxr.mozilla.org/seamonkey/source/content/base/src/nsPlainTextSerializer.cpp#385>. Then, we'd have the DOM node in the tag-specific processing code, which would (presumably) enable us to modify it. We could just take the first #text childnode and remove lealding spaces and then take the last #text subnode and remove the trailing spaces and be done.
If I double click on a word in the html compose window and turn it bold, incorrect plain text will get generated. It seems like a serious flaw that we generate structured text that we can't display with just three clicks of the mouse. This isn't some pathological edge case, unless I'm misunderstanding something. I know that you'll violently resist this suggestion, but I don't think we should do this structure stuff on conversion from html to plain text if we can't handle such a simple case. To the user, it just looks like we're throwing asterix around.
Well, in that case, composer generates HTML "hello<b> you </b>there!". You can argue that this is just as flawed as "hello* you *there". Anyways, this RFE doesn't seem hard to fix.
How is this an RFE? Taking html text from the editor and converting it to structured text is just plain broken. If this isn't supported, we should just turn it off until the RFE is done.
Because we insert the stars exactly where the <b> tags are. It might be braindead, but it's not a bug either. FWIW, it's turned off in Netscape 6 anyways, I think.
I can't look into this bug atm, because my hd broke and is off for exchange.
I guess I'm confused. I thought the whole point of inserting the '*''s is so that we would display the text as bold when displaying the message as plain text, not to tell the user where the original html had its <b> tags. Is there some sort of spec/internet standard for the way "structured text" should be formatted/displayed? And I don't understand why it's OK to mess up mozilla user's messages but not Netscape users.
> I thought the whole point of inserting the '*''s is so > that we would display the text as bold when displaying the message as plain > text, not to tell the user where the original html had its <b> tags. The reason for the code is both of that. > Is there some sort of spec/internet standard for the way "structured text" > should be formatted/displayed? Yes <http://info.astrian.net/jargon/How_Jargon_Works/Hacker_Writing_Style.html>, but admittedly, it requires the stars to be around words. I'll maybe look into this bug when I have a chance.
--> tanu
Assignee: harishd → tmutreja
Unable to reproduce the bug. Tried with the latest build(Jan16) and don't see any additional space around structured text.
Tried this and it works for me too.
I can reproduce it with a relatively current trunk build. Make sure to mark the spaces around "you", i.e. mark " you ", not just "you". You are getting starts at all, don't you? However, the bug is less obvious by now, because now, Composer marks only "you" (without spaces) when doubleclicking on a word.
Ben, your comments look confusing to me. I think this is not what you mentioned in the original bug ("not the spaces"). Anyway, while marking, if we include spaces around "you" the expected behaviour should be the same as we are actually getting here i.e. "* you *" should be the right thing.
> I think this is not what you mentioned in the original bug ("not the spaces") Please read comment 3. > "* you *" should be the right thing. This doesn't follow the convention, which is to add the stars around the word. (The fact that our TXT->HTML recognizer and probably that of other Unix software doesn't recognize it is a result of that.)
Ben, I understand the problem now. I need to confirm the exact expected behaviour before starting working on it. If I select " you " and make it bold (including spaces around), should it be sent as "*you*" (which in turn would mean that "hello you there" would be sent as "hello*you*there" or should it be sent as " *you* "(spaces around stars instead of spaces). Second option looks better but that would actually mean that we are not displying the correct DOM Node, because in the actual DOM Node spaces are part of <b> tag. Is there any specification for this? I understand your comment#2 except for "I'd remove DoOpen/CloseContainer and move their content into Open/CloseContainer <http://lxr.mozilla.org/seamonkey/source/content/base/src/nsPlainTextSerializer. cpp#385>. Then, we'd have the DOM node in the tag-specific processing code, " I feel even in Open/CloseContainer of nsPlainTextSerilaizer we do not have access to the DOM Node. The solution that I can think atm is to set a flag with the opening <b> and handle the Text children of <B> based on that flag.
Status: NEW → ASSIGNED
> Second option looks better but that would actually mean that we are not > displying the correct DOM Node, because in the actual DOM Node spaces are part > of <b> tag. The second option is correct, even though it doesn't exactly match the DOM. > Is there any specification for this? see comment 9. > I feel even in Open/CloseContainer of nsPlainTextSerilaizer we do not have > access to the DOM Node. Bummer. I think you have access to it in AppendElement*(), but not sure, if/when these are called.
Assignee: t_mutreja → nobody
QA Contact: sujay → dom-to-text
This is a mass change. Every comment has "assigned-to-new" in it. I didn't look through the bugs, so I'm sorry if I change a bug which shouldn't be changed. But I guess these bugs are just bugs that were once assigned and people forgot to change the Status back when unassigning.
Status: ASSIGNED → NEW

Bulk-downgrade of unassigned, untouched DOM/Storage bug's priority.

If you have reason to believe, this is wrong, please write a comment and ni :jstutte.

Severity: normal → S4
Priority: -- → P5
You need to log in before you can comment on or make changes to this bug.