Open Bug 117009 Opened 23 years ago Updated 3 years ago

nsPlaintextSerializer: Remove trailing spaces around structured text

Categories

(Core :: DOM: Serializers, enhancement, P5)

enhancement

Tracking

()

People

(Reporter: BenB, Unassigned)

References

Details

Reproduction:
1. Open Compose windows
2. Enter text "hello you there"
3. Mark " you " (not the spaces) and make it bold
4. Send it as plaintext
5. Recieve

Actual result:

"hello* you *there"

Expected result:

"hello *you* there"


Comment #4 in bug 116842 from bienvenu@netscape.com:
<quote src="http://bugzilla.mozilla.org/show_bug.cgi?id=116842#c4">
Afte thrashing around for a while trying to figure out if I've broken anything,
I've found out that converting html to plain text is broken in mozilla, and has
nothing to do with my changes. If you use the html compose window to make some
text bold, and send it as plain text, it can get sent incorrectly, or displayed
incorrectly (not sure which - if you make " test " bold, it will get sent and
displayed as "* test *", which is broken. Perhaps the struct code doesn't handle
this case correctly, because of the leading and trailing spaces. Is there a bug
on this? If that's the bug, it seems like this code is pretty seriously flawed.
</quote>

Why do you think so? The actual behaviour is not optimal, but why "seriously
flawed"?

I thought about it, and IIRC, it would've been quite hard to implement (don't
remember why), and I didn't consider it worth it.
Blocks: 116842
Nonsnense, not mozTXTToHTMLConv, but nsPlaintextSerializer (TXT->HTML). This bug
is in the generator, not the recognizer.
Assignee: neeti → harishd
Component: Networking → DOM to Text Conversion
QA Contact: benc → sujay
Summary: mozTXTToHTMLConv: Remove trailing spaces around structured text → nsPlaintextSerializer: Remove trailing spaces around structured text
> (not the spaces)

s/not/note/
Sorry.

I don't think that this code is "seriously flawed", because we (er, I) merely
fail to fixup the suboptimal formatting.

I now remember the reason why I didn't fix it when I wrote the code: It was next
to impossible to fix it in the bold-processing code, because I didn't have a DON
and at the time </b> was processed, the last trailing space was already written
out to the result. We'd have had to cache the whole content between <b> and </b>
and process it differently - and it could contain more formatting/structure.

Since we moved to the DOM conversion (in contrast to the old stream conversion),
this bug is probably easier to fix, because we could modify the DOM first and
"fixup" all markup before we convert.

I'd remove DoOpen/CloseContainer and move their content into Open/CloseContainer
<http://lxr.mozilla.org/seamonkey/source/content/base/src/nsPlainTextSerializer.cpp#385>.
Then, we'd have the DOM node in the tag-specific processing code, which would
(presumably) enable us to modify it. We could just take the first #text
childnode and remove lealding spaces and then take the last #text subnode and
remove the trailing spaces and be done.
If I double click on a word in the html compose window and turn it bold,
incorrect plain text will get generated. It seems like a serious flaw that we
generate structured text that we can't display with just three clicks of the
mouse. This isn't some pathological edge case, unless I'm misunderstanding
something. I know that you'll violently resist this suggestion, but I don't
think we should do this structure stuff on conversion from html to plain text if
we can't handle such a simple case. To the user, it just looks like we're
throwing asterix around.
Well, in that case, composer generates HTML "hello<b> you </b>there!". You can
argue that this is just as flawed as "hello* you *there". Anyways, this RFE
doesn't seem hard to fix.
How is this an RFE? Taking html text from the editor and converting it to
structured text is just plain broken. If this isn't supported, we should just
turn it off until the RFE is done.
Because we insert the stars exactly where the <b> tags are. It might be
braindead, but it's not a bug either. FWIW, it's turned off in Netscape 6
anyways, I think.
I can't look into this bug atm, because my hd broke and is off for exchange.
I guess I'm confused. I thought the whole point of inserting the '*''s is so
that we would display the text as bold when displaying the message as plain
text, not to tell the user where the original html had its <b> tags. Is there
some sort of spec/internet standard for the way "structured text" should be
formatted/displayed? And I don't understand why it's OK to mess up mozilla
user's messages but not Netscape users.
> I thought the whole point of inserting the '*''s is so
> that we would display the text as bold when displaying the message as plain
> text, not to tell the user where the original html had its <b> tags.

The reason for the code is both of that.

> Is there some sort of spec/internet standard for the way "structured text"
> should be formatted/displayed?

Yes <http://info.astrian.net/jargon/How_Jargon_Works/Hacker_Writing_Style.html>,
but admittedly, it requires the stars to be around words.


I'll maybe look into this bug when I have a chance.
--> tanu
Assignee: harishd → tmutreja
Unable to reproduce the bug. Tried with the latest build(Jan16) and don't see 
any additional space around structured text.
Tried this and it works for me too.
I can reproduce it with a relatively current trunk build.
Make sure to mark the spaces around "you", i.e. mark " you ", not just "you".
You are getting starts at all, don't you?

However, the bug is less obvious by now, because now, Composer marks only "you"
(without spaces) when doubleclicking on a word.
Ben, your comments look confusing to me. I think this is not what you mentioned 
in the original bug ("not the spaces"). Anyway, while marking, if we include 
spaces around "you" the expected behaviour should be the same as we are actually 
getting here i.e. "* you *" should be the right thing.
> I think this is not what you mentioned in the original bug ("not the spaces")

Please read comment 3.

> "* you *" should be the right thing.

This doesn't follow the convention, which is to add the stars around the word.
(The fact that our TXT->HTML recognizer and probably that of other Unix software
doesn't recognize it is a result of that.)
Ben, I understand the problem now. I need to confirm the exact expected 
behaviour before starting working on it. 

If I select " you " and make it bold (including spaces around), should it be 
sent as "*you*" (which in turn would mean that "hello you there" would be sent 
as "hello*you*there" or should it be sent as " *you* "(spaces around stars 
instead of spaces). 
Second option looks better but that would actually mean that we are not 
displying the correct DOM Node, because in the actual DOM Node spaces are part 
of <b> tag. Is there any specification for this?

I understand your comment#2 except for 
"I'd remove DoOpen/CloseContainer and move their content into 
Open/CloseContainer 
<http://lxr.mozilla.org/seamonkey/source/content/base/src/nsPlainTextSerializer.
cpp#385>. Then, we'd have the DOM node in the tag-specific processing code, "

I feel even in Open/CloseContainer of nsPlainTextSerilaizer we do not have 
access to the DOM Node. The solution that I can think atm is to set a flag with 
the opening <b> and handle the Text children of <B> based on that flag.  
Status: NEW → ASSIGNED
> Second option looks better but that would actually mean that we are not 
> displying the correct DOM Node, because in the actual DOM Node spaces are part 
> of <b> tag.

The second option is correct, even though it doesn't exactly match the DOM.

> Is there any specification for this?

see comment 9.

> I feel even in Open/CloseContainer of nsPlainTextSerilaizer we do not have 
> access to the DOM Node.

Bummer. I think you have access to it in AppendElement*(), but not sure, if/when
these are called.
Assignee: t_mutreja → nobody
QA Contact: sujay → dom-to-text
This is a mass change. Every comment has "assigned-to-new" in it.

I didn't look through the bugs, so I'm sorry if I change a bug which shouldn't be changed. But I guess these bugs are just bugs that were once assigned and people forgot to change the Status back when unassigning.
Status: ASSIGNED → NEW

Bulk-downgrade of unassigned, untouched DOM/Storage bug's priority.

If you have reason to believe, this is wrong, please write a comment and ni :jstutte.

Severity: normal → S4
Priority: -- → P5
You need to log in before you can comment on or make changes to this bug.