Closed Bug 46227 Opened 22 years ago Closed 20 years ago

switching between "normal" and "html source" adds spaces

Categories

(Core :: DOM: Serializers, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla1.0.1

People

(Reporter: spam, Assigned: akkzilla)

References

Details

(Keywords: helpwanted, Whiteboard: EDITORBASE)

Attachments

(4 files, 1 obsolete file)

Linux build ID 2000-072113

Was trying to find what kind of html that triggers bug 46001 (only text and <br>
doesn't) and stumbled across another oddity:

When toggling back and forth between "Normal" and "HTML Source" in Composer, one
additional linefeed is erronously added in the HTML Source view for each time
it's clicked.

To reproduce:

Open composer
-Click the twisty to see the "toggle view" tabs
-Click "HTML Source"
-Click "Normal"

-Repeat back and forth between the last two clicks 9 times each:
 the HTML Source view now show 10 linefeeds before the </body> tag.

-Now go back to "Normal" view and try to mouse down with arrows:
 the extra linefeeds appear not to be there: One can not maneuvre downwards. The
linefeeds are LF (or CRLF) and not interpreted as html <br> tags.

-Save the file: The extra linefeeds are saved.

It seems that each time HTML Source view is clicked, one additional linefeed is
inserted. Worse yes: If HTML <br> exists in the file, one extra LF will be added
to EACH of these tags for each time the view is toggled.

After only a short while, a page with seemingly 4 lines of text gets so tall in
HTML Source view that it can't be viewed within a full heigth sized composer-window.
looking at it.. it seems an extra LF is inserted for EVERY tag found.
Tested an anchor too, and a headline.
AND: it seems to NOT happen when triggering "View source" but actually inserts
when "Normal" is clicked.

The indication of this:
Saved the file after having gone back to "normal" view, and it had one more
linefeed than what i had just counted in "View Source" mode.
setting keywords and assigning to jfrancis
Assignee: beppe → jfrancis
Keywords: correctness, nsbeta3
Target Milestone: --- → M18
hmm isn't "correctness" a little weak?
Would have thought "data corruption" was more it..
akkana?  This looks like output system stuff.  
Assignee: jfrancis → akkana
clearing milestone to get this back on beppe's radar
Target Milestone: M18 → ---
akkana, setting to m18
Target Milestone: --- → M18
Accepting.
Status: NEW → ASSIGNED
setting to nsbeta3+
Whiteboard: nsbeta3+
setting priority in status whiteboard
Priority: P3 → P4
Whiteboard: nsbeta3+ → [nsbeta3+][p:4]
Per PDT triage, making [nsbeta3-].  Other more critical work for PR3 and RTM.  
Beppe, if you feel a release note might be needed, please add relnote3 keyword.
Whiteboard: [nsbeta3+][p:4] → [nsbeta3-][p:4][minus]
this does affect how the user sees the document, but the lf does not hinder the 
view in the browser, when it affects the final output, then yes it would be 
severe. Moving this out to future and adding helpwanted
Keywords: helpwanted
Target Milestone: M18 → Future
right now it's impossible to reproduce this bug:
The code that was in "normal view" before view was changed (to for instance
"html source") will re-appear when changing back to "Normal" view.
Not so sure anything is really fixed though, or if it is a new bug, masquerading
this one:

Adding code in source-view seems to be deliberatly blocked now?
(I seem to remember i could add handwritten code there, and it "imported" to
other views.)
yes the issue about not being able to edit in HTML mode is already a bug -- 
50034
This is probably easy, should probably be considered for rtm.
Keywords: rtm
if this doesn't affect the rendering of the page, then this can wait till after
rtm, if this does affect the layout, then remove the rtm- for reconsideration.
Whiteboard: [nsbeta3-][p:4][minus] → [nsbeta3-][p:4][rtm-]
Anthony is taking over Output bugs.
Assignee: akkana → anthonyd
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
removing nsbeta
Keywords: nsbeta3, rtm
Whiteboard: [nsbeta3-][p:4][rtm-]
--> kin
Assignee: anthonyd → kin
Status: ASSIGNED → NEW
*** Bug 105449 has been marked as a duplicate of this bug. ***
It seems to me that :

1. the extra lines get added when saving the file. This includes when switching
from html view to nornal view, but not when going from normal to html view.

2. Open a file in normal view add a character, save and check the file with vi.
One blank line is added for each non-blank line, in the header. Note therefore
that you do not need to switch views, and maybe it is the save operation,
whether invoked as part of switch html to normal view or just a straight save
that is triggering the behaviour.

The above is the most concise description of the problem I can come up with at
this time.
Minor clarification: editing in html view followed by a switch from html to
normal does not seem to force a save at that point, as I inferred before.
However the rest of me earlier msg stands.
I have been seeing the "extra" blank lines on windows too. Therefore, OS = ALL.
This annoying bug persists in 200202021 on win2k. I observe it occuring when I
save (after editing), leave, and then reload the HTML code into Composer. As far
as I can tell the blank lines are inserted *ONLY* into the header section of the
code, i.e.  <head>  ....  </head> and not in other sections.
removing myself from the cc list
I used to be able to reproduce this, but on Windows build 2002030703, the
original problem appears to be mostly fixed.
One blank line appears between <title> and <meta>, but only one.  If you delete
the line, it comes back, but if you leave it alone, no more new linefeeds appear.
It no longer happens anywhere else.

I do notice a new and similar problem now, though.
Every time you modify the source, then switch to normal view and back, a space
is added before and after each line in the html source.
Also, that blank line left between <title> and <meta> gets 7 new spaces in it
each time.  6 if it had been deleted before.
I also observed that several spaces appeared in each blank line.

This is not the place to say too much about this, but it is a concern of mine
that a bug like this: essentially inserting unwanted characters by something as
old in concept as a text editor, could go on for so long (18 months and
counting!!), with no systematic process occurring to find it and fix it. Text
editors have been around for many decades now, and yet under this GNU
environment it seems the wheel got reinvented, and it is carved out a bit
offround in the process.
*** Bug 138651 has been marked as a duplicate of this bug. ***
Mozilla is nearing 1.0 and this bug still haunts us.  After speaking with a few
of the developers it appears that pretty print's implementation is causing this
error.  

Sever blank lines are added to various parts of the code.  

I have voted for this bug in order to escalate it's visibility to the
hardworking bug-squashers.  Please try to handle this one before 1.1.
akkana, do you want to take this bug back? I ended up with this again because 
you gave it to tony, and tony gave me all his bugs when he switched groups.

I think this may be the same problem we discussed over IRC the other day.
Assignee: kin → akkana
Priority: P4 → --
Target Milestone: Future → ---
It's a serializer bug, which is why it was given to Tony when he was
theoretically the serializer owner.

I'm willing to own it since it affects the editor.
Target Milestone: --- → Future
Keyword "Data Corruption" should be added.

That error is not "normal", but "critical", because this error makes the
composer absoutely unusable for any page. Even if yo only want to add some
keywords or similar (not to talk about inserted javascript ...) you have to go
to "HTML-SOURCE", which caused the data-corruption.

It is hard to believe that this Problem Opened: 2000-07-23 17:48
- still not is assigned
- still has target "future"

Does anyone else want to vote for this bug?
I agree that this is dataloss, and I would consider this along with the similar
bug 51318 (which may be fixed with bug 85505) to be the most important composer
bugs.

I just voted for this bug (all my other votes in the browser product were on
non-composer components), and now with 6 votes, it places number 8 out of all
Editor: Core and Editor: Composer bugs.
*** Bug 143264 has been marked as a duplicate of this bug. ***
OS: Linux → All
To be more clear (and findable in bugzilla), please *change subject to*:

"Switching between "Normal" and "Source" views adds extra blank lines (LF or
CRLF) to HTML code"

This bug is highly annoying and *makes Composer unusable* because after a bit of
editing (and switching between "Normal" and "Source") there are so *many* blank
lines that you end up with only one line of code visible per screen; thus
forcing the user to endlessly scroll-scroll-scroll just to go down a *few* lines
of code. This bug should block the next release!

Please add *keywords*: 4xp, adt1.0.0, mozilla1.0, nsbeta1, nsbranch

Please make this bug *block* bug 114527 (Meta Bug: Composer Work for MachV)

Also, this is a "Major loss of function". Therefore, Severity: "Major"
*** Bug 143446 has been marked as a duplicate of this bug. ***
nominating...starting to get DUPS.
Keywords: nsbeta1
Whiteboard: EDITORBASE
Could someone please post an updated description of how to reproduce this bug?

I tried: open a blank editor page, type something, toggle back and forth between
html and normal view.  No newlines added.  In normal view, I hit return and
typed another line.  Still no newlines added.  Noticed someone said "a newline
is added for each tag seen", so I tried making something bold; still no newlines
added.  In case they had to be block tags, I tried adding a list; still no
newlines added.  Tried saving to a file, then toggling back to source mode; no
newlines (either in source mode or in the saved file).

Can someone please either describe what to do starting with a blank editor
window to see this bug, or attach an html file to this bug which, when edited,
will show the bug?

I presume it's not a platform linebreak issue, since the bug was initially
reported on linux and that's where I'm trying to reproduce it.
It happens when you make changes to the html source, then switch to normal view,
then back to source.
But I'm not getting new lines anymore.  Instead, I'm now seeing a space inserted
at the beginning and end of each line in the body, which keep accumulating.

There is also a blank line between <title> and <meta> which it insists on
recreating if deleted, and that line gains about 6 space chars each time.
I don't see these spaces, and I don't see any blank lines at all.  You're
evidently doing something differently from what I'm doing.
Whiteboard: EDITORBASE → EDITORBASE, NEED INFO
I just tested both the latest trunk and branch builds, and tested with a new
profile.  It does sound like we're doing two different things.  Here's one way
to see the bug:
1.  Open a blank composer page. (File -> New -> Composer Page)
2.  Click the "html source" tab along the bottom. (Or do View -> HTML Source)
3.  The line of text between the line containing "<body>" and the line
containing "</body>" should contain only "<br>"
4.  Replace that "<br>" line with "test"
5.  Click the "normal" tab along the bottom. (Or do View -> Normal Edit Mode)
6.  Without making any changes in the view, switch back to HTML source mode.
7.  The line containing "test" is now " test "
I do see that.  But I don't see anything growing.  If I switch back and forth,
no additional spaces or newlines get added.

If instead of test I type three lines, like
1
2
3
then if I go to normal then back to source, I see 
 1 2 3
(we lose source formatting between text nodes -- that's no surprise and is
covered by other bugs on various parser and serializer issues) and if I switch
back and forth, I continue to see the same thing -- again, nothing increasing. 

It's a single space at the beginning of the line, and another one at the end. 
This can't be the terrible "whitespace keeps growing forever and makes composer
unusable" bug that everybody's been describing here, can it?
Every time you make an edit in HTML source view, another space is inserted at
the beginning and end of each line of text.  In the very simple example I gave,
editing in a normal view eliminates the spaces at the beginning of the line, but
this only happens on the line after <body>, so in any real situation with actual
content, it's much more annoying.

An entire document, as you edit it, slowly gets padded out one space at a time.
 For example, after one edit and save of the mozilla.org front page, the html
file becomes 2k bigger.  It's a 20k file, so that's 10%.  It consistantly gains
2k each time.  If it's your only copy and you want to undo the damage, it
requires deleting the spaces manually, line-by-line in a text editor.  This
takes quite a while.

The HTML source editing feature of Composer is unusable until this bug is fixed,
at least for me.

I'm fairly certain I figured out what's wrong, and I'll post my theory in a moment.
I'm pretty sure I know what is happening, and now I just need to find the
relevant code to check if my theory is correct.  I took a look at what was going
on with the DOM Inspector, by using it on a Composer window, and examining the
document tree, especially the #text nodes, as this goes on.  (Keep in mind that
if you do this, you have to collapse the editor node and expand it again to
update.  Just deselecting and reselecting the #text node won't work)

I noticed that the #text node contained a newline before and after the word
"test".  If you try this, make sure to do "select all", then paste into a text
editor to make sure you catch any newlines after the text which would otherwise
be invisible (selection doesn't show it).

So where were the newlines coming from?  They were in the source!  After it
reaches the <body> tag, it starts copying everything from that point to the next
tag, and since there is a newline after <body> and a newline before </body>,
those are included.  Is anything wrong with this behavior?  No, that is the
normal (probably correct) parser behavior, and you can see this outside of composer.

But when the HTML source is regenerated, it wants to make sure there's a newline
after certain tags (such as <br>).  The problem is that it always inserts the
newline in addition to the ones that were already there in the original source.
 This is the cause of the original problem.

Something has changed since then.  Now it seems the newlines in the #text nodes
are being changed to spaces.  This was probably an attempt to minimize the
visibility of this bug.  If you have some text, a newline, and some text again,
then a space is visible where the newline was.  So if you change the newlines to
spaces, the rendered HTML doesn't look different in any case, and the bug in the
HTML source view is less visible.

I think the correct solution to this bug would be to first stop it from
converting newlines to spaces, so we have the old behavior, and then modify it
so that when converting the document tree to source, it first checks if the
following (text) node starts with a newline.  If so, don't insert one after the
current tag.
Peter: Is that attachment intended to illustrate what composer does to a file
(or has has done at some point in the past but no longer does)?  That file has
lots of lines containing only whitespace, but when I feed it to composer I don't
notice anything getting worse.  Please clarify.

burpmaster: I think the spaces are coming from nsHTMLContentSerializer.cpp line
700.  Just removing it isn't an option (then words will sometimes run together),
but perhaps the code should be smarter and e.g. not add the space when already
at end of line.
Clarifying the summary, since the current bug has to do with adding spaces, not
linebreaks.  If there's still a case where linebreaks get added, someone please
post instructions on how to reproduce.
Summary: switching between "normal" and "html source" adds LF (or CRLF) → switching between "normal" and "html source" adds spaces
Whiteboard: EDITORBASE, NEED INFO → EDITORBASE
Here are the step tested on:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.0rc2) Gecko/20020510
Alias, WinXP Pro and Mozilla 1.0RC2
In Oliver's steps, the newline before each meta tag comes from the tag list in
nsHTMLContentSerializer::LineBreakBeforeOpen -- I wonder if we should remove
meta from that list?  (That's a trivial part of the bug, though, and probably
not what he was calling attention to.)

The linebreak after br presumably comes from
nsHTMLContentSerializer::LineBreakAfterOpen, and it makes sense to have it there
-- I would expect a prettyprinter to put a br on a line by itself.  I don't see
the number of line breaks growing, though; there's one line break added after
the br, and no more line breaks get added after that when I add the "test2" and
then click normal/source, still just one after the br, which seems correct.  I'm
still not seeing the "constantly increasing number of linebreaks" bug.

I spent a while tracing around in nsHTMLContentSerializer.cpp to see why the
extra spaces got added at the beginning and end of the line.  The one at the
beginning of the line comes from line 700: the first time through on the "test"
text node, lineLength is 0 yet the "if (indx == oldLineEnd)" clause is always
done, so a space is always added at the beginning of lines.  This change:
700c700,701
<         AppendToString(NS_LITERAL_STRING(" "), aOutputStr);
---
>         if (lineLength > 0)
>           AppendToString(NS_LITERAL_STRING(" "), aOutputStr);
stops the initial space from being put in.
However, it doesn't stop the space after the word, which happens on the next
pass through.  The code needs to do something like what
nsPlaintextSerializer.cpp does -- mark that a space is needed, but not put it in
until the next non-space word is added on the line.

This is all tied in with the rewrite of the wrapping code in this file which
happened back in the noxif landing.  See bug 56921 for another discussion of
problems with the current wrapping code -- it always makes lines too long, and
it doesn't use the I18n-approved line breakers.  Someone needs to rewrite this
class to wrap correctly, or resurrect the old code from
nsHTMLContentSinkStream.cpp (which this code replaced, creating several
regressions).  I don't understand this code well enough to be able to fix
localized problems like this in a safe way -- someone needs to take ownership of
this code and make it work right.

Reassigning to the serializer owner.
Assignee: akkana → harishd
Component: Editor: Core → DOM to Text Conversion
Attached patch patch (obsolete) — Splinter Review
The patch is a fix that I came up with last night for spaces at the end of a
line, plus Akkana's suggestion to fix spaces at the beginning of lines.
It fixes the problem completely, except for the blank lines before each <meta>
which must be a seperate issue, and I see no other issues.
Awesome!  Your fix looks good to me.  In that case, I'll take the bug back to
get the fix checked in (unless you want to own it, burpmaster).

I'm tempted to remove the meta tag listing from the LineBreakBeforeOpen list at
the same time -- what does everyone else think about that?
Assignee: harishd → akkana
Target Milestone: Future → mozilla1.0.1
It's not ready yet, I found one issue.
The patch deals with the newlines at the beginning of a line, where they can be
deleted without being converted to a space because the previous newline
character is already acting as whitespace.
The patch deals with the newlines at the end of the line and deletes those
without making a space, too, for the same reason.

However, when text is wrapped, a newline is inserted and the original space at
the wrapping point is preserved.  These accumulate like before.  I will post an
updated patch to fix this as well, and I should probably also add comments.

And as for the meta tag issue, if you look at the blank line before each meta
tag, you'll find there are spaces there which keep accumulating, even with my
patch applied.  I tried removing meta (and link) from the LineBreakBeforeOpen
list, and these spaces are still generated, although now on the same line.

Then I tried putting <table><tr><td><meta></td></tr></table> in the head tag,
and found that so many more spaces were being created.  That means that in
addition to the newline problem, new indentation is created each time, and the
old indentation is being preserved.  A seperate bug should be probably filed.
Comment on attachment 83621 [details] [diff] [review]
patch

r=akkana on burpmaster's patch.
Attachment #83621 - Flags: review+
Some of that indentation is intentional: for instance, 
<table>
  <tbody>
    <tr>
      <td>item one

Indentation level will be increased for any of the tags listed in
StartIndentation/EndIndentation.  "head" is one of these tags, so the meta tags
inside the head should be indented two spaces.  But if there are spaces which
are growing in number each time you toggle between normal and source mode, that
of course is not intended.
Attached patch updated patchSplinter Review
It turned out I just needed to set addSpace to false if addLineBreak was true.
I also cleaned it up a bit.

Can someone give me the permission to obsolete files?
Attachment #83621 - Attachment is obsolete: true
Attachment #83621 - Flags: review+
Comment on attachment 83661 [details] [diff] [review]
updated patch

r=akkana on the new patch.  Thanks!
Attachment #83661 - Flags: review+
jst, can you sr this, since this is in your code?  Thanks!
Keywords: patch
Whiteboard: EDITORBASE → EDITORBASE, NEED SR
Comment on attachment 83661 [details] [diff] [review]
updated patch

sr=jst
Attachment #83661 - Flags: superreview+
The fix is in on the trunk, credit to burpmaster in the checkin comment.  Thanks
again!
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Whiteboard: EDITORBASE, NEED SR → EDITORBASE
Filed bug 145196 for remaining issue with <meta>.
On nigthly 2002051708/Linux I still see extra lines:

How to reproduce: ( [enter] means pressing that key )
Window/Composer
type a[enter]b[enter]c[enter]
select <html>source
place coursor after "b<br>" and type [enter]<br>
select normal
select <html>source

I see blank line after every <br> tag. I expect to not see any of them.
kaspar: I don't see any extra blank lines with a trunk Linux build from this
morning.  What platform?  Does anyone else see what kaspar describes?

If so, someone should probably file a new bug since the issue in this one was fixed.
Akkana,

ok, seams my .mozilla directory is somehow corrupted. When I started with no
.mozilla directory in my home directory composer did not added extra lines.

I used to install CaScadeS in the past.
cc Daniel so he can read comment 62 and any other relevant comments (possible
CaScadeS interaction?)
Hardware: PC → All
no, it is totally impossible that CaScadeS is the root of the problem here.
It is an add-on that does absolutely nothing unless you select of its menu items,
and these menu items only modify the contents of HEAD and/or STYLE attributes.
Nothing less, but nothing more.
I just want to say that I confirmed this BUG as fixed using 
the reproduce steps in the attach in comment #47:
http://bugzilla.mozilla.org/show_bug.cgi?id=46227#c47 
I used the trunk nightly build of 2002-05-19
The fix works great! Thanks
kaspar: I don't suppose by any chance you saved a copy of the bad profile
directory?  The profile folks may be interested in hearing about any profile
corruption that may have happened recently, so if you still have the files it
might be worth filing a bug.
I cannot confirm that the bug is FIXED
I tried with "latest TRUNK"
Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.0.0+) Gecko/20020521
today, and there still are inserted new lines. May be only in special
source-files? I can create att. with html.file if required.
Rainer: can you confirm with this attach(comment #17):
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=83576
I use this attach to confirm it fixed with the 2002-05-17 trunk build
Akkana/Dark/someone, can you verify this bug and mark verified-fixed?

There is conflicing remarks in this bug indicating that it is still not fixed.
i sat and tested this the other day with the same html file i formetly saw grow
like mad both with spaces and linefeeds. I was unable to reproduce the bug. WFM.
Hi oliver,

I can confirm results of att. 83576 only for "pretty print" mode. In "retain
original source"-mode the problem still is actual. Because "pp"-mode caused
several other Problems (for example bug 127559, bug 126259)it must be checked,
whether these problems are solved.

Because the "pretty print"- mode causes several risks it would be fine, if the
"retain original source code"-mode could be modified, too. The problem should be
solved in both modes.
Bug 126259 is _not_ fixed, so that the "pretty print"-mode still causes
problems, so that this bug is _not_ fixed
Depends on: 126259
marking verified per RKA's comments...

REOPEN if anyone feels this is not fixed.
Status: RESOLVED → VERIFIED
Hi Bielefeld:
could you please supply the steps to reproduce this bug?
Also, which version(date ID) of the Mozilla trunk are you using
for testing? It seems that this is solved in the trunk, but not
in the 1.0 tree (ie RC3).
comment to #74

Hi 
Oliver, I belive everything concerning  Bug 126259  is described very well in
http://gair.firstpr.com.au/public-files/moz-bug-space-in-linked-image/test-1.html
on that bug-page.

There is a bugfix announced by
http://bugzilla.mozilla.org/show_bug.cgi?id=126259#c10
but I do not know when this bugfix will be available.
comment to #75
Hi Rainer:
I think bug http://bugzilla.mozilla.org/show_bug.cgi?id=126259 is diferent from
this bug, because this bug is about adding space caracters everywhere in the 
source when changing the display mode.
Bug 126259 is about the "pretty print" feature and the IMG tag. I more like:
"the source look good but some browsers don't like it"

Can you please tell me if you can still reproduce this bug with the instructions
described in comment #47? (Use the current trunk build)
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=83576

If you can't reproduce the bug using comment #47, then this BUG is fixed
and further discusions should be made in bug 126259.

I read your page and verified that a CR is still added between <a> and </a>. 
I used 1.0RC3.
Reply to #76
If I hve selected "Reformat HTML-Source" for the composer, this "insert line
problem" Is no longer actual for me: not in TRUNK and not in RC3.
But the final solution still is blocked by bug 126259
Any estimate of when 1.01 will be released? I'd like a stable build which 
incorporates the fix for this Bug 46227.
*** Bug 155750 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.