Closed Bug 359195 Opened 18 years ago Closed 14 years ago

html parser erroneously aggregates between-tags text nodes

Categories

(Core :: DOM: HTML Parser, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: glazou, Unassigned)

References

()

Details

(Keywords: dataloss, Whiteboard: [fixed by the HTML5 parser])

This bugs is critical to Composer and the editor because it has deep
side-effects when the editor serializes a document, generating unwanted
line feeds and indentations.

1. launch the browser
2. open the URL

  http://glazman.org/htmlparsertest.html

3. view source of the page
4. click on the link in the page to start the test. Each alert represents
   a node (element or text) in traversal order. Text nodes show the number
   of chars in the data, the data itself, and all the charcodes of the data.
   Compare with the source view...
5. see that the text node between the title element and the meta element is
   totally horked. It aggregates all linefeeds from the root of the document
   to the current point and indentation whitespaces.
Keywords: dataloss
Is this a recent regression? Does it happen with trunk builds from two weeks ago? I know that we've been working on the content sinks lately.

We can definitely test this with JS, so it should go in the test suite at mozilla/testing/mochitest.
Flags: in-testsuite?
(In reply to comment #1)
> Is this a recent regression? Does it happen with trunk builds from two weeks
> ago? I know that we've been working on the content sinks lately.

I don't think it's a regression, though I can't test at the moment. In fact, I think the problem here is the same one as in bug 190955.
I don't think it's a regression, and I think the ooooold bug that has been
for ages on our top bugs' list for Mozilla Composer and Nvu, inserting
unwanted empty lines in serialized document trees, is caused by the current bug.
We were looking for a serializer issue, but that's more probably now an htmlparser
issue !
Assignee: mrbkap → nobody
The HTML5 parser now discards early space characters (per spec) instead of moving them forward in the DOM.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by the HTML5 parser]
You need to log in before you can comment on or make changes to this bug.