Leading and trailing newlines of a textnode should be skipped. Isn't it as simple as that? if i write <tag1> </tag1> i obviously want a newline there. not so for <tag1> </tag1> if i write: <pre> Some text here. Some on another line. </pre> i want the data of the text node to be == " Some text here.\n\n Some on another line."

Alex Vincent [:WeirdAl]

Comment 44

•

23 years ago

That's not how it works. The <pre>...</pre> element means "preformatted text". That means the browser must copy the text to the screen, character for character. Including the newlines immediately adjoining the tag. The exception are additional markup tags, such as <em>...</em>, which would still apply inside the preformatted text tags. Inside <pre>...</pre> tags is always considered "significant", unconditionally. The preformatted text element is a block-level element, like the <p>...</p> element. Typically people do not add newline characters immediately following the opening tag of either one, but if they do for the preformatted text element, the browser must assume that's intentional.

Jonathan Watt [:jwatt]

Comment 45

•

23 years ago

As Alexey points out the W3C recommendation is very clear on how white space should be treated.

vidur (gone)

Comment 46

•

23 years ago

Moving to jst@netscape.com's bug list. Apologies for letting it languish on mine.

Assignee: vidur → jst

Status: REOPENED → NEW

basic

Comment 47

•

23 years ago

From jscript@pacbell.net aka WeirdAl > Maybe we should do some sort of check for "white-space: pre" before removing > superfluous white space. not so easy, what if I dynamically set an elements CSS white-space property to pre ? Parsing should not depend on the stylesheet I think. From alexey@ihug.com.au > David, latest spec very clearly defines how white space characters should be > treated: > http://www.w3.org/TR/2001/WD-xhtml1-20011004/#uaconf This will only apply to xhtml served as xml when the xml:space attribute of the element is not set to preserve. I assume that <pre> in xhtml has the xml:space attribute set to preserve. If the white spaces are removed in HTML based on what that doc mentions but without a way to revert back to current behavior, we have effectively removed support for css property white-space: pre; in html. From jscript@pacbell.net aka WeirdAl > What if we did a process late in the game, before any event handlers or > scripting take over but after styling, of cleaning up the nodes? Again not so easy, what if I dynamically set an elements CSS white-space property to pre ?

Zack Weinberg (:zwol)

•

23 years ago

Guess I should have said what the point of that little demonstration was... My point was: how is IE doing it? I'm sorry I can't propose a solution, since I don't know how any of the code works, but it seems like IE must be following rules. That is, "whitespace is allowed here, but not here" or some such. I would guess that one of the rules in play in the example above is that whitespace between tags that are within a paragraph are significant and must be preserved as text nodes. Others might be: Whitespace between block-level tags is not significant, and therefore need not be preserved as text nodes. Text nodes between TR and TD tags is not allowed, and therefore whitespace between TR and TD tags is not preserved as text nodes. Maybe this is just really hard, or not the way you've tackled this thus far? As an engineer on another product, I can't stand it when people who've never seen the codebase say, "this should be so easy to implement!", so I won't. ;)

Alex Vincent [:WeirdAl]

Comment 52

•

23 years ago

You know, I've looked at this, and I've changed my mind. I said: >My opinion is, if it renders, include it. If it doesn't render or isn't meant >to render, exclude it. But that's a cheap way to approach a fundamental question. A document's rendering does not necessarily correspond to how the DOM views it. Does the DOM see a <!DOCTYPE > tag? Yes. Does the user? No. We're dancing around the issue. Should there or should there not be whitespace text nodes in the DOM? We haven't yet figured that out. Frankly, when I restrict my perspective to that specific question, I am forced to give those whitespace text nodes the benefit of the doubt. What reason do we really have to take them out? IE's behavior, as we are all well aware, is not necessarily the correct behavior. Nor should conveinience always dictate what we require of Mozilla 1.0. Sloppy coding by users is what brought people to condemn Netscape 6.0, because Mozilla and Netscape stopped supporting layers. I recommend WONTFIX or INVALID.

Alexey Chernyak

Comment 53

•

23 years ago

There is no question on which nodes and where to delete. The spec is very clear on that. No point in discussing it. The question at hand is how to get CSS property 'white-space: pre' to work after white spaces were removed from DOM. >We're dancing around the issue. Should there or should there not be whitespace >text nodes in the DOM? We haven't yet figured that out. The way I see it, DOM is a mechanism for describing a document structure (HTML). And HTML specification defines the rules for structure of HTML documents. The structure described by our DOM violates those rules, so it basically can't be called an HTML DOM. We need a different solution for handling 'white-space: pre' from the one we have now. Also we need different solution for ViewSource, which right now relies on whitespaces in DOM and doesn't show the actual source of the document! 2 possible approaches are: 1. Storing discarded white space information so it can be used by "pre" or ViewSource. This will keep ViewSource working, but will not help to show the *actual* source. 2. Ability to retrieve and use raw portions of code. This is preferable for ViewSource. However coordinates would have to be preserved for each element for ViewSource colouring to work. This also would involve re-parsing for "pre", for it may contain other tags inside of it. The second approach is more favourable, but looks harder. So, is this feasible? Fixing this bug and ViewSource before 1.0 release would be really awesome.

Keywords: mozilla0.9.9, mozilla1.0

Boris Zbarsky [:bzbarsky]

Comment 54

•

23 years ago

A few comments on the XHTML spec link Alexey posted: 1) The line in question ("All white space surrounding block elements should be removed.") is a "should" not a "must". 2) By "block" I presume it means things that are declared to be blocks in the DTD? That is, at http://www.zbarsky.org:8000/~bzbarsky/domTest.html the first two "Some text" occurences should be on one line with no space between them while the second two should be on two separate lines?

Hixie (not reading bugmail)

Comment 55

•

23 years ago

In reply to comment 41: The XHTML spec is on crack. See my post in www-talk: http://lists.w3.org/Archives/Public/www-talk/2001MayJun/0141.html I was asked to make a "definitive standards statement". My opinion are my own, and are thus not normative or anything, but: I would say this bug is a WONTFIX. In fact I put a comment to that effect in the status whiteboard last July.

Hixie (not reading bugmail)

Comment 56

•

23 years ago

Web authors should read: http://www.mozilla.org/docs/dom/technote/whitespace/ I'm going to mark this WONTFIX because bz gave me the go-ahead to do so. :-)

Comment 84

•

23 years ago

•

23 years ago

Recommend removing the nsbeta1 keyword as well, as long as we're not certain we should do this even in Mozilla, much less Netscape.

Keywords: mozilla0.9.9, mozilla1.0

lhylan

Reporter

Comment 91

•

23 years ago

Re: comment #89 -- if you do file a separate bug, can you make sure to note the table tags case (that text nodes are not allowed between them)? I'm cautiously thrilled to think that this discussion could move off of whether to preserve whitespace at all and on to whether to create text nodes where they are explicitly forbidden, since that was my original argument for reopening this bug (comment #3). having said that, thanks much to everyone who's thought long and hard about all the issues involved here.

Hixie (not reading bugmail)

Comment 92

•

23 years ago

The last comments on this bug are INVALID. While arbitrary text is not allowed between elements in HTML <head> blocks, text consisting of exclusively white-space characters _is_ allowed, and no spec that I know of says that this should not be represented in the DOM.

Status: REOPENED → RESOLVED

Closed: 23 years ago → 23 years ago

Resolution: --- → INVALID

Christopher Hoess (gone)

Comment 93

•

23 years ago

In fact, looking carefully at the spec, I'd say our behavior is mandated for XML. (I don't think much of it for HTML, but I generally think that representing HTML4 with DOM is as bozotic as Appendix C, a view that's unlikely to gain political traction...) DOM1, Interface Text: "The Text interface represents the textual content (termed character data in XML) of an Element or Attr." XML 1.0, section 2.4: "All text that is not markup constitutes the character data of the document." (Would it have been that painful for XML to distinguish non-significant whitespace from character data?!)

Sivakiran Tummala

Comment 94

•

23 years ago

•

23 years ago

This specific bug (whitespace nodes existing in the DOM), as described above, is invalid. Marking as such.

Status: REOPENED → RESOLVED

•

testcase 24 years ago Alexey Chernyak 1.76 KB, text/html		Details
dom inspector view of a popular website 23 years ago andreww 58.06 KB, image/png		Details