If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

textContent ignores whitespace

RESOLVED FIXED in mozilla1.8beta3

Status

()

Core
Serializers
RESOLVED FIXED
13 years ago
12 years ago

People

(Reporter: Jason Barnabe (np), Unassigned)

Tracking

({testcase})

Trunk
mozilla1.8beta3
testcase
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

13 years ago
Per bug 172817 comment 72...

W3 spec on textContent:
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Node3-textContent

Testcase coming up.
(Reporter)

Comment 1

13 years ago
Created attachment 177210 [details]
testcase
To be precise, the plaintext serializer in "raw" mode is what munges the
whitespace...  See
http://lxr.mozilla.org/seamonkey/source/content/base/src/nsGenericElement.cpp#233

That shouldn't be happening, should it?  Quite apart from whatever the behavior
of GetTextContent() is supposed to be?
Assignee: general → dom-to-text
Component: DOM → DOM to Text Conversion
OS: Windows XP → All
QA Contact: ian
Hardware: PC → All
Yeah, I think we're backward here.  Seems the only other thing that uses
OutputRaw is .innerHTML (on the HTMLSerializer) and IE strips out all whitespace
there, but we leave it in for that.... We need to fix textContent to keep
whitespace, and innerHTML to strip it.
The HTML and plaintext serializers do quite different things with OutputRaw...

I can get the spaces to not be stripped if I use OutputPreformatted.  But then
things like the newline in:

<body>
<pre>

are included in the textContent.  Should they be?  Again, the spec is a little
vague...

Comment 5

13 years ago
Created attachment 183898 [details] [diff] [review]
.textContent rewrite

I'd already rewritten it using content iterators before discovering this bug.
(I was considering writing nsNode3Tearoff::AppendTextContentTo instead).
It has the advantage of working on nodes without documents ;-)
Both this method and nsIDOMRange::ToString preserve whitespace nodes.
Comment on attachment 183898 [details] [diff] [review]
.textContent rewrite

>Index: nsGenericElement.cpp
...
>+  while (!iter->IsDone()) {
>+    nsIContent* content = iter->GetCurrentNode();
>+    if (content->IsContentOfType(nsIContent::eTEXT)) {
>+      nsITextContent* textContent = NS_STATIC_CAST(nsITextContent*, content);

Eep! Use a real QI here. There's no guarentee that the above will work.

r=me with that
Attachment #183898 - Flags: review+
We still want to fix the serializer brokenness, though...
True, but there's no reason to take the detour through a serializer when doing
.textContent, so that is IMHO a different bug.

Comment 9

13 years ago
From the DOM level 3 core spec:

On getting, no serialization is performed, the returned string does not contain
any markup.
No whitespace normalization is performed and the returned string does not
contain the
white spaces in element content (see the attribute
Text.isElementContentWhitespace [p.96] ).

Does that mean that getting textContent should NOT get the whitespace and we
have the correct behavior?
No, becuase we stripped the whitespace too often, not just when it was marked as
ElementContentWhitespace. We currently don't support ElementContentWhitespace,
and technically aren't required to since we're not a validating parser.

Comment 11

13 years ago
Created attachment 184581 [details] [diff] [review]
QI version

Transferring sicking's r=.
Attachment #183898 - Attachment is obsolete: true
Attachment #184581 - Flags: superreview?(jst)
Attachment #184581 - Flags: review+
Comment on attachment 184581 [details] [diff] [review]
QI version

sr=jst
Attachment #184581 - Flags: superreview?(jst) → superreview+

Updated

13 years ago
Attachment #184581 - Flags: approval1.8b3?
Comment on attachment 184581 [details] [diff] [review]
QI version

a=shaver
Attachment #184581 - Flags: approval1.8b3? → approval1.8b3+

Comment 14

13 years ago
Fix checked in.
Status: NEW → RESOLVED
Last Resolved: 13 years ago
Resolution: --- → FIXED
(Reporter)

Comment 15

13 years ago
*** Bug 295771 has been marked as a duplicate of this bug. ***
(Reporter)

Comment 16

13 years ago
The contents of the pre in the testcase are

"there is whitespace here

and here     and here
	!<!--hey, why not some comments too?-->"


and it outputs like 

"there is whitespace here

and here and here
!hey, why not some comments too?"

Is it the expected behaviour of textContent to strip the whitespace between the
two "and here"s and between the start of the last line and the exclamation mark?

Also, I've filed bug 296657 on how it's outputting the comment node. 
Neil: I didn't tell you to remove the IsContentOfType check ;)

The whitespace between the two "and here"s isn't stripped, it's just that the
alert function won't show it. Try doing just alert("hello      hello")
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Which brings the obvious question. Why do we have nsITextContent? Couldn't we
just move those functions to nsIContent? Right now QI-ing to nsITextContent is
very randomly implemented, textnodes and comments QI to it, but not PIs.
Depends on: 296657
Uh, guys, comment nodes and PIs shouldn't be appearing in textContent. Please
look again at the table at:
   http://www.w3.org/TR/DOM-Level-3-Core/core.html#Node3-textContent
(In reply to comment #18)
> Right now QI-ing to nsITextContent is
> very randomly implemented, textnodes and comments QI to it, but not PIs.

Only PIs don't QI to it, because that wouldn't make sense. PIs have a
target/data pair, whereas the others have just some text data.
True, though I still don't see a reason to have to QI to nsITextContent at all.
I can't think of any case where a QI to nsITextContent gives you any additional
information. In most cases you're only interested in textnodes, and in no case I
can think of you're interested in just comments or textnodes. If you're
interested in nodevalues you still have to check for PIs separatly.
Neil, is this last patch in?  Is the bug fixed?

Comment 23

12 years ago
This bug was fixed on 1st of June and bug 296657 was fixed on the 8th...
So is there a reason this is still open?
(Reporter)

Updated

12 years ago
Status: REOPENED → RESOLVED
Last Resolved: 13 years ago12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.8beta3
You need to log in before you can comment on or make changes to this bug.