Last Comment Bug 534071 - CR (
) is rendered as a new line
: CR (
) is rendered as a new line
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Layout (show other bugs)
: Trunk
: All All
: P2 normal (vote)
: ---
Assigned To: Henri Sivonen (:hsivonen)
:
Mentors:
http://enux.pl/download/tmp/testxml/t...
Depends on:
Blocks: 557197
  Show dependency treegraph
 
Reported: 2009-12-10 15:17 PST by Maciej Jaros
Modified: 2010-06-24 10:40 PDT (History)
6 users (show)
hsivonen: in‑testsuite?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
final+


Attachments
Testcase (240 bytes, application/xhtml+xml)
2010-04-05 08:49 PDT, :Ms2ger (⌚ UTC+1/+2)
no flags Details
Testcase with escaped CR-LF (412 bytes, application/xhtml+xml)
2010-04-06 09:44 PDT, :Ms2ger (⌚ UTC+1/+2)
no flags Details
Potential fix (4.70 KB, patch)
2010-05-11 05:39 PDT, Henri Sivonen (:hsivonen)
roc: review+
Details | Diff | Splinter Review

Description Maciej Jaros 2009-12-10 15:17:49 PST
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; pl; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; pl; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)

There is a bug in rendering of HTML documents made from XML through XSL transform. This seems to happen because of new lines added as entities. I add those entities to have nicely formatted HTML output.

Reproducible: Always

Steps to Reproduce:
1.Create XSL for XML
2.Add table with <xsl:text>&#13;</xsl:text> inside a table
3.view XML in browser
Actual Results:  
New line character adds extra vertical space.

Expected Results:  
new line character shouldn't change the rendering (just chagne output XHTML)

Saved output renders correctly.
Comment 1 Tim (fmdeveloper) 2009-12-10 23:41:59 PST
Is it adding a full extra new line as in Bug 197075?
Comment 2 Maciej Jaros 2009-12-10 23:52:54 PST
(In reply to comment #1)
> Is it adding a full extra new line as in Bug 197075?

It does add an extra new line, but this is not the same. PREs should preserve whitespace characters, TABLEs should not.

The url shows you an example:
http://enux.pl/download/tmp/testxml/test.xml
Correct rendering would be:
http://enux.pl/download/tmp/testxml/test.html
(note test.html is exactly what is produced with XSL transform from test.xml)
Comment 3 Tim (fmdeveloper) 2009-12-18 22:33:26 PST
Adding xhtml keyword. 

The only difference I see is xml version is loaded in standards mode and the html version loads in quirks mode.
Comment 4 Maciej Jaros 2009-12-27 16:32:25 PST
Standards mode is irrelevant. See:
http://enux.pl/download/tmp/testxml/test.xhtml
Comment 5 :Ms2ger (⌚ UTC+1/+2) 2010-04-05 08:46:35 PDT
This isn't limited to XHTML, the HTML5 parser exposes the same behavior.
Comment 6 :Ms2ger (⌚ UTC+1/+2) 2010-04-05 08:49:18 PDT
Created attachment 437040 [details]
Testcase
Comment 7 Boris Zbarsky [:bz] 2010-04-06 08:17:35 PDT
This seems like correct behavior per http://www.w3.org/TR/CSS21/text.html#white-space-model (in particular, nothing in there says anything about transforming CR into space, and then the CR should be rendered.

If you think the behavior should be different in spite of you sticking technically invalid characters into the DOM (which is supposed to use LF throughout for newlines), please post a spec change suggestion describing your proposed change to www-style@w3.org.
Comment 8 :Ms2ger (⌚ UTC+1/+2) 2010-04-06 09:44:50 PDT
Created attachment 437319 [details]
Testcase with escaped CR-LF

And what about this one? I think the rule

> 1. Each tab (U+0009), carriage return (U+000D), or space (U+0020) character
>    surrounding a linefeed (U+000A) character is removed if 'white-space' is
>    set to 'normal', 'nowrap', or 'pre-line'.

applies here.

The spec also claims

> The CSS 'white-space' processing model assumes all newlines have been
> normalized to line feeds.

but I'm not sure if that's supposed to be normative.

(I do agree that it's silly to stick CRs in the DOM, though.)
Comment 9 Boris Zbarsky [:bz] 2010-04-06 10:19:01 PDT
> And what about this one? I think the rule

That applies for your second testcase, not the first one.

> but I'm not sure if that's supposed to be normative.

It is, yes.  Generally, the only way to stick a CR into the DOM is to try really hard to put it in there.
Comment 10 Maciej Jaros 2010-04-10 07:54:46 PDT
(In reply to comment #7)
> This seems like correct behavior per
> http://www.w3.org/TR/CSS21/text.html#white-space-model (in particular, nothing
> in there says anything about transforming CR into space, and then the CR should
> be rendered.

First of all - this a CSS spec, XHTML spec refers to the XML spec which says:
"To simplify the tasks of applications, the XML processor MUST behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character." (see http://www.w3.org/TR/xml/#sec-line-ends)

And this should clarify that both of the characters MUST be treated identically and so Gecko seems to be a single engine that doesn't conform to the specification in this area.
Comment 11 Maciej Jaros 2010-04-10 07:59:12 PDT
To add to the above see white space definition and a note here:
http://www.w3.org/TR/xml/#sec-common-syn
Comment 12 Boris Zbarsky [:bz] 2010-04-10 11:37:35 PDT
We do perform newline normalization in various cases.  I seem to recall that it's very purposefully NOT done for &#13;, but I'm not enough of an XML spec-lawyer to tell you why.  I bet Henri is, though!
Comment 13 Henri Sivonen (:hsivonen) 2010-04-12 00:18:54 PDT
(In reply to comment #12)
> We do perform newline normalization in various cases.  I seem to recall that
> it's very purposefully NOT done for &#13;, but I'm not enough of an XML
> spec-lawyer to tell you why.  I bet Henri is, though!

http://www.w3.org/TR/xml/#AVNormalize step 3. first bullet point adds character references to the normalized value without further processing. Note that the usual XML line break normalization (http://www.w3.org/TR/xml/#sec-line-ends) has taken place before character references are expanded.

So it is correct behavior that &#13; in XML (even in attributes) puts a CR in the DOM.

(In reply to comment #9)
> > but I'm not sure if that's supposed to be normative.
> 
> It is, yes.  Generally, the only way to stick a CR into the DOM is to try
> really hard to put it in there.

Surely CSS needs to define the rendering of CR if it is possible to put a CR in the DOM even if it is hard.

Considering that http://www.w3.org/TR/CSS21/text.html#white-space-model step #1 mentions CR, I find it very odd that neither step #3 nor step #4 substep #1 mention CR.
Comment 14 Henri Sivonen (:hsivonen) 2010-04-12 00:31:40 PDT
See also attachment 438430 [details] from bug 557197.
Comment 15 Henri Sivonen (:hsivonen) 2010-05-11 05:39:32 PDT
Created attachment 444639 [details] [diff] [review]
Potential fix

I'll post to www-style about this.
Comment 16 Henri Sivonen (:hsivonen) 2010-05-19 08:17:34 PDT
Comment on attachment 444639 [details] [diff] [review]
Potential fix

In case the CSS WG's response doesn't make it in time for beta, does this approach look like something that could be landed? Or should CR be made more LF-like? Or something yet different?
Comment 17 Johnny Stenback (:jst, jst@mozilla.com) 2010-05-26 16:15:45 PDT
Blocking since this blocks blocker bug 557197.
Comment 18 Henri Sivonen (:hsivonen) 2010-06-04 07:17:03 PDT
fantasai, can you guess what the probable outcome on this issue might be in the CSS WG or when the CSS WG might address this?
Comment 19 Boris Zbarsky [:bz] 2010-06-22 22:36:33 PDT
Comment on attachment 444639 [details] [diff] [review]
Potential fix

This is more roc's kettle of fish.
Comment 20 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2010-06-23 19:55:50 PDT
Comment on attachment 444639 [details] [diff] [review]
Potential fix

code looks OK. But it depends on what the WG says.
Comment 21 fantasai 2010-06-24 01:40:10 PDT
The probable outcome, if there is consistency among implementations, is to make the spec match what implementations do. If there is no consistency among implementations, then... I don't know. Btw, please tag subject lines with the spec (as described in the spec's Status section) when posting to www-style in the future.
Comment 22 Henri Sivonen (:hsivonen) 2010-06-24 02:18:25 PDT
Landed as http://hg.mozilla.org/mozilla-central/rev/3c4932e0058b to get a 1.9.3 blocker addressed. 

Bug 565035 is needed as a follow-up. I'll also follow up to the WG.
Comment 23 Maciej Jaros 2010-06-24 10:40:56 PDT
(In reply to comment #21)
> The probable outcome, if there is consistency among implementations, is to make
> the spec match what implementations do. If there is no consistency among
> implementations, then... I don't know...

Latest Opera and Chrome/Safari displays this page as expected:
http://enux.pl/download/tmp/testxml/test.xml

I can check IE9 preview tomorrow if you are interested.

Note You need to log in before you can comment on or make changes to this bug.