Closed
Bug 566280
Opened 14 years ago
Closed 14 years ago
[HTML5] Plain text prefixed by U+0000 displays only U+FFFD
Categories
(Core :: DOM: HTML Parser, defect, P2)
Core
DOM: HTML Parser
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
blocking2.0 | --- | final+ |
People
(Reporter: hsivonen, Assigned: hsivonen)
References
Details
(Keywords: regression)
Attachments
(2 files, 3 obsolete files)
13 bytes,
text/html
|
Details | |
5.15 KB,
patch
|
sicking
:
review+
|
Details | Diff | Splinter Review |
Steps to reproduce: 1) Load the attachment. Expected results: �hello world Actual results: �
Updated•14 years ago
|
blocking2.0: --- → ?
Keywords: regression
Updated•14 years ago
|
Assignee: nobody → hsivonen
Assignee | ||
Comment 2•14 years ago
|
||
Assignee | ||
Comment 3•14 years ago
|
||
Attachment #447722 -
Attachment is obsolete: true
Assignee | ||
Comment 4•14 years ago
|
||
zwol, HTML5 invalidates http://mxr-test.konigsberg.mozilla.org/mozilla-central/source/layout/reftests/bugs/228856-2.html?force=1 since the HTML5 parsering algorithm turns U+0000 into U+FFFD before it reaches the CSS parser. The test has accidentally passed due to this bug. What should be done to 228856-2.html when landing this fix?
Comment 5•14 years ago
|
||
Is there a specification that explicitly calls for U+0000 to be replaced by U+FFFD? That seems odd to me; if anything, I'd have expected to see a hexbox rather than a Unicode REPLACEMENT CHARACTER. U+FFFD would normally indicate an encoding error (e.g. an invalid UTF-8 sequence or unpaired UTF-16 surrogate, or an invalid code in a legacy codepage that cannot be transcoded to Unicode), not merely a correctly-encoded character that we can't display.
Comment 6•14 years ago
|
||
The test is really about what U+0000 does to the CSS parser, so you should definitely pull the contents of the <style> tag out to a separate sheet. I'm not sure what to do with the divs, though. Does <div something="..�.."> still generate an attribute with a literal NUL in its value? If so, we could probably just delete the subtests with literal NULs in the input, and rely on the �s. If not, we need to convert this to a mochitest that uses JS to examine the parsed style sheet, which is a thing I can do if you don't know how.
Assignee | ||
Comment 7•14 years ago
|
||
(In reply to comment #6) > The test is really about what U+0000 does to the CSS parser, so you should > definitely pull the contents of the <style> tag out to a separate sheet. OK. > I'm not sure what to do with the divs, though. Does <div something="..�.."> > still generate an attribute with a literal NUL in its value? � generates U+FFFD per HTML5. > If so, we could > probably just delete the subtests with literal NULs in the input, and rely on > the �s. If not, we need to convert this to a mochitest that uses JS to > examine the parsed style sheet, which is a thing I can do if you don't know > how. I don't, so it would be nice if you'd do it to make sure the test still test what you intended.
Assignee | ||
Comment 8•14 years ago
|
||
(In reply to comment #5) > Is there a specification that explicitly calls for U+0000 to be replaced by > U+FFFD? Yes, the HTML5 spec. The zero byte: http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#preprocessing-the-input-stream The numeric reference: http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#tokenizing-character-references
Comment 9•14 years ago
|
||
(In reply to comment #7) > > we need to convert this to a mochitest that uses JS to > > examine the parsed style sheet, which is a thing I can do if you don't know > > how. > > I don't, so it would be nice if you'd do it to make sure the test still test > what you intended. I will try to find time for this next week. Note that Monday is a holiday in the USA. (In reply to comment #8) > (In reply to comment #5) > > Is there a specification that explicitly calls for U+0000 to be replaced by > > U+FFFD? > > Yes, the HTML5 spec. CSS presently doesn't define the behavior of U+0000 either as a literal character or as a \-escape. It is tempting to propose that CSS change to match HTML5 - it's not like there's any cost to doing so, and we'd gain predictability. dbaron, fantasai, what do you think?
Assignee | ||
Comment 10•14 years ago
|
||
Splitting out the first <style> into a <link rel=stylesheet> was enough to make 228856-2.html not fail. The binary patch that adds a reftest for this bug is like the reference for the test except there's a zero byte where the reference has �.
Attachment #447960 -
Attachment is obsolete: true
Attachment #448355 -
Flags: review?(jonas)
Assignee | ||
Comment 11•14 years ago
|
||
Forgot to update a copyright year.
Attachment #448355 -
Attachment is obsolete: true
Attachment #448371 -
Flags: review?(jonas)
Attachment #448355 -
Flags: review?(jonas)
Assignee | ||
Comment 12•14 years ago
|
||
Comment on attachment 448355 [details] [diff] [review] Fix bad copypasta, make the reftest reference work on the tinderbox, make an older reftest not fail (In reply to comment #11) > Forgot to update a copyright year. Sorry. Wrong bug.
Attachment #448355 -
Attachment is obsolete: false
Attachment #448355 -
Flags: review?(jonas)
Assignee | ||
Updated•14 years ago
|
Attachment #448371 -
Attachment is obsolete: true
Attachment #448371 -
Flags: review?(jonas)
Comment 13•14 years ago
|
||
Henri, when exactly were these rules for U+0000 and � added to HTML5? If there was public discussion of this change, a pointer to that would also be useful.
Assignee | ||
Comment 14•14 years ago
|
||
(In reply to comment #13) > Henri, when exactly were these rules for U+0000 and � added to HTML5? http://html5.org/tools/web-apps-tracker?from=13&to=14 > If > there was public discussion of this change, a pointer to that would also be > useful. I can't find a public discussion of this change. I can find some emails where I whined about U+0000 getting dropped without a parse error, but I don't see email from me or Hixie about mapping it to U+FFFD.
Attachment #448355 -
Flags: review?(jonas) → review+
Assignee | ||
Comment 17•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/14bb99ed59c8
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•