Closed
Bug 437915
Opened 15 years ago
Closed 15 years ago
acid3 test 33 fails: whitespace error in class processing
Categories
(Core :: CSS Parsing and Computation, defect)
Core
CSS Parsing and Computation
Tracking
()
RESOLVED
FIXED
mozilla1.9.1a1
People
(Reporter: dbaron, Assigned: dbaron)
References
()
Details
Attachments
(5 files)
251 bytes,
text/html; charset=UTF-8
|
Details | |
2.30 KB,
text/html; charset=UTF-8
|
Details | |
8.84 KB,
patch
|
bzbarsky
:
review+
sicking
:
superreview+
|
Details | Diff | Splinter Review |
2.41 KB,
text/html; charset=UTF-8
|
Details | |
3.03 KB,
text/html; charset=UTF-8
|
Details |
Test 33 in http://acid3.acidtests.org/ fails with the message "whitespace error in class processing". I'll attach a simplified testcase (which Webkit passes). I still need to figure out what spec says our behavior is wrong and what it says we should do instead.
Assignee | ||
Comment 1•15 years ago
|
||
Assignee | ||
Comment 2•15 years ago
|
||
CSS 2.1 says: [att~=val] Match when the element's "att" attribute value is a space-separated list of "words", one of which is exactly "val". If this selector is used, the words in the value must not contain spaces (since they are separated by spaces). ... that was a change from earlier drafts, I think. I think Acid3 is testing something that's since been changed in the specs.
Assignee | ||
Comment 3•15 years ago
|
||
It's actually always been that way in CSS2 / CSS 2.1. I do recall the WG discussing the discrepancy at some point, though.
Assignee | ||
Comment 4•15 years ago
|
||
It's also possible the discussion was about the internal discrepancy in css3-selectors between the summary table in section 2 and the text in 6.3.1, but that looks like it was fixed by Hixie in http://lists.w3.org/Archives/Member/w3c-css-wg/2006JanMar/0073.html without it being discussed elsewhere. Then again, I really do remember a discussion, but I can't find any record of it. Maybe the discussion I remember is this one, though: http://lists.w3.org/Archives/Public/www-style/2005Mar/0107.html http://lists.w3.org/Archives/Public/www-style/2005Mar/0111.html http://lists.w3.org/Archives/Public/www-style/2005Mar/0120.html http://lists.w3.org/Archives/Public/www-style/2007Mar/0000.html http://lists.w3.org/Archives/Public/www-style/2007Mar/0001.html
Assignee | ||
Comment 5•15 years ago
|
||
Ah, the discussion I was remembering was this one, I think: http://lists.w3.org/Archives/Member/w3c-css-wg/2005OctDec/0093.html http://lists.w3.org/Archives/Member/w3c-css-wg/2005OctDec/0158.html http://lists.w3.org/Archives/Member/w3c-css-wg/2005OctDec/0160.html ... which resulted in the change being made to css3-selectors between: http://www.w3.org/TR/2001/CR-css3-selectors-20011113/#attribute-selectors and http://www.w3.org/TR/2005/WD-css3-selectors-20051215/#attribute-representation So in this case, the newer change was to css3-selectors, and this probably is a bug. But we should probably get the change backported to CSS 2.1.
Assignee | ||
Comment 6•15 years ago
|
||
So, it turns out we need to fix this bug separately for ~= and for class. For ~= we need to patch ValueIncludes in nsCSSRuleProcessor.cpp. class attribute parsing, however, lives somewhere in content code, perhaps multiple places.
Comment 7•15 years ago
|
||
(In reply to comment #0) > Test 33 in http://acid3.acidtests.org/ fails with the message "whitespace error > in class processing". I'll attach a simplified testcase (which Webkit passes). > > I still need to figure out what spec says our behavior is wrong and what it > says we should do instead. Based exclusively upon looking at the test case that you uploaded in comment #2, the relevant specification seems to be that of HTML 4.01. Section 7.5.2 [1] says that class names specified in |class| attribute values need to be separated by white space characters. White space characters are defined in section 9.1 [2] to be the following: U+0009 <control> (HORIZONTAL TABULATION) U+000B <control> (VERTICAL TABULATION) U+000C <control> (FORM FEED) U+0020 SPACE Also included are the line‐breaking characters [3]: U+000A <control> (LINE FEED) U+000D <control> (CARRIAGE RETURN) U+000A U+000D <control> (LINE FEED) <control> (CARRIAGE RETURN) Therefore, the |class| attribute in your test should be equivalent to |a b c d e f g h|. [1] http://www.w3.org/TR/html401/struct/global.html#adef-class [2] http://www.w3.org/TR/html401/struct/text.html#h-9.1 [3] http://www.w3.org/TR/html401/struct/text.html#line-breaks
Assignee | ||
Comment 8•15 years ago
|
||
But unless HTML defines normalization, what CSS matches is defined by CSS, which is different.
Assignee | ||
Comment 9•15 years ago
|
||
HTML 5, at http://www.whatwg.org/specs/web-apps/current-work/#space , defines white space as: "U+0020 SPACE, U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), U+000B LINE TABULATION, U+000C FORM FEED (FF), and U+000D CARRIAGE RETURN (CR)". css3-selectors says: "Only the characters "space" (U+0020), "tab" (U+0009), "line feed" (U+000A), "carriage return" (U+000D), and "form feed" (U+000C) can occur in white space." What we currently implement (in both places) uses nsCRT::IsAsciiSpace, which NS_IsAsciiWhitespace, which checks for ' ', '\r', '\n', and '\t'. So it seems like the following characters count as whitespace: U+0020: HTML 4.01, HTML 5, css3-selectors, current implementation U+000D: HTML 4.01, HTML 5, css3-selectors, current implementation U+000A: HTML 4.01, HTML 5, css3-selectors, current implementation U+0009: HTML 4.01, HTML 5, css3-selectors, current implementation U+000C: HTML 4.01, HTML 5, css3-selectors U+000B: HTML 4.01, HTML 5 I'd note that Hixie's test tests U+000C, but not U+000B. I'm curious what other implementations do here.
Assignee | ||
Comment 10•15 years ago
|
||
Assignee | ||
Comment 11•15 years ago
|
||
Webkit trunk from a few days/weeks ago follows css3-selectors. IE7 follows HTML. Opera 9.27 and 9.5b2 consider all but two of the characters as whitespace.
Assignee | ||
Comment 12•15 years ago
|
||
I think your summary of HTML4 was actually incorrect, though. It makes U+200B (zero-width space) a whitespace character, not U+000B (vertical tab). So what IE7 matches is HTML5.
Comment 13•15 years ago
|
||
(In reply to comment #8) > But unless HTML defines normalization, what CSS matches is defined by CSS, > which is different. I figured that the HTML parser would split the names based on its own white space rules then pass the separated names to the CSS parser for use with class selectors. I guess not. (In reply to comment #12) > I think your summary of HTML4 was actually incorrect, though. It makes U+200B > (zero-width space) a whitespace character, not U+000B (vertical tab). Yes, I noticed this myself and was going to correct it before you replied. Somehow I turned a two into a zero. > So what IE7 matches is HTML5. Using your comment #10 test case as a measure, Win. Internet Explorer 8 Beta 1 (8.0.6001.17184) treats all of the following characters as white space for the purposes of class selector matching: U+0009, U+000A, U+000B, U+000C, U+000D, and U+0020. (Interestingly enough, this includes U+000B.) The “Emulate IE7” mode behaves identically. Based on comment #9, IE8 and its IE7 emulation mode match the rules of css3-selectors and do not match HTML 4.01/5 rules (U+200B is missing).
Comment 14•15 years ago
|
||
(In reply to comment #13) > Based on comment #9, IE8 and its IE7 emulation mode match the rules of > css3-selectors and do not match HTML 4.01/5 rules (U+200B is missing). Sorry, it matches HTML 5 and not HTML 4.01.
Assignee | ||
Comment 15•15 years ago
|
||
So, for now, I'm going to patch this to match WebKit and css3-selectors, since that's the smaller change and the obviously correct part. I think HTML's definition of the class attribute needs to match css3-selectors's definition of attribute selectors (but that need not match other whitespace handling in either language), but I don't have the energy for the spec fight right now.
Assignee: nobody → dbaron
Assignee | ||
Comment 16•15 years ago
|
||
This does as I described in my previous comment -- adding only U+000C (form feed) to our set of whitespace characters.
Attachment #324389 -
Flags: superreview?(jonas)
Attachment #324389 -
Flags: review?(bzbarsky)
![]() |
||
Comment 17•15 years ago
|
||
Comment on attachment 324389 [details] [diff] [review] patch r=bzbarsky
Attachment #324389 -
Flags: review?(bzbarsky) → review+
Comment 18•15 years ago
|
||
I'd like to make HTML5 consistent with CSS and with the most widely deployed browser. My impression is that changing CSS will be easier. Any chance we can push back on the CSSWG and have CSS include the same whitespace characters?
Assignee | ||
Comment 19•15 years ago
|
||
Sure, I'm ok with going that direction; I almost did that yesterday, actually.
Assignee | ||
Comment 20•15 years ago
|
||
Comment 21•15 years ago
|
||
k, mailed the csswg
Assignee | ||
Comment 22•15 years ago
|
||
Here's a similar one that works in IE. IE's CSS parsing is really hard to test since it recovers from errors in ridiculous ways (e.g., it accepts "1pxA2pxA3pxA4px" as a value for margin, meaning "1px 2px 3px 4px" -- this makes it hard to tell what is whitespace versus what's "not part of identifiers", though this test tries two approaches). But it seems that IE uses the same or very similar lists of characters for whitespace in HTML and CSS, and I'd really rather not change CSS's overall definition of whitespace here, since that's been stable for over a decade and is implemented correctly by everyone other than IE. So, actually, I think HTML5 should change.
Assignee | ||
Updated•15 years ago
|
Attachment #324515 -
Attachment is patch: false
Attachment #324515 -
Attachment mime type: text/plain → text/html; charset=UTF-8
Comment on attachment 324389 [details] [diff] [review] patch _Personally_ I would prefer if what's considered whitespace was consistent across CSS, HTML and XML. And even better if that list was as short as possible. But I'll rely on dbaron here and he says this is what we want to do at least for now :)
Attachment #324389 -
Flags: superreview?(jonas) → superreview+
Comment 24•15 years ago
|
||
I removed U+000B from HTML5.
Assignee | ||
Comment 25•15 years ago
|
||
Fixed: http://hg.mozilla.org/mozilla-central/index.cgi/rev/b220fe3a520b
Status: NEW → RESOLVED
Closed: 15 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.1
Assignee | ||
Updated•15 years ago
|
Target Milestone: mozilla1.9.1 → mozilla1.9.1a1
You need to log in
before you can comment on or make changes to this bug.
Description
•