[FIX] end-of-file handling for @import is wrong

RESOLVED FIXED

Status

()

P3
normal
RESOLVED FIXED
13 years ago
10 years ago

People

(Reporter: annevk, Assigned: zwol)

Tracking

({css2, testcase})

Trunk
css2, testcase
Points:
---
Bug Flags:
wanted1.9.1 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 13 obsolete attachments)

15.25 KB, patch
dbaron
: review+
dbaron
: superreview+
Details | Diff | Splinter Review
560 bytes, text/html; charset=UTF-8
Details
(Reporter)

Description

13 years ago
See <http://www.w3.org/TR/CSS21/syndata.html#parsing-errors>

For:

# <style> @import "test</style>

... a file "test" should be imported.
(Reporter)

Comment 1

13 years ago
Created attachment 209983 [details]
testcase
OS: Windows XP → All
Hardware: PC → All
Created attachment 210074 [details]
Testcase #2

I assume this should also be green?
Created attachment 210075 [details]
Testcase #3

How about this?
Created attachment 210076 [details] [diff] [review]
Patch rev. 1

This makes Testcase 1 & 2 green, but not 3 (it's easy to fix that one too if
you think we should).
Attachment #210076 - Flags: superreview?(dbaron)
Attachment #210076 - Flags: review?(dbaron)
Testcase 2: yes.
Testcase 3: no.

EOF handling should just end all strings, close all parens and braces, and fill in a ';' if needed, basically.
Ok, the patch should fix @import then and other where a string is terminated
by EOF. I suspect there are more cases we do not handle correctly...
An audit of all the place where we have
  if (!ExpectSymbol(aErrorCode, ';', PR_TRUE))
in
http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/layout/style/nsCSSParser.cpp

turns up @namespace and @charset, the other @-rules could have similar
problems (@media @-moz-document @font-face @page).

Testcases appreciated ;-)
Comment on attachment 210076 [details] [diff] [review]
Patch rev. 1

I will do a more complete investigation...
Attachment #210076 - Flags: superreview?(dbaron)
Attachment #210076 - Flags: review?(dbaron)
Assignee: dbaron → mats.palmgren
Created attachment 210096 [details]
Testcase #4, url("...")
Created attachment 210102 [details]
Testcase #5, @-moz-document
Created attachment 210103 [details]
Testcase #6, @namespace
Created attachment 210104 [details]
Testcase #7, @charset
Created attachment 210105 [details]
Testcase #8, @media
Created attachment 210106 [details]
Testcase #9, counter()
Created attachment 210107 [details] [diff] [review]
Patch rev. 2

This fixes all the cases I can find. Haven't run regression testing yet though...
Attachment #210076 - Attachment is obsolete: true
Layout regression tests passed.
Created attachment 224327 [details] [diff] [review]
Patch rev. 3

Same as rev.2 but since the last reference to the PEGroupRuleEOF message
was removed I should remove it from the .properties file too.
Attachment #210107 - Attachment is obsolete: true
Attachment #224327 - Flags: superreview?(dbaron)
Attachment #224327 - Flags: review?(dbaron)
Summary: end-of-file handling for @import is wrong → [FIX] end-of-file handling for @import is wrong
Comment on attachment 224327 [details] [diff] [review]
Patch rev. 3

I'm not happy with the changes you made to suppress error propagation at both callers of SkipDeclaration.  Can you ensure SkipDeclaration sets aErrorCode to NS_OK if it's just a normal EOF and not some other error?  (If that's not the case already.)  And maybe also document that SkipDeclaration's return value is false on EOF and true otherwise?

In GatherURL, ParseMozDocumentRule, and ParseNamespaceRule, and ParseURL, I'm not sure whether the ")" in a url() should be considered an "open expression", but I suppose it's ok.

I'm also a little unsure about removing some of the error messages:  I fear that the removal of error messages will cause an accidental opening { or ( to be unreported.  Have you made sure that all callers of SkipUntil have reported an error?

Could you add a comment in nsCSSScanner::ParseString saying that EOF terminates strings?

Given the differences in platform newline conventions, it also seems like newline followed by EOF should terminate strings without error, although this is probably something we should get clarified in the spec.

With those issues fixed, this looks fine, and you'll have r+sr=dbaron.  Sorry for the huge delay in getting to this.
Attachment #224327 - Flags: superreview?(dbaron)
Attachment #224327 - Flags: superreview-
Attachment #224327 - Flags: review?(dbaron)
Attachment #224327 - Flags: review-
(Assignee)

Comment 18

11 years ago
Created attachment 321685 [details] [diff] [review]
alternative minimal patch, with extensive tests

Looking at Mats' most recent patch, it occurred to me that there was a much simpler way to implement most of the rule: have ExpectSymbol() return true if it encounters EOF when scanning for punctuation that's only ever used to close constructs.  That's ) ] } and ;.  The attached patch implements just this, with a test case that is as comprehensive as I could make it.  With one possible exception, I believe Gecko + this patch is fully in compliance with the CSS2.1 EOF rules (and thus bug 311616 could be closed too).  Note that strings and comments are already treated correctly per the CSS2.1 rules (despite the initial impression one might get reading nsCSSScanner.cpp) and thus no changes were required for that.

The one possible exception is constructs such as

  <style>td[rowspan="3"</style>
  <style>td:lang(en</style>

where the unclosed construct is inside a selector.  These correspond to the TODOs in the test case.  CSS2.1 is clear that these are equivalent to

  <style>td[rowspan="3"]</style>
  <style>td:lang(en)</style>

respectively, but I'm not sure if those are themselves well-formed rules.  The current parser says no (you have to have, at least, an empty { } block afterward).  If the current parser is correct on that count, the TODOs should be dropped and the code changes applied as is; otherwise I'll augment the patch to handle the TODOs.  It's not hard, but I was trying to change as little as possible.

I did not worry about the quality of the parse diagnostics dumped to the error console.  Abstractly, it would be good to issue diagnostics for unclosed constructs, but I had to give up on that because the diagnostic logic is just too fragile.  If quality diagnostics are desired that should be a separate bug (and my proposed solution will be "rewrite nsCSS{Parser,Scanner} from the ground up with that as a primary design goal").
Assignee: mats.palmgren → zweinberg
Attachment #209983 - Attachment is obsolete: true
Attachment #210074 - Attachment is obsolete: true
Attachment #210075 - Attachment is obsolete: true
Attachment #210096 - Attachment is obsolete: true
Attachment #210102 - Attachment is obsolete: true
Attachment #210103 - Attachment is obsolete: true
Attachment #210104 - Attachment is obsolete: true
Attachment #210105 - Attachment is obsolete: true
Attachment #210106 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #321685 - Flags: superreview?(dbaron)
Attachment #321685 - Flags: review?(dbaron)
Comment on attachment 321685 [details] [diff] [review]
alternative minimal patch, with extensive tests

>+    /* CSS2.1 specifies that all "open constructs" are to be closed at
>+       EOF.  It simplifies higher layers if we claim to have found an
>+       ), ], }, or ; if we encounter EOF while looking for one of them.
>+
>+       XXXzw do REPORT_UNEXPECTED_EOF here? */

Hrm.  I have mixed feelings about this question.  I tend to think that the style sheet is invalid and therefore we should.  Also, I tend to think it's probably a useful diagnostic because it often meens something was unclosed from much earlier.

Also, could you follow local comment style -- either C++ "//" comments, or C comments with a * at the start of each line?

>     if (!GetToken(aErrorCode, PR_TRUE)) {
>-      if (aStopSymbol == PRUnichar(0))
>-        return PR_TRUE;
>       REPORT_UNEXPECTED_EOF(PEGatherMediaEOF);
>+      if (aStopSymbol == PRUnichar(0) || aStopSymbol == PRUnichar(';'))
>+        return PR_TRUE;
>       break;

This is wrong, since we use this code for parsing:
 * the media attribute on link and style elements
 * the input to the setter of nsIDOMMediaList::mediaText
through nsICSSParser::ParseMediaList.  That's when it would be called with PRUnichar(0) as aStopSymbol.  In those cases, the input we're given is just a media list, so we should expect the media list to end with EOF.

I haven't looked at the tests yet.
Comment on attachment 321685 [details] [diff] [review]
alternative minimal patch, with extensive tests

>diff --git a/layout/style/test/test_parse_rule.html b/layout/style/test/test_parse_rule.html
>+<link rel="stylesheet" type="text/css" href="/tests/SimpleTest/test.css" />

What was the rationale behind this change?  I suspect the original author of the test may have left it out to avoid test.css potentially interfering with the tests.  Are you confident that it won't?

>+// CSS2.1 section 4.2
>+"@import 'data:text/css,#a{color:green}'",
>+"@import 'data:text/css,#a{color:green}",

>-todo: {"@import 'data:text/css,#a{color:green}'" : 1,
>-       "@import 'data:text/css,#a{color:green}" : 1,
>+todo: {

The first of these changes is duplicating tests that are already there -- the "todo" sections repeat the contents of the test.  (That's hit a bunch of people in this test.  Maybe we should reorganize it at some point...)

That said, the comment above those tests should perhaps reference 4.2 instead of or in addition to 4.1.5.
Comment on attachment 321685 [details] [diff] [review]
alternative minimal patch, with extensive tests

>diff --git a/layout/style/test/Makefile.in b/layout/style/test/Makefile.in
>@@ -106,6 +106,7 @@ _TEST_FILES =	test_bug73586.html \
> 		test_style_struct_copy_constructors.html \
> 		test_value_storage.html \
> 		test_value_computation.html \
>+		test_css_eof_handling.html \
> 		css_properties.js \
> 		property_database.js \
> 		unstyled.xml \

The current pattern here was (1) all tests, in alphabetical order (2) support files, in some random order.  Would you mind alphabetizing the test?

>diff --git a/layout/style/test/test_css_eof_handling.html b/layout/style/test/test_css_eof_handling.html
>+const tests = [

Any chance you could rename "tag" to "name" or "testname" or "description" or something like that in this array?  I just don't like "tag" because I think tags are things like "<H1>".

>+  {
>+      tag: "basic rule",
>+      ref: "#r {background-color : orange}",
>+      tst: "#t {background-color : orange",
>+      prop: "backgroundColor", pseudo: ""

If you want, you could use "background-color" instead of "backgroundColor", and then replace refStyle[prop] with refStyle.getPropertyValue(prop).  That might be clearer / less confusing to those adding tests here later.

>+      todo: 1,

Are there bugs filed on the todo items?  Either way, could you make sure they're filed, and add the bug number next to the 'todo'?

>+      tag: "selector 1",
>+      tst: 'td[colspan="3',
>+      rule: 'td[colspan="3"] { }'

I'd prefer not making the test so dependent on our current canonicalization, since that's very much subject to change in the future (especially when specs get written to say how it should work).  (I somewhat prefer having tests that we don't think somebody's going to have to change in the future, since the person who has to change it might get things wrong and change it so it no longer tests what it was originally supposed to test.)

A way to avoid that dependency on our current canonicalization would be to canonicalize *both* test.tst and test.rule, rather than just canonicalizing test.tst and comparing that to test.rule.

Since we're also likely to improve the fidelity of reserialization, I'd also suggest that you make the following changes to "rule" so that future improvements to reserialization fidelity don't break the test:

>+      tst: "@charset 'utf-8",
>+      rule: "@charset \"utf-8\";"

>+      tst: "@charset 'utf-8'",
>+      rule: "@charset \"utf-8\";"

Use the same quote style (' rather than ") for these two.

>+      tst: "@media all {",
>+      rule: "@media all {\n}"

No need for the \n here if you take the canonicalize-both approach.

>+      tst: "@namespace foo url('http://foo.example.com/')",
>+      rule: "@namespace foo url(http://foo.example.com/);"

Same quote style in rule here as well.

>+      tag: "@namespace 1c",
>+      tst: "@namespace foo url('http://foo.example.com/",
>+      rule: "@namespace foo url(http://foo.example.com/);"

... and here

>+      tag: "@namespace 1d",
>+      tst: "@namespace foo 'http://foo.example.com/'",
>+      rule: "@namespace foo url(http://foo.example.com/);"

...and here

>+      tag: "@namespace 1e",
>+      tst: "@namespace foo 'http://foo.example.com/",
>+      rule: "@namespace foo url(http://foo.example.com/);" 

... and here

... and all the namespace 2* tests...

>+      tag: "@-moz-document 1",
>+      tst: '@-moz-document domain(example.com) {',
>+      rule: '@-moz-document domain("example.com")  {\n}'
>+  },
>+  {
>+      tag: "@-moz-document 2",
>+      tst: '@-moz-document domain(example.com) { p {',
>+      rule: '@-moz-document domain("example.com")  {\n  p {  }\n}'

And both same quoting style and no \n here.

>+const basestyle = ("table {\n"+
>+		   "    border-collapse: collapse;\n"+
>+		   "}\n"+
>+		   "td {\n"+
>+		   "    width: 1.5em;\n"+
>+		   "    height: 1.5em;\n"+
>+		   "    border: 1px solid black;\n"+
>+		   "    text-align: center;\n"+
>+		   "    margin: 0;\n"+
>+		   "}\n"+
>+		   "tr { counter-increment: tr }\n");

What's this for?

>+/* This is more complicated than it might look like it needs to be,
>+   because for each subtest we have to splat stuff into the iframe,
>+   allow the renderer to run, and only then interrogate the computed
>+   styles.  */

For what it's worth, I think there's a slightly cleaner approach you could have taken:
 * make the iframe a separate document that you link to
 * give the style element inside it an id attribute, so you can get it from the script
 * set iframe.contentDocument.getElementById("styleelement").firstChild.data to the contents of the style sheet you want for each part of the test
 * for the canonicalization of test.rule, you could have a second style element.

However, a bunch of other tests already use this approach, and it's probably a good idea to be testing a bunch of different things, so I'm actually in favor of keeping what you've done.

However, for the canonicalization of test.rule within what you've already done, you could probably do something like dynamically add a second style element to the iframe (document.createElement("style"), document.getElementsByTagName("head")[0].appendChild(style), etc.).


I think I'm done reviewing at this point.  (Sorry it took as long as it did.)  Could you post a new patch that addresses the comments I've made?

(I'm marking it as review denied, since my threshhold for whether to mark it as granted or denied is whether I want to look at the revised patch; that's not only a function of patch quality but also a function of how much experience I have working with the person whose code I'm reviewing.)
Attachment #321685 - Flags: superreview?(dbaron)
Attachment #321685 - Flags: superreview-
Attachment #321685 - Flags: review?(dbaron)
Attachment #321685 - Flags: review-
(In reply to comment #21)
> I think I'm done reviewing at this point.  (Sorry it took as long as it did.) 
> Could you post a new patch that addresses the comments I've made?

That said, I usually manage to notice something else the second time around...
(Assignee)

Comment 23

11 years ago
(In reply to comment #21)
>>+      todo: 1,
>
> Are there bugs filed on the todo items?  Either way, could you make
> sure they're filed, and add the bug number next to the 'todo'?

I've pulled this one out to respond to separately from the rest of
your comments because I don't want it to get lost in a sea of mostly
"yep, will make that change".

All the todo items are of the form I was talking about in comment #18:
the unclosed construct is inside a selector.  CSS2.1 is clear that

 <style>td[rowspan="3"</style>

should be parsed as if it read

 <style>td[rowspan="3"]</style>

but is not clear whether the parser should go on and fill in an empty
property list,

 <style>td[rowspan="3"] {}</style>

The present parser (even after my patch) does not fill in the empty
property list, and discards the rule altogether.  If it is correct to
do that, the todo items should just be dropped.  If it is not correct,
then I'll file that as a separate bug.
(Assignee)

Comment 24

11 years ago
Created attachment 322687 [details] [diff] [review]
alternative minimal patch, revised

Here's a revised patch which addresses nearly all of your comments.  Please see comment #23 for a critical remaining question, though.

>>+       XXXzw do REPORT_UNEXPECTED_EOF here? */
>
> Hrm.  I have mixed feelings about this question.  I tend to think
> that the style sheet is invalid and therefore we should.  Also, I
> tend to think it's probably a useful diagnostic because it often
> meens something was unclosed from much earlier.

That was my feeling as well.  The catch is that I have to introduce a
new variant on REPORT_UNEXPECTED_EOF or else a new translatable.  In
the revised patch I went for the former.

> Also, could you follow local comment style -- either C++ "//"
> comments, or C comments with a * at the start of each line?

Fixed.

>>-      if (aStopSymbol == PRUnichar(0))
>>-        return PR_TRUE;
>>       REPORT_UNEXPECTED_EOF(PEGatherMediaEOF);
>>+      if (aStopSymbol == PRUnichar(0) || aStopSymbol == PRUnichar(';'))
>>+        return PR_TRUE;
>
> This is wrong, since [...] in [x,y,z] cases, the input we're given is
> just a media list, so we should expect the media list to end with
> EOF.

So what you're saying is, when aStopSymbol==PRUnichar(0), we should
return PR_TRUE and not issue a diagnostic?  I made that change in the
revised patch.

>>+<link rel="stylesheet" type="text/css" href="/tests/SimpleTest/test.css" />
>
> What was the rationale behind this change?  I suspect the original
> author of the test may have left it out to avoid test.css potentially
> interfering with the tests.  Are you confident that it won't?

I changed it to be consistent with the other tests, and to make it
easier to read the detailed results of that test.  I didn't do a
detailed audit, but test.css applies styles only to a short list of
classes: test_ok, test_not_ok, all_pass, some_fail, and test_report.
None of those strings appear anywhere in test_parse_rule.html.  All of
the actual tests in test_parse_rule appear to select the same DIV tag,
which never has any of those classes applied to it.  Also,
empirically, it didn't break anything.

>>+// CSS2.1 section 4.2
>>+"@import 'data:text/css,#a{color:green}'",
>>+"@import 'data:text/css,#a{color:green}",
>
> [This] is duplicating tests that are already there -- the "todo"
> sections repeat the contents of the test.  (That's hit a bunch of
> people in this test.  Maybe we should reorganize it at some point...)
>
> That said, the comment above those tests should perhaps reference
> 4.2 instead of or in addition to 4.1.5.

I see the duplicates now.  I took the addition back out and put "and
4.2" at the end of the comment referencing 4.1.5.

> The current pattern [in the Makefile] was (1) all tests, in
> alphabetical order (2) support files, in some random order.  Would
> you mind alphabetizing the test?

Done.

> Any chance you could rename "tag" to "name" or "testname" or
> "description" or something like that in this array?  I just don't
> like "tag" because I think tags are things like "<H1>".

Also done (I used "name").

>>+      tst: "#t {background-color : orange",
>>+      prop: "backgroundColor", pseudo: ""
>
> If you want, you could use "background-color" instead of
> "backgroundColor", and then replace refStyle[prop] with
> refStyle.getPropertyValue(prop).

Good idea; I didn't know about getPropertyValue.  Changed.

>>+      tag: "selector 1",
>>+      tst: 'td[colspan="3',
>>+      rule: 'td[colspan="3"] { }'
>
>I'd prefer not making the test so dependent on our current
>canonicalization, since that's very much subject to change in the
>future (especially when specs get written to say how it should work).

Sure.  I think this will actually simplify the code a bit, for the
property-matching test cases I already have a second <style>
element...  and yes also to the quoting consistency and so on.  I
almost did something like that myself to get around whitespace
significance, but I was concerned that it might be legitimate to
canonicalize a rule with no properties into no rule at all (since it
has no effect).  If we want to go in the direction of reserialization
fidelity, though, that obviously won't be happening.

>>+const basestyle = ("table {\n"+
>>+		   "    border-collapse: collapse;\n"+
>>+		   "}\n"+
>>+		   "td {\n"+
>>+		   "    width: 1.5em;\n"+
>>+		   "    height: 1.5em;\n"+
>>+		   "    border: 1px solid black;\n"+
>>+		   "    text-align: center;\n"+
>>+		   "    margin: 0;\n"+
>>+		   "}\n"+
>>+		   "tr { counter-increment: tr }\n");
>
>What's this for?

It is largely a remnant of the original non-automated version of this
test case.  I had a table with a whole bunch of rows, each row with
two cells, test style on the left and reference style on the right,
and the above was to make the overall appearance consistent.  The
pairs of table cells still do appear briefly in the iframe, which is
visible, if you load the page separately from the rest of the
mochitests.  I could take most of it out but I don't think it's doing
any harm.

The counter-increment declaration is necessary, though, to make the
test that uses counter() work.
Attachment #224327 - Attachment is obsolete: true
Attachment #321685 - Attachment is obsolete: true
Attachment #322687 - Flags: superreview?
Attachment #322687 - Flags: review?
(Assignee)

Comment 25

11 years ago
Comment on attachment 322687 [details] [diff] [review]
alternative minimal patch, revised

r+sr? dbaron (for real this time)
Attachment #322687 - Flags: superreview?(dbaron)
Attachment #322687 - Flags: superreview?
Attachment #322687 - Flags: review?(dbaron)
Attachment #322687 - Flags: review?
One other thing I need to look into is whether we need to add OUTPUT_ERROR lines; we normally suppress errors when the failure isn't propagated (and we need to, since in a bunch of cases we try parsing things in a bunch of different ways).
(Assignee)

Updated

11 years ago
Flags: wanted1.9.1?
Priority: -- → P3
(In reply to comment #23)
> The present parser (even after my patch) does not fill in the empty
> property list, and discards the rule altogether.  If it is correct to
> do that, the todo items should just be dropped.  If it is not correct,
> then I'll file that as a separate bug.

I think it needs to be filed as a separate bug.
Created attachment 325366 [details]
testcase for whether OUTPUT_ERROR is needed

It looks like it's not needed, although I'd like to look at the code a little more to understand why not.
Flags: wanted1.9.1? → wanted1.9.1+
Comment on attachment 322687 [details] [diff] [review]
alternative minimal patch, revised

Well, I went through the existing CLEAR_ERROR calls and I don't any that scare me.

r+sr=dbaron

I have this prepared to land, so I'll land it later today.
Attachment #322687 - Flags: superreview?(dbaron)
Attachment #322687 - Flags: superreview+
Attachment #322687 - Flags: review?(dbaron)
Attachment #322687 - Flags: review+
(Assignee)

Comment 33

11 years ago
I believe this landing allows us to close bug 311616 as well.  I'm not going to do it just yet, in case someone knows different.
You need to log in before you can comment on or make changes to this bug.