url() tokenization should not be contextual

RESOLVED FIXED in mozilla5

Status

()

defect
P3
normal
RESOLVED FIXED
9 years ago
8 years ago

People

(Reporter: dbaron, Assigned: dbaron)

Tracking

(Blocks 1 bug, {css2})

Trunk
mozilla5
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(7 attachments)

Right now the CSS scanner has a separate entry point for tokenizing URIs.  This is incorrect according to the spec; it means that our tokenization is incorrect when we encounter a url() at an unexpected location.

This makes us fail:
http://test.csswg.org/suites/css2.1/20100917/xhtml1/uri-016.xht
http://test.csswg.org/suites/css2.1/20100917/html4/uri-016.htm
I have a seven patch series in my patch queue for this:

share-gather-url
rename-tokens
move-eating-whitespace
fix-indent-next-url
disallow-control-in-url
url-include-close-paren
url-noncontextual

in http://hg.mozilla.org/users/dbaron_mozilla.com/patches/file/fdc0e62eada4

though it still needs a little more work, and more tests.
Assignee: nobody → dbaron
Status: NEW → ASSIGNED
This fixes another bug I noticed while looking at the spec (basically independent of this bug, really).
Attachment #518526 - Flags: review?(bzbarsky)
Attachment #518524 - Attachment description: patch 3: move consumption of leading whitespace inside url() into scanner → patch 3: move consumption of leading whitespace (for all forms) and trailing whitespace (for string forms) inside url() into scanner
Also, I'm aware that these changes might cause an incorrect leading url( on error messages while parsing domain() or url-prefix(), but I don't think it's a serious problem.
Comment on attachment 518522 [details] [diff] [review]
patch 1: use GatherURL more, and rename it

>+++ b/layout/style/nsCSSParser.cpp
>+CSSParserImpl::ParseURLOrString(nsString& aURL)
>+  UngetToken();

This will change the behavior of this testcase:

  <style>div { color: red }</style>
  <style>@import ; div { color: green; }</style>
  <div>This should be green</div>

to make the div green, not red.  I think that's good, and will make us match Opera and Webkit here, but we should add a test for this.  This looks like the @import version of bug 589672.

(Incidentally, if I use "media" there, not "import", then we match Webkit but not Opera; I'm not sure which is correct.)

r=me with that.
Attachment #518522 - Flags: review?(bzbarsky) → review+
Comment on attachment 518523 [details] [diff] [review]
patch 2: rename tokens to match spec

r=me
Attachment #518523 - Flags: review?(bzbarsky) → review+
Comment on attachment 518524 [details] [diff] [review]
patch 3: move consumption of leading whitespace (for all forms) and trailing whitespace (for string forms) inside url() into scanner

r=me
Attachment #518524 - Flags: review?(bzbarsky) → review+
Comment on attachment 518525 [details] [diff] [review]
patch 4: fix indent

r=me fwiw
Attachment #518525 - Flags: review+
Comment on attachment 518526 [details] [diff] [review]
patch 5: disallow control characters in unquoted url()

r=me
Attachment #518526 - Flags: review?(bzbarsky) → review+
Comment on attachment 518529 [details] [diff] [review]
patch 6: include the close paren in the URL token, and make quoted URLs produce the URL/Bad_URL tokens

>+++ b/layout/style/nsCSSParser.cpp
>+  if (eCSSToken_URL != mToken.mType) {
>     // in the failure case, we do not have to match parentheses, since
>     // this is now an invalid URL token.

This comment doesn't really make sense, imo.

>+++ b/layout/style/nsCSSScanner.cpp
>+    case eCSSToken_URL:
>+    case eCSSToken_Bad_URL:
>+      if (mSymbol != PRUnichar(0)) {
>+        aBuffer.Append(mSymbol);
>+      }
>+      aBuffer.Append(mIdent);
>+      if (mSymbol != PRUnichar(0)) {
>+        aBuffer.Append(mSymbol);
>+      }
>+      if (mType == eCSSToken_URL) {
>+        aBuffer.Append(PRUnichar('('));
>+      }
>+      break;

I don't follow this code.  There are four cases here, I think:

1) eCSSToken_URL with quotes; eg: url('foo').  Then mIdent is the thing between 
   the quotes and mSymbol is the quote mark.  So we'd output: 'foo'(
2) eCSSToken_Bad_URL with quotes.  Then we'd output something like 'foo' even
   though the string was not terminated?
3) eCSSTokenURL without quotes; eg: url(foo).  Then we output: foo(
4) eCSSToken_Bad_URL without quotes.  Then we output: foo

Am I just missing something?  It looks like the old code didn't do a very good job here either...
Comment on attachment 518529 [details] [diff] [review]
patch 6: include the close paren in the URL token, and make quoted URLs produce the URL/Bad_URL tokens

Oh, I see.  You just typed '(' when you meant ')' in ToString.

r=me with that fixed and the comment about failure case made sane.
Attachment #518529 - Flags: review?(bzbarsky) → review+
Comment on attachment 518530 [details] [diff] [review]
patch 7: include the opening "url(" in the URL token, and make tokenization of url() (but not url-prefix() or domain()) noncontextual

r=me.  Very nice!
Attachment #518530 - Flags: review?(bzbarsky) → review+
You need to log in before you can comment on or make changes to this bug.