url() tokenization should not be contextual

RESOLVED FIXED in mozilla5

Status

()

Core
CSS Parsing and Computation
P3
normal
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: dbaron, Assigned: dbaron)

Tracking

(Blocks: 1 bug, {css2})

Trunk
mozilla5
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(7 attachments)

(Assignee)

Description

7 years ago
Right now the CSS scanner has a separate entry point for tokenizing URIs.  This is incorrect according to the spec; it means that our tokenization is incorrect when we encounter a url() at an unexpected location.

This makes us fail:
http://test.csswg.org/suites/css2.1/20100917/xhtml1/uri-016.xht
http://test.csswg.org/suites/css2.1/20100917/html4/uri-016.htm
(Assignee)

Updated

7 years ago
Blocks: 605520
(Assignee)

Comment 1

7 years ago
I have a seven patch series in my patch queue for this:

share-gather-url
rename-tokens
move-eating-whitespace
fix-indent-next-url
disallow-control-in-url
url-include-close-paren
url-noncontextual

in http://hg.mozilla.org/users/dbaron_mozilla.com/patches/file/fdc0e62eada4

though it still needs a little more work, and more tests.
Assignee: nobody → dbaron
Status: NEW → ASSIGNED
(Assignee)

Comment 2

7 years ago
Working links to test:
http://test.csswg.org/suites/css2.1/latest/xhtml1/uri-016.xht
http://test.csswg.org/suites/css2.1/latest/html4/uri-016.htm
http://test.csswg.org/suites/css2.1/nightly-unstable/xhtml1/uri-016.xht
http://test.csswg.org/suites/css2.1/nightly-unstable/html4/uri-016.htm
(Assignee)

Comment 3

7 years ago
Created attachment 518522 [details] [diff] [review]
patch 1: use GatherURL more, and rename it
Attachment #518522 - Flags: review?(bzbarsky)
(Assignee)

Comment 4

7 years ago
Created attachment 518523 [details] [diff] [review]
patch 2: rename tokens to match spec
Attachment #518523 - Flags: review?(bzbarsky)
(Assignee)

Comment 5

7 years ago
Created attachment 518524 [details] [diff] [review]
patch 3: move consumption of leading whitespace (for all forms) and trailing whitespace (for string forms) inside url() into scanner
Attachment #518524 - Flags: review?(bzbarsky)
(Assignee)

Comment 6

7 years ago
Created attachment 518525 [details] [diff] [review]
patch 4: fix indent
(Assignee)

Comment 7

7 years ago
Created attachment 518526 [details] [diff] [review]
patch 5: disallow control characters in unquoted url()

This fixes another bug I noticed while looking at the spec (basically independent of this bug, really).
Attachment #518526 - Flags: review?(bzbarsky)
(Assignee)

Comment 8

7 years ago
Created attachment 518529 [details] [diff] [review]
patch 6: include the close paren in the URL token, and make quoted URLs produce the URL/Bad_URL tokens
Attachment #518529 - Flags: review?(bzbarsky)
(Assignee)

Comment 9

7 years ago
Created attachment 518530 [details] [diff] [review]
patch 7: include the opening "url(" in the URL token, and make tokenization of url() (but not url-prefix() or domain()) noncontextual
Attachment #518530 - Flags: review?(bzbarsky)
(Assignee)

Updated

7 years ago
Attachment #518524 - Attachment description: patch 3: move consumption of leading whitespace inside url() into scanner → patch 3: move consumption of leading whitespace (for all forms) and trailing whitespace (for string forms) inside url() into scanner
(Assignee)

Comment 10

7 years ago
Also, I'm aware that these changes might cause an incorrect leading url( on error messages while parsing domain() or url-prefix(), but I don't think it's a serious problem.
(Assignee)

Comment 11

7 years ago
Passed try on both debug and opt mac-64:
http://tbpl.mozilla.org/?tree=MozillaTry&rev=e31cd6d5e01c
Comment on attachment 518522 [details] [diff] [review]
patch 1: use GatherURL more, and rename it

>+++ b/layout/style/nsCSSParser.cpp
>+CSSParserImpl::ParseURLOrString(nsString& aURL)
>+  UngetToken();

This will change the behavior of this testcase:

  <style>div { color: red }</style>
  <style>@import ; div { color: green; }</style>
  <div>This should be green</div>

to make the div green, not red.  I think that's good, and will make us match Opera and Webkit here, but we should add a test for this.  This looks like the @import version of bug 589672.

(Incidentally, if I use "media" there, not "import", then we match Webkit but not Opera; I'm not sure which is correct.)

r=me with that.
Attachment #518522 - Flags: review?(bzbarsky) → review+
Comment on attachment 518523 [details] [diff] [review]
patch 2: rename tokens to match spec

r=me
Attachment #518523 - Flags: review?(bzbarsky) → review+
Comment on attachment 518524 [details] [diff] [review]
patch 3: move consumption of leading whitespace (for all forms) and trailing whitespace (for string forms) inside url() into scanner

r=me
Attachment #518524 - Flags: review?(bzbarsky) → review+
Comment on attachment 518525 [details] [diff] [review]
patch 4: fix indent

r=me fwiw
Attachment #518525 - Flags: review+
Comment on attachment 518526 [details] [diff] [review]
patch 5: disallow control characters in unquoted url()

r=me
Attachment #518526 - Flags: review?(bzbarsky) → review+
Comment on attachment 518529 [details] [diff] [review]
patch 6: include the close paren in the URL token, and make quoted URLs produce the URL/Bad_URL tokens

>+++ b/layout/style/nsCSSParser.cpp
>+  if (eCSSToken_URL != mToken.mType) {
>     // in the failure case, we do not have to match parentheses, since
>     // this is now an invalid URL token.

This comment doesn't really make sense, imo.

>+++ b/layout/style/nsCSSScanner.cpp
>+    case eCSSToken_URL:
>+    case eCSSToken_Bad_URL:
>+      if (mSymbol != PRUnichar(0)) {
>+        aBuffer.Append(mSymbol);
>+      }
>+      aBuffer.Append(mIdent);
>+      if (mSymbol != PRUnichar(0)) {
>+        aBuffer.Append(mSymbol);
>+      }
>+      if (mType == eCSSToken_URL) {
>+        aBuffer.Append(PRUnichar('('));
>+      }
>+      break;

I don't follow this code.  There are four cases here, I think:

1) eCSSToken_URL with quotes; eg: url('foo').  Then mIdent is the thing between 
   the quotes and mSymbol is the quote mark.  So we'd output: 'foo'(
2) eCSSToken_Bad_URL with quotes.  Then we'd output something like 'foo' even
   though the string was not terminated?
3) eCSSTokenURL without quotes; eg: url(foo).  Then we output: foo(
4) eCSSToken_Bad_URL without quotes.  Then we output: foo

Am I just missing something?  It looks like the old code didn't do a very good job here either...
Comment on attachment 518529 [details] [diff] [review]
patch 6: include the close paren in the URL token, and make quoted URLs produce the URL/Bad_URL tokens

Oh, I see.  You just typed '(' when you meant ')' in ToString.

r=me with that fixed and the comment about failure case made sane.
Attachment #518529 - Flags: review?(bzbarsky) → review+
Comment on attachment 518530 [details] [diff] [review]
patch 7: include the opening "url(" in the URL token, and make tokenization of url() (but not url-prefix() or domain()) noncontextual

r=me.  Very nice!
Attachment #518530 - Flags: review?(bzbarsky) → review+
(Assignee)

Comment 20

7 years ago
https://hg.mozilla.org/projects/birch/rev/ee00084cc421
https://hg.mozilla.org/projects/birch/rev/f8f925f12210
https://hg.mozilla.org/projects/birch/rev/205c84e0227e
https://hg.mozilla.org/projects/birch/rev/2ccb53c18fce
https://hg.mozilla.org/projects/birch/rev/905d7597d602
https://hg.mozilla.org/projects/birch/rev/deff78db28b6
https://hg.mozilla.org/projects/birch/rev/8431275e6a49
Whiteboard: fixed-in-birch
(Assignee)

Comment 21

7 years ago
https://hg.mozilla.org/mozilla-central/rev/ee00084cc421
https://hg.mozilla.org/mozilla-central/rev/f8f925f12210
https://hg.mozilla.org/mozilla-central/rev/205c84e0227e
https://hg.mozilla.org/mozilla-central/rev/2ccb53c18fce
https://hg.mozilla.org/mozilla-central/rev/905d7597d602
https://hg.mozilla.org/mozilla-central/rev/deff78db28b6
https://hg.mozilla.org/mozilla-central/rev/8431275e6a49
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago
Priority: -- → P3
Resolution: --- → FIXED
Whiteboard: fixed-in-birch
Target Milestone: --- → mozilla2.2
You need to log in before you can comment on or make changes to this bug.