control characters above 0x7e should not be allowed in unquoted url()

RESOLVED FIXED in mozilla15

Status

()

Core
CSS Parsing and Computation
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: kennyluck, Assigned: kennyluck)

Tracking

({dev-doc-complete})

Trunk
mozilla15
dev-doc-complete
Points:
---
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 2 obsolete attachments)

(Assignee)

Description

5 years ago
Created attachment 621307 [details]
test case

CSS2.1 has

URI 	url\({w}{string}{w}\)
        |url\({w}([!#$%&*-\[\]-~]|{nonascii}|{escape})*{w}\)

nonascii	[^\0-\237]

where \237 is the octal representation of 0x9f and the last control character of all the u+??

This is a leftout case of bug 604179 patch 5. I have a patch already.
(Assignee)

Comment 1

5 years ago
Created attachment 621308 [details] [diff] [review]
proposed fix

I was originally trying to improve performance of url() tokenization by expanding the lexing table, but I don't think I am experienced enough to say this is really gaining performance. Fixing this is just an accident. :p
Assignee: nobody → kennyluck
Attachment #621308 - Flags: review?(dbaron)
Comment on attachment 621308 [details] [diff] [review]
proposed fix

>+#define SI   IS_IDENT|START_IDENT
>+#define XI   IS_IDENT            |IS_HEX_DIGIT

>+#define XSI  IS_IDENT|START_IDENT|IS_HEX_DIGIT

These three don't look like they're needed anymore (though maybe I missed something), so remove them.


> static const PRUint8 gLexTable[] = {
>-//                                     TAB LF      FF  CR
>-   0,  0,  0,  0,  0,  0,  0,  0,  0,  W,  W,  0,  W,  W,  0,  0,
>+//                                              TAB  LF        FF   CR
>+   0,   0,   0,   0,   0,   0,   0,   0,   0,   W,   W,   0,   W,   W,   0,   0,

I'd rather you not expand the table width, since it pushes it over 80 characters.  Just make the line with ABCDEF break the alignment instead.

>+static inline bool
>+IsURLChar(PRInt32 ch) {
>+  return ch >= 256 || (gLexTable[ch] & IS_URL_CHAR) != 0;
>+}

This needs to check ch >= 0, just like IsIdent does.  In fact, the function should look exactly like IsIdent except for checking a different bit.


r=dbaron with those changes
Attachment #621308 - Flags: review?(dbaron) → review+
Keywords: dev-doc-needed
Comment on attachment 621308 [details] [diff] [review]
proposed fix

Review of attachment 621308 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/style/nsCSSScanner.cpp
@@ +947,4 @@
>      }
> +
> +    ch = Read();
> +    if (ch < 0) break;

if (ch < 0) {
  break;
}
(Assignee)

Comment 4

5 years ago
Created attachment 622259 [details] [diff] [review]
comments addressed

but I identified a flaw so I am asking for a check.
Attachment #621308 - Attachment is obsolete: true
Attachment #622259 - Flags: review?(dbaron)
(Assignee)

Comment 5

5 years ago
(In reply to David Baron [:dbaron] from comment #2)
> Comment on attachment 621308 [details] [diff] [review]
> proposed fix
> 
> >+#define SI   IS_IDENT|START_IDENT
> >+#define XI   IS_IDENT            |IS_HEX_DIGIT
> 
> >+#define XSI  IS_IDENT|START_IDENT|IS_HEX_DIGIT
> 
> These three don't look like they're needed anymore (though maybe I missed
> something), so remove them.

Ah, yeah, I missed that. And I realized I didn't change SI to USI for U+A0-U+FF. This is fixed in the updated patch (with a test). I am asking for a check because of this.

Also, I forgot the #undef's.
 
> > static const PRUint8 gLexTable[] = {
> >-//                                     TAB LF      FF  CR
> >-   0,  0,  0,  0,  0,  0,  0,  0,  0,  W,  W,  0,  W,  W,  0,  0,
> >+//                                              TAB  LF        FF   CR
> >+   0,   0,   0,   0,   0,   0,   0,   0,   0,   W,   W,   0,   W,   W,   0,   0,
> 
> I'd rather you not expand the table width, since it pushes it over 80
> characters.  Just make the line with ABCDEF break the alignment instead.

Done.

> >+static inline bool
> >+IsURLChar(PRInt32 ch) {
> >+  return ch >= 256 || (gLexTable[ch] & IS_URL_CHAR) != 0;
> >+}
> 
> This needs to check ch >= 0, just like IsIdent does.  In fact, the function
> should look exactly like IsIdent except for checking a different bit.

I actually examined this pretty clearly and IsURLChar is the only function where the ch >= 0 check isn't necessary. But anyway I am adding this check back for now and hopefully someone in the future can refactor the whole file somehow.


(In reply to Ms2ger from comment #3)
> Comment on attachment 621308 [details] [diff] [review]
> proposed fix
> 
> Review of attachment 621308 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: layout/style/nsCSSScanner.cpp
> @@ +947,4 @@
> >      }
> > +
> > +    ch = Read();
> > +    if (ch < 0) break;
> 
> if (ch < 0) {
>   break;
> }

Done (although various places in this file still have the former style). I should spend some time reading the style guideline.
Status: NEW → ASSIGNED
(Assignee)

Comment 6

5 years ago
Created attachment 622260 [details]
interdiff
Comment on attachment 622259 [details] [diff] [review]
comments addressed

r=dbaron

(Ah, the mistake was missing the U bit for U+00A0 to U+00FF.)

> // @   A   B   C   D   E   F   G   H   I   J   K   L   M   N   O
>-   0,  XSI,XSI,XSI,XSI,XSI,XSI,SI, SI, SI, SI, SI, SI, SI, SI, SI,
>+U,UXSI,UXSI,UXSI,UXSI,UXSI,UXSI,USI,USI,USI,USI,USI,USI,USI,USI,USI,

Do re-align the comment to match the values for this line, though.

Sorry for the delay.
Attachment #622259 - Flags: review?(dbaron) → review+
(Assignee)

Comment 8

5 years ago
Created attachment 623620 [details] [diff] [review]
final patch

Carry over r=dbaron
Attachment #622259 - Attachment is obsolete: true
Attachment #623620 - Flags: review+
(Assignee)

Comment 9

5 years ago
(In reply to David Baron [:dbaron] from comment #7)
> > // @   A   B   C   D   E   F   G   H   I   J   K   L   M   N   O
> >-   0,  XSI,XSI,XSI,XSI,XSI,XSI,SI, SI, SI, SI, SI, SI, SI, SI, SI,
> >+U,UXSI,UXSI,UXSI,UXSI,UXSI,UXSI,USI,USI,USI,USI,USI,USI,USI,USI,USI,
> 
> Do re-align the comment to match the values for this line, though.

Ah, right. Done. (I indented those two lines again and they have 71 characters. I think this is in the acceptable range.)

> Sorry for the delay.

That was no delay as compared to my experience sending patch to WebKit! I am pretty sure most people on #whatwg share similar experience interacting with Mozilla folks on Bugzilla.
Keywords: checkin-needed
https://hg.mozilla.org/integration/mozilla-inbound/rev/14abca4e7378
Flags: in-testsuite+
Keywords: checkin-needed
Target Milestone: --- → mozilla15
https://hg.mozilla.org/mozilla-central/rev/14abca4e7378
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
The documentation has been updated with a note on the syntax as can be read here: https://developer.mozilla.org/en-US/docs/CSS/uri#Syntax

If that's enough, then this bug can be marked as dev-doc-complete.
(In reply to Frédéric Bourgeon [:FredB] from comment #12)
> The documentation has been updated with a note on the syntax as can be read
> here: https://developer.mozilla.org/en-US/docs/CSS/uri#Syntax
> 
> If that's enough, then this bug can be marked as dev-doc-complete.

I don't think this will need more documentation. Marking as complete.
Keywords: dev-doc-needed → dev-doc-complete
You need to log in before you can comment on or make changes to this bug.