Last Comment Bug 543428 - selectorText and cssText strip backslash from "\:" in type element selectors
: selectorText and cssText strip backslash from "\:" in type element selectors
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: CSS Parsing and Computation (show other bugs)
: Trunk
: All All
: P3 normal with 1 vote (vote)
: mozilla1.9.3a1
Assigned To: David Baron :dbaron: ⌚️UTC-7 (busy September 14-25)
:
Mentors:
Depends on:
Blocks: 475216
  Show dependency treegraph
 
Reported: 2010-02-01 02:15 PST by Daniel Glazman (:glazou)
Modified: 2010-02-04 12:57 PST (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (25.80 KB, patch)
2010-02-03 14:01 PST, David Baron :dbaron: ⌚️UTC-7 (busy September 14-25)
no flags Details | Diff | Splinter Review
patch (28.70 KB, patch)
2010-02-03 14:23 PST, David Baron :dbaron: ⌚️UTC-7 (busy September 14-25)
bzbarsky: review+
Details | Diff | Splinter Review

Description Daniel Glazman (:glazou) 2010-02-01 02:15:12 PST
In a selector contains a type element selector itself containing backslash+colon because the element name contains a colon, the output of that selector through the CSS OM (cssText or selectorText) returns a version of the selector without the backslash. It's then impossible to reuse the selector or the associated rule.
nsCSSSelector::AppendToStringWithoutCombinatorsOrNegations is faulty here, it
should not append tag names through toString() but through another method dealing
with characters inside tag names that can exist in tag names and must be escaped
in CSS idents.
Comment 1 Boris Zbarsky [:bz] (still a bit busy) 2010-02-03 08:18:28 PST
Note that the same thing needs to happen for namespace prefixes.  And IDs. And classnames.  And pseudoclass names.

What else should this apply to?  Which chars need to be escaped?
Comment 2 Daniel Glazman (:glazou) 2010-02-03 08:32:06 PST
(In reply to comment #1)
> Note that the same thing needs to happen for namespace prefixes.  And IDs. And
> classnames.  And pseudoclass names.
> 
> What else should this apply to?  Which chars need to be escaped?

I did not have enough to dig, but I suspect that escaped sequences representing unicode chars are replaced by the char itself... An escape sequence representing a acute-e é would be serialized into a é making the serialization invalid again :-( I think we probably have to escape all chars outside of - _ a-Z and A-z, and if they're outside of the ascii range turn the result into an escaped hex sequence.
Comment 3 Daniel Glazman (:glazou) 2010-02-03 08:35:08 PST
Please note that Opera seems to escape all chars outside of [\-_a-zA-Z] with an hex sequence.
Comment 4 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-02-03 09:55:02 PST
We need a function for serializing an identifier.  I thought we had a bug on this somewhere, but I guess not.  It could probably live right next to nsStyleUtil::AppendEscapedCSSString.

(In reply to comment #1)
> Note that the same thing needs to happen for namespace prefixes.  And IDs. And
> classnames.  And pseudoclass names.
> 
> What else should this apply to?  Which chars need to be escaped?

Also values of a whole bunch of properties that take arbitrary identifiers, such as counter-*, font-family (sort of), etc.
Comment 5 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-02-03 09:57:55 PST
Oh, and all characters from U+80 and higher are all valid in identifiers; we don't need to escape them.
Comment 6 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-02-03 14:01:07 PST
Created attachment 425068 [details] [diff] [review]
patch

This should address the bulk of the problems.  The only thing I know I've missed is 'font-family', although it wouldn't surprise me if there's more.
Comment 7 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-02-03 14:02:20 PST
And yes, I know the computed style code is ugly, but that's what we get for shoehorning all the computed style values into a broken API that doesn't do things like distinguish string vs. identifier.
Comment 8 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-02-03 14:23:10 PST
Created attachment 425077 [details] [diff] [review]
patch

The previous patch unnecessarily escaped non-ascii characters.  This fixes it, and adds some tests that we don't over-escape.
Comment 9 Boris Zbarsky [:bz] (still a bit busy) 2010-02-03 15:29:36 PST
Note to self; the "escape digits when first == true" code needs to escape '-' as well.  Will get back to this after dinner.
Comment 10 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-02-03 15:51:04 PST
(In reply to comment #9)
> Note to self; the "escape digits when first == true" code needs to escape '-'
> as well.  Will get back to this after dinner.

No, I don't think it does.  '-' gets escaped in the code below as '\-'; it doesn't need *numerical* escaping, which is what that clause is for.
Comment 11 Boris Zbarsky [:bz] (still a bit busy) 2010-02-03 20:50:04 PST
> it doesn't need *numerical* escaping

Ah, indeed.
Comment 12 Boris Zbarsky [:bz] (still a bit busy) 2010-02-03 20:53:47 PST
Comment on attachment 425077 [details] [diff] [review]
patch

>+++ b/layout/style/nsStyleUtil.cpp
>+    } else {
>+      if (!((*in == PRUnichar('_')) ||

Might be worth it to stick *in in a local at the top of that else block; may be clearer and might optimize better.

r=bzbarsky with that.  I have no problem with us adding a SetIdent API for computed style in a different bug, btw.
Comment 13 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-02-04 12:54:02 PST
http://hg.mozilla.org/mozilla-central/rev/41394e846c1b
Comment 14 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-02-04 12:57:43 PST
I added a note to bug 280443 about the "XXX" comment introduced in this patch.

Note You need to log in before you can comment on or make changes to this bug.