Last Comment Bug 499655 - Selectors should have dual atoms: HTML and other
: Selectors should have dual atoms: HTML and other
Status: RESOLVED FIXED
: html5
Product: Core
Classification: Components
Component: CSS Parsing and Computation (show other bugs)
: Trunk
: All All
: -- normal (vote)
: ---
Assigned To: David Zbarsky (:dzbarsky)
:
Mentors:
http://livedom.validator.nu/?%3C!DOCT...
Depends on: 507487
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-22 02:52 PDT by Henri Sivonen (:hsivonen)
Modified: 2009-08-01 11:07 PDT (History)
7 users (show)
bzbarsky: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (6.85 KB, patch)
2009-07-27 08:03 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Review
Patch (7.27 KB, patch)
2009-07-27 08:28 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Review
Patch with tests (22.56 KB, patch)
2009-07-28 09:21 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Review
Patch with tests, fixed previous error(didn't compile) (23.62 KB, application/octet-stream)
2009-07-29 15:33 PDT, David Zbarsky (:dzbarsky)
no flags Details
Patch with tests, fixed previous error(didn't compile) (23.62 KB, patch)
2009-07-29 15:35 PDT, David Zbarsky (:dzbarsky)
dbaron: review-
Details | Diff | Review
Patch adressing comments (25.44 KB, patch)
2009-07-29 19:46 PDT, David Zbarsky (:dzbarsky)
dbaron: review+
Details | Diff | Review
Patch for check-in (25.54 KB, patch)
2009-07-30 10:21 PDT, David Zbarsky (:dzbarsky)
no flags Details | Diff | Review

Description Henri Sivonen (:hsivonen) 2009-06-22 02:52:14 PDT
Steps to reproduce:
 1) Load http://livedom.validator.nu/?%3C!DOCTYPE%20html%3E%0A%3Csvg%3E%0A%3CfeBlend%2F%3E%0A%3C%2Fsvg%3E%0A%3Cscript%3Ew%28document.querySelectorAll%28%27svg%27%29.length%29%3C%2Fscript%3E%0A%3Cscript%3Ew%28document.querySelectorAll%28%27feBlend%27%29.length%29%3C%2Fscript%3E

Expected results:
Expected log to say
1
1

Actual results:
Log says
1
0

See http://lists.w3.org/Archives/Public/public-html/2009Apr/0081.html

As of HTML5, SVG names with uppercase ASCII letters can enter into a text/html DOM. Even without an HTML5 parser, such elements can be introduced with createElementNS().

Currently, selectors associated with text/html DOMs are lowercased. This makes it impossible to match SVG camelCase names. Going forward, element and attribute selectors should be compiled with dual atoms: one atom for matching against elements in the http://www.w3.org/1999/xhtml namespace and another for matching against all other elements.

The atom for matching against elements in the http://www.w3.org/1999/xhtml namespace should be ASCII-lowercased if the document is a text/html document. The other atom should always be in the original case.
Comment 1 Henri Sivonen (:hsivonen) 2009-06-22 02:55:56 PDT
See also bug 499656.
Comment 2 David Zbarsky (:dzbarsky) 2009-07-27 08:03:16 PDT
Created attachment 390816 [details] [diff] [review]
patch
Comment 3 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-07-27 08:27:05 PDT
Two comments here:  mNewTag doesn't seem like a good name for a variable; it should have a name that describes what it is, not when it was introduced.

But more importantly, my understanding of what we ought to be doing here is making style sheet parsing not use any notion of case sensitiveness; in other words, removing mCaseSensitive.  But that's a slightly bigger task, and there may already be another bug report on that.  I'd need to think about it a little longer whether the goal of this bug can be accomplished without doing that (right now I'm not sure).
Comment 4 David Zbarsky (:dzbarsky) 2009-07-27 08:28:16 PDT
Created attachment 390821 [details] [diff] [review]
Patch
Comment 5 Boris Zbarsky [:bz] 2009-07-27 09:32:54 PDT
So if I understood our desired behavior correctly, we want something like this:

1)  In HTML documents, HTML tagnames are matched "case-insensitively" (in
    practice, by lowercasing tagnames during parsing and using a lowercase
    version of the tag selector in the matching).  This means that if someone
    inserts a node in the XHTML namespace with a non-lowercase localName into
    the document using createElementNS, it will be impossibly to match it with
    a tag selector.  This is no worse than what we have now.
2)  In HTML documents, non-HTML tagnames are matched case-sensitively.
3)  In XHTML documents and other XML documents, all tagnames are case-sensitive.

This involves having a notion of "HTML document" in either the parser or the matching code.  Either one works for tag names, which is what this bug was primarily about, as long as a single stylesheet which tries to match on non-lowercase tagnames for HTML nodes is not used for both HTML and XHTML documents.

To solve bug 282328, we'd indeed need to make sure the "HTML document" check happened during matching, not during parsing.
Comment 6 David Zbarsky (:dzbarsky) 2009-07-28 09:21:36 PDT
Created attachment 391106 [details] [diff] [review]
Patch with tests
Comment 7 David Zbarsky (:dzbarsky) 2009-07-29 15:33:48 PDT
Created attachment 391462 [details]
Patch with tests, fixed previous error(didn't compile)
Comment 8 David Zbarsky (:dzbarsky) 2009-07-29 15:35:26 PDT
Created attachment 391463 [details] [diff] [review]
Patch with tests, fixed previous error(didn't compile)
Comment 9 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-07-29 17:25:57 PDT
Comment on attachment 391463 [details] [diff] [review]
Patch with tests, fixed previous error(didn't compile)

There are a bunch of places where you're missing the space after "if"
(which is local style throughout the code you're touching).

In nsICSSStyleRule.h, you should add a comment that for pseudo-elements,
mCasedTag is null but mLowercaseTag contains the name of the
pseudo-element.  (This distinction will actually be useful for
distinguishing pseudo-element selectors from elements that have an
escaped-colon in them, which will help with bug 475216 comment 1.)

A few comments on the change in SelectorMatches:

+       if(data.mContentTag != aSelector->mCasedTag)

This line is over-indented.

+  if(aSelector->mLowercaseTag) {

For this check, I think you should actually add a inline method to
nsCSSSelector called HasTagSelector() that returns whether you have a
tage selector that's *not* a pseudo-element, and that's what you should
test here.  (You don't need to test pseudo-elements at all here if you
change PseudoEnumFunc as I describe below.)

+    if(data.mIsHTMLContent) {

You should add a comment that if we had a stricter test here (i.e., that
it was an HTML node in a text/html document) and perhaps some tweaks to
RuleHash, we could completely remove the notion of case sensitivity from
style sheets.  (I think this would fix some existing bugs related to the
handling of our user-agent style sheets.)  Maybe I could convince you to
do that in a followup patch?

+          !aSelector->mLowercaseTag && !aSelector->mIDList && !aSelector->mAttrList &&

This should:
 1) call HasTagSelector rather than look at mLowercaseTag directly
 2) be wrapped to a width less than 80 characters

+  NS_ASSERTION(aSelector->mLowercaseTag == data->mPseudoTag, "RuleHash failure");

I think you add, *before* this assertion, a test that calls
aSelector->IsPseudoElement() and returns early if it's not.  This would
be a second inline function that you'd need to add to nsCSSSelector.

There's another badly-indented } in nsCSSSelector::SetTag.

In nsCSSStyleRule.cpp you can remove the IsPseudoElement static function
and replace it with calls to your new function (added per above), which
is more correct.

\ No newline at end of file

You should have a newline.

nsTreeBodyFrame::PseudoMatches doesn't actually need to check if the
tag matches; its only caller is PseudoEnumFunc, and PseudoEnumFunc's
caller (a hash table enumerator) has already done that, so you should
actually change PseudoMatches to begin with:
  NS_ABORT_IF_FALSE(aSelector->mLowercaseTag == aTag,
                    "should not have been called");
and remove the runtime test.


I'm a bit puzzled by your tests.  I don't see any tests for behavior
that differs between HTML and XHTML, and I'd expect some.  Shouldn't you
get three elements for each query in the HTML version of the test?  Or
is createElementNS supposed to create a case-sensitive element, even
when used in an HTML document?

(I'd appreciate if Henri could clarify the behavior that's actually
expected here, preferably with pointers to HTML5.)

Also:

+htmldiv.id = 1;

These assignments don't seem to be doing anything.  Take them out?  Or,
if they're needed, put the value being assigned in quotes?

+ok(list[0] == htmldiv, "First element didn't match");

this is probably better tested using |is|.

Did you check that the tests fail without the patch but pass with the
patch?

This looks good, but I'd like to have another look at a revised patch.
Comment 10 David Zbarsky (:dzbarsky) 2009-07-29 19:45:39 PDT
(In reply to comment #9)
> (From update of attachment 391463 [details] [diff] [review])
> There are a bunch of places where you're missing the space after "if"
> (which is local style throughout the code you're touching)...
> and remove the runtime test.
> 
All done.

> 
> I'm a bit puzzled by your tests.  I don't see any tests for behavior
> that differs between HTML and XHTML, and I'd expect some.  

When matching on 'TEst' the HTML document matches the html test and test:TEst while the XHTML matches the html TEst and test:TEst

>Shouldn't you
> get three elements for each query in the HTML version of the test?  Or
> is createElementNS supposed to create a case-sensitive element, even
> when used in an HTML document?

I think it should create a case-sensitive element, as I understand.
> (I'd appreciate if Henri could clarify the behavior that's actually
> expected here, preferably with pointers to HTML5.)
> 
> Also:
> 
> +htmldiv.id = 1;
> 
> These assignments don't seem to be doing anything.  Take them out?  Or,
> if they're needed, put the value being assigned in quotes?

Done, I used them for testing.
> +ok(list[0] == htmldiv, "First element didn't match");
> 
> this is probably better tested using |is|.

Done.

> Did you check that the tests fail without the patch but pass with the
> patch?
> 
> This looks good, but I'd like to have another look at a revised patch.

Yes, the HTML test fails without the patch.
Comment 11 David Zbarsky (:dzbarsky) 2009-07-29 19:46:22 PDT
Created attachment 391519 [details] [diff] [review]
Patch adressing comments
Comment 12 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-07-29 21:59:06 PDT
Comment on attachment 391519 [details] [diff] [review]
Patch adressing comments

Requesting an additional review from Henri

>+    if (data.mIsHTMLContent) {
>+      if (data.mContentTag != aSelector->mLowercaseTag)
>+        return PR_FALSE;
>+    }
>+    else if (data.mContentTag != aSelector->mCasedTag) {
>+      return PR_FALSE;
>+    }

I actually preferred the symmetry of the way you had it before.

> static void PseudoEnumFunc(nsICSSStyleRule* aRule, nsCSSSelector* aSelector,
>                            void* aData)
> {
>+  if(!aSelector->IsPseudoElement())
>+    return;
>+
>   PseudoRuleProcessorData* data = (PseudoRuleProcessorData*)aData;

You need a space after "if".

Also probably better to declare |data| the very first thing, if only because it tends to be the style for callback functions to have the casts to the "correct" types at the very start.

>+  inline PRBool HasTagSelector() const {
>+    return mLowercaseTag && mCasedTag;
>+  }

This can be just !!mCasedTag.


I want to come back to this in the morning because I'm not awake enough right now to grant review...
Comment 13 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-07-29 21:59:44 PDT
(In reply to comment #12)
> Requesting an additional review from Henri

Never mind this bit; forgot to delete it.
Comment 14 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-07-30 07:03:42 PDT
Comment on attachment 391519 [details] [diff] [review]
Patch adressing comments

>+  nsCOMPtr<nsIAtom> mLowercaseTag; //For pseudo-elements, mCasedTag will be null
>+  nsCOMPtr<nsIAtom> mCasedTag;     //but mLowercaseTag contains their name.

Could you also add a comment here that mLowercaseTag is the same as mCasedTag in case-sensitive (non-text/html) documents, but is lowercase in case-insensitive (text/html) ones?

(And with those two comments, probably better to put them above the variables rather than squeeze them in the margin.)


r=dbaron with that and the above comments
Comment 15 David Zbarsky (:dzbarsky) 2009-07-30 10:21:41 PDT
Created attachment 391639 [details] [diff] [review]
Patch for check-in
Comment 16 Boris Zbarsky [:bz] 2009-07-30 12:45:46 PDT
Pushed http://hg.mozilla.org/mozilla-central/rev/358af1196dc2
Comment 17 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-07-31 09:41:59 PDT
I backed this out temporarily:
http://hg.mozilla.org/mozilla-central/rev/c6cab4ce7379
http://hg.mozilla.org/mozilla-central/rev/1de986ad2611
since we're about to ship an alpha, and I don't want to ship this in a weird half-fixed state.  Once bug 507487 gets sorted out, we can reland this along with that fix.
Comment 18 Boris Zbarsky [:bz] 2009-07-31 19:22:06 PDT
So do we want to try landing this (and bug 507487) before alpha (whenever the tree reopens), or just wait till after?
Comment 19 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-08-01 11:07:53 PDT
http://hg.mozilla.org/mozilla-central/rev/bf66fe37b733

Note You need to log in before you can comment on or make changes to this bug.