Implement short-circuiting of some sort for descendant selectors

RESOLVED FIXED in mozilla13

Status

()

Core
CSS Parsing and Computation
P1
normal
RESOLVED FIXED
6 years ago
5 years ago

People

(Reporter: bz, Assigned: bz)

Tracking

(Depends on: 1 bug, Blocks: 2 bugs)

9 Branch
mozilla13
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments)

Some sites have tons of gratuitous descendant selectors (e.g. github; see bug 704911).

WebKit apparently implemented an optimization for this at some point that works as follows:

1) Each rightmost selector (though in our case we'll want this to be each
   sequence of simple selectors that can be preceded by a combinator) stores
   information about what its ancestors need to match.  Specifically, if all
   combinators before it are " " or ">" it stores the set of tag names, class
   names, and ids from the preceding sequences of simple selectors.  The
   storage used to be a hashtable; it was converted to a Bloom filter in
   https://bugs.webkit.org/show_bug.cgi?id=53880
2) When doing selector matching on a bunch of nodes (frame construction, style
   reresolution), maintain a parent element stack of some sort that stores tags,
   ids, classes.  I haven't looked carefully into exactly how WebKit does that.
3) When matching a sequence of simple selectors, if the node stack is live and
   the sequence of simple selectors has an initialized Bloom filter, check all
   the things on the node stack against the filter.  If any fail, the selector
   can't match so bail out.

We should consider doing something like this or that has a similar effect....
Oh, an open question: should we try to do this ASAP (somewhat bitrotting bug 631527 in the process) or should we wait for that to land, which depends on a Windows compiler update at least.  That last certainly won't happen for gecko 11...
I guess I should take this for now, but if someone else has the bandwidth and desire to work on it, that would be _very_ much appreciated.  At least pending an answer to comment 1.
Assignee: nobody → bzbarsky
Priority: -- → P1
(In reply to Boris Zbarsky (:bz) from comment #1)
> Oh, an open question: should we try to do this ASAP (somewhat bitrotting bug
> 631527 in the process) or should we wait for that to land, which depends on
> a Windows compiler update at least.  That last certainly won't happen for
> gecko 11...

Does *landing* it depend on a Windows compiler update or does *enabling* it depend on a Windows compiler update?
Landing, as far as I can tell, since the code doesn't actually compile.  Unless you want to land it in ifdefs?
OS: Mac OS X → All
Hardware: x86 → All
Blocks: 708594
Blocks: 718864

Updated

6 years ago
Blocks: 633733
Some updates, because parts of comment 0 were wrong, obviously so in retrospect:

The rightmost selector just stores hashes of the ids/classes/tags used up its parent selector chain.  In WebKit's case up to a configurable number (currently set to 4) of them, stored in an array.

The selector matcher object in WebKit is what stores the Bloom filter based on what tags/ids/classes are present in the DOM.  The actual Bloom filter implementation allows addition and removal, looks like; that's what happens as nodes are pushed and popped while traversing the DOM.  The impl lives in JavaScriptCore/wtf/BloomFilter.h for those who want to read it.

In terms of how the hash computation is done, it looks like wtf strings tend to store a hash in them at all times (and compute it on string creation).

For us, would it make sense to store a hashcode (32-bit int) inside nsIAtom?  We have to hash those anyway, so have to compute hashcodes for them no matter what....
Yes, storing the hash in the nsIAtom makes a lot of sense to me.
Depends on: 729952
Depends on: 730100
Depends on: 730414
Created attachment 600501 [details] [diff] [review]
part 1.  Store the hashes of the atoms occurring as IDs, classes, and tag names in selectors that would need to match one of our ancestors.   store a maximum of 4 hashes to save space.  An interesting question is whether 4 is the right number,
Attachment #600501 - Flags: review?(dbaron)
Created attachment 600502 [details] [diff] [review]
part 2.  Hang an optional Bloom filter and some methods for managing it off the TreeMatchContext.
Attachment #600502 - Flags: review?(dbaron)
Created attachment 600503 [details] [diff] [review]
part 3.  Use the TreeMatchContext's ancestor filter, if any, in EnumerateAllRules.
Attachment #600503 - Flags: review?(dbaron)
Created attachment 600507 [details] [diff] [review]
part 4.  Make style reresolution actually use ancestor filters.
Attachment #600507 - Flags: review?(dbaron)
Created attachment 600508 [details] [diff] [review]
part 5.  Make frame construction use ancestor filters when resolving style.
Attachment #600508 - Flags: review?(dbaron)
Oh, and these patches pass full try (with Talos and the works) without dying, which suggests that we're at least managing to push the right element sets into the filter.
Blocks: 732667
Comment on attachment 600501 [details] [diff] [review]
part 1.  Store the hashes of the atoms occurring as IDs, classes, and tag names in selectors that would need to match one of our ancestors.   store a maximum of 4 hashes to save space.  An interesting question is whether 4 is the right number,

># HG changeset patch
># User Boris Zbarsky <bzbarsky@mit.edu>
># Date 1330116591 18000
># Node ID 83a97c36a598480da23725b14999b9ef31f2909b
># Parent  b33f839df6c39de4c2594f81d6e6b3f06be4ca17
>Bug 705877 part 1.  Store the hashes of the atoms occurring as IDs, classes, and tag names in selectors that would need to match one of our ancestors.  r=dbaron
>
>We store a maximum of 4 hashes to save space.  An interesting question is whether 4 is the right number,
>as well as whether we should prioritize things from earlier in the selector and whether we should,
>prioritize ids and classes over tags.  In practice, this seems to be good enough at least performance-wise,
>though it's possible we could use less memory and still do OK on the performance.
>
>diff --git a/layout/style/nsCSSRuleProcessor.cpp b/layout/style/nsCSSRuleProcessor.cpp
>--- a/layout/style/nsCSSRuleProcessor.cpp
>+++ b/layout/style/nsCSSRuleProcessor.cpp
>@@ -120,27 +120,88 @@ struct RuleSelectorPair {
>     : mRule(aRule), mSelector(aSelector) {}
>   // If this class ever grows a destructor, deal with
>   // PerWeightDataListItem appropriately.
> 
>   css::StyleRule*   mRule;
>   nsCSSSelector*    mSelector; // which of |mRule|'s selectors
> };
> 
>+#define NS_IS_ANCESTOR_OPERATOR(ch)                      \
>+  ((ch) == PRUnichar(' ') || (ch) == PRUnichar('>'))

Don't put the random run of spaces before the \.

>+  static const size_t sMaxAncestorHashes = 4;

Maybe use an enum rather than a static const?  I'm actually a little surprised it works this way, but if it works maybe it's ok.


(Also, at some point we should rename both _OPERATOR macros and nsCSSSelector::mOperator to "combinator", now that the spec has a term for it.)

>+      // Only put in the tag name if it's all-lowercase.  Otherwise we run into
>+      // trouble because we may test the wrong one of mLowercaseTag and
>+      // mCasedTag against the filter.
>+      if (sel->mLowercaseTag && sel->mCasedTag == sel->mLowercaseTag) {
>+        mAncestorSelectorHashes[hashIndex++] = sel->mLowercaseTag->getHash();
>+        if (hashIndex == sMaxAncestorHashes) {
>+          return;
>+        }

Do you think including the tag is actually worthwhile when it might displace classes or ids?


Also, I think including the highest 4 is probably a bad idea because it's not unusual for pages to have multiple classes on <body>.

r=dbaron
Attachment #600501 - Flags: review?(dbaron) → review+
(In reply to David Baron [:dbaron] from comment #13)
> Do you think including the tag is actually worthwhile when it might displace
> classes or ids?

Might be worth experimenting later, anyway, but probably shouldn't block landing.

> Also, I think including the highest 4 is probably a bad idea because it's
> not unusual for pages to have multiple classes on <body>.

Er, never mind my logic there.  But I still don't see any reason to want the last
four rather than the first four.
> Don't put the random run of spaces before the \.

Fixed.

> Maybe use an enum rather than a static const?

Fixed.

I'll try to do some experimentation on the tag thing.
Comment on attachment 600502 [details] [diff] [review]
part 2.  Hang an optional Bloom filter and some methods for managing it off the TreeMatchContext.

nsRuleProcessorData.h
=====================

>+  template<size_t hashListLength>
>+    bool MightHaveMatchingAncestor(const uint32_t* aHashes) const
>+  {
>+    MOZ_ASSERT(mFilter);
>+    for (size_t i = 0; i < hashListLength && aHashes[i]; ++i) {
>+      if (!mFilter->mayContain(aHashes[i])) {
>+        return false;
>+      }
>+    }
>+
>+    return true;
>+  }

I actually like the "Might" here better than the "may" in the
BloomFilter API.  Maybe BloomFilter should use "mightContain" rather
than "mayContain"?

>+  // Stack of indices to pop to
>+  nsTArray<PRUint32> mPopTargets;

Mention that these are indices into mHashes?


nsCSSRuleProcessor.cpp
======================

In AncestorFilter::Init, why do we check aElement->GetNodeParent()?
It seems like we shouldn't ever hit the no-node-parent case -- so
it's just a check for being the *root* of a disconnected fragment,
which seems like an odd thing to be checking for.  (Why care about
only being the root and not being within?)  Seems better not to
have this check (or maybe to have an assert -- or a stronger
assert -- if you want to say something about being connected to a
document.)

>+    nsAutoTArray<Element*, 20> ancestors;

Seems low?

>+    for (PRUint32 i = ancestors.Length(); i > 0; ) {
>+      PushAncestor(ancestors[--i]);
>+    }

I actually prefer keeping the decrement in the for() rather than
in the body by using the pattern:
  for (PRUint32 i = ancestors.Length(); i-- != 0; ) {
    PushAncestor(ancestors[i]);
  }

r=dbaron
Attachment #600502 - Flags: review?(dbaron) → review+
Comment on attachment 600503 [details] [diff] [review]
part 3.  Use the TreeMatchContext's ancestor filter, if any, in EnumerateAllRules.

>+  if (ancestorFilter &&
>+      !ancestorFilter->MightHaveMatchingAncestor<RuleValue::sMaxAncestorHashes>(
>+          value.mAncestorSelectorHashes)) {

Can you make the template parameter be
mozilla::ArrayLength(value.mAncestorSelectorHashes), or does that not
compile?  It seems clearer if it works.

r=dbaron
Attachment #600503 - Flags: review?(dbaron) → review+
> Mention that these are indices into mHashes?

Done.

> In AncestorFilter::Init, why do we check aElement->GetNodeParent()?

I replaced this with a MOZ_ASSERT that IsInDoc().

> Seems low?

Per discussion offline, bumped to 50.

> I actually prefer keeping the decrement in the for()

OK, done.

> mozilla::ArrayLength(value.mAncestorSelectorHashes)

That doesn't compile, but NS_ARRAY_LENGTH(value.mAncestorSelectorHashes) does.  Using that.
Comment on attachment 600507 [details] [diff] [review]
part 4.  Make style reresolution actually use ancestor filters.

Maybe add both of (or at least one of):

 (a) a comment in nsIFrame.h above nsIFrame::GetParentStyleContextFrame explaining that when the resulting frame is a child, you depend on that child having the same content node

 (b) an assertion in the provider-is-child case in ReResolveStyleContext, mentioning that if the assert were violated we'd need to add the current element to the ancestor filter

r=dbaron
Attachment #600507 - Flags: review?(dbaron) → review+
Comment on attachment 600508 [details] [diff] [review]
part 5.  Make frame construction use ancestor filters when resolving style.

>@@ -2309,16 +2309,17 @@ nsCSSFrameConstructor::ConstructDocEleme
>     PropagateScrollToViewport();
> 
>   SetUpDocElementContainingBlock(aDocElement);
> 
>   NS_ASSERTION(mDocElementContainingBlock, "Should have parent by now");
> 
>   nsFrameConstructorState state(mPresShell, mFixedContainingBlock, nsnull,
>                                 nsnull, aFrameState);
>+  state.mTreeMatchContext.mAncestorFilter.Init(aDocElement);

As we discussed, add a comment here explaining that this could init with null and then push the doc element 20 lines below.  (Or maybe even do that?  Though constructing the doc element is quite rare, so maybe it's not worth the extra code that avoids having the ancestors array.)

>@@ -3585,31 +3586,37 @@ nsCSSFrameConstructor::ConstructFrameFro
>   if (aState.mCreatingExtraFrames && aItem.mContent->IsHTML() &&
>       aItem.mContent->Tag() == nsGkAtoms::iframe)
>   {
>     return NS_OK;
>   }
> 
>   nsStyleContext* const styleContext = aItem.mStyleContext;
>   const nsStyleDisplay* display = styleContext->GetStyleDisplay();
>+  nsIContent* const content = aItem.mContent;
>+
>+  // Push the content as a style ancestor now, so we don't have to do
>+  // it in our various full-constructor functions and whatnot.

And could you expand on this a bit as we discussed?

r=dbaron with that
Attachment #600508 - Flags: review?(dbaron) → review+
> add a comment here explaining that this could init with null and then push the doc
> element 20 lines below.

Made this init with null and pushed the doc element after we finish resolving style for it.

> And could you expand on this a bit as we discussed?

Done.
I inserted a patch renaming mayContain to mightContain:

https://hg.mozilla.org/integration/mozilla-inbound/rev/6e9330f5a9bd
https://hg.mozilla.org/integration/mozilla-inbound/rev/490429be8812
https://hg.mozilla.org/integration/mozilla-inbound/rev/beb5d8473216
https://hg.mozilla.org/integration/mozilla-inbound/rev/f17f42bbe181
https://hg.mozilla.org/integration/mozilla-inbound/rev/c13394ac5b2a
https://hg.mozilla.org/integration/mozilla-inbound/rev/f4d3bc1351bd
Flags: in-testsuite-
Target Milestone: --- → mozilla13
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/f925f2f8d1fd for Windows build bustage
The compiler error is:

e:/builds/moz2_slave/m-in-w32/build/layout/style/nsCSSRuleProcessor.cpp(2353) : error C2232: '->AncestorFilter::MightHaveMatchingAncestor' : left operand has 'struct' type, use '.'

where the file looks like this:

  if (ancestorFilter &&
      !ancestorFilter->MightHaveMatchingAncestor<
        NS_ARRAY_LENGTH(value.mAncestorSelectorHashes)>(
          value.mAncestorSelectorHashes)) {
    // We won't match; nothing else to do here
    return;
  }

and line 2353 is the line NS_ARRAY_LENGTH is on.  I have no idea what MSVC is doing here, but I'm just going to switch this back to RuleValue::eMaxAncestorHashes, I think.
With that:

https://hg.mozilla.org/integration/mozilla-inbound/rev/621b16542cd5
https://hg.mozilla.org/integration/mozilla-inbound/rev/361616736d23
https://hg.mozilla.org/integration/mozilla-inbound/rev/d2140528affe
https://hg.mozilla.org/integration/mozilla-inbound/rev/12b71ef500a0
https://hg.mozilla.org/integration/mozilla-inbound/rev/8f22a2dc76d6
https://hg.mozilla.org/integration/mozilla-inbound/rev/4180d628d872
https://hg.mozilla.org/mozilla-central/rev/621b16542cd5
https://hg.mozilla.org/mozilla-central/rev/361616736d23
https://hg.mozilla.org/mozilla-central/rev/d2140528affe
https://hg.mozilla.org/mozilla-central/rev/12b71ef500a0
https://hg.mozilla.org/mozilla-central/rev/8f22a2dc76d6
https://hg.mozilla.org/mozilla-central/rev/4180d628d872
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED

Comment 27

5 years ago
I can't be sure but I think this may have caused bug 735481 which I just entered.
Depends on: 735481
Yep, this definitely caused that bug.

Updated

5 years ago
Depends on: 736389

Updated

5 years ago
Depends on: 736924
You need to log in before you can comment on or make changes to this bug.