Closed Bug 1355343 Opened 3 years ago Closed 3 years ago

stylo: Snapshots don't look at the changed state of the other elements, which is wrong.

Categories

(Core :: CSS Parsing and Computation, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: emilio, Assigned: emilio)

References

Details

Attachments

(2 files, 1 obsolete file)

bz and dbaron pointed this out this morning. We should make sure to look at other snapshots in the DOM in order to handle correctly some dynamic changes.
Blocks: stylo
Comment on attachment 8856816 [details]
Bug 1355343: Move node restyle bits to Element, and add bits for snapshot handling.

https://reviewboard.mozilla.org/r/128734/#review131834

Per discussion, the borrow here isn't going to work. We can solve this by separating out an immutable part of the the ElementData, and storing the snapshot in the immutable part. Maybe something like:

struct ElementDataContainer {
  shared: SharedElementData,
  local: AtomicRefCell<ElementData>,
}

That will allow us to keep the naming we have for ElementData now, which will reduce the churn of this patch.
Attachment #8856816 - Flags: review?(bobbyholley) → review-
Assignee: nobody → emilio+bugs
Priority: -- → P1
Ok, so cool part is I have a patch for this. I've chosen an hybrid approach among the discussed ones which I'll justify when uploading.

Not-so-cool part is I've tried to assert some basic invariants about the state of the tree in multiple places, and it's specially annoying.

In particular, all the callers of PresShell::RecreateFramesFor dig into the frame constructor, requesting styles that aren't really up-to-date all the time. This is basically the problem that bug 1351535 tries to solve, but now we have the extra annoyance of having to keep the all snapshots alive at different points.

I would like to ensure no changes enter in the frame constructor without having done a style flush before. Cameron, you looked at this lately, do you think it's reasonable to flush styles on PresShell::RecreateFramesFor and similar callsites?
Flags: needinfo?(cam)
Depends on: 1357142
(In reply to Emilio Cobos Álvarez [:emilio] from comment #3)
> I would like to ensure no changes enter in the frame constructor without
> having done a style flush before. Cameron, you looked at this lately, do you
> think it's reasonable to flush styles on PresShell::RecreateFramesFor and
> similar callsites?

I don't think that's really tractable, which is why we did bug 1351535 the way we did.

Can you explain more about your hybrid approach?
If I recall correctly, I got some assertions or test failures when flushing the whole document's styles in PresShell::RecreateFramesFor.  I'm not sure what those issues were, now, although I think I erred on the side of matching Gecko's existing behaviour of just restyling the subtree itself and relying on the old ancestor styles, rather than digging too deep into the failures.  If you tried this recently and got a green try run, you might like to do one on top of the state of the tree before the bug 1351535 patches landed, to see what the issues were.

Though looking back to the IRC discussion http://logs.glob.uno/?c=mozilla%23layout&s=29+Mar+2017&e=29+Mar+2017&h=RecreateFramesFor#c28248 I can't see a mention of trying to flush styles, so maybe I am misremembering...
Flags: needinfo?(cam)
In general I'm kind of wary about trying to introduce this invariant - it might be a pretty deep rabbit hole to fix all the issues, and fixing them might cause different performance characteristics from Gecko (where we end up having to do larger style flushes than Gecko did). Given that we've already sunk the work into bug 1351535, I'd prefer to avoid introducing design constraints that require this invariant (if we can avoid doing so).
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #6)
> In general I'm kind of wary about trying to introduce this invariant - it
> might be a pretty deep rabbit hole to fix all the issues, and fixing them
> might cause different performance characteristics from Gecko (where we end
> up having to do larger style flushes than Gecko did). Given that we've
> already sunk the work into bug 1351535, I'd prefer to avoid introducing
> design constraints that require this invariant (if we can avoid doing so).

Note that these calls:
 * Are relatively uncommon.
 * Are usually followed by a flush right after (e.g., a few of editor callers call into GetOffsetHeight right after, which flushes layout). That's not the case in the XBL case though.

In any case, the work in bug 1357142 makes these calls asynchronous, and we'd need a layout/style flush in most cases anyway (if only for reflowing the page?), so I think it shouldn't be a performance loss issue at all (if only, it'd be the opposite).

We can keep it, but the price to pay for keeping it is more complexity (a lot more, actually, that's why I started doing it) and weaker assertions (a lot weaker, too) on the style system side, so I think it's definitely worth it.
I'll defer to your and heycam's judgement on this one. :-)
Depends on: 1361766
(In reply to Emilio Cobos Álvarez [:emilio] from comment #3)
> Ok, so cool part is I have a patch for this. I've chosen an hybrid approach
> among the discussed ones which I'll justify when uploading.

Does your hybrid approach allow us to expand snapshots in parallel? I'm getting the sense that this work is somewhat expensive and that we should try to parallelize it.
Flags: needinfo?(emilio+bugs)
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #9)
> (In reply to Emilio Cobos Álvarez [:emilio] from comment #3)
> > Ok, so cool part is I have a patch for this. I've chosen an hybrid approach
> > among the discussed ones which I'll justify when uploading.
> 
> Does your hybrid approach allow us to expand snapshots in parallel? I'm
> getting the sense that this work is somewhat expensive and that we should
> try to parallelize it.

It does
Flags: needinfo?(emilio+bugs)
Oh, and the bloom filter as well, right? Just trying to figure out what 'hybrid' means and whether the dependency matching still happens during the traversal.
Flags: needinfo?(emilio+bugs)
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #11)
> Oh, and the bloom filter as well, right? Just trying to figure out what
> 'hybrid' means and whether the dependency matching still happens during the
> traversal.

Yeah, it does. The approach is basically passing down a &'a SelectorMap down the shared style context. It has various optimizations to avoid hashmap lookups (flags, plus caching the lookup result).

We can still change it to use ElementData, but I tried that first, and it was very ugly in a lot of different senses, and we'd still need to care about dropping snapshots off-main-thread and all that.

This also saves the pointer in ElementData, which may be relevant.
Flags: needinfo?(emilio+bugs)
(In reply to Emilio Cobos Álvarez [:emilio] from comment #12)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #11)
> > Oh, and the bloom filter as well, right? Just trying to figure out what
> > 'hybrid' means and whether the dependency matching still happens during the
> > traversal.
> 
> Yeah, it does. The approach is basically passing down a &'a SelectorMap down
> the shared style context.

If it's a SelectorMap, I don't grok the design. If it was a SnapshotMap then it would make sense to me.
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #13)
> If it's a SelectorMap, I don't grok the design. If it was a SnapshotMap then
> it would make sense to me.

Err, yeah, a SnapshotMap, sorry :)
So today I took a look at this again, and here's a try run with the current state of affairs: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c08c4179bdbd86b9139f65a18e5bd7afa1758b38&selectedJob=97190104

That is, almost green. I need to dig into those assertions, but all of those seem to come from PresShell::CreateFramesFor, which is the last entrypoint for sync frame construction... bypassing the restyle manager. I can probably try to work around it... but I can also try to kill it, or at least flush styles there...
Being able to reproduce them locally would help too, I guess :(
(Just in case, this is green now)
Comment on attachment 8856816 [details]
Bug 1355343: Move node restyle bits to Element, and add bits for snapshot handling.

https://reviewboard.mozilla.org/r/128734/#review140196

::: dom/base/Element.h:106
(Diff revision 2)
> +  // Whether the element has already handled its relevant snapshot.
> +  //
> +  // Used by the servo restyle process in order to accurately track whether the
> +  // style of an element is up-to-date, even during the same restyle process.
> +  ELEMENT_HANDLED_SNAPSHOT = ELEMENT_SHARED_RESTYLE_BIT_4,

I'm curious as to why we need this, but presumably that will become clear when I review the patch that uses it. :-)
Attachment #8856816 - Flags: review?(bobbyholley) → review+
Comment on attachment 8856816 [details]
Bug 1355343: Move node restyle bits to Element, and add bits for snapshot handling.

https://reviewboard.mozilla.org/r/128734/#review140196

> I'm curious as to why we need this, but presumably that will become clear when I review the patch that uses it. :-)

Basically, in order to:

 * Avoid processing snapshots twice (that's because of the calls into CreateFramesFor, etc, and because we do two calls to `Servo_TraverseSubtree` if there are animations, etc, and we don't want to process snapshots for those). The second one is workaround-able via traversal flags, probably.
 * Add strong assertions that guarantee that we've handled a given snapshot, and that our style is up-to-date. That helps catch bugs, and keep sanity.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #22)
> Comment on attachment 8856816 [details]
> Bug 1355343: Move node restyle bits to Element, and add bits for snapshot
> handling.
> 
> https://reviewboard.mozilla.org/r/128734/#review140196
> 
> > I'm curious as to why we need this, but presumably that will become clear when I review the patch that uses it. :-)
> 
> Basically, in order to:
> 
>  * Avoid processing snapshots twice (that's because of the calls into
> CreateFramesFor, etc,

I think we could probably fix this if we needed to by using SequentialTasks to drop snapshots when doing a partial traversal.

> and because we do two calls to `Servo_TraverseSubtree`
> if there are animations, etc, and we don't want to process snapshots for
> those). The second one is workaround-able via traversal flags, probably.
>  * Add strong assertions that guarantee that we've handled a given snapshot,
> and that our style is up-to-date. That helps catch bugs, and keep sanity.

This all makes sense, but bits are valuable, so it's worth documenting the work that would need to be done to free up the bit. Please write all of this up in an extensive comment above the bit declaration so that somebody in the future can free it up if they need it.
Also, even with the explanations above, I don't understand why we can't just use a single bit for HAS/HANDLED. Why can't we just unset HAS when we handle?
Comment on attachment 8865374 [details]
Bug 1355343: Take all the snapshots into account.

https://reviewboard.mozilla.org/r/137036/#review140198

::: dom/base/Element.cpp:3908
(Diff revision 1)
>  
>  void
>  Element::ClearServoData() {
>  #ifdef MOZ_STYLO
>    Servo_Element_ClearData(this);
> +  UnsetFlags(ELEMENT_HAS_SNAPSHOT | ELEMENT_HANDLED_SNAPSHOT);

This should move to GeckoElement::clear_data, otherwise we'll miss it for servo-triggered clears.

::: layout/base/ServoRestyleManager.h:160
(Diff revision 1)
>    // update, which triggers a normal restyle, and so there might be any new
>    // transition created later. Therefore, if this flag is true, we need to
>    // increase mAnimationGeneration before creating new transitions, so their
>    // creation sequence will be correct.
>    bool mHaveNonAnimationRestyles = false;
> +  SnapshotTable mSnapshots;

Needs a comment.

::: layout/base/ServoRestyleManager.cpp:331
(Diff revision 1)
> +  nsIPresShell* presShell = mPresContext->PresShell();
> +  presShell->SetNeedStyleFlush();
> +  presShell->ObserveStyleFlushes();

We have this code in 3 places now (this, Gecko_SetOwnerDocumentNeedsStyleFlush, NoteDirtyDescendantsForServo) - 4 if you count the gecko call. Please make this a helper.

::: layout/base/ServoRestyleManager.cpp:397
(Diff revision 1)
>    while (styleSet->StyleDocument()) {
> +    ClearSnapshots();

Can you make this a do/while loop with a break so that we only need one callsite for ClearSnapshots?

::: layout/style/ServoBindings.cpp:354
(Diff revision 1)
>  {
> -  MOZ_ASSERT(NS_IsMainThread());
> -  return new ServoElementSnapshot(aElement);
> +  MOZ_ASSERT(aTable);
> +  MOZ_ASSERT(aElement);
> -}
>  
> -void
> +  return aTable->Get(const_cast<Element*>(aElement));

Can you assert that the return value is non-null?

::: layout/style/ServoElementSnapshotTable.h:17
(Diff revision 1)
> +class ServoElementSnapshotTable
> +  : public nsClassHashtable<nsRefPtrHashKey<dom::Element>, ServoElementSnapshot>
> +{
> +};

Does this need to be a subclass, or can it just be a typedef? If there's a reason please document it.

::: layout/style/ServoStyleSet.cpp:328
(Diff revision 1)
> -    if (Servo_TraverseSubtree(aRoot, mRawSet.get(),
> -                              aRootBehavior, aRestyleBehavior)) {
> +    if (Servo_TraverseSubtree(
> +          aRoot, mRawSet.get(), &snapshots, aRootBehavior, aRestyleBehavior)) {

Hm, does this really conform to Gecko style?

::: servo/components/style/restyle_hints.rs:630
(Diff revision 1)
> +        let snapshot = match snapshot_el.snapshot() {
> +            None => return RestyleHint::empty(),
> +            Some(s) => s,
> +        };

Hm, can't we assert that the snapshot exists if we get here?

::: servo/components/style/traversal.rs:784
(Diff revision 1)
> +        // Handle element snapshots and sibling restyle hints.
> +        //
> +        // NB: This will be a no-op if there's no restyle data and no snapshot.
> +        let later_siblings =
> +            child_data.compute_final_hint(child, traversal.shared_context());

Why did this chunk need to move?

::: servo/ports/geckolib/glue.rs:200
(Diff revision 1)
> +    let token = RecalcStyleOnly::pre_traverse(element,
> +                                              &shared_style_context,
> -                                                     traversal_flags);
> +                                              traversal_flags);

So, the intention of doing this before create_shared_context was to avoid some of the dumb stuff we do in that function if we don't actually need it. Can you at least file a followup bug to remove the obvious stuff (like the animation senders we don't use, and the whole creationdata business, and the boxing)?
Attachment #8865374 - Flags: review?(bobbyholley) → review+
Comment on attachment 8865375 [details]
Bug 1355343: Cache snapshot lookups.

https://reviewboard.mozilla.org/r/137038/#review140232
Attachment #8865375 - Flags: review?(bobbyholley) → review+
Comment on attachment 8865374 [details]
Bug 1355343: Take all the snapshots into account.

https://reviewboard.mozilla.org/r/137036/#review140198

> This should move to GeckoElement::clear_data, otherwise we'll miss it for servo-triggered clears.

Ouch, good catch.

> We have this code in 3 places now (this, Gecko_SetOwnerDocumentNeedsStyleFlush, NoteDirtyDescendantsForServo) - 4 if you count the gecko call. Please make this a helper.

Yup!

> Can you make this a do/while loop with a break so that we only need one callsite for ClearSnapshots?

I can try, not sure it'll be clearer.

> Can you assert that the return value is non-null?

The caller asserts that.

> Does this need to be a subclass, or can it just be a typedef? If there's a reason please document it.

It's easier in order to do bindgen stuff. I eventually added a few methods to it, but then ended up not needing those. I can get it back to a typedef, I guess.

> Hm, does this really conform to Gecko style?

That specific part is clang-formatted, since I wasn't sure what kind of formatting would suit better, so I assume the answer is "yes". I can probably split it into a variable declaration + a function call.

> Hm, can't we assert that the snapshot exists if we get here?

Yes, I initially had code that didn't wrap the call to compute_final_hint in an `if` condition, and thus this was the place to reject it. Will update it.

> Why did this chunk need to move?

This way we can optimize out the allocation of the `RestyleData` if there was a snapshot and the snapshot produced no hints.

> So, the intention of doing this before create_shared_context was to avoid some of the dumb stuff we do in that function if we don't actually need it. Can you at least file a followup bug to remove the obvious stuff (like the animation senders we don't use, and the whole creationdata business, and the boxing)?

Sounds good, will do.
Oh, also - do any of our existing tests fail without these patches? If not, can you add some that will fail if we don't do transitive snapshots?
Flags: needinfo?(emilio+bugs)
I think we can't until we implement some of the restyle optimizations that heycam is working on, otherwise the later siblings and subtree hints save us in all the test cases I can think of.
Flags: needinfo?(emilio+bugs)
Blocks: 1363245
Attachment #8865375 - Attachment is obsolete: true
Pushed by ecoal95@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/eaf4e461619e
Move node restyle bits to Element, and add bits for snapshot handling. r=bholley
https://hg.mozilla.org/integration/autoland/rev/2150351429b5
Take all the snapshots into account. r=bholley
https://hg.mozilla.org/mozilla-central/rev/eaf4e461619e
https://hg.mozilla.org/mozilla-central/rev/2150351429b5
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Duplicate of this bug: 1352308
You need to log in before you can comment on or make changes to this bug.