NoteDirtyElement is either called too often or is too slow or both

RESOLVED FIXED in Firefox 59

Status

()

defect
P2
normal
RESOLVED FIXED
2 years ago
3 months ago

People

(Reporter: smaug, Assigned: bholley)

Tracking

({perf})

unspecified
mozilla59
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox59 fixed)

Details

(Whiteboard: [qf:f60][qf:p1])

Attachments

(4 attachments, 3 obsolete attachments)

Reporter

Description

2 years ago
NoteDirtyElement is 4.6% of innerHTML set time in a profile I'm looking at.
Reporter

Comment 2

2 years ago
innerHTML speed is rather important for Speedometer.
Depends on: 1383332
Whiteboard: [qf]
Assignee

Comment 3

2 years ago
So 3/4 samples are in MaybeConstructLazily, so the work involved here is basically propagating up the lazy frame construction bits, which Gecko does too. Stylo's code here is slightly more complex than Gecko's, but would be faster in a lot of cases (the first time we set the bit we don't need to walk up the tree, whereas Gecko always walks up to the root).

How does the time under MaybeConstructLazily compare between stylo and non-stylo?
Assignee

Updated

2 years ago
Flags: needinfo?(bugs)
I took a look, and we're mostly taking the fast paths there, so unless the `parent` checks are hurting us (in which case we could at least move the aElement == existingRoot up), I'm not sure there's much we can do here.

In my profile NoteDirtyElement was also incorrectly blamed with some refcount traffic, that really cam from BindToTree / UnbindFromTree, for some reason.

The other case where we may pay more than Gecko is in propagating the HAS_DIRTY_DESCENDANTS bits, which we could try to coalesce in MaybeConstructLazily.

I'll post a patch with the first idea, please take a profile if you wish, because mine was kinda busted because of the refcount traffic that was incorrectly reported.

I can also try to poke at (2), I guess...
Reporter

Comment 6

2 years ago
stylo vs non-stylo isn't too interesting question here. We are spending too much time in a _very_ hot code path (and doesn't matter whether it is stylo or non-stylo, but in this case it happens to be stylo).
nsIMutationObserver implementations should be as fast as possible.
Can we do something to optimize the method, or call it less often, or call it asynchronously?
Flags: needinfo?(bugs)
Assignee

Comment 7

2 years ago
Realistically, assuming we aren't taking a regression with the switch to stylo, I probably don't have time to think about this until after the 57 merge.
Reporter

Comment 8

2 years ago
Oh, sure, I don't think I indicated anywhere this _must_ go into FF57.
It is just that we have a slow method in a very hot code path.

I may take a look at this.
Reporter

Comment 9

2 years ago
Let me profile without stylo
Reporter

Comment 10

2 years ago
The stuff under nsCSSFrameConstructor::ContentAppended seems to take more time when Stylo is enabled, 
but doesn't affect bug 1347525 too badly.
Note, this seems to be a case where Gecko profiler may not give low-level enough information, so some native profiler may be better.

Updated

2 years ago
Whiteboard: [qf] → [qf:p2]
Priority: -- → P2
Keywords: perf
Now that 57 is shipped, let's re-evaluate if we can do work here. I'm marking this qf:p1 but am willing to hear objections :)
Whiteboard: [qf:p2] → [qf:p1]

Updated

2 years ago
Whiteboard: [qf:p1] → [qf:i60][qf:p1]

Updated

2 years ago
Whiteboard: [qf:i60][qf:p1] → [qf:f60][qf:p1]
smaug said he'd profile this again to see where things are these days.
Flags: needinfo?(bugs)
Reporter

Comment 13

2 years ago
Ok, I was reprofiling this using Nightly, and I can't see any high numbers for NoteDirtyElement anymore.
Marking this WFM. Please reopen if needed.
(FWIW, I'm getting quite a bit faster innerHTML numbers for Nightly than for Chrome, on Linux.)
Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(bugs)
Resolution: --- → WORKSFORME
Reporter

Comment 14

Last year
Reopening. Using a different testcase the method is there.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Reporter

Comment 16

Last year
https://perfht.ml/2D1L6os might be a bit better link
Assignee

Comment 17

Last year
I can have a look at this testcase.
Assignee: nobody → bobbyholley
Assignee

Comment 19

Last year
MozReview-Commit-ID: 9LBNm2h5Ct3
Attachment #8941992 - Flags: review?(emilio)
Assignee

Comment 20

Last year
This reduces time spent in NoteDirtyElement by about 15% in the testcase. Even
though the subsequent patch will cause us to call Servo_Element_IsDisplayNone less
for this particular testcase, it could still be hot on other testcases, and so
it's worth optimizing.

MozReview-Commit-ID: 3F3Zfp48dDW
Attachment #8941993 - Flags: review?(emilio)
Assignee

Comment 21

Last year
This eliminates ~90% of the time spent in NoteDirtyElement on this testcase.

MozReview-Commit-ID: Lm5hf7QRiOK
Attachment #8941994 - Flags: review?(emilio)
Assignee

Updated

Last year
Attachment #8906035 - Attachment is obsolete: true
Assignee

Comment 22

Last year
(Note that I bumped the iterations on the testcase to get better resolution of the effects of these patches)
Assignee

Comment 23

Last year
MozReview-Commit-ID: 9LBNm2h5Ct3
Attachment #8942007 - Flags: review?(emilio)
Assignee

Updated

Last year
Attachment #8941992 - Attachment is obsolete: true
Attachment #8941992 - Flags: review?(emilio)
Assignee

Comment 26

Last year
MozReview-Commit-ID: 9LBNm2h5Ct3
Attachment #8942022 - Flags: review?(emilio)
Assignee

Updated

Last year
Attachment #8942007 - Attachment is obsolete: true
Attachment #8942007 - Flags: review?(emilio)
Comment on attachment 8942022 [details] [diff] [review]
Part 1 - Add machinery to assert single-threadedness from geckolib. v3

Review of attachment 8942022 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with those allows removed.

::: servo/ports/geckolib/glue.rs
@@ +215,5 @@
>  unsafe fn dummy_url_data() -> &'static RefPtr<URLExtraData> {
>      RefPtr::from_ptr_ref(&DUMMY_URL_DATA)
>  }
>  
> +#[allow(dead_code)]

Why is this needed? This shouldn't warn even in release builds.
Attachment #8942022 - Flags: review?(emilio) → review+
Comment on attachment 8941993 [details] [diff] [review]
Part 2 - Avoid atomic overhead in Servo_Element_IsDisplayNone. v1

Review of attachment 8941993 [details] [diff] [review]:
-----------------------------------------------------------------

I'd rather not take it if it happens to not be needed with the next patch tbh. But looks good if it is indeed needed...
Attachment #8941993 - Flags: review?(emilio) → review+
Comment on attachment 8941994 [details] [diff] [review]
Part 3 - Rearrange NoteDirtyElement for faster bailouts on the same restyle root. v1

Review of attachment 8941994 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsIDocumentInlines.h
@@ +97,5 @@
> +nsIDocument::SetServoRestyleRootDirtyBits(uint32_t aDirtyBits)
> +{
> +  MOZ_ASSERT(aDirtyBits);
> +  MOZ_ASSERT((aDirtyBits & ~Element::kAllServoDescendantBits) == 0);
> +  MOZ_ASSERT((aDirtyBits & mServoRestyleRootDirtyBits) == mServoRestyleRootDirtyBits);

Let's assert mServoRestyleRoot here?
Attachment #8941994 - Flags: review?(emilio) → review+
Assignee

Comment 30

Last year
(In reply to Emilio Cobos Álvarez [:emilio] from comment #27)
> Comment on attachment 8942022 [details] [diff] [review]
> Part 1 - Add machinery to assert single-threadedness from geckolib. v3
> 
> Review of attachment 8942022 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me with those allows removed.
> 
> ::: servo/ports/geckolib/glue.rs
> @@ +215,5 @@
> >  unsafe fn dummy_url_data() -> &'static RefPtr<URLExtraData> {
> >      RefPtr::from_ptr_ref(&DUMMY_URL_DATA)
> >  }
> >  
> > +#[allow(dead_code)]
> 
> Why is this needed? This shouldn't warn even in release builds.

It does: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fd55825e8b0ab26262e2dbe59962dbb55d375caf&selectedJob=155779839

This is because is_in_servo_traversal is currently only used in a #[cfg(not(debug_asssertions))] block. I could remove it for is_main_thread, but that would be asymmetric, and also likely to bite people again in the future if code gets rearranged. The advantage of deny(dead_code) is that it makes sure that stuff gets cleaned up when it's not used, but I don't think that's a particularly high priority for these functions.

(In reply to Emilio Cobos Álvarez [:emilio] from comment #28)
> I'd rather not take it if it happens to not be needed with the next patch
> tbh. But looks good if it is indeed needed...

Per the commit message, it depends on the testcase. Part 3 further optimizes this particular testcase so that Servo_Element_IsDisplayNone will never be reached, but it will on other testcases.

In general I think it's fine to bypass locks in main-thread-only glue.rs FFI hooks because:
* The locks are designed to protect us from racing with rayon, but the assertions ensure that rayon can't be running at these callsites.
* We're one frame away from C++ code, so we don't get deep transitive benefits of type-safety. The only reason we're calling into Rust at all is that the data representation is opaque to C++, and we we wouldn't dream of paying this kind of runtime cost in C++.

That said, I also don't think we should sprinkle safety bypasses willy-nilly, even in glue.rs. So I think a good compromise is "we can bypass if we can measure a clear speedup in a profiler".

(In reply to Emilio Cobos Álvarez [:emilio] from comment #29)
> Let's assert mServoRestyleRoot here?

Good idea.

Thanks for the reviews!

Comment 32

Last year
Pushed by bholley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/af26f7cb29ca
Add machinery to assert single-threadedness from geckolib. r=emilio
https://hg.mozilla.org/integration/autoland/rev/dae099622091
Rearrange NoteDirtyElement for faster bailouts on the same restyle root. r=emilio

Comment 33

Last year
bugherder
https://hg.mozilla.org/mozilla-central/rev/af26f7cb29ca
https://hg.mozilla.org/mozilla-central/rev/dae099622091
Status: REOPENED → RESOLVED
Closed: 2 years agoLast year
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.