Closed Bug 1398119 Opened 7 years ago Closed 7 years ago

NoteDirtyElement is either called too often or is too slow or both

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla59

Project Flags:

Performance Impact

high

Tracking Flags:

Tracking

Status

firefox59

---

fixed

People

(Reporter: smaug, Assigned: bholley)

References

Details

(Keywords: perf)

Attachments

(4 files, 3 obsolete files)

Bug 1398119: Move the common case of NoteDirtyElement during innerHTML up. 7 years ago Emilio Cobos Álvarez (:emilio) 59 bytes, text/x-review-board-request		Details
testcase 7 years ago Olli Pettay [:smaug][bugs@pettay.fi] 509 bytes, text/html		Details
Part 1 - Add machinery to assert single-threadedness from geckolib. v1 7 years ago Bobby Holley (:bholley) 5.88 KB, patch		Details \| Diff \| Splinter Review
Part 2 - Avoid atomic overhead in Servo_Element_IsDisplayNone. v1 7 years ago Bobby Holley (:bholley) 2.03 KB, patch	emilio : review+	Details \| Diff \| Splinter Review
Part 3 - Rearrange NoteDirtyElement for faster bailouts on the same restyle root. v1 7 years ago Bobby Holley (:bholley) 6.08 KB, patch	emilio : review+	Details \| Diff \| Splinter Review
Part 1 - Add machinery to assert single-threadedness from geckolib. v2 7 years ago Bobby Holley (:bholley) 4.53 KB, patch		Details \| Diff \| Splinter Review
Part 1 - Add machinery to assert single-threadedness from geckolib. v3 7 years ago Bobby Holley (:bholley) 4.52 KB, patch	emilio : review+	Details \| Diff \| Splinter Review

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Description

•

7 years ago

NoteDirtyElement is 4.6% of innerHTML set time in a profile I'm looking at.

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 1

•

7 years ago

https://perfht.ml/2eLtOhP

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 2

•

7 years ago

innerHTML speed is rather important for Speedometer.

Depends on: 1383332

Whiteboard: [qf]

Bobby Holley (:bholley)

Assignee

Comment 3

•

7 years ago

So 3/4 samples are in MaybeConstructLazily, so the work involved here is basically propagating up the lazy frame construction bits, which Gecko does too. Stylo's code here is slightly more complex than Gecko's, but would be faster in a lot of cases (the first time we set the bit we don't need to walk up the tree, whereas Gecko always walks up to the root). How does the time under MaybeConstructLazily compare between stylo and non-stylo?

Bobby Holley (:bholley)

Assignee

Updated

•

7 years ago

Flags: needinfo?(bugs)

Emilio Cobos Álvarez (:emilio)

Comment 4

•

7 years ago

I took a look, and we're mostly taking the fast paths there, so unless the `parent` checks are hurting us (in which case we could at least move the aElement == existingRoot up), I'm not sure there's much we can do here. In my profile NoteDirtyElement was also incorrectly blamed with some refcount traffic, that really cam from BindToTree / UnbindFromTree, for some reason. The other case where we may pay more than Gecko is in propagating the HAS_DIRTY_DESCENDANTS bits, which we could try to coalesce in MaybeConstructLazily. I'll post a patch with the first idea, please take a profile if you wish, because mine was kinda busted because of the refcount traffic that was incorrectly reported. I can also try to poke at (2), I guess...

Comment hidden (mozreview-request)

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 6

•

7 years ago

stylo vs non-stylo isn't too interesting question here. We are spending too much time in a _very_ hot code path (and doesn't matter whether it is stylo or non-stylo, but in this case it happens to be stylo). nsIMutationObserver implementations should be as fast as possible. Can we do something to optimize the method, or call it less often, or call it asynchronously?

Flags: needinfo?(bugs)

Bobby Holley (:bholley)

Assignee

Comment 7

•

7 years ago

Realistically, assuming we aren't taking a regression with the switch to stylo, I probably don't have time to think about this until after the 57 merge.

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 8

•

7 years ago

Oh, sure, I don't think I indicated anywhere this _must_ go into FF57. It is just that we have a slow method in a very hot code path. I may take a look at this.

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 9

•

7 years ago

Let me profile without stylo

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 10

•

7 years ago

The stuff under nsCSSFrameConstructor::ContentAppended seems to take more time when Stylo is enabled, but doesn't affect bug 1347525 too badly. Note, this seems to be a case where Gecko profiler may not give low-level enough information, so some native profiler may be better.

Jean Gong :jgong

Updated

•

7 years ago

Whiteboard: [qf] → [qf:p2]

Hsin-Yi Tsai (she/her) [:hsinyi]

Updated

•

7 years ago

Priority: -- → P2

Firefox Bug Husbandry Bot

Updated

•

7 years ago

Keywords: perf

Andrew Overholt [:overholt]

Comment 11

•

7 years ago

Now that 57 is shipped, let's re-evaluate if we can do work here. I'm marking this qf:p1 but am willing to hear objections :)

Whiteboard: [qf:p2] → [qf:p1]

Jean Gong :jgong

Updated

•

7 years ago

Whiteboard: [qf:p1] → [qf:i60][qf:p1]

Jean Gong :jgong

Updated

•

7 years ago

Whiteboard: [qf:i60][qf:p1] → [qf:f60][qf:p1]

Andrew Overholt [:overholt]

Comment 12

•

7 years ago

smaug said he'd profile this again to see where things are these days.

Flags: needinfo?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 13

•

7 years ago

Ok, I was reprofiling this using Nightly, and I can't see any high numbers for NoteDirtyElement anymore. Marking this WFM. Please reopen if needed. (FWIW, I'm getting quite a bit faster innerHTML numbers for Nightly than for Chrome, on Linux.)

Status: NEW → RESOLVED

Closed: 7 years ago

Flags: needinfo?(bugs)

Resolution: --- → WORKSFORME

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 14

•

7 years ago

Reopening. Using a different testcase the method is there.

Status: RESOLVED → REOPENED

Resolution: WORKSFORME → ---

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 15

•

7 years ago

Attached file testcase — Details

https://perfht.ml/2D34Gkk

Olli Pettay [:smaug][bugs@pettay.fi]

Reporter

Comment 16

•

7 years ago

https://perfht.ml/2D1L6os might be a bit better link

Bobby Holley (:bholley)

Assignee

Comment 17

•

7 years ago

I can have a look at this testcase.

Assignee: nobody → bobbyholley

Bobby Holley (:bholley)

Assignee

Comment 18

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a3d408b40e63cd2710987a99e9e0fd49653ddcc3

Bobby Holley (:bholley)

Assignee

Comment 19

•

7 years ago

Attached patch Part 1 - Add machinery to assert single-threadedness from geckolib. v1 (obsolete) — Details — Splinter Review

MozReview-Commit-ID: 9LBNm2h5Ct3

Attachment #8941992 - Flags: review?(emilio)

Bobby Holley (:bholley)

Assignee

Comment 20

•

7 years ago

Attached patch Part 2 - Avoid atomic overhead in Servo_Element_IsDisplayNone. v1 — Details — Splinter Review

This reduces time spent in NoteDirtyElement by about 15% in the testcase. Even though the subsequent patch will cause us to call Servo_Element_IsDisplayNone less for this particular testcase, it could still be hot on other testcases, and so it's worth optimizing. MozReview-Commit-ID: 3F3Zfp48dDW

Attachment #8941993 - Flags: review?(emilio)

Bobby Holley (:bholley)

Assignee

Comment 21

•

7 years ago

Attached patch Part 3 - Rearrange NoteDirtyElement for faster bailouts on the same restyle root. v1 — Details — Splinter Review

This eliminates ~90% of the time spent in NoteDirtyElement on this testcase. MozReview-Commit-ID: Lm5hf7QRiOK

Attachment #8941994 - Flags: review?(emilio)

Bobby Holley (:bholley)

Assignee

Updated

•

7 years ago

Attachment #8906035 - Attachment is obsolete: true

Bobby Holley (:bholley)

Assignee

Comment 22

•

7 years ago

(Note that I bumped the iterations on the testcase to get better resolution of the effects of these patches)

Bobby Holley (:bholley)

Assignee

Comment 23

•

7 years ago

Attached patch Part 1 - Add machinery to assert single-threadedness from geckolib. v2 (obsolete) — Details — Splinter Review

MozReview-Commit-ID: 9LBNm2h5Ct3

Attachment #8942007 - Flags: review?(emilio)

Bobby Holley (:bholley)

Assignee

Updated

•

7 years ago

Attachment #8941992 - Attachment is obsolete: true

Attachment #8941992 - Flags: review?(emilio)

Bobby Holley (:bholley)

Assignee

Comment 24

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8c674de9a8abb5351aacd6888356b7b474c58ba1

Bobby Holley (:bholley)

Assignee

Comment 25

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=321e423cf82decde6c249b8787d0fa179d3fc081

Bobby Holley (:bholley)

Assignee

Comment 26

•

7 years ago

Attached patch Part 1 - Add machinery to assert single-threadedness from geckolib. v3 — Details — Splinter Review

MozReview-Commit-ID: 9LBNm2h5Ct3

Attachment #8942022 - Flags: review?(emilio)

Bobby Holley (:bholley)

Assignee

Updated

•

7 years ago

Attachment #8942007 - Attachment is obsolete: true

Attachment #8942007 - Flags: review?(emilio)

Emilio Cobos Álvarez (:emilio)

Comment 27

•

7 years ago

Comment on attachment 8942022 [details] [diff] [review] Part 1 - Add machinery to assert single-threadedness from geckolib. v3 Review of attachment 8942022 [details] [diff] [review]: ----------------------------------------------------------------- r=me with those allows removed. ::: servo/ports/geckolib/glue.rs @@ +215,5 @@ > unsafe fn dummy_url_data() -> &'static RefPtr<URLExtraData> { > RefPtr::from_ptr_ref(&DUMMY_URL_DATA) > } > > +#[allow(dead_code)] Why is this needed? This shouldn't warn even in release builds.

Attachment #8942022 - Flags: review?(emilio) → review+

Emilio Cobos Álvarez (:emilio)

Comment 28

•

7 years ago

Comment on attachment 8941993 [details] [diff] [review] Part 2 - Avoid atomic overhead in Servo_Element_IsDisplayNone. v1 Review of attachment 8941993 [details] [diff] [review]: ----------------------------------------------------------------- I'd rather not take it if it happens to not be needed with the next patch tbh. But looks good if it is indeed needed...

Attachment #8941993 - Flags: review?(emilio) → review+

Emilio Cobos Álvarez (:emilio)

Comment 29

•

7 years ago

Comment on attachment 8941994 [details] [diff] [review] Part 3 - Rearrange NoteDirtyElement for faster bailouts on the same restyle root. v1 Review of attachment 8941994 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/base/nsIDocumentInlines.h @@ +97,5 @@ > +nsIDocument::SetServoRestyleRootDirtyBits(uint32_t aDirtyBits) > +{ > + MOZ_ASSERT(aDirtyBits); > + MOZ_ASSERT((aDirtyBits & ~Element::kAllServoDescendantBits) == 0); > + MOZ_ASSERT((aDirtyBits & mServoRestyleRootDirtyBits) == mServoRestyleRootDirtyBits); Let's assert mServoRestyleRoot here?

Attachment #8941994 - Flags: review?(emilio) → review+

Bobby Holley (:bholley)

Assignee

Comment 30

•

7 years ago

(In reply to Emilio Cobos Álvarez [:emilio] from comment #27) > Comment on attachment 8942022 [details] [diff] [review] > Part 1 - Add machinery to assert single-threadedness from geckolib. v3 > > Review of attachment 8942022 [details] [diff] [review]: > ----------------------------------------------------------------- > > r=me with those allows removed. > > ::: servo/ports/geckolib/glue.rs > @@ +215,5 @@ > > unsafe fn dummy_url_data() -> &'static RefPtr<URLExtraData> { > > RefPtr::from_ptr_ref(&DUMMY_URL_DATA) > > } > > > > +#[allow(dead_code)] > > Why is this needed? This shouldn't warn even in release builds. It does: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fd55825e8b0ab26262e2dbe59962dbb55d375caf&selectedJob=155779839 This is because is_in_servo_traversal is currently only used in a #[cfg(not(debug_asssertions))] block. I could remove it for is_main_thread, but that would be asymmetric, and also likely to bite people again in the future if code gets rearranged. The advantage of deny(dead_code) is that it makes sure that stuff gets cleaned up when it's not used, but I don't think that's a particularly high priority for these functions. (In reply to Emilio Cobos Álvarez [:emilio] from comment #28) > I'd rather not take it if it happens to not be needed with the next patch > tbh. But looks good if it is indeed needed... Per the commit message, it depends on the testcase. Part 3 further optimizes this particular testcase so that Servo_Element_IsDisplayNone will never be reached, but it will on other testcases. In general I think it's fine to bypass locks in main-thread-only glue.rs FFI hooks because: * The locks are designed to protect us from racing with rayon, but the assertions ensure that rayon can't be running at these callsites. * We're one frame away from C++ code, so we don't get deep transitive benefits of type-safety. The only reason we're calling into Rust at all is that the data representation is opaque to C++, and we we wouldn't dream of paying this kind of runtime cost in C++. That said, I also don't think we should sprinkle safety bypasses willy-nilly, even in glue.rs. So I think a good compromise is "we can bypass if we can measure a clear speedup in a profiler". (In reply to Emilio Cobos Álvarez [:emilio] from comment #29) > Let's assert mServoRestyleRoot here? Good idea. Thanks for the reviews!

Bobby Holley (:bholley)

Assignee

Comment 31

•

7 years ago

https://github.com/servo/servo/pull/19753

Pulsebot

Comment 32

•

7 years ago

Pushed by bholley@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/af26f7cb29ca Add machinery to assert single-threadedness from geckolib. r=emilio https://hg.mozilla.org/integration/autoland/rev/dae099622091 Rearrange NoteDirtyElement for faster bailouts on the same restyle root. r=emilio

Cristina Coroiu [:ccoroiu]

Comment 33

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/af26f7cb29ca https://hg.mozilla.org/mozilla-central/rev/dae099622091

Status: REOPENED → RESOLVED

Closed: 7 years ago → 7 years ago

status-firefox59: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla59

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: DOM → DOM: Core & HTML

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

3 years ago

Performance Impact: --- → P1

Whiteboard: [qf:f60][qf:p1]

You need to log in before you can comment on or make changes to this bug.