1387956 - stylo: Need to measure ComputedValues together during memory reporting

Reporter

Description

•

7 years ago

Continuing discussion from bug 1367854 comment 6.

My understanding of memory reports is patchy, so let me know if I'm getting this wrong.

The observable I would like is the following two per-window measurements:
(I) Memory overhead of all ComputedValues hanging off elements in that DOM.
(II) Memory overhead of all ComputedValues hanging off the frame tree that were not counted in (I).

To achieve this, it seems like we want to do the following, in order:
(1) Stop measuring mServoData during the normal DOM traversal (so we stop billing it to elements).
(2) During memory reporting, traverse the DOM (using a StyleChildrenIterator), measuring the mServoData of every element. The result of this gives us (I).
(3) Next, with the |seen| table already primed from (2), traverse the frame tree, measuring StyleContext() on every frame (style contexts and ComputedValues are the same data structure). This gives us (II).

Bobby Holley (:bholley)

Reporter

Comment 1

•

7 years ago

Nick, do I have this right? If so, is this something you can take?

Flags: needinfo?(n.nethercote)

Bobby Holley (:bholley)

Reporter

Updated

•

7 years ago

Summary: stylo: Need to measure ComputedValues together → stylo: Need to measure ComputedValues together during memory reporting

Nicholas Nethercote [inactive]

Assignee

Comment 2

•

7 years ago

I'm happy to take it. I'm not sure yet if the algorithm in comment 0 will work or not. I'll investigate once I've finished bug 1387958, which is easier than this one.

Assignee: nobody → n.nethercote

Flags: needinfo?(n.nethercote)

Bobby Holley (:bholley)

Reporter

Comment 3

•

7 years ago

Great, thanks!

Bobby Holley (:bholley)

Reporter

Comment 4

•

7 years ago

bz points out that it would be really nice to measure the weight of the style structs separately from the ComputedValues, which would e.g. be useful diagnosing issues like the hypothesis in bug 1367854 comment 19. Itemization of each style struct would be even nicer, but that might make the reporting code unwieldy. We could potentially break ComputedValues down into:
* inherited structs
* non-inherited structs
* rule nodes
* Everything else (including self)

Nicholas Nethercote [inactive]

Assignee

Comment 5

•

7 years ago

> Itemization of each style struct would be even nicer

I'm doing that, it's not too bad and we have similar stuff already.

Bobby Holley (:bholley)

Reporter

Comment 6

•

7 years ago

(In reply to Nicholas Nethercote [:njn] from comment #5)
> > Itemization of each style struct would be even nicer
> 
> I'm doing that, it's not too bad and we have similar stuff already.

Great! I realize now that the STYLE_STRUCT maros should actually make this relatively manageable.

Nicholas Nethercote [inactive]

Assignee

Updated

•

7 years ago

Depends on: 1388975

Comment hidden (mozreview-request)

This patch moves measurement of ComputedValues objects from Rust to C++.
Measurement now happens (a) via DOM elements and (b) remaining elements via
the frame tree. Likewise for the style structs hanging off ComputedValues
objects.

Here is an example of the output.

> ├──27,600,448 B (26.49%) -- active/window(https://en.wikipedia.org/wiki/Barack_Obama)
> │  ├──12,772,544 B (12.26%) -- layout
> │  │  ├───4,483,744 B (04.30%) -- frames
> │  │  │   ├──1,653,552 B (01.59%) ── nsInlineFrame
> │  │  │   ├──1,415,760 B (01.36%) ── nsTextFrame
> │  │  │   ├────431,376 B (00.41%) ── nsBlockFrame
> │  │  │   ├────340,560 B (00.33%) ── nsHTMLScrollFrame
> │  │  │   ├────302,544 B (00.29%) ── nsContinuingTextFrame
> │  │  │   ├────156,408 B (00.15%) ── nsBulletFrame
> │  │  │   ├─────73,024 B (00.07%) ── nsPlaceholderFrame
> │  │  │   ├─────27,656 B (00.03%) ── sundries
> │  │  │   ├─────23,520 B (00.02%) ── nsTableCellFrame
> │  │  │   ├─────16,704 B (00.02%) ── nsImageFrame
> │  │  │   ├─────15,488 B (00.01%) ── nsTableRowFrame
> │  │  │   ├─────13,776 B (00.01%) ── nsTableColFrame
> │  │  │   └─────13,376 B (00.01%) ── nsTableFrame
> │  │  ├───3,412,192 B (03.28%) -- servo-style-structs
> │  │  │   ├──1,288,224 B (01.24%) ── Display
> │  │  │   ├────742,400 B (00.71%) ── Position
> │  │  │   ├────308,736 B (00.30%) ── Font
> │  │  │   ├────226,512 B (00.22%) ── Background
> │  │  │   ├────218,304 B (00.21%) ── TextReset
> │  │  │   ├────214,896 B (00.21%) ── Text
> │  │  │   ├────130,560 B (00.13%) ── Border
> │  │  │   ├─────81,408 B (00.08%) ── UIReset
> │  │  │   ├─────61,440 B (00.06%) ── Padding
> │  │  │   ├─────38,176 B (00.04%) ── UserInterface
> │  │  │   ├─────29,232 B (00.03%) ── Margin
> │  │  │   ├─────21,824 B (00.02%) ── sundries
> │  │  │   ├─────20,080 B (00.02%) ── Color
> │  │  │   ├─────20,080 B (00.02%) ── Column
> │  │  │   └─────10,320 B (00.01%) ── Effects
> │  │  ├───2,227,680 B (02.14%) -- computed-values
> │  │  │   ├──1,182,928 B (01.14%) ── non-dom
> │  │  │   └──1,044,752 B (01.00%) ── dom
> │  │  ├───1,500,016 B (01.44%) ── text-runs
> │  │  ├─────492,640 B (00.47%) ── line-boxes
> │  │  ├─────326,688 B (00.31%) ── frame-properties
> │  │  ├─────301,760 B (00.29%) ── pres-shell
> │  │  ├──────27,648 B (00.03%) ── pres-contexts
> │  │  └─────────176 B (00.00%) ── style-sets

The 'servo-style-structs' and 'computed-values' sub-trees are new. (Prior to
this patch, ComputedValues under DOM elements were tallied under the the
'dom/element-nodes' sub-tree, and ComputedValues not under DOM element were
ignored.) 'servo-style-structs/sundries' aggregates all the style structs that
are smaller than 8 KiB.

Other notable things done by the patch are as follows.

- It significantly changes the signatures of the methods measuring nsINode and
  its subclasses, in order to handle the tallying of style structs separately
  from element-nodes. Likewise for nsIFrame.

- It renames the 'layout/style-structs' sub-tree as
  'layout/gecko-style-structs', to clearly distinguish it from the new
  'layout/servo-style-structs' sub-tree.

- It adds some FFI functions to access various Rust-side data structures from
  C++ code.

- There is a nasty hack used twice to measure Arcs, by stepping backwards from
  an interior pointer to a base pointer. It works, but I want to replace it
  with something better eventually. The "XXX WARNING" comments have details.

- It makes DMD print a line to the console if it sees a pointer it doesn't
  recognise. This is useful for detecting when we are measuring an interior
  pointer instead of a base pointer, which is bad but easy to do when Arcs are
  involved.

- It removes the Rust code for measuring CVs, because it's now all done on the
  C++ side.

Review commit: https://reviewboard.mozilla.org/r/167444/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/167444/

Nicholas Nethercote [inactive]

Assignee

Comment 8

•

7 years ago

This is in good shape except for one thing: the step-backwards-over-the-Arc-refcount hack is busted for ServoStyleContext on Win32 and some Android configs: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3a2aa15c4e6453479717253723774682842b08eb

Comment hidden (mozreview-request)

Comment on attachment 8896177 [details]
Bug 1387956 - Overhaul ComputedValues measurement, and add style structs measurement. .

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/167444/diff/1-2/

Nicholas Nethercote [inactive]

Assignee

Comment 10

•

7 years ago

Ok, I've fixed the ServoStyleContext problem from the previous version.

Bobby Holley (:bholley)

Reporter

Comment 11

•

7 years ago

mozreview-review

Comment on attachment 8896177 [details]
Bug 1387956 - Overhaul ComputedValues measurement, and add style structs measurement. .

https://reviewboard.mozilla.org/r/167444/#review172946

This is really great work. I confess to skimming the details a little bit because I don't know the memory reporting code and I'm sleepy, but I wanted to unblock this. Feel free to flag someone else if you want a more careful review of that part.

::: commit-message-f50b8:55
(Diff revision 2)
> +ignored.) 'servo-style-structs/sundries' aggregates all the style structs that
> +are smaller than 8 KiB.

Does that mean the style structs that collectively consume less than 8KiB?

::: dom/base/nsINode.h:335
(Diff revision 2)
> -  virtual size_t SizeOfExcludingThis(mozilla::SizeOfState& aState) const;
> +  virtual void AddSizeOfExcludingThis(mozilla::SizeOfState& aState,
> +                                      nsStyleSizes& aSizes,
> +                                      size_t* aNodeSize) const;

Should this use the new macro?

::: js/xpconnect/src/XPCJSRuntime.cpp:2161
(Diff revision 2)
> +        // We combine the node size with nsStyleSizes here. It's not ideal, but
> +        // it's hard to get the style structs measurements out to
> +        // nsWindowMemoryReporter, and the number of orphan DOM nodes is
> +        // usually small.

It shouldn't matter. We drop mServoData in UnbindFromTree, so any non-in-tree element can't have any style data to mueasure.

::: layout/style/nsCSSPseudoElements.h:95
(Diff revision 2)
> +  // This must match EAGER_PSEUDO_COUNT in Rust code.
> +  static const size_t kEagerPseudoCount = 4;

Maybe put this right above IsEagerlyCascadedInServo, with no newline, so that it's harder to miss?

::: servo/components/style/data.rs:270
(Diff revision 2)
> +        // XXX: measure the EagerPseudoArray itself, but not the ComputedValues
> +        // within it.
> +
> +        0

Seems like the code to do this is missing?

::: servo/ports/geckolib/glue.rs:811
(Diff revision 2)
> +pub extern "C" fn Servo_Element_GetPseudoComputedValues(element: RawGeckoElementBorrowed,
> +                                                        index: usize) -> ServoStyleContextStrong
> +{
> +    let element = GeckoElement(element);
> +    let data = element.borrow_data().expect("Getting CVs on unstyled element");
> +    data.styles.pseudos.as_array()[index].as_ref().expect("Getting CVs on unstyled element")

This expect message is wrong.

Attachment #8896177 - Flags: review?(bobbyholley) → review+

Nicholas Nethercote [inactive]

Assignee

Comment 12

•

7 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/0823bc83b06b7e5a5e732c368157c2a8a848e0c7
Bug 1387956 (part 1) - Change |nsWindowSizes*| arguments to |nsWindowSizes&|. r=mccr8.

https://hg.mozilla.org/integration/mozilla-inbound/rev/a8aa2a5c2870498d9c21b05fed673d786591f817
Bug 1387956 (part 2) - Overhaul handling of nsWindowSizes. r=mccr8.

Nicholas Nethercote [inactive]

Assignee

Updated

•

7 years ago

No longer depends on: 1388975

Nicholas Nethercote [inactive]

Assignee

Comment 14

•

7 years ago

(In reply to Nicholas Nethercote [:njn] from comment #12)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/
> 0823bc83b06b7e5a5e732c368157c2a8a848e0c7
> Bug 1387956 (part 1) - Change |nsWindowSizes*| arguments to
> |nsWindowSizes&|. r=mccr8.
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/
> a8aa2a5c2870498d9c21b05fed673d786591f817
> Bug 1387956 (part 2) - Overhaul handling of nsWindowSizes. r=mccr8.

These patches were r+'d in bug 1388975 but I accidentally landed them under this bug, so I have dup'd that bug to this one.

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 15

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/0823bc83b06b
https://hg.mozilla.org/mozilla-central/rev/a8aa2a5c2870

Status: NEW → RESOLVED

Closed: 7 years ago

status-firefox57: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla57

Nicholas Nethercote [inactive]

Assignee

Comment 16

•

7 years ago

Still more stuff to land here.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Nicholas Nethercote [inactive]

Assignee

Comment 17

•

7 years ago

> > +ignored.) 'servo-style-structs/sundries' aggregates all the style structs that
> > +are smaller than 8 KiB.
> 
> Does that mean the style structs that collectively consume less than 8KiB?

Depends what you mean by "collectively". For each style struct kind (Position, Font, etc.) if the total for the window is less than 8 KiB, it'll get lumped into sundries. So sundries can exceed 8 KiB. It's an approach already in use in a number of places in memory reporting code.

Nicholas Nethercote [inactive]

Assignee

Comment 18

•

7 years ago

I have opened https://github.com/servo/servo/pull/18065 for the Servo side of these changes.

Comment hidden (mozreview-request)

Comment on attachment 8896177 [details]
Bug 1387956 - Overhaul ComputedValues measurement, and add style structs measurement. .

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/167444/diff/2-3/

Nicholas Nethercote [inactive]

Assignee

Comment 20

•

7 years ago

mozreview-review-reply

Comment on attachment 8896177 [details]
Bug 1387956 - Overhaul ComputedValues measurement, and add style structs measurement. .

https://reviewboard.mozilla.org/r/167444/#review172946

> Should this use the new macro?

No, because it lacks the |override|.

> It shouldn't matter. We drop mServoData in UnbindFromTree, so any non-in-tree element can't have any style data to mueasure.

Ok, I updated the comment.

> Seems like the code to do this is missing?

The "XXX" communicates that it's a todo.

Pulsebot

Comment 21

•

7 years ago

Pushed by nnethercote@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5f2f00d59868
Overhaul ComputedValues measurement, and add style structs measurement. r=bholley.

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 22

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/5f2f00d59868

Status: REOPENED → RESOLVED

Closed: 7 years ago → 7 years ago

Resolution: --- → FIXED

Joel Maher ( :jmaher ) (UTC -8)

Comment 23

•

7 years ago

and we see some memory improvements from this:
== Change summary for alert #8758 (as of August 14 2017 03:16 UTC) ==

Improvements:

  4%  Heap Unclassified summary linux64-stylo opt stylo     78,743,881.56 -> 75,415,126.79
  4%  Heap Unclassified summary macosx64-stylo opt stylo    97,335,159.26 -> 93,461,847.52
  4%  Heap Unclassified summary linux64-stylo-sequential opt stylo-sequential78,116,199.50 -> 75,255,957.13

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=8758

Nicholas Nethercote [inactive]

Assignee

Comment 24

•

7 years ago

(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #23)
> and we see some memory improvements from this:

That is surprising! This bug just added some code to measure Stylo memory usage. It should not have changed Stylo memory usage.

Bobby Holley (:bholley)

Reporter

Comment 25

•

7 years ago

(In reply to Nicholas Nethercote [:njn] from comment #24)
> (In reply to Joel Maher ( :jmaher) (UTC-5) from comment #23)
> > and we see some memory improvements from this:
> 
> That is surprising! This bug just added some code to measure Stylo memory
> usage. It should not have changed Stylo memory usage.

It seems like Talos is measuring heap-unclassified as a separate metric (where lower is better). So this is just saying that the patch reduced heap-unclassified, as expected, by measuring more things.

Nicholas Nethercote [inactive]

Assignee

Comment 26

•

7 years ago

Ah, yes, I misread comment 23. Makes sense now.

Nicholas Nethercote [inactive]

Assignee

Updated

•

7 years ago

Blocks: 1281964