Closed Bug 1421195 Opened 7 years ago Closed 7 years ago

Add telemetry probe for parallel Stylo restyles

Categories

(Core :: CSS Parsing and Computation, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox59 --- fixed

People

(Reporter: manishearth, Assigned: manishearth)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

We'd like to see how often restyles are parallel.

I'm thinking of doing this as a per-page ping, with each page figuring out the percentage of restyles that are parallel, and then accumulating these in a histogram.
So currently this patch tracks every restyle that happens per-page and record the fraction of them that were parallel (on page unload), provided that the page had enough (currently, 30) restyles.

We ignore the full-tree stylings, but I can add a separate probe for that as well.



This doesn't really help weight styles; for example a page may have a lot of sequential traversals for tiny inconsequential DOM mutations, and only one parallel traversal for a huge mutation that causes visible jank. I'm interested in ideas for what we could measure here that can help. Restyle times are one possible thing to measure. Ideally a 3D histogram for tree size vs restyle time vs occurrence rate (one for parallel, one for sequential) would be ideal but that's a lot of data and I don't think we support 3D histograms.

Will need sign-off from a telemetry peer to land, but also would like to just figure out what we should be measuring.
Comment on attachment 8932426 [details]
Bug 1421195 - Add telemetry probe for fraction of restyles that are parallel ;

https://reviewboard.mozilla.org/r/203474/#review208906

::: servo/components/style/gecko/data.rs:209
(Diff revision 1)
>          self.stylist.add_size_of(ops, sizes);
>      }
> +
> +    /// Record that a traversal happened for later collection as telemetry
> +    pub fn record_traversal(&self, was_parallel: bool) {
> +        self.total_traversal_count.fetch_add(1, Ordering::Release);



::: servo/components/style/gecko/data.rs:211
(Diff revision 3)
> +
> +    /// Record that a traversal happened for later collection as telemetry
> +    pub fn record_traversal(&self, was_parallel: bool) {
> +        self.total_traversal_count.fetch_add(1, Ordering::Release);
> +        if was_parallel {
> +            self.parallel_traversal_count.fetch_add(1, Ordering::Release);

I'm 100% sure relaxed is enough everywhere here.

::: servo/ports/geckolib/glue.rs:269
(Diff revision 3)
>      };
>  
>      let traversal = RecalcStyleOnly::new(shared_style_context);
> -    driver::traverse_dom(&traversal, token, thread_pool);
> +    let used_parallel = driver::traverse_dom(&traversal, token, thread_pool);
> +
> +    if traversal_flags.contains(TraversalFlags::ParallelTraversal) &&

The element will always have data _after_ the traversal, you need to check this before.

Also, this function can be called multiple times per each Servo_TraverseSubtree call, so I'm not sure recording twice is what you want.
Attachment #8932426 - Flags: review?(emilio)
The result of clicking around a bit on reddit, wikipedia, Twitter, and Github. You have to close the tab or navigate away for it to be recorded.

STYLO_PARALLEL_RESTYLE_FRACTION
6 samples, average = 7.5, sum = 45

 0 |  0  0%
 1 |#############  1  17%
 5 |#############  1  17%
 7 |#############  1  17%
 9 |#########################  2  33%
11 |#############  1  17%
13 |  0  0%
Comment on attachment 8932426 [details]
Bug 1421195 - Add telemetry probe for fraction of restyles that are parallel ;

https://reviewboard.mozilla.org/r/203474/#review209062

Code-wise looks ok. How much do we want this to stick around? I'd prefer if we don't do telemetry probes every restyle, since people complained when they were done on every scroll event, and this is hotter.
Attachment #8932426 - Flags: review?(emilio) → review+
> How much do we want this to stick around? 

At least one nightly? Probably not forever.

> I'd prefer if we don't do telemetry probes every restyle,

Yeah that's my concern too.

This specific patch does a probe every page unload, not restyle, so that's ok, but getting better restyle data does mean doing it per restyle.
Attachment #8932426 - Flags: review?(chutten)
Attachment #8932636 - Flags: review?(chutten)
Attachment #8932426 - Flags: review?(chutten)
Attachment #8932636 - Flags: review?(chutten)
ni? for data review.

This adds three probes. When a page unloads, they calculate:

 - The fraction of restyles that were parallel
 - The fraction of restyled elements that were styled in parallel
 - The fraction of restyle-traversed elements that were traversed in parallel

A restyle happens whenever the DOM is manipulated (elements moved around, ids/classes set) or if the CSS changes. In that process, we make a decision whether to do the restyle in parallel or not.

The first probe helps understand what the absolute fraction of restyles is, and the other two are weighted by the number of elements so that we get an idea of how much styling is done in parallel.
Flags: needinfo?(liuche)
Hi Manish, we're using a new data review process where requesting data-review involves copy-pasting this form into the bug and answering the questions. If all the questions are answered, the data steward will give an r+ on data review.

https://github.com/mozilla/data-review/blob/master/request.md
Flags: needinfo?(liuche) → needinfo?(manishearth)
Comment on attachment 8932426 [details]
Bug 1421195 - Add telemetry probe for fraction of restyles that are parallel ;

https://reviewboard.mozilla.org/r/203474/#review209952

::: toolkit/components/telemetry/Histograms.json:13878
(Diff revision 4)
> +    "alert_emails": ["manish@mozilla.com"],
> +    "expires_in_version": "60",
> +    "kind": "linear",
> +    "high": 100,
> +    "n_buckets": 50,
> +    "description": "Fraction of restyles on a single page that were parallel",

Description implies this number is less than 1.
> What questions will you answer with this data?

We want to figure out the effectiveness of Stylo's current parallelization heuristics, and in general get a better idea of what to optimize next.


> Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements?

Helps the servo project better evaluate its impact; and determine where to focus additional resources for further optimization.

> What alternative methods did you consider to answer these questions? Why were they not sufficient?

Manual profiling of specific pages. This doesn't tell us what gets used, ultimately.

(We will be doing this too)

> Can current instrumentation answer these questions?

No.

> List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories on the found on the Mozilla wiki.

They are all  Category 3 “Web activity data”, tracked on this bug.

The proposed measurements are, on a per-page basis,


 - The fraction of restyles that were parallel
 - The fraction of restyled elements that were styled in parallel
 - The fraction of restyle-traversed elements that were traversed in parallel

This is accumulated into a histogram for each.


> How long will this data be collected? Choose one of the following:

I want this data to be collected for 6 months initially (potentially renewable).

This will probably be collected for less than 6 months, but I don't have a specific date.


> What populations will you measure?

Nightly.

> Please provide a general description of how you will analyze this data.

Mostly seeing what kind of distribution we have. I expect a bimodal distribution for the second two, with small sites having zero parallelism and medium-large sites having a moderate number of parallelism. Knowing how much parallelism gets used is what we're going for.

> Where do you intend to share the results of your analysis?

On this bug, probably
Flags: needinfo?(manishearth) → needinfo?(liuche)
Comment on attachment 8932636 [details]
Bug 1421195 - Add weighted telemetry probes for parallel restyles ;

https://reviewboard.mozilla.org/r/203688/#review209562

::: servo/components/style/context.rs:129
(Diff revision 2)
> +    }
> +
> +    #[cfg(feature = "gecko")]
> +    /// On Gecko's nightly build?
> +    pub fn is_nightly(&self) -> bool {
> +        unsafe { bindings::Gecko_IsNightly() }

This should be at build-time I think.

Can this just be a constant generated with `ifdef` that bindgen can read? Alternatively, a cargo feature would do, but it's more annoying to add I guess.

::: layout/style/ServoBindings.cpp:150
(Diff revision 3)
> +  }
> +#endif
> +}
> +
> +bool
> +Gecko_IsNightly()

Can we just use a constant instead? Bindgen should be able to make this at compile time on the rust side too.

::: servo/components/style/driver.rs:138
(Diff revision 3)
> -        aggregate.finish(traversal, parallel, start_time.unwrap());
> +            aggregate.finish(traversal, parallel, start_time.unwrap());
> -        if aggregate.is_large_traversal() {
> +            if aggregate.is_large_traversal() {
> -            println!("{}", aggregate);
> +                println!("{}", aggregate);
> -        }
> +            }
> -    }
> +        }
> +        maybe_stats = Some(aggregate);

You really need to call `aggregate.finish()`, otherwise you're measuring only main-thread work.

::: servo/components/style/gecko/data.rs:122
(Diff revision 3)
> +    /// Number of events which were parallel
> +    pub parallel_count: AtomicUsize
> +}
> +
> +impl TraversalCount {
> +    fn record(&self, parallel: bool, count: u32) {

Given we only record with `1`, is the `count` argument really needed?

::: servo/components/style/gecko/data.rs:129
(Diff revision 3)
> +        if parallel {
> +            self.parallel_count.fetch_add(count as usize, Ordering::Relaxed);
> +        }
> +    }
> +
> +    fn get(&self) -> (u32, u32) {

I don't think this is particularly useful, but...
Attachment #8932636 - Flags: review?(emilio)
> Given we only record with `1`, is the `count` argument really needed?

No? The traversal/parallel record() calls use non-1 counts

> I don't think this is particularly useful, but...

We do it three times and it's clunky, enough to warrant a function
# Data Review Form (to be filled by Data Stewards)

> 1) Is there or will there be **documentation** that describes the schema for the ultimate data set available publicly, complete and accurate? (see [here](https://github.com/mozilla/activity-stream/blob/master/docs/v2-system-addon/data_dictionary.md), [here](https://github.com/mozilla-mobile/focus/wiki/Install-and-event-tracking-with-the-Adjust-SDK), and [here](https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/telemetry/index.html) for examples).  Refer to the appendix for "documentation" if more detail about documentation standards is needed.

Yes, in Histograms.json

> 2) Is there a control mechanism that allows the user to turn the data collection on and off? (Note, for data collection not needed for security purposes, Mozilla provides such a control mechanism) Provide details as to the control mechanism available.

Yes, Histograms is controlled by Firefox Telemetry settings

> 3) If the request is for permanent data collection, is there someone who will monitor the data over time?**

Non-permanent, 6mo with potential renewal

4) Using the **[category system of data types](https://wiki.mozilla.org/Firefox/Data_Collection)** on the Mozilla wiki, what collection type of data do the requested measurements fall under?  **

Type 1 (not Type 3, there is no personally identifiable user information)

5) Is the data collection request for default-on or default-off?

Nightly, default on

6) Does the instrumentation include the addition of **any *new* identifiers** (whether anonymous or otherwise; e.g., username, random IDs, etc.  See the appendix for more details)?

No

7) Is the data collection covered by the existing Firefox privacy notice? **If unsure: escalate to legal if:**

Yes

8) Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)**

Probe will expire in 60, can be renewed.

datareview+
Flags: needinfo?(liuche)
Comment on attachment 8932636 [details]
Bug 1421195 - Add weighted telemetry probes for parallel restyles ;

https://reviewboard.mozilla.org/r/203688/#review210366

::: servo/components/style/driver.rs:34
(Diff revision 4)
>  /// Returns true if the traversal was parallel
>  pub fn traverse_dom<E, D>(
>      traversal: &D,
>      token: PreTraverseToken<E>,
>      pool: Option<&rayon::ThreadPool>
> -) -> bool
> +) -> (bool, Option<TraversalStatistics>)

Can we give this a name, or at least update the doc comment?

::: servo/components/style/driver.rs:118
(Diff revision 4)
>              nodes_remaining_at_current_depth = discovered.len();
>          }
>      }
> -
> -    // Dump statistics to stdout if requested.
> -    if dump_stats {
> +    let mut maybe_stats = None;
> +    // Accumulate statistics
> +    if dump_stats || is_nightly {

How can this work?

```
let start_time = if dump_stats { Some(..) } else { None };

if dump_stats || is_nightly {
    finish(start_time.unwrap());
}
```

Though I had overlooked what `finish` does and it's not relevant to you, so you can skip calling it, I guess, as long as it's properly documented that the object may only be partially initialized?

Though that's really unfortunate...
Attachment #8932636 - Flags: review?(emilio)
Comment on attachment 8932636 [details]
Bug 1421195 - Add weighted telemetry probes for parallel restyles ;

https://reviewboard.mozilla.org/r/203688/#review210396

::: layout/style/ServoBindings.cpp:131
(Diff revision 4)
>  void
> -Gecko_RecordTraversalStatistics(uint32_t total, uint32_t parallel)
> +Gecko_RecordTraversalStatistics(uint32_t total, uint32_t parallel,
> +                                uint32_t total_t, uint32_t parallel_t,
> +                                uint32_t total_s, uint32_t parallel_s)
>  {
> +

Can we just not call this if not in nightly, and assert here?

::: commit-message-0d911:1
(Diff revision 5)
> +Bug 1421195 - Add weighted telemetry probes for parallel restyels ; r?emilio

s/restyels/restyles

::: layout/style/ServoBindings.cpp:129
(Diff revision 5)
>  
>  
>  void
> -Gecko_RecordTraversalStatistics(uint32_t total, uint32_t parallel)
> +Gecko_RecordTraversalStatistics(uint32_t total, uint32_t parallel,
> +                                uint32_t total_t, uint32_t parallel_t,
> +                                uint32_t total_s, uint32_t parallel_s)

Let's just call it on nightly and assert if you want?

::: servo/components/style/driver.rs:36
(Diff revision 5)
> +/// all of its fields will be initialized since we don't call finish().
>  pub fn traverse_dom<E, D>(
>      traversal: &D,
>      token: PreTraverseToken<E>,
>      pool: Option<&rayon::ThreadPool>
> -) -> bool
> +) -> (bool, Option<TraversalStatistics>)

I still think a struct would make sense here.
Attachment #8932636 - Flags: review?(emilio) → review+
Component: Telemetry → DOM: CSS Object Model
Product: Toolkit → Core
(I am assuming this wanted to be closer to other Stylo bugs.)
Component: DOM: CSS Object Model → CSS Parsing and Computation
Pushed by manishearth@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/b3e6c0395ec7
Add telemetry probe for fraction of restyles that are parallel ; r=emilio
https://hg.mozilla.org/integration/autoland/rev/78940acd651b
Add weighted telemetry probes for parallel restyles ; r=emilio
Manish, don't you need data-r? for landing this?
Flags: needinfo?(manishearth)
Oh, right, comment 19, but should've been in the commit message, probably...
Flags: needinfo?(manishearth)
https://hg.mozilla.org/mozilla-central/rev/b3e6c0395ec7
https://hg.mozilla.org/mozilla-central/rev/78940acd651b
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: