Closed Bug 1349808 Opened 7 years ago Closed 7 years ago

Add telemetry for cases when we can't run async animations due to layer size being too large

Categories

(Core :: DOM: Animation, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: birtles, Assigned: boris)

References

Details

Attachments

(1 file)

See bug 1100357 comment 23. We would like to collect data to see how much difference that bug would make.

This probably amounts to recording instances of AnimationPerformanceWarning::ContentTooLarge but presumably limited to one instance per page load? I guess it would be good to know the actual layer size too?

I presume this is useful data for Quantum Flow and so getting this probe in should be high priority.
Assignee: nobody → boris.chiou
Looks like we need to declare a histogram for ContentTooLarge, and add a C++ probe to record when ContentTooLarge happens.
Status: NEW → ASSIGNED
The Histogram can store a one dimension value, so there are some possible way to record this:

a) count: Just record the number of ContentTooLarge.
b) exponential: 
  1) Record the frame size. A possible way to store |width * height|, but I'm not sure if this value makes sense.
  2) Two histograms: one for width, another one for height. However, I don't think this makes sense because I think
     we should put width and height together.
c) categorical: two categories, one for width and another one for height. (Just like two counters, but put them together
   in the same histogram)

The simplest way is (a), and I can implement it first. However, another question is: how do we know this is the same page load, so we don't record the count?
Comment on attachment 8850870 [details]
Bug 1349808 - Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg,

https://reviewboard.mozilla.org/r/123390/#review125794

BTW, I have no idea to test this.
Comment on attachment 8850870 [details]
Bug 1349808 - Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg,

https://reviewboard.mozilla.org/r/123390/#review126136

Clearing review request for now since it's not clear to me that a count is necessarily the most useful thing here. I'd like to check with Botond what he thinks would be best here. Also, I (now) think we probably should record this once per effect instead of once per page.

::: dom/animation/EffectCompositor.cpp:960
(Diff revision 1)
> +  static nsIDocument* prevDoc = nullptr;
> +  nsIDocument* currentDoc = aFrame->GetContent()
> +                            ? aFrame->GetContent()->GetOwnerDocument()
> +                            : nullptr;
> +  if (aWarning.mType == AnimationPerformanceWarning::Type::ContentTooLarge &&
> +      prevDoc != currentDoc) {
> +    Telemetry::AutoCounter<Telemetry::ASYNC_ANIMATION_CONTENT_TOO_LARGE>
> +      contentTooLargeCounter;
> +    ++contentTooLargeCounter;
> +
> +    prevDoc = currentDoc;

How is this prevDoc / currentDoc suppose to work? If we sample two different documents alternatively won't the count jump dramatically?

Perhaps we should allow reporting this once per effect.

::: toolkit/components/telemetry/Histograms.json:142
(Diff revision 1)
>      "bug_numbers": [1331139],
>      "description": "For each Application Reputation lookup against both the V2 and V4 Google lists, note which version of the allow list returned a match (0 = no match, 1 = match only V2, 2 = match only V4, 3 = match both V2 and V4)"
>    },
> +  "ASYNC_ANIMATION_CONTENT_TOO_LARGE": {
> +    "alert_emails": ["birtles@mozilla.com"],
> +    "expires_in_version": "never",

I think we plan to fix this in the next few versions, so perhaps version 58 is enough here (i.e. hopefully we'll fix this for 57, but just in case that slips and in order to verify that it has worked keep it around for another version).

::: toolkit/components/telemetry/Histograms.json:143
(Diff revision 1)
>      "description": "For each Application Reputation lookup against both the V2 and V4 Google lists, note which version of the allow list returned a match (0 = no match, 1 = match only V2, 2 = match only V4, 3 = match both V2 and V4)"
>    },
> +  "ASYNC_ANIMATION_CONTENT_TOO_LARGE": {
> +    "alert_emails": ["birtles@mozilla.com"],
> +    "expires_in_version": "never",
> +    "kind": "count",

I wonder if instead of a count, it would be more useful to record the ratio by which the layer was greater than the viewport? e.g. if the layer was 10% larger than the viewport, store either '110' or '1.1' whichever is more natural for telemetry.

Then we'd get a distribution where we can see by how much we're going beyond the viewport which is probably what we want here?

One issue is there are two limits: the viewport size plus an absolute limit. I guess we could have cases where the layer was made inactive because it was larger than the absolute limit but still less than the relative limit (viewport size). In that case we'd get a ratio of, e.g. 0.8. So that argues for making the low value correspond to not to 100% but 0%.

::: toolkit/components/telemetry/Histograms.json:144
(Diff revision 1)
>    },
> +  "ASYNC_ANIMATION_CONTENT_TOO_LARGE": {
> +    "alert_emails": ["birtles@mozilla.com"],
> +    "expires_in_version": "never",
> +    "kind": "count",
> +    "bug_numbers": [1349808],

Let's add 1100357 to this since that's the bug where we expect this to be fixed.
Attachment #8850870 - Flags: review?(bbirtles)
Botond, what data do you think would be most useful to collect here? I'm wondering if recording the percentage by which the layer exceeds the viewport would be most useful. That would let us know how often we're just missing the fast path (e.g. by being 5% over the limit) and what sort of layer sizes we'll get if we force these layers to be active (e.g. how often will be get a layer 10 times the viewport?). What do you think?
Flags: needinfo?(botond)
Comment on attachment 8850870 [details]
Bug 1349808 - Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg,

https://reviewboard.mozilla.org/r/123390/#review126136

> How is this prevDoc / currentDoc suppose to work? If we sample two different documents alternatively won't the count jump dramatically?
> 
> Perhaps we should allow reporting this once per effect.

OK, the doc check is not correct, and I will move this for once per effect.

> I think we plan to fix this in the next few versions, so perhaps version 58 is enough here (i.e. hopefully we'll fix this for 57, but just in case that slips and in order to verify that it has worked keep it around for another version).

I see. I will change this to 58.
Attachment #8850870 - Flags: review?(benjamin)
Comment on attachment 8850870 [details]
Bug 1349808 - Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg,

https://reviewboard.mozilla.org/r/123390/#review126136

> I wonder if instead of a count, it would be more useful to record the ratio by which the layer was greater than the viewport? e.g. if the layer was 10% larger than the viewport, store either '110' or '1.1' whichever is more natural for telemetry.
> 
> Then we'd get a distribution where we can see by how much we're going beyond the viewport which is probably what we want here?
> 
> One issue is there are two limits: the viewport size plus an absolute limit. I guess we could have cases where the layer was made inactive because it was larger than the absolute limit but still less than the relative limit (viewport size). In that case we'd get a ratio of, e.g. 0.8. So that argues for making the low value correspond to not to 100% but 0%.

I see. I am waiting for Botond's comment. BTW, I will adopt the integer version (exponential kind) to represent the ratio because it seems we always use integer as the label for other exponential cases.
(In reply to Brian Birtles (:birtles) from comment #6)
> Botond, what data do you think would be most useful to collect here? I'm
> wondering if recording the percentage by which the layer exceeds the
> viewport would be most useful. That would let us know how often we're just
> missing the fast path (e.g. by being 5% over the limit) and what sort of
> layer sizes we'll get if we force these layers to be active (e.g. how often
> will be get a layer 10 times the viewport?). What do you think?

I think ideally we would collect both the percentage and the absolute value. Since we have limits on both, this data could help us decide how to set those limits.
Flags: needinfo?(botond)
Comment on attachment 8850870 [details]
Bug 1349808 - Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg,

https://reviewboard.mozilla.org/r/123390/#review126602

Thanks for doing this. Can you make the following changes and then ask Botond to review since he understands what data we need much better than I do.

::: dom/animation/KeyframeEffect.cpp:133
(Diff revision 2)
> +  // Reset this because we change the target, so we may need to re-record
> +  // anything to Telemetry.

I wonder if we need this? All it will mean is that if we have an animation that targets one oversized frame and then is retargeted to try to animate another oversized frame we will record both occurrences instead of just the first once. Maybe that's worth doing. If so, let's make the comment:

  // If the new target frame is also oversized we should probably record that
  // too so we have a more complete picture of the type of frame sizes we
  // encounter, hence we reset the telemetry flag here.

::: dom/animation/KeyframeEffectReadOnly.h:453
(Diff revision 2)
> +  // We record data to Telemetry per effect, so use this flag to trace it.
> +  bool mIsRecordedToTelemetry = false;

Let's call this mRecordedContentTooLarge and change the comment to something like:

  // We only want to record telemetry data for "ContentTooLarge" warnings once
  // per effect:target pair so we use this member to record if we have already
  // reported a "ContentTooLarge" warning for the current target.

::: dom/animation/KeyframeEffectReadOnly.cpp:1595
(Diff revision 2)
> +    MOZ_ASSERT(aWarning.mParams,
> +               "Should have warning parameters for ContentTooLarge");
> +    const nsTArray<int32_t>& params = aWarning.mParams.ref();
> +    MOZ_ASSERT(params.Length() >= 4,
> +               "Warning parameters should contains frame size and viewport");

Nit: This might be a little simpler/clearer as:

  // ContentTooLarge stores: frameSize (w x h),
  //                         relativeLimit (w x h), i.e. =~ viewport size
  //                         absoluteLimit (w x h)
  MOZ_ASSERT(aWarning.mParams && aWarning.mParams->Length() >= 4,
             "ContentToLarge warning should have at least 4 parameters");
  const nsTArray<int32_t>& params = aWarning.mParams.ref();

::: dom/animation/KeyframeEffectReadOnly.cpp:1600
(Diff revision 2)
> +    uint32_t frameSize = params[0] * params[1];
> +    uint32_t viewport = params[2] * params[3];
> +    uint32_t ratio = (uint32_t)((double)frameSize / (double)viewport * 100.0);

So |params| are all int32_t. As I understand it, you're storing the result in a uint32_t in order to benefit from its greater range. I believe you want to make sure the calculation takes place in unsigned int space, however.

If either argument is unsigned, the other will be implicitly convered to unsigned. So we could write this (using function-style casts) as:

  uint32_t frameArea       = uint32_t(params[0]) * params[1];
  uint32_t viewportArea    = uint32_t(params[2]) * params[3];
  uint32_t frameToViewport = double(frameArea) / viewportArea * 100.0;

(In the last line, / and * have the same operator precedence and the associativity of the operators means the / will be evaluated first. As a result viewPortArea will be converted to a double, then the result of that will be a double so we will do the multiplication in double space which is not as good as unsigned integer but I don't think it matters here. Then the result will be implicitly converted to an unsigned integer by truncation.)

::: toolkit/components/telemetry/Histograms.json:141
(Diff revision 2)
>      "n_values": 4,
>      "bug_numbers": [1331139],
>      "description": "For each Application Reputation lookup against both the V2 and V4 Google lists, note which version of the allow list returned a match (0 = no match, 1 = match only V2, 2 = match only V4, 3 = match both V2 and V4)"
>    },
> +  "ASYNC_ANIMATION_CONTENT_TOO_LARGE_FRAME_SIZE": {
> +    "alert_emails": ["birtles@mozilla.com"],

bbirtles@mozilla.com

::: toolkit/components/telemetry/Histograms.json:147
(Diff revision 2)
> +    "expires_in_version": "58",
> +    "kind": "exponential",
> +    "high": 40000000,
> +    "n_buckets": 50,
> +    "bug_numbers": [1100357, 1349808],
> +    "description": "If the layer size is too large, we can't run async animations This histogram stores actual frame size (i.e. the number of pixels)."

Botond can probably make these descriptions more precise, but here's my attempt:

"The number of pixels of the frame for each time we encountered a layer that was so large we decided not to run its animations on the compositor."

::: toolkit/components/telemetry/Histograms.json:149
(Diff revision 2)
> +    "high": 40000000,
> +    "n_buckets": 50,
> +    "bug_numbers": [1100357, 1349808],
> +    "description": "If the layer size is too large, we can't run async animations This histogram stores actual frame size (i.e. the number of pixels)."
> +  },
> +  "ASYNC_ANIMATION_CONTENT_TOO_LARGE_RATIO": {

(I wonder if RATIO should be PERCENTAGE? I'm not really sure. Ratio just sounds like it should be a fractional value.)

::: toolkit/components/telemetry/Histograms.json:150
(Diff revision 2)
> +    "n_buckets": 50,
> +    "bug_numbers": [1100357, 1349808],
> +    "description": "If the layer size is too large, we can't run async animations This histogram stores actual frame size (i.e. the number of pixels)."
> +  },
> +  "ASYNC_ANIMATION_CONTENT_TOO_LARGE_RATIO": {
> +    "alert_emails": ["birtles@mozilla.com"],

bbirtles@mozilla.com

::: toolkit/components/telemetry/Histograms.json:152
(Diff revision 2)
> +    "kind": "exponential",
> +    "high": 10000,
> +    "n_buckets": 50,

Given these numbers, how large will the bucket be just around the 100 mark? I hope we don't end up with a bucket that goes from, for example, 95 to 170. That would probably not give us the information we're looking for.

I guess answering this question requires looking up the code to see exactly how it divides up the buckets. Would you mind doing that to just confirm we have sufficiently fine-grained buckets near the 100 mark?

::: toolkit/components/telemetry/Histograms.json:156
(Diff revision 2)
> +    "expires_in_version": "58",
> +    "kind": "exponential",
> +    "high": 10000,
> +    "n_buckets": 50,
> +    "bug_numbers": [1100357, 1349808],
> +    "description": "If the layer size is too large, we can't run async animations This histogram stores the ratio of the actual frame size to the viewport size."

The ratio of the frame size (in total number of pixels) to the root reference frame (~viewport) size for each time we encountered a layer that was so large we decided not to run its animations on the compositor expressed as a percentage (i.e. 110 = frame area was 10% larger than the viewport).
Attachment #8850870 - Flags: review?(bbirtles) → review+
(In reply to Brian Birtles (:birtles) from comment #11)
> ::: dom/animation/KeyframeEffect.cpp:133
> (Diff revision 2)
> > +  // Reset this because we change the target, so we may need to re-record
> > +  // anything to Telemetry.
> 
> I wonder if we need this? All it will mean is that if we have an animation
> that targets one oversized frame and then is retargeted to try to animate
> another oversized frame we will record both occurrences instead of just the
> first once. Maybe that's worth doing. If so, let's make the comment:

I am guessing we should reset the flag in UpdateProperties().  Say, changing width style of an element which has a transform animation.
(In reply to Hiroyuki Ikezoe (:hiro) from comment #12)
> (In reply to Brian Birtles (:birtles) from comment #11)
> > ::: dom/animation/KeyframeEffect.cpp:133
> > (Diff revision 2)
> > > +  // Reset this because we change the target, so we may need to re-record
> > > +  // anything to Telemetry.
> > 
> > I wonder if we need this? All it will mean is that if we have an animation
> > that targets one oversized frame and then is retargeted to try to animate
> > another oversized frame we will record both occurrences instead of just the
> > first once. Maybe that's worth doing. If so, let's make the comment:
> 
> I am guessing we should reset the flag in UpdateProperties().  Say, changing
> width style of an element which has a transform animation.

I don't understand why would we want to do that? The only thing resetting it in UpdateProperties would do is mean we report more than once per effect. I think once per effect is probably enough, but once per effect:target pair (like this patch does) is also probably reasonable.
Hmm interesting.  In my understandings one of the purpose of this telemetry is to collect size data of transform animations that we don't run on the compositor.  For example:

div.style.width = '2000px';  // the div has a transform animation.
// record one telemetry entry
// later..

div.style.width = '3000px'; // we do omit this data?
I am not really sure how often such cases happen. So, I don't strongly say that 'we should collect them'.
(In reply to Hiroyuki Ikezoe (:hiro) from comment #14)
> Hmm interesting.  In my understandings one of the purpose of this telemetry
> is to collect size data of transform animations that we don't run on the
> compositor.  For example:
> 
> div.style.width = '2000px';  // the div has a transform animation.
> // record one telemetry entry
> // later..
> 
> div.style.width = '3000px'; // we do omit this data?

Yes, I think it's fine to omit the change there. As I understand it, more than anything we want to gauge how often users hit this problem. Then, after that, we'd like to understand the conditions where they are hitting it (e.g. is it content 10 times bigger than the viewport or layers that are only fractionally larger? Are we hitting it more on smaller screens? etc.).

For that purpose I think it's enough to record the frame size/ratio the first time we encounter the problem (if we record multiple events for each effect it becomes harder to guage how frequently this is encountered and the representative frame sizes are probably less useful too).
Comment on attachment 8850870 [details]
Bug 1349808 - Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg,

https://reviewboard.mozilla.org/r/123390/#review126602

> I wonder if we need this? All it will mean is that if we have an animation that targets one oversized frame and then is retargeted to try to animate another oversized frame we will record both occurrences instead of just the first once. Maybe that's worth doing. If so, let's make the comment:
> 
>   // If the new target frame is also oversized we should probably record that
>   // too so we have a more complete picture of the type of frame sizes we
>   // encounter, hence we reset the telemetry flag here.

Yes, that's what I think. I think this is a special case if someone uses this Web Animation API and the oversized frame happened again on a different target.

> (I wonder if RATIO should be PERCENTAGE? I'm not really sure. Ratio just sounds like it should be a fractional value.)

Let's change it to PERCENTAGE, and let Botond check this.

> Given these numbers, how large will the bucket be just around the 100 mark? I hope we don't end up with a bucket that goes from, for example, 95 to 170. That would probably not give us the information we're looking for.
> 
> I guess answering this question requires looking up the code to see exactly how it divides up the buckets. Would you mind doing that to just confirm we have sufficiently fine-grained buckets near the 100 mark?

The exponential histogram uses math.log() values to know the range of each buckets [1]. In other words: An exponential histogram fits this requirement since it has "narrow" buckets near the minimum value and significantly "wider" buckets near the maximum value. Here are some examples:

1) low: 1, high: 10000, n_buckets: 100:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 21, 23, 25, 27, 29, 31, 34, 37, 40, 43, 46, 50, 54, 58, 63, 68, 74, 80, 86, 93, 101, 109, 118, 128, 138, 149, 161, 174, 188, 203, 219, 237, 256, 277, 299, 323, 349, 377, 408, 441, 477, 516, 558, 603, 652, 705, 762, 824, 891, 963, 1041, 1125, 1216, 1315, 1422, 1537, 1662, 1797, 1943, 2101, 2271, 2455, 2654, 2869, 3102, 3354, 3626, 3920, 4238, 4582, 4954, 5356, 5791, 6261, 6769, 7318, 7912, 8554, 9249, 10000]

2) low: 80, high: 10000, n_buckets: 100:
[0, 80, 84, 88, 92, 97, 102, 107, 112, 118, 124, 130, 137, 144, 151, 159, 167, 175, 184, 193, 203, 213, 224, 235, 247, 259, 272, 286, 300, 315, 331, 348, 366, 385, 404, 424, 445, 468, 492, 517, 543, 570, 599, 629, 661, 694, 729, 766, 805, 846, 889, 934, 981, 1031, 1083, 1138, 1196, 1257, 1321, 1388, 1458, 1532, 1610, 1691, 1777, 1867, 1961, 2060, 2164, 2274, 2389, 2510, 2637, 2770, 2910, 3057, 3212, 3375, 3546, 3725, 3914, 4112, 4320, 4539, 4769, 5010, 5264, 5530, 5810, 6104, 6413, 6738, 7079, 7437, 7813, 8208, 8623, 9060, 9518, 10000]

3) low: 100, high: 1000, n_buckets: 100:
[0, 100, 102, 104, 106, 109, 112, 115, 118, 121, 124, 127, 130, 133, 136, 139, 142, 145, 148, 152, 156, 160, 164, 168, 172, 176, 180, 184, 188, 192, 197, 202, 207, 212, 217, 222, 227, 232, 238, 244, 250, 256, 262, 268, 274, 281, 288, 295, 302, 309, 316, 324, 332, 340, 348, 356, 364, 373, 382, 391, 400, 410, 420, 430, 440, 450, 461, 472, 483, 494, 506, 518, 530, 543, 556, 569, 583, 597, 611, 626, 641, 656, 672, 688, 704, 721, 738, 755, 773, 791, 810, 829, 849, 869, 890, 911, 932, 954, 977, 1000]

So if we want to fine-grained near the 100 mark, the best way is to set the low value to 80~100, and the high value shouldn't too high, i.e. 500. Any recorded values greater than this maximum will be counted in the last bucket, and all histograms automatically get a bucket with label "0" for counting values below the "low" value.

For the frame size histogram, we should do the same thing: set a reasonable low value and a high value.

[1] http://searchfox.org/mozilla-central/rev/7419b368156a6efa24777b21b0e5706be89a9c2f/toolkit/components/telemetry/histogram_tools.py#59-75


What do you think about the low and high value for both histograms? How about this?
1. ASYNC_ANIMATION_CONTENT_TOO_LARGE_FRAME_SIZE: low: 1 (default), high: 40000000, n_buckets: 100. I think this could be linear?
2. ASYNC_ANIMATION_CONTENT_TOO_LARGE_PERCENTAGE: low: 100, high: 1000, n_buckets: 100
Attachment #8850870 - Flags: review?(benjamin)
Comment on attachment 8850870 [details]
Bug 1349808 - Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg,

https://reviewboard.mozilla.org/r/123390/#review126784

data-r=me - you can use https://telemetry.mozilla.org/histogram-simulator/#low=1&high=500&n_buckets=20&kind=linear&generate=normal to try out various high/low linear/exponential combinations.
Attachment #8850870 - Flags: review?(benjamin) → review+
(In reply to Benjamin Smedberg [:bsmedberg] from comment #19)
> Comment on attachment 8850870 [details]
> Bug 1349808 - Add telemetry for cases when we can't run async animations due
> to layer size being too large.
> 
> https://reviewboard.mozilla.org/r/123390/#review126784
> 
> data-r=me - you can use
> https://telemetry.mozilla.org/histogram-simulator/
> #low=1&high=500&n_buckets=20&kind=linear&generate=normal to try out various
> high/low linear/exponential combinations.

Great. Thanks for the information.
(In reply to Boris Chiou [:boris] from comment #17)
> What do you think about the low and high value for both histograms? How
> about this?
> 1. ASYNC_ANIMATION_CONTENT_TOO_LARGE_FRAME_SIZE: low: 1 (default), high:
> 40000000, n_buckets: 100. I think this could be linear?
> 2. ASYNC_ANIMATION_CONTENT_TOO_LARGE_PERCENTAGE: low: 100, high: 1000,
> n_buckets: 100

I agree, that tool is great!

For (1), I wonder if |high| should be a little higher? In (2) we set |high| to 1000, i.e. 10 times the viewport. If you have a 4K display with resolution 3840 x 2160, that gives you 8,294,400 pixels. If the browser is full-screen and we have a layer 10 times the viewport, that would give us a frame size of ~82,944,000. So perhaps we should set |high| to 80,000,000 instead of 40,000,000?

I think we want to stick with exponential too. If we use linear and |high| is 80,000,000 then I think the first bucket is from 1 to 816,330 which would be a frame size of about 900x900. I think that's a pretty large bucket? Exponential seems to give more useful buckets to me.

For (2), perhaps n_buckets: 50 is enough?
(In reply to Brian Birtles (:birtles) from comment #21)
> For (1), I wonder if |high| should be a little higher? In (2) we set |high|
> to 1000, i.e. 10 times the viewport. If you have a 4K display with
> resolution 3840 x 2160, that gives you 8,294,400 pixels. If the browser is
> full-screen and we have a layer 10 times the viewport, that would give us a
> frame size of ~82,944,000. So perhaps we should set |high| to 80,000,000
> instead of 40,000,000?
> 
> I think we want to stick with exponential too. If we use linear and |high|
> is 80,000,000 then I think the first bucket is from 1 to 816,330 which would
> be a frame size of about 900x900. I think that's a pretty large bucket?
> Exponential seems to give more useful buckets to me.
> 
> For (2), perhaps n_buckets: 50 is enough?

Agreed. Let set them as:
1. ASYNC_ANIMATION_CONTENT_TOO_LARGE_FRAME_SIZE: low: 1 (default), high: 80,000,000, n_buckets: 100. exponential
2. ASYNC_ANIMATION_CONTENT_TOO_LARGE_PERCENTAGE: low: 100, high: 1000, n_buckets: 50. exponential
Attachment #8850870 - Flags: review?(botond)
mozreview seems don't accept data-review keyword, so I will push it into inbound later after reviewing.
Comment on attachment 8850870 [details]
Bug 1349808 - Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg,

https://reviewboard.mozilla.org/r/123390/#review128214

I have a few thoughts about the values we're collecting here:

  - The thing we're computing as the "viewport size", from the third
    and fourth parameters of the animation warning, is actually the
    viewport size multipled by the values of the "relative limit"
    pref, currently 1.125. This in and of itself is fine (if we're
    interested in the percentages a fractions of the real viewport
    size, we can divide them by 1.125), but if we fiddle with the
    value of this pref, the recorded data will fluctuate.
    
  - As soon as we turn on partial pre-rendering, we will stop
    collecting this data, because the warning is no longer fired in
    that case, even if the content is too large.
    
Brian, do you think these will be a problem? If so, we could address them with a bit of extra code by recording the data in nsDisplayTransform::ShouldPrerenderTransformedContent() instead, where we have the freedom to record the real viewport size, and to record data even if we don't fire the warning, if we want.

Otherwise, I'm happy to land this as-is (with just the nits below addressed).

::: dom/animation/KeyframeEffectReadOnly.cpp:1599
(Diff revision 4)
> +      !mRecordedContentTooLarge) {
> +    // ContentTooLarge stores: frameSize (w x h),
> +    //                         relativeLimit (w x h), i.e. =~ viewport size
> +    //                         absoluteLimit (w x h)
> +    MOZ_ASSERT(aWarning.mParams && aWarning.mParams->Length() >= 4,
> +               "ContentToLarge warning should have at least 4 parameters");

typo: "To" -> Too"

::: dom/animation/KeyframeEffectReadOnly.cpp:1602
(Diff revision 4)
> +    //                         absoluteLimit (w x h)
> +    MOZ_ASSERT(aWarning.mParams && aWarning.mParams->Length() >= 4,
> +               "ContentToLarge warning should have at least 4 parameters");
> +    const nsTArray<int32_t>& params = aWarning.mParams.ref();
> +    uint32_t frameArea = uint32_t(params[0]) * params[1];
> +    uint32_t viewportArea = uint32_t(params[2]) * params[3];

I would call these variables 'frameSize' and 'viewportSize'.

::: toolkit/components/telemetry/Histograms.json:142
(Diff revision 4)
>      "bug_numbers": [1331139],
>      "description": "For each Application Reputation lookup against both the V2 and V4 Google lists, note which version of the allow list returned a match (0 = no match, 1 = match only V2, 2 = match only V4, 3 = match both V2 and V4)"
>    },
> +  "ASYNC_ANIMATION_CONTENT_TOO_LARGE_FRAME_SIZE": {
> +    "alert_emails": ["bbirtles@mozilla.com"],
> +    "expires_in_version": "58",

This is the first version in which the data will _not_ be collected.

Based on Brian's comment in comment 5:

> I think we plan to fix this in the next few
> versions, so perhaps version 58 is enough here
> (i.e. hopefully we'll fix this for 57, but just
> in case that slips and in order to verify that
> it has worked keep it around for another
> version).

I think we do still want to collect the data in 58, and therefore this sould be 59?

::: toolkit/components/telemetry/Histograms.json:151
(Diff revision 4)
> +    "bug_numbers": [1100357, 1349808],
> +    "description": "The number of pixels of the frame for each time we encountered a layer that was so large we decided not to run its animations on the compositor."
> +  },
> +  "ASYNC_ANIMATION_CONTENT_TOO_LARGE_PERCENTAGE": {
> +    "alert_emails": ["bbirtles@mozilla.com"],
> +    "expires_in_version": "58",

Likewise.
Attachment #8850870 - Flags: review?(botond) → review+
(In reply to Botond Ballo [:botond] from comment #25)
> I have a few thoughts about the values we're collecting here:
> 
>   - The thing we're computing as the "viewport size", from the third
>     and fourth parameters of the animation warning, is actually the
>     viewport size multipled by the values of the "relative limit"
>     pref, currently 1.125. This in and of itself is fine (if we're
>     interested in the percentages a fractions of the real viewport
>     size, we can divide them by 1.125), but if we fiddle with the
>     value of this pref, the recorded data will fluctuate.
>     
>   - As soon as we turn on partial pre-rendering, we will stop
>     collecting this data, because the warning is no longer fired in
>     that case, even if the content is too large.
>     
> Brian, do you think these will be a problem? If so, we could address them
> with a bit of extra code by recording the data in
> nsDisplayTransform::ShouldPrerenderTransformedContent() instead, where we
> have the freedom to record the real viewport size, and to record data even
> if we don't fire the warning, if we want.
> 
> Otherwise, I'm happy to land this as-is (with just the nits below addressed).

ni?birtles for the above questions.
Flags: needinfo?(bbirtles)
(In reply to Botond Ballo [:botond] from comment #26)
> (In reply to Botond Ballo [:botond] from comment #25)
> > I have a few thoughts about the values we're collecting here:
> > 
> >   - The thing we're computing as the "viewport size", from the third
> >     and fourth parameters of the animation warning, is actually the
> >     viewport size multipled by the values of the "relative limit"
> >     pref, currently 1.125. This in and of itself is fine (if we're
> >     interested in the percentages a fractions of the real viewport
> >     size, we can divide them by 1.125), but if we fiddle with the
> >     value of this pref, the recorded data will fluctuate.

Ah, interesting. I didn't know that. So, in effect, we're saying that (by default) 127% larger than the viewport is ok.

Perhaps we should divide the viewportSize by layout.animation.prerender.viewport-ratio-limit-{x,y}. That way we'll typically only get telemetry values >= 127.

- If a user has tweaked that pref to something less than 1.125, we'll see telemetry values for, say, 105 which we can
  probably ignore as the user tweaked their system in a way that's suboptimal.
- If a user has tweaked that pref to something greater that 1.125 we simply won't see any values until the new threshold
  is passed (telling us that we exceed even larger values) but leaving us in the dark about frame sizes that would have
  exceeded the limit if they user hadn't tweaked the setting.
- Most importantly, though, we'll have easy-to-read numbers so that, for example, if we get a lot of samples in the 127
  to 133 bucket we'll now that 1.125 is probably not enough for a lot of content. On the other hand if we get a lot of
  samples outside that we'll know that something more aggressive is needed (e.g. partial pre-rendering).

> >   - As soon as we turn on partial pre-rendering, we will stop
> >     collecting this data, because the warning is no longer fired in
> >     that case, even if the content is too large.

I think that's fine. It will help us confirm that pre-rendering is working as expected (i.e. eliminating the condition that causes us to stop running animations on the compositor).

> > Brian, do you think these will be a problem? If so, we could address them
> > with a bit of extra code by recording the data in
> > nsDisplayTransform::ShouldPrerenderTransformedContent() instead, where we
> > have the freedom to record the real viewport size, and to record data even
> > if we don't fire the warning, if we want.

For the first point, could we just calculate the viewport in KeyframeEffectReadOnly::SetPerformanceWarning as:

  uint32_t viewportWidth  = params[2] / gfxPrefs::AnimationPrerenderViewportRatioLimitX();
  uint32_t viewportHeight = params[3] / gfxPrefs::AnimationPrerenderViewportRatioLimitY();
  uint32_t viewportSize   = viewportWidth * viewportHeight;

For the second point, I think the purpose of this data collection is just to work out when users are getting suboptimal animation performance so we can fix it. If they're not getting the warning the presumably they're getting async animation (or something else is getting in the way) so we don't need to record the data.

However, if the graphics team needs this data for some other optimizations then it might be better to do it in nsDisplayTransform::ShouldPrerenderTransformedContent.

My suggestion is just to tweak the calculation in KeyframeEffectReadOnly::SetPerformanceWarning. What do you think?
Flags: needinfo?(bbirtles) → needinfo?(botond)
(In reply to Brian Birtles (:birtles) from comment #27)
> For the first point, could we just calculate the viewport in
> KeyframeEffectReadOnly::SetPerformanceWarning as:
> 
>   uint32_t viewportWidth  = params[2] /
> gfxPrefs::AnimationPrerenderViewportRatioLimitX();
>   uint32_t viewportHeight = params[3] /
> gfxPrefs::AnimationPrerenderViewportRatioLimitY();
>   uint32_t viewportSize   = viewportWidth * viewportHeight;

Yeah, that would work.

> For the second point, I think the purpose of this data collection is just to
> work out when users are getting suboptimal animation performance so we can
> fix it. If they're not getting the warning the presumably they're getting
> async animation (or something else is getting in the way) so we don't need
> to record the data.

That sounds reasonable to me.
Flags: needinfo?(botond)
Great s Botond!

Boris, would you mind updating the patch to include the change suggested in comment 27?
(In reply to Brian Birtles (:birtles) from comment #29)
> Great s Botond!
> 
> Boris, would you mind updating the patch to include the change suggested in
> comment 27?

Sure!
Attachment #8850870 - Flags: review+ → review?(bbirtles)
Comment on attachment 8850870 [details]
Bug 1349808 - Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg,

https://reviewboard.mozilla.org/r/123390/#review128356

::: dom/animation/KeyframeEffectReadOnly.cpp:1612
(Diff revision 5)
> +    uint32_t viewPortWidth =
> +      params[2] / gfxPrefs::AnimationPrerenderViewportRatioLimitX();
> +    uint32_t viewPortHeight =
> +      params[3] / gfxPrefs::AnimationPrerenderViewportRatioLimitY();
> +    uint32_t viewportSize = viewPortWidth * viewPortHeight;

We are mixing 'viewport' and 'viewPort' here. We should be consistent (probably 'viewport').

::: dom/animation/KeyframeEffectReadOnly.cpp:1612
(Diff revision 5)
> +    uint32_t viewPortWidth =
> +      params[2] / gfxPrefs::AnimationPrerenderViewportRatioLimitX();
> +    uint32_t viewPortHeight =
> +      params[3] / gfxPrefs::AnimationPrerenderViewportRatioLimitY();

Maybe we should add in a:

MOZ_ASSERT(gfxPrefs::AnimationPrerenderViewportRatioLimitX() != 0.0 &&
           gfxPrefs::AnimationPrerenderViewportRatioLimitY() != 0.0);

?

Although I guess it's still possible a user could set that?

So perhaps just make this:

  float viewportRatioX = gfxPrefs::AnimationPrerenderViewportRatioLimitX();
  float viewportRatioY = gfxPrefs::AnimationPrerenderViewportRatioLimitY();
  uint32_t viewportWidth =
    viewportRatioX
    ? params[2] / gfxPrefs::AnimationPrerenderViewportRatioLimitX()
    : params[2];
  uint32_t viewportHeight =
    viewportRatioY
    ? params[3] / gfxPrefs::AnimationPrerenderViewportRatioLimitY()
    : params[3];

::: dom/animation/KeyframeEffectReadOnly.cpp:1612
(Diff revision 5)
> +    uint32_t viewPortWidth =
> +      params[2] / gfxPrefs::AnimationPrerenderViewportRatioLimitX();
> +    uint32_t viewPortHeight =
> +      params[3] / gfxPrefs::AnimationPrerenderViewportRatioLimitY();
> +    uint32_t viewportSize = viewPortWidth * viewPortHeight;

I suppose these three: viewportWidth, viewportHeight, viewportSize could all just be doubles. Then we won't need the cast to double on the following line, either.

::: toolkit/components/telemetry/Histograms.json:157
(Diff revision 5)
> +    "kind": "exponential",
> +    "low": 100,
> +    "high": 1000,
> +    "n_buckets": 50,
> +    "bug_numbers": [1100357, 1349808],
> +    "description": "The ratio of the frame size (in total number of pixels) to the root reference frame (~viewport) size for each time we encountered a layer that was so large we decided not to run its animations on the compositor expressed as a percentage (i.e. 110 = frame area was 10% larger than the viewport)"

Let's update this comment to:

"The ratio of the frame size (in total number of pixels) to the relative limit (~viewport size plus some tolerance factor, typically 12.5% in each dimension, i.e. ~27% tolerance in total area) for each time we encountered a layer that was so large we decided not to run its animations on the compositor expressed as a percentage (e.g. 130 = frame area was 30% larger than the relative limit)"
Attachment #8850870 - Flags: review?(bbirtles) → review+
(In reply to Brian Birtles (:birtles) from comment #29)
> Great s Botond!

Sorry, I don't know what happened there. That should have been "thanks" Botond.
Comment on attachment 8850870 [details]
Bug 1349808 - Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg,

https://reviewboard.mozilla.org/r/123390/#review128356

> We are mixing 'viewport' and 'viewPort' here. We should be consistent (probably 'viewport').

Oops. My typo.

> I suppose these three: viewportWidth, viewportHeight, viewportSize could all just be doubles. Then we won't need the cast to double on the following line, either.

OK, let's make viewportWidth, viewportHeight, and viewportSize be all doubles, and convert viewportSize into uint32_t while doing Telemetry::Accumulate()
Comment on attachment 8850870 [details]
Bug 1349808 - Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg,

https://reviewboard.mozilla.org/r/123390/#review128356

> OK, let's make viewportWidth, viewportHeight, and viewportSize be all doubles, and convert viewportSize into uint32_t while doing Telemetry::Accumulate()

Oops, viewportSize doesn't need to be Accumulate(), so please forget my last comment.
Pushed by bchiou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e9336ce58b4e
Add telemetry for cases when we can't run async animations due to layer size being too large. data-review=bsmedberg, r=birtles,botond,bsmedberg
https://hg.mozilla.org/mozilla-central/rev/e9336ce58b4e
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
See Also: → 1361915
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: