The default bug view has changed. See this FAQ.

Improve Start/Stop FrameTimeRecording for telemetry usage

RESOLVED FIXED in mozilla21

Status

()

Core
Graphics: Layers
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: avih, Assigned: avih)

Tracking

unspecified
mozilla21
x86_64
Windows 7
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 4 obsolete attachments)

(Assignee)

Description

4 years ago
Currently, LayerManager::StartFrameTimeRecording() can only service a single consumer at a time. More than one (i.e. a 2nd Start before Stop) would invalidate the readings for all consumers, without them knowing about it.

Furthermore, if a Start is issued without a matching Stop (due to elements closing before stopping, bugs, etc), the recording array will grow forever, leading to memory inflation.

These two issues should be fixed before this facility can be used globally for telemetry.

My current approach is to:
1. Replace the growing array implementation with a cyclic array.
2. Modify the API to return a handle on Start, and use this handle on Stop.
3. Recording will pause if the data since last Start is about to get overwritten.
4. Return empty result if Stop was issued too late.
5. Use default buffer size of 3600 (60s @60fps), with a pref to change it.


- Dependency on bug 820167 is due to its improvement of StartFrameTimeRecording to also record paint times.
(Assignee)

Comment 1

4 years ago
Created attachment 698786 [details] [diff] [review]
Frames recording: cyclic buffer capable of serving concurrent consumers

This patch satisfies the approach from comment #1.

I decided to change the default (and minimum) buffer length is 5 seconds, which is enough for tab animation. If other consumers need more in the future, we can set a pref.

Implication is that after 5 seconds without any consumer starting recording, recording will pause.
(Assignee)

Comment 2

4 years ago
Created attachment 698979 [details] [diff] [review]
Rebased and fixed members initialization order
Attachment #698786 - Attachment is obsolete: true
Comment on attachment 698979 [details] [diff] [review]
Rebased and fixed members initialization order

Looks good, I've only made a few style comments below. 
Btw, in practice, how likely is the Telemetry data from tab animations to be skewed by paints unrelated to the animation being timed?

> NS_IMETHODIMP
>-nsDOMWindowUtils::StopFrameTimeRecording(float** paintTimes, uint32_t *frameCount, float **frameIntervals)
>+nsDOMWindowUtils::StopFrameTimeRecording(uint32_t startIndex, float** paintTimes, uint32_t *frameCount, float **frameIntervals)

Split this into one indented parameter per line

>-  if (*frameCount != 0) {
>-    *frameIntervals = (float*)nsMemory::Alloc(*frameCount * sizeof(float*));
>-    if (!*frameIntervals)
>-      return NS_ERROR_OUT_OF_MEMORY;

I think this allocator is infallible

>-  void startFrameTimeRecording();
>-  void stopFrameTimeRecording([optional, array, size_is(frameCount)] out float paintTimes,
>+  void startFrameTimeRecording([retval] out unsigned long startIndex);
>+  void stopFrameTimeRecording(in unsigned long startIndex,
>+                              [optional, array, size_is(frameCount)] out float paintTimes,
>                               [optional] out unsigned long frameCount,
>                               [retval, array, size_is(frameCount)] out float frameIntervals);

Add a comment to these

>+    if (!mRecordingSize) { // Initialize recording buffers
>+      uint32_t kRecordingMinLength = 60 * 5; // 5 seconds @60 fps.
>+      uint32_t kRecordingMaxLength = 60 * 60 * 60; // One hour
>+      mRecordingSize = (uint32_t)Preferences::GetInt("toolkit.framesRecording.bufferElements", kRecordingMinLength);

I'd make the minimum recording length a bit longer, maybe 10 seconds. You could also just call Length() instead of storing mRecordingSize.
Nit: const uint32_t + toolkit.frame_recording.bufferSize

>+   *     might have been presented some time before calling StartFrameTimeRecording.
>+   */
>+  // Returns a handle which represents current recording start position.
>+  uint32_t StartFrameTimeRecording();
>+
>+  // Returns number of recorded frames since aStartIndex was issued,
>+  //   and allocates+populates 2 arraye with the recorded data.
>+  // - Allocation errors are indicated by *aFrameIntervals==0.
>+  // - If no allocation errors, caller is responsible to release both arrays
>+  //   with nsMemory::Free (even if size is 0).
>+  uint32_t StopFrameTimeRecording(uint32_t aStartIndex, float **aFrameIntervals, float **aPaintTimes);
>+  
>   void SetPaintStartTime(TimeStamp& aTime);
>-  void StopFrameTimeRecording(nsTArray<float>& aFrameTimes, nsTArray<float>& aProcessingTimes);

Nit: Keep same comment style
(Assignee)

Comment 4

4 years ago
(In reply to Vladan Djeric (:vladan) from comment #3)
> Comment on attachment 698979 [details] [diff] [review]
> Rebased and fixed members initialization order
> 
> Looks good, I've only made a few style comments below. 

Thanks, good points. I'll handle them.

> Btw, in practice, how likely is the Telemetry data from tab animations to be
> skewed by paints unrelated to the animation being timed?

Other paints (and general processing) do affect animation and this is reflected at the measurements (e.g. longer intervals and paint times if opening the new tab page with thumbnails preview - which is displayed during the animation). I don't think we can isolate tab animation paints from other processing other, than making sure they are minimal when testing (e.g. by setting browser.newtab.url to about:blank).

If you're asking if there could be recorded frames which are not part of the tab animation (which would result in more recorded frames than animation frames and hence skewed result), then as far as I understand, there shouldn't be such. But since I'm currently (and also before this patch) seeing some possible anomaly at the first 2-3 frames of tab animation (sometimes shorter-than-expected intervals), then possibly there are unrelated paints within the recording. I'm suspecting it's a hiccup of the refresh driver timing, but I'll give it another look anyway.


> >-  if (*frameCount != 0) {
> >-    *frameIntervals = (float*)nsMemory::Alloc(*frameCount * sizeof(float*));
> >-    if (!*frameIntervals)
> >-      return NS_ERROR_OUT_OF_MEMORY;
> 
> I think this allocator is infallible

This is the old code which I replaced, but my code uses the same test when allocating the result arrays. Docs say it can fail: https://developer.mozilla.org/en-US/docs/nsMemory/Alloc
(Assignee)

Updated

4 years ago
Blocks: 828097
(In reply to Avi Halachmi (:avih) from comment #4)
> This is the old code which I replaced, but my code uses the same test when
> allocating the result arrays. Docs say it can fail:
> https://developer.mozilla.org/en-US/docs/nsMemory/Alloc

That doc was last updated in 2006 :P 
nsMemory::Alloc points to NS_Alloc and NS_Alloc was made infallible in bug 680556
(Assignee)

Comment 6

4 years ago
Created attachment 699994 [details] [diff] [review]
V3 - Alloc is infallible, style changes, fixed compilation warnings

(In reply to Vladan Djeric (:vladan) from comment #5)
> That doc was last updated in 2006 :P 
> nsMemory::Alloc points to NS_Alloc and NS_Alloc was made infallible in bug
> 680556

Now it's last updated in 2013 ;)
Attachment #698979 - Attachment is obsolete: true
(Assignee)

Updated

4 years ago
Attachment #699994 - Flags: review?(khuey)
Attachment #699994 - Flags: review?(jmuizelaar)
Comment on attachment 699994 [details] [diff] [review]
V3 - Alloc is infallible, style changes, fixed compilation warnings

Review of attachment 699994 [details] [diff] [review]:
-----------------------------------------------------------------

r=me on the dom bits.
Attachment #699994 - Flags: review?(khuey) → review+
(Assignee)

Updated

4 years ago
Blocks: 828126
No longer blocks: 828126
Depends on: 828126
Comment on attachment 699994 [details] [diff] [review]
V3 - Alloc is infallible, style changes, fixed compilation warnings

Review of attachment 699994 [details] [diff] [review]:
-----------------------------------------------------------------

The added complexity is unfortunate but I don't have any obvious suggestions for simplification.

::: gfx/layers/Layers.cpp
@@ +918,5 @@
> +    if (!mFrameIntervals.Length()) { // Initialize recording buffers
> +      const uint32_t kRecordingMinSize = 60 * 10; // 10 seconds @60 fps.
> +      const uint32_t kRecordingMaxSize = 60 * 60 * 60; // One hour
> +      uint32_t bufferSize = (uint32_t)Preferences::GetInt("toolkit.framesRecording.bufferSize",
> +                                                          kRecordingMinSize);

Why the cast?

@@ +957,3 @@
>      TimeStamp now = TimeStamp::Now();
> +    uint32_t i = mRecordingNextIndex % mFrameIntervals.Length();
> +    mFrameIntervals[i] = (float)(now - mLastFrameTime).ToMilliseconds();

I wonder if we should have ToMillisecondsf(). We should also probably use a c++ style cast.

@@ +963,5 @@
>      mLastFrameTime = now;
> +
> +    if (mRecordingNextIndex > (mRecordingLatestStartIndex + mFrameIntervals.Length())) {
> +      // We've just overwritten the most recent recording start -> pause.
> +      mIsRecording = false;

I wonder if it would be better to just store the last n times in the buffer instead of dropping everything when we get to the end.

@@ +979,1 @@
>  {

It's probably worth adding a bit of implementation documentation about how this all works including motivation for the index to not wrap around Length() etc.

@@ +988,5 @@
> +  }
> +
> +  // Allocate result arrays and copy frame intervals and paint times into them
> +  *aFrameIntervals = (float*)nsMemory::Alloc(length * sizeof(float));
> +  *aPaintTimes     = (float*)nsMemory::Alloc(length * sizeof(float));

I don't like moving this ugly XPCOM style array business into the LayerManager. I intentionally wrote the copy in the nsDOMWindowUtils too keep this uglyness in that area. I'd prefer if it stayed that way, because that will get fixed when we switch DOMWindowUtils to the new bindings.

::: gfx/layers/Layers.h
@@ +548,5 @@
>    nsTArray<float> mPaintTimes;
> +  bool mIsRecording;
> +  uint32_t mRecordingLatestStartIndex; // Triggers pause if this gets overwritten.
> +  uint32_t mRecordingCurrentRunStartIndex; // Sequence is valid only if it started not before this value (otherwise the sequence contains recording pauses).
> +  uint32_t mRecordingNextIndex;

It's probably worth grouping this stuff into a struct of it's own so it's clear what all belongs to the recording.
Attachment #699994 - Flags: review?(jmuizelaar) → review+
(Assignee)

Comment 9

4 years ago
(In reply to Jeff Muizelaar [:jrmuizel] from comment #8)
> The added complexity is unfortunate but I don't have any obvious suggestions
> for simplification.
Agreed on both. I can simplify this by removing the pause mechanism - behavior for consumers or memory usage would stay identical, but once (if) we start, we'll keep recording forever. If the performance tradeoff is acceptable, I'll make the change.

> Why the cast?
Suppress compiler warning.

> I wonder if we should have ToMillisecondsf(). We should also probably use a
> c++ style cast.
OK c++ cast. Not sure why would we want also ToMillisecondsf. Cast is perfectly valid, and we don't need more than float here. Am I missing something? Another function just to avoid a cast from double to float on our usage?

> I wonder if it would be better to just store the last n times in the buffer
> instead of dropping everything when we get to the end.
We're pausing - nothing gets dropped. But indeed, the recording will no longer be available to consumers. Otherwise, consumers will unknowingly get partial recording data.

The design of this system is that the buffer length is able to accommodate any usage, and if it doesn't, it should be enlarged via the preference. Right now it's used only for tab animation, and 10s (or more) is way more than enough. Also, I can't currently think of animations we'll want to measure that last more than 10s, or even more than 1 second.

> It's probably worth adding a bit of implementation documentation...
OK.

> > +  // Allocate result arrays and copy frame intervals and paint times into them
> > +  *aFrameIntervals = (float*)nsMemory::Alloc(length * sizeof(float));
> > +  *aPaintTimes     = (float*)nsMemory::Alloc(length * sizeof(float));
> 
> I don't like moving this ugly XPCOM style array business into the
> LayerManager. I intentionally wrote the copy in the nsDOMWindowUtils too
> keep this uglyness in that area. I'd prefer if it stayed that way, because
> that will get fixed when we switch DOMWindowUtils to the new bindings.
Agreed on the ugliness. The reason is performance impact of an extra array copy (to 2 overall). Since you can live with that, I'll change it.


> It's probably worth grouping this stuff into a struct of it's own so it's
> clear what all belongs to the recording.
OK.
(Assignee)

Comment 10

4 years ago
Created attachment 703720 [details] [diff] [review]
v4 - c++ casts, documentation, moved XPCOM style arrays, members grouping.

Try https://tbpl.mozilla.org/?tree=Try&rev=db52f4024570
Attachment #699994 - Attachment is obsolete: true
Attachment #703720 - Flags: review?(jmuizelaar)
Comment on attachment 703720 [details] [diff] [review]
v4 - c++ casts, documentation, moved XPCOM style arrays, members grouping.

Review of attachment 703720 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry this review took so long.
Attachment #703720 - Flags: review?(jmuizelaar) → review+
(Assignee)

Comment 12

4 years ago
Created attachment 705444 [details] [diff] [review]
v5 - rebase
Attachment #703720 - Attachment is obsolete: true
https://hg.mozilla.org/integration/mozilla-inbound/rev/6cbe99a8f3a7
https://hg.mozilla.org/mozilla-central/rev/6cbe99a8f3a7
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
(Assignee)

Comment 15

4 years ago
It should be noted that the recording (regardless of the improvement from this bug) is broken with OMTC, and since right now OMTC is on by default on OSX, we don't get tab animation telemetry there.

Since OMTC will be turned on on other systems as well over time, if we wish to keep this intervals recording facility, then we should probably make it work with OMTC as well.
Do we want to keep this facility?
(Assignee)

Comment 17

4 years ago
I think we do, though in retrospective, recording the "paint times" (from the refresh driver tick till PostPresent - using SetPaintStartTime) hasn't been too useful.

Frame intervals are still useful IMO for animation instrumentation.

The issue right now is that LayerManager::PostPresent which is used to record the data isn't reached when using OMTC.
(Assignee)

Updated

4 years ago
Blocks: 920123
(Assignee)

Updated

4 years ago
No longer blocks: 920123
Depends on: 920123
You need to log in before you can comment on or make changes to this bug.