Closed Bug 1369417 Opened 7 years ago Closed 5 years ago

modify ts_paint to report additional information

Categories

(Testing :: Talos, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: jmaher, Assigned: zbraniecki)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxperf:p5])

We need to measure more than what we currently are in ts_paint:
- first paint (that's the most important point). This is when we
receive the widget-first-paint notification
- when we are done displaying the browser window (on video recordings
this is when nothing moves on the display anymore... but what the
current ts_paint does is probably a good enough approximation).
- when we are ready to handle user events. I would measure this as the
first time a Services.tm.mainThread.idleDispatch callback gets called
after sessionstore-windows-restored


we sort of cover the 2nd point, but we can do better to ensure we cover the 1st and 3rd points.

One hiccup here is that ts_paint only measures a single data point so the framework will need some adjusting.  It shouldn't be too hard, just takes a few days to tweak and make sure we don't mess things up on specific platforms.
Summary: modify ts_paint to report → modify ts_paint to report additional information
Whiteboard: [PI:June]
:rwood, is this something you can look into after finishing up some of the mitmproxy work?
Assignee: nobody → rwood
Status: NEW → ASSIGNED
So question is, do we want the new values to be subtests of ts_paint and just aggregated in, so that ts_paint still just reports one (aggregated) value?

Or do we want the results reported as three different values, i.e the 'other' suite would have something like:

Perfherder:
    a11yr summary opt e10s: 640.99
    sessionrestore opt e10s: 700
    sessionrestore_no_auto_restore opt e10s: 739
    tabpaint summary opt e10s: 72.38
    tpaint opt e10s: 279.13
    ts_paint opt e10s: 884
    ts_paint_first opt e10s: xxx
    ts_paint_user opt e10s: xxx

The test itself would run only once but we could 'hack' the perferder data output to look like individual tests above. Problem I see with that is it would be confusing if someone wants to run "ts_paint_user" they couldn't, they'd have to run the 'other' suite or 'ts_paint'.

Or we could just add two completely new startup tests that measure the new events and report as above but actually be separate tests.

What do you think :jmaher?
Flags: needinfo?(jmaher)
I think hacking perfherder would be ideal, so not subtests
Flags: needinfo?(jmaher)
Having separate values instead of a synthetic aggregate score would help implementors understand what regressions mean, IMO.
Not sure if my opinion adds much here, but I've been burned at least once (bug 1372261) by a test that buried a real regression inside subtests, and I had to go through some perfherder hoops to visualize where the actual regression was. I think the less of that sort of thing we can do, the better.
(In reply to Joel Maher ( :jmaher) from comment #0)
> We need to measure more than what we currently are in ts_paint:
> - first paint (that's the most important point). This is when we
> receive the widget-first-paint notification
> - when we are done displaying the browser window (on video recordings
> this is when nothing moves on the display anymore... but what the
> current ts_paint does is probably a good enough approximation).

From digging around the code, it looks like the current ts_paint test is reporting first paint already (the first item in your list above)? Looks to me like it does unless I'm following the code wrong.

Even though we are triggering on the 'MozAfterPaint' event, once that fires we are returning startupInfo['firstPaint'] - startupInfo['process'].

https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/startup_test/tspaint_test.html#24

And startupInfo['firstPaint'] is actually 'widget-first-paint':

https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/talos-powers/components/TalosPowersService.js#258

I'm not sure what the difference is between that (what ts_paint currently reports) and the first item in the list. If they are the same, then any idea how to capture the second item: (?)

> - when we are done displaying the browser window (on video recordings
> this is when nothing moves on the display anymore... 

Thanks :)
Flags: needinfo?(jmaher)
(In reply to Robert Wood [:rwood] from comment #6)
> 
> And startupInfo['firstPaint'] is actually 'widget-first-paint':

Kinda. It looks like we use the widget-first-paint observer notification to know for sure that we can retrieve the firstPaint measurement from nsIAppStartup.getStartupInfo().

Here's where widget-first-paint is fired:

http://searchfox.org/mozilla-central/rev/c49a70b53f67dd5550eec8a08793805f2aca8d42/layout/base/PresShell.cpp#8892

And here's where the firstPaint measurement is taken:

http://searchfox.org/mozilla-central/rev/c49a70b53f67dd5550eec8a08793805f2aca8d42/view/nsViewManager.cpp#362

So they're not exactly the same, I don't think - but I do think they're fired very soon after one another. Maybe tn knows the exact definition here?
Flags: needinfo?(tnikkel)
these are topics I don't really know.  :mconley did some work on tspaint a little while ago.  :florian, can you help out here as you had requested this change.  We need a subject matter expert on this topic.
Flags: needinfo?(jmaher)
Flags: needinfo?(florian)
(In reply to Mike Conley (:mconley) from comment #7)
> Here's where widget-first-paint is fired:
> 
> http://searchfox.org/mozilla-central/rev/
> c49a70b53f67dd5550eec8a08793805f2aca8d42/layout/base/PresShell.cpp#8892
> 
> And here's where the firstPaint measurement is taken:
> 
> http://searchfox.org/mozilla-central/rev/
> c49a70b53f67dd5550eec8a08793805f2aca8d42/view/nsViewManager.cpp#362
> 
> So they're not exactly the same, I don't think - but I do think they're
> fired very soon after one another. Maybe tn knows the exact definition here?

Looks like they would both fire just after doing a composite to the window. The second one perhaps slightly earlier. I'm not sure how this works when we are doing the compositing on the compositor thread. I'd guess these main thread composites would be no-ops so that we generate the mozafterpaint events on the main thread.
Flags: needinfo?(tnikkel)
Depends on: 1375073
Updates from IRC:

<mconley> florian / jmaher: if I understand the pattern coming out of Platform Integrity properly, we should probably measure:
<mconley> 1) Time to first paint the browser window
<mconley> 2) Time to first paint the content area
<mconley> 3) Time to "hero element" visible in the content area
<mconley> 4) Time until browser is accepting UI events
<mconley> and maybe
<florian> would that be with about:home in the content area?
<mconley> florian: yes, until Activity Stream replaces it.
<florian> ok
<jmaher> mconley: can we measure 1/2/3/4 now?  I think #3 is yet to be defined
<mconley> I'm pretty sure we're already measuring 1. I think it's reasonably straight-forward to measure 2 and maybe 3. Not sure about 4.
<jmaher> mconley: if you know how to measure other things, please speak up
<mconley> Well, this is mostly speculation
<florian> mconley: 4 is either Services.tm.dispatchToMainThread, or Services.tm.idleDispatchToMainThread
<mconley> we'd need to re-write the ts_paint test to load about:home, and inject a script somehow to wait for a non-zero-rect MozAfterPaint in there
<rwood> thanks, yeah that's exactly what I need the names of the actual events to listen for
<mconley> rwood: we'll want to look at the paint timestamp on the non-zero-rect MozAfterPaint event
<mconley> that'll tell us how long it took to go from loading about:home to actually presenting it on screen.
<mconley> Whether we want to calculate how long it's been since process start until about:home is presented... not sure.

<mconley> Part of my worry here is about how we're going to inject script so that we know when about:home has presented
<mconley> Because up until now, we've been loading the privileged startup_test/tspaint_test.html
<mconley> which does all of the work

<mconley> we might need Platform to expose this value somehow: http://searchfox.org/mozilla-central/source/layout/base/nsPresContext.h#1417
<mconley> rwood: because that'd be a lot easier than trying to ram a MozAfterPaint event handler into the page that's loading about:home
<mconley> like
<mconley> somehow Talos knows we're running ts_paint
<mconley> and so
<mconley> it waits until about:home has finished loading
<mconley> and then injects a script, which grabs that value from the content
<mconley> and then reports that
<mconley> rwood: so I suspect getting non-blank-paint exposed is pretty important for this work
<mconley> because that'll simplify the Talos stuff
<mconley> and make it so that we don't have to rush to get a framescript jammed in before a paint occurs
moving ts_paint to load about:home or about:blank will require us to have an addon (actually a good excuse for a web extension) that effectively is the test- right now we use addons to query information and the state of the test is managed by loading the tspaint.html webpage which queries for metrics and quits the browser.

fixing up mozafterpaint would be nice- will that work properly when loading about:home?

Please do chime in if there are other ideas on how to do this reliably- maybe we can hack around with this after we fix up talos to report multiple values properly.
Update: Bug 1375073 adds support to talos for mutli-value results in a single startup test iteration. Note to self, when updating the ts_paint talos test, ensure the test output is in this format:

__start_report{"event_1": 1026, "event_2": 1126, "event_3": 1226}__end_report

And update the ts_paint and ts_paint_webext test definitions in [1] and [2] to include the 'firstpaint', 'userready', and 'testeventmap' options, i.e.:

    firstpaint = True
    userready = True
    testeventmap = [{'name': 'ts_paint',
                     'label': 'event_1'},
                    {'name': 'ts_paint_first_paint',
                     'label': 'event_2'},
                    {'name': 'ts_paint_user_ready',
                     'label': 'event_3'}]

[1] http://searchfox.org/mozilla-central/rev/1b7b1ec949497e9fc2a9b9dfaccbe894ee2ad5ef/testing/talos/talos/test.py#123
[2] http://searchfox.org/mozilla-central/rev/1b7b1ec949497e9fc2a9b9dfaccbe894ee2ad5ef/testing/talos/talos/test.py#146
Notes from the SF meet-up:

We want ts_paint to record the time to finish showing about:home. This is a pretty interesting challenge because, at least from florian screen recording of a recent build, a number of stuff paints in about:home before it can really be considered "loaded" (example: the logo, the snippet, etc).

Here's our planned scheme for determining when about:home has "settled":

1) When about:home is loading, add a MozAfterPaint event listener
2) For each MozAfterPaint, look at the size of the invalidated rect.
3) When we see a steady stream of MozAfterPaints showing the invalidated rect as being the size of the text caret, we consider ourselves "settled". Look at the last non-zero sized MozAfterPaint, and consider that the time of the last meaningful paint for about:home, and it was presented.
For the browser UI, the caret in about:home won't be a problem, so we can use a slightly adjusted plan:

1) When browser.xul is loading, add a MozAfterPaint event listener
2) When we see a number of seconds where MozAfterPaint is firing with an empty rect (a.k.a, we're compositing the flashing caret, but nothing in the parent is painting), we consider the last non-zero rect as the final paint, and that the browser was settled.


So, comment 13 and comment 14 give us:

1) Time to about:home final paint and composite
2) Time to browser final paint and composite

ts_paint currently measures time to browser first paint. I _don't_ think it's measuring time to composite, but we should double-check, and perhaps adjust this.

Getting first non-zero paint for about:home is easy.

Finally, we want responsiveness to user events...

What we want is some way of knowing if there's been a sufficiently large window of time where we've been waiting for user input events.

The Gecko Profiler back-end knows when jank has occurred, and it also knows when the event queue is being properly serviced and is not janking... we wonder if we could potentially ask it if a sufficiently large window of non-consecutive-jank has occurred?

Can we re-use any of this logic? Or are there other parts of platform that might know when our event queue is being consumed regularly that we can detect?
I was just talking to farre from the DOM team, and for input events, he suggested the following:

0) Set some variable CONSECUTIVE_IDLE_CALLBACKS = 0
1) Early on in the browser's lifetime, queue a requestIdleCallback
2) Have the callback measure the time between the queueing and when it was serviced
3) If the delta is greater than X, set CONSECUTIVE_IDLE_CALLBACKS to 0, queue another idle callback, and go back to 2
4) If the delta is less than or equal to X, increment CONSECUTIVE_IDLE_CALLBACKS
5) If CONSECUTIVE_IDLE_CALLBACKS reaches some threshold, consider the event queue as being regularly serviced, and record the queuing time of the first in the series of the consecutive idle callbacks as the time to responsiveness.
farre also tells me that smaug might have a better idea for getting this measurement.
:smaug, would you have some other ideas than what :mconley proposes in comments 13,14,15 for measuring a useful metric of when the browser has started up and about:home is "done" ?
Flags: needinfo?(bugs)
requestIdleCallback doesn't really work well with input events: first, it has some headroom or what to call it to try to ensure processing idle callback doesn't overlap the next refreshdriver tick and also input events can easily be processed while there are timers pending. idle runnables are bound by timers.

And once bug 1351148 lands, input event processing will be quite different.

It isn't clear to what we want to measure.
Flags: needinfo?(bugs)
Flags: needinfo?(florian)
Whiteboard: [PI:June] → [PI:July]
Whiteboard: [PI:July] → [PI:August]
Hi all! Hailing here from bug 1381988 comment 21 where I was trying to understand ts_paint and modify it to use firstNonBlankPaint.

I'd be happy to take over this bug if it's stalled. I believe that instead of using MozAfterPaint and idleCallbacks, we can do what I did in bug 1381988 for tpaint: use requestAnimationFrame and collect timeToNonBlankPaint, I believe.

In that case, I'd like to collect:

1) browser.xul's timeToNonBlankPaint
2) browser.xul's onload (all sync JS is done, everything else is async / lazy)
3) about:newtab timeToNonBlankPaint
4) about:newtab onload

Does it sound like a good start?

My challenge here is that as I said in bug 1381988 comment 21 I'm questioning the current timestamp of StartupInfo.process.

I'm also not sure of the different between the current ViewManager's recorded firstPaint and timeToNonBlankPaint which I'd like to switch us to.

Rob, Mike, any feedback on this idea?

:smaug, does the proposal sound sane to you?

> It isn't clear to what we want to measure.

I believe we want to measure the time between when the new Firefox process is kicked off, and when particular stages of the startup end for main window (browser.xul) and the content of the newtab.
Flags: needinfo?(bugs)
Hmm, why you'd use requestAnimationFrame? That forces earlier layout flush and what not, and possibly also earlier (too early) paint.
Flags: needinfo?(bugs)
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #19)
> Hi all! Hailing here from bug 1381988 comment 21 where I was trying to
> understand ts_paint and modify it to use firstNonBlankPaint.
> 
> I'd be happy to take over this bug if it's stalled.

Thanks :gandalf, I appreciate that as you (and other devs) have way more expertise in this area.
Assignee: rwood → gandalf
Hi :gandalf, just a ping to see where this is at, is this still a priority and something you want to look into? Is there anything I can do to help? Thanks!
Flags: needinfo?(gandalf)
Whiteboard: [PI:August] → [PI:September]
Robert - I'm sorry for radio silence. I started looking through the current status of paint information, and scaled it up into a proposal for a unified bootstrap stages/metrics in https://groups.google.com/d/msg/mozilla.dev.platform/z7MQVwrJayk/SvJt-dnkAAAJ

I believe that given proximity to 57, it's not worth shifting the metrics for talos, even if they're knowingly imperfect.

I'd like to start crafting a plan for post-57 unification of the metrics that will probably start with introducing the new timing stamps that will be used by ts_paint and tpaint.

In result I'd like to suggest waiting with this bug until we either get a consensus and add the new metrics, or decide not to and then we can revisit this bug and try to improve it within the timestmaps we have already.
Flags: needinfo?(gandalf)
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #23)
...

Hi :gandalf, thank you for the detailed update! I agree - sounds like a good plan. When the time comes if there's something you need from me to help, just let me know.
Whiteboard: [PI:September]
Whiteboard: [fxperf]
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #23)
> decide not to and then we can revisit
> this bug and try to improve it within the timestmaps we have already.

What would that look like in practice? comment #20 here suggests that requestAnimationFrame will introduce other issues. I'm also confused because talos already seems to have firstNonBlankPaint flag which suggests that it supports what you want to do...
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(gandalf)
Until we hear more here, I'm P5'ing this.
Priority: -- → P5
Priority: P5 → --
Whiteboard: [fxperf] → [fxperf:p5]

No comments for over a year. Cleaning up open bugs. Please reopen or start a new bug if there's still interest in this enhancement.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: needinfo?(gandalf)
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.