Talos tpaint test should measure performance.timing.timeToNonBlankPaint instead of onload time

ASSIGNED
Assigned to
(Needinfo from 2 people)

Status

enhancement
ASSIGNED
2 years ago
Last year

People

(Reporter: cpeterson, Assigned: zbraniecki, NeedInfo)

Tracking

(Blocks 2 bugs)

Version 3
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [PI:June])

Attachments

(1 attachment)

Bug 1377251 added a proprietary performance.timing.timeToNonBlankPaint property. We should change the Talos tpaint and ts_paint tests to measure timeToNonBlankPaint instead of onload time.

To expose the timeToNonBlankPaint property to web content, you must set the dom.performance.time_to_non_blank_paint.enabled pref = true. There is no timeToNonBlankPaint event that a test can listen for. In bug 1377251 comment 24, bsmedberg suggests listening for the 'onload' event and then reading the timeToNonBlankPaint property, which should (!) have been set by onload time.

Note that timeToNonBlankPaint is the relative time since performance.timing.navigationStart, whereas other performance.timing.* properties are the absolute timestamps (since the UNIX epoch).

See related bug 1363845 for changing the tp6 page load tests to use timeToNonBlankPaint.
Assignee: nobody → gandalf
Status: NEW → ASSIGNED
So, it seems that both tpaint and ts_paint listen to some form of MozAfterPaint:

 - http://searchfox.org/mozilla-central/rev/88180977d79654af025158d8ebeb8c2aa11940eb/testing/talos/talos/tests/tpaint/tpaint.html#71
 - http://searchfox.org/mozilla-central/rev/88180977d79654af025158d8ebeb8c2aa11940eb/testing/talos/talos/startup_test/tspaint_test.html#16

With the difference being that ts_paint is calculating using startupInto["firstPaint"] which, according to docs awaits both MozAfterPaint and onload.

I assume that it makes sense to switch to not calculate using onload for ts_paint, but I'm wondering what is the difference between listening to MozAfterPaint and using timeToNonBlankPaint.

NI on Markus for the difference between the two, and on Mike to confirm that the switch is reasonable.
Flags: needinfo?(mstange)
Flags: needinfo?(mconley)
I suspect we can actually get away with not waiting for MozAfterPaint in ts_paint. I suspect the old MozAfterPaint event handler is a holdover from when ts_paint used to use it as the "end of stopwatch".
Flags: needinfo?(mconley)
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #1)
> NI on Markus for the difference between the two

Mike would know better, but I think MozAfterPaint might fire a little later than timeToNonBlankPaint, because timeToNonBlankPaint measures the time of display list construction, and MozAfterPaint additionally waits for rasterization and compositing.
Flags: needinfo?(mstange)
Comment on attachment 8892210 [details]
Bug 1381988 - Switch talos tpaint test to measure time until timeToNonBlankPaint, rather than until MozAfterPaint.

https://reviewboard.mozilla.org/r/163156/#review169474

thanks for putting a patch together- this is for tpaint only, is there related work for ts_paint as well?

::: testing/talos/talos/test.py:262
(Diff revision 1)
>      lower_is_better = True
>      fnbpaint = True
>  
>  
>  @register_test()
> -class tpaint(PageloaderTest):
> +class tpaint(QuantumPageloadTest):

quantumPageloadTest requires a pageset that was recorded and designed for replay.  We need to keep this as a PageloaderTest.

::: testing/talos/talos/tests/tpaint/tpaint.html:65
(Diff revision 1)
> +/**
> + * We're going to calculate the difference between when the done() function
> + * is called, and when timeToNonBlankPaint happened.
> + *
> + * This difference, called `delay` will be transferred back to the childIsOpen
> + * and substracted from the measurement.

nit: subtracted
Yeah, sorry, will get back to this today.

Thanks for the insight with the pageset. I wasn't sure why the tests were failing. :)
Comment on attachment 8892210 [details]
Bug 1381988 - Switch talos tpaint test to measure time until timeToNonBlankPaint, rather than until MozAfterPaint.

https://reviewboard.mozilla.org/r/163156/#review169488

::: testing/talos/talos/test.py:276
(Diff revision 2)
>      tppagecycles = 20
>      timeout = 300
>      gecko_profile_interval = 1
>      gecko_profile_entries = 2000000
>      tpmozafterpaint = True
> +    fnbpaint = True

as a note, you have both fnbpaint and tpmozafterpaint here, is that intentional?
> as a note, you have both fnbpaint and tpmozafterpaint here, is that intentional?

Not sure if it'll stay this way. Unfortunately, fnbpaint doesn't have an associated event, so we have to wait for something to know it's done. I could use window.onload for that instead.

Overall, since there's no correlation between onload, mozafterpaint and fnbpaint, I'd really like to do something like [0] to make this non-racy, but for now, I would like to verify that the code works on try first.


[0] http://searchfox.org/mozilla-central/source/testing/talos/talos/pageloader/chrome/pageloader.js#273
ask for help as needed, I am around this week and half of next week, then rwood and I should be around as normal the last 2 weeks of the month!
you should see something like:
17:46:05     INFO -  PID 9252 | __start_tp_report
17:46:05     INFO -  PID 9252 | _x_x_mozilla_page_load
17:46:05     INFO -  PID 9252 | _x_x_mozilla_page_load_details
17:46:05     INFO -  PID 9252 | |i|pagename|runs|
17:46:05     INFO -  PID 9252 | |0;/tpaint.html?auto=1;308.8950000000002;308.345;273.8249999999998;302.345;279.03999999999996;284.69499999999994;299.02499999999986;298.7199999999998;279.19000000000005;279.02;273.2049999999999;299.125;278.105;296.3800000000001;267.33500000000004;273.22500000000014;302.15999999999985;270.125;296.6199999999999;295.28
17:46:05     INFO -  PID 9252 | __end_tp_report
17:46:05     INFO -  PID 9252 | __start_cc_report
17:46:05     INFO -  PID 9252 | _x_x_mozilla_cycle_collect,1410
17:46:05     INFO -  PID 9252 | __end_cc_report
17:46:05     INFO -  PID 9252 | __startTimestamp1501721165449__endTimestamp
17:46:05     INFO -  PID 9252 |

but instead I see:
17:57:28     INFO -  TEST-INFO | started process 29979 (/builds/slave/test/build/application/firefox/firefox  http://localhost:58463/getInfo.html)
17:57:33     INFO -  TEST-INFO | 29979: exit 0
17:57:34     INFO -  Browser initialized.
17:57:34     INFO -  Running cycle 1/1 for tpaint test...
17:57:34     INFO -  TEST-INFO | started process 30188 (/builds/slave/test/build/application/firefox/firefox -profile /tmp/tmpfUXSsz/profile -tp file:/builds/slave/test/build/tests/talos/talos/tests/tpaint/tpaint.manifest.develop -tpchrome -tpmozafterpaint -tpnoisy -fnbpaint -tpcycles 1 -tppagecycles 20)
17:57:35     INFO -  PID 30188 |
17:57:35     INFO -  PID 30188 | (/builds/slave/test/build/application/firefox/firefox:30237): Pango-WARNING **: error opening config file '/home/cltbld/.pangorc': Permission denied
17:57:35     INFO -  PID 30188 |
17:57:45     INFO -  PID 30188 | RSS: Main: 177786880
17:57:45     INFO -  PID 30188 |
17:57:45     INFO -  PID 30188 | fnbpaint is not yet available (0), retry number 1...
17:57:45     INFO -  PID 30188 | Abort: firstNonBlankPaint value is not available after loading the page
17:57:45     INFO -  PID 30188 | RSS: Main: 173584384
17:57:45     INFO -  PID 30188 |
17:57:45     INFO -  PID 30188 | fnbpaint is not yet available (0), retry number 2...
17:57:46     INFO -  PID 30188 | *** UTM:SVC TimerManager:registerTimer called after profile-before-change notification. Ignoring timer registration for id: telemetry_modules_ping
17:57:46     INFO -  Terminating psutil.Process(pid=30188, name='firefox')
17:57:46     INFO -  TEST-INFO | 30188: exit 0
17:57:46     INFO -  TEST-UNEXPECTED-ERROR | tpaint | Could not find report in browser output: [('tsformat', ('__start_report', '__end_report')), ('tpformat', ('__start_tp_report', '__end_tp_report'))]


I suspect we are not getting values and because of that we are failing to print data out.  I think you have a slight typo in your previous comment, to run the test locally it is |./mach talos-test -a tpaint|

Can you try waiting in a loop for firstnonblankpaint to have a value, or create code to work around a blank value?
Flags: needinfo?(jmaher)
> Can you try waiting in a loop for firstnonblankpaint to have a value, or create code to work around a blank value?

I created a loop that is supposed to do this, and it works locally again, but doesn't on try :(

Any ideas why?
Flags: needinfo?(jmaher)
oh, I think I see what is going on- in test.py fnbpaint=true, so pageloader is measuring fnbpaint and in this case it is failing to find it (I suspect pageloader doesn't know about the child window created).  I verified I could run this just fine locally, so I am confused why this fails on try- possibly try removing the fnbpaint=True, and just keep the modifications to tpaint.html
Flags: needinfo?(jmaher)
Zibi, is the test script setting the "dom.performance.time_to_non_blank_paint.enabled" pref = true on the test machine? The performance.timing.timeToNonBlankPaint property is only exposed to web content if the "dom.performance.time_to_non_blank_paint.enabled" pref is set.
Flags: needinfo?(gandalf)
Chris: yeah, fnbpaint=True is setting it, but I think Joel is right and the fnbpaint=True is also expecting something more from the test.

I switched to manually set the pref for this test and it still passes locally with flying colors. Let's see what try says.
Flags: needinfo?(gandalf)
Ok. This is not insignificant change: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=ca68f88a175d&newProject=try&newRevision=fd808d25734e&framework=1&showOnlyImportant=0

I'll try with more runs once I have both tpaint and ts_paint because I want to understand if this change affects the stdev and/or distribution, but this will require some explanation.
7-10% drop on tpaint is surprising to me since we really only change the MozAfterPaint -> timetoFirstNonBlankPaint and that should often be (almost) the same?
I looked into ts_paint. It's pretty interesting what is going on there.

Currently, the test measures the time between two timestamps:

* process - http://searchfox.org/mozilla-central/source/toolkit/components/startup/nsAppStartup.cpp#772
* firstPaint - http://searchfox.org/mozilla-central/source/view/nsViewManager.cpp#361

I don't know if I understand it. The `process` event happens in GetStartupInfo which is fired in a couple places like nsBrowserGlue.js#__trackSlowStartup and browser.js#CombinedStopReload.init.
Neither of those seems to be the right place to record the correct process startup timestamp.

firstPaint on the other hand is recorded in nsViewManager.cpp#PaintWindow, so it doesn't take into account the content at all.

My question here is, if timeToNonBlankPaint is the right measure for this test. If it is, it should be captured from browser.xul context, not from content context.
If that's true, I'm wondering if we should switch the firstPaint StartupInfo event to use it, and if that's true, then maybe it shouldn't be recorded in nsViewManager?

Separate question is if I'm correct that the `process` timestamp is not recorded at the right time.

Joel, what do you think?
Flags: needinfo?(jmaher)
great questions.  Keep in mind there is discussion in bug 1369417 about adding additional measurements for ts_paint.

Sadly I don't know the answer to either of your two questions :(

I am fine switching whatever needs to be switched, as long as it produces reliable numbers!

In the old days when I understood ts_paint, we measured the timestamp prior to launching the process from the python script and passed that into the webpage we loaded, this would give us a starting time to subtract when we received an onload event.
Flags: needinfo?(jmaher)
> Keep in mind there is discussion in bug 1369417 about adding additional measurements for ts_paint.


Oooh, I would love then to move this conversation there and limit this bug to tpaint.

If you're ok with that, then I believe the patch is ready. The only question is - why do we see such a significant perf improvement.

All I'm doing is instead of taking MozAfterPaint, I'm now taking the timeToNonBlankPaint, which should be the same in cases where the first paint is the non-blank-paint, and if anything, it should be a regression if the first paint was blank.

Instead, we see 17-22ms improvement.

:smaug - can you take a look at the code that fires MozAfterPaint vs timeToNonBlankPaint and how I'm modifying the test to make sure that the new value is the correct one?

tpaint is supposed to measure the time between parent's "window.open" and child window's first-paint (non blank).
Flags: needinfo?(bugs)
I think keeping this bug to tpaint is a good idea.


As for the improvements, we saw this on tp6 where we saw large improvements in time when switching to ttfnbp from map:
https://bugzilla.mozilla.org/show_bug.cgi?id=1363845#c23
Summary: Talos tpaint/ts_paint tests should measure performance.timing.timeToNonBlankPaint instead of onload time → Talos tpaint test should measure performance.timing.timeToNonBlankPaint instead of onload time
Ok, so my plan is:

1) Wait for :smaug to confirm that the number change makes sense from the DOM code perspective
2) Land this
3) Work on ts_paint
4) Come back to tpaint and consider also extending it to handle both parent window startup and content window startup. (currently tpaint only measures content window timestamps)
So use of requestAnimationFrame looks wrong to me. Why do you want to force early layout flush etc. when measuring?
Flags: needinfo?(bugs)
So glad I asked you! :)

I definitely don't want to pressure early layout flush. I only want to measure and since there's no promise or event associated with timetononblankpaint I thought of using the rAF as a way to cyclicly check if the fnbp is ready.

What should I use instead to not affect the system I'm measuring? RequestIdleCallback? setTimeout? SetInterval?
Flags: needinfo?(bugs)
rIC may be called before any painting has happened, or may not be called at all.
I don't know why you get different timing from MozAfterPaint and timeToNonBlankPaint
(though, those are updated at different times) but perhaps you could use MozAfterPaint listener to read timeToNonBlankPaint ?
Flags: needinfo?(bugs)
I was hoping to limit our use of mozAfterPaint since it seems to be not on any path to be standardized (fnbp is), but if you think it's the best way to get the number, let's use it.

I'm running a try run with mozAfterPaint with higher number of replicants. If the performance characteristic will stay within the numbers we see in previous trys, I'll land.

Thanks Olli, Joel, Chris!
Here's the 100 replications try run: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=1ff17e717191&newProject=try&newRevision=547f1c7f332653d5b63b803acfc854d5f5a7fa6b&framework=1&showOnlyImportant=0

It so far shows -20ms change on linux and -40ms change on macos.

Olli said that he'll take a look again at the code to get any understanding on why MozAfterPaint vs. timeToNonBlankPaint is so different and help us decide if this patch should land.
Flags: needinfo?(bugs)
So, hmm, isn't MozAfterPaint fired after composite, and timeToNonBlankPaint is updated during paint.
If so, MozAfterPaint is probably what we want to measure.
I'm looking at code like http://searchfox.org/mozilla-central/rev/b52285fffc13f36eca6b47de735d4e4403b3859e/layout/base/PresShell.cpp#6303-6304
Or maybe timeToNonBlankPaint should be changed to be update after composite too.

Markus might have an opinion.
Flags: needinfo?(bugs) → needinfo?(mstange)
Thanks Olli!

I think I'm surprised that the difference between mozafterpaint and fnbp is so large. 11% on opening a trivial document seems like a lot.

Secondly, since fnbp is what we're standardizing in Performance Timing API and it is what tp6 is already using, I'd say we should move it to whatever is the right place to measure when the user sees the first non blank frame.
It may be worth also checking with the spec proposal for the first paint timing - https://github.com/WICG/paint-timing - to make sure that we standardize around the correct moment.
if we use mozAfterPaint, we should do so for nonEmptyRect.  It would be nice if we could have a more standard measurement that we determine is the right thing- I think that is why timeToNonBlankPaint was so important.
Also, per spec proposal:

"First Contentful Paint entry contains a DOMHighResTimeStamp reporting the time when the browser first rendered any text, image (including background images), non-white canvas or SVG. "

Not sure if that's where timeToNonBlankPaint happens, but I assume that this timestamp is going to be merged into the First Contentful Paint once the proposal advances, so I'd say we should aim to fire it as close to when the spec expects it as possible.
phew! I spent some time on IRC getting a bit more feedback from Olli.

It seems that:

a) Without the patch, tpaint is measuring a wrong thing, because MozAfterPaint is fired asynchronously and too late.
b) With the patch, we're measuring a bit too learly, because timeToNonBlankPaint is measured too early (before composite step).

I filed bug 1388157 to revisit when we're measuring timeToNonBlankPaint and potentially align it more with the Painting spec proposal.

I'll keep this bug open until then, and I opened bug 1388159 to land a temporary change that will make tpaint more realistic in the meantime.
Blocks: 1254628
Flags: needinfo?(gijskruitbosch+bugs)
So based on comment #38, my understanding is that we don't want to move forward with this outright because it would no longer include the 'composite' step in its measurements, and as a result will make it impossible to see the impact of changes to compositing.

However, I believe that it's possible for tests to output multiple measures. :jmaher / :gandalf, is that right? If so, could we adapt tpaint and ts_paint so they include the other number as well? That would provide people working on frontend things with a more stable number, AIUI.

I should imagine we'd want something like that anyway to avoid sudden jumps in the numbers - we could keep both measures side by side for a while and see if we're missing anything about what those measures are, uh, measuring.
Flags: needinfo?(jmaher)
Flags: needinfo?(gijskruitbosch+bugs)
Flags: needinfo?(gandalf)
technically we don't have that ability- but we can add it.  We are working on adding it for tp6 right now and once that is done we could look into adding it for other tests.  It would make sense to have multiple metrics and see how they line up over time.
Flags: needinfo?(jmaher)
(In reply to Joel Maher ( :jmaher - limited bugzilla access until May 1st) (UTC+2) from comment #40)
> technically we don't have that ability- but we can add it.  We are working
> on adding it for tp6 right now and once that is done we could look into
> adding it for other tests.  It would make sense to have multiple metrics and
> see how they line up over time.

Hi Joel, how is this coming along? :-)
Flags: needinfo?(jmaher)
:gijs, we are standing up a new framework "raptor" (Bug 1449180) which is designed to support multiple measurements.  I am not sure if it will work smoothly with tpaint, but that is a test we can look into porting in the short term (once tp6 and speedometer are running)
Flags: needinfo?(jmaher)
Whiteboard: [PI:May]
Great! The boot stage process I'm proposing for us here has been implemented in FxOS with raptor being the front end CI for it - https://developer.mozilla.org/en-US/docs/Archive/B2G_OS/Developing_Gaia/Raptor/Responsiveness_guidelines#App_startup_stages_and_recommended_time_goals

I can't wait for raptor to work in Firefox! :)
Raptor doesn't support startup tests, just pageload (and standard benchmarks like 'speedometer'). I don't see startup tests being added as that is a significant design shift and difficult since the main Raptor driver is a webextension inside the browser itself. So any changes here for tpaint and ts_paint would need to be done in talos.
Whiteboard: [PI:May] → [PI:June]
(In reply to Robert Wood [:rwood] from comment #44)
> Raptor doesn't support startup tests, just pageload (and standard benchmarks
> like 'speedometer'). I don't see startup tests being added as that is a
> significant design shift and difficult since the main Raptor driver is a
> webextension inside the browser itself. So any changes here for tpaint and
> ts_paint would need to be done in talos.

That being said, I do think we should investigate what it would take to have Raptor support start-up tests. However they would be a more 'simplified' version of start up tests, capturing only the main events that are most valuable. I don't see startup tests like ts_paint or tpaint themselves being ported to raptor specifically as they are very talos specific.
You need to log in before you can comment on or make changes to this bug.