Closed Bug 1174792 Opened 9 years ago Closed 7 years ago

Confirm e10s causes a 65%-72% tsvgr_opacity win on Linux

Categories

(Testing :: Talos, defect, P5)

Unspecified
Windows
defect

Tracking

(e10s+)

RESOLVED WONTFIX
Tracking Status
e10s + ---

People

(Reporter: mconley, Unassigned)

References

(Blocks 1 open bug)

Details

Initial measurements on mozilla-central changeset 0093691d3715 comparing e10s to non-e10s shows the following data for Linux after several retriggers:

Linux 32:

:) tsvgr_opacity        264.9 +/-  1% (6#)  [ -65.9%]      90.3 +/-  1% (6#)


Linux 64:

:) tsvgr_opacity        281.4 +/-  4% (6#)  [ -72.3%]      77.9 +/-  1% (6#)


These numbers are claiming that we've more than doubled our performance on tsvgr_opacity.

That's a pretty bold claim. We should try to confirm this, because if it's true, it's a really epic perf win.
Summary: Confirm e10s causes a tsvgr_opacity win on Linux → Confirm e10s causes a 65%-72% tsvgr_opacity win on Linux
The reason I filed this was because the win in comment 0 is truly epic, and I want to make sure it's real, and that e10s is not somehow cheating this test.

jwatt, do you happen to know if the tsvgr_opacity test should work properly with e10s?
Flags: needinfo?(jwatt)
Priority: -- → P5
I don't see any reason that e10s should cause much (if any) of a win.

(In reply to Mike Conley (:mconley) - Needinfo me! from comment #2)
> jwatt, do you happen to know if the tsvgr_opacity test should work properly
> with e10s?

tsvgr_opacity should work properly with e10s as long as we're measuring the time taken to paint properly, but as discussed with avih and jmaher on IRC, we're not measuring it properly. We're actually measuring how long it takes for the load event to fire. I guess the load event may fire earlier in e10s if the process is then unburdened from some work that is now happening in the chrome process, but I'm surprised that that would be so significant. This is supposed to be a very expensive-to-paint test, so I'd then have to ask what on earth is happening on Linux at the same time that is up to three times more expensive?

Anyway, Avi suggested the test could be declared as "one which reports its own timing" and have it use script atomically insert into the DOM the content that we wish to record the paint time of (time the time from just after the DOM insertion to the subsequent requestAnimationFrame callback). Apparently the pageloader addon injects a tpRecordTime function into the page for the page to allow it to record its own time. It would seem easiest to me to write the content that is to be painted with style="display:none;" on it and then use script to remove that attribute to cause it to paint. I don't foresee implementations ever speculatively pre-rendering |display:none| content since |display| changes can change layout, not just rendering, so that should work well.
Flags: needinfo?(jwatt)
On windows (7 and xp) we have massive regression on the rest of the platforms we perform better. This seems to me quite random...

linux64          -10.24% 	
osx-10-10        -11.81% 	
windows7-32      67.32% 	
windows8-64      -36.54% 	
windowsxp        66.64% 	

jwatt, do you see any reason why this test might perform so differently on win7-32 than on win8-64?
Flags: needinfo?(jwatt)
I have no idea. Maybe Bas could hazard a guess?

Bas, the tests basically fill, stroke or fill-and-stroke a lot of rectangles with some opacity on the fill/stroke. Its intent is to ensure that we optimize the opacity into the color of the fill/stroke where possible, and only use group opacity in the case that the rects have both fill and stroke simultaneously.
Flags: needinfo?(jwatt) → needinfo?(bas)
Actually, let's take the Windows discussion to bug 1250350 and keep this about Linux.
Flags: needinfo?(bas)
Joel, is there anyone working on fixing the tsvgr_opacity tests as described in comment 3?
Flags: needinfo?(jmaher)
right now svgr_opacity is running with mozafterpaint (not just onload).  We still need to confirm if that is measuring valid things.  We have confirmed this is the right event for e10s mode in ts_paint and other startup tests.  For this we need to look at the pageloader implementation.

Right now there are only 2 tests in svg_opacity (and have been for a long time):
https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/tests/svg_opacity/svg_opacity.manifest

Both of these tests are measuring mozafterpaint.  You can see mozafterpaint works in pageloader:
https://dxr.mozilla.org/mozilla-central/search?q=path%3Atesting%2Ftalos%2Ftalos%2Fpageloader+mozafterpaint&redirect=true&case=false

In this case we have listeners in content as well as the main process, depending if we are in e10s mode or not.

reading up on comment 3, there is a desire to rewrite both of those files and make them measure their own time.  This is as simple as editing the manifest and adding a '%' in front of the page:
https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/tests/svgx/svgx.manifest (example)

then modify both source files to have something like this:
https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/tests/svgx/hixie-002.xml#288

let me know if that helps.
Flags: needinfo?(jmaher)
(In reply to Joel Maher (:jmaher) from comment #8)
> right now svgr_opacity is running with mozafterpaint (not just onload).  We
> still need to confirm if that is measuring valid things.  We have confirmed
> this is the right event for e10s mode in ts_paint and other startup tests.

For any test that is intended to test how long it takes to load and fully render a page, I don't think mozafterpaint is the right thing. Specifically I think the way it is currently being used is to wait until both a 'load' and 'mozafterpaint' event have been received, then stop the timer. Given the way that gecko incrementally paints as a page loads that can go wrong. Consider this scenario:

 1. the page partially loads and paints - mozafterpaint
 2. the page finishes loading, but hasn't yet rendered - load
 3. the page finishes rendering

If the timer stops after step 2 then the work done in step 3 is incorrectly ignored.

> reading up on comment 3, there is a desire to rewrite both of those files
> and make them measure their own time.

Yes, to make the tests valid, as per the above. What I was wondering is if someone on your team who manages these tests is planing on doing this, or if that needs to be someone from the layout team?
Flags: needinfo?(jmaher)
you are correct that we stop measuring at #2 and if we continue painting/rendering we miss out on that timing.

There is nobody working on this- if there is an event that exists, I would be happy to listen for that or replace mozafterpaint for that and ensure it works on all tests, etc.

if there isn't an event, then we would need somebody from the layout team to look into this- maybe there is a way to wait for mozafterpaint event not received in 5 seconds or something.
Flags: needinfo?(jmaher)
(In reply to Joel Maher (:jmaher) from comment #10)
> There is nobody working on this- if there is an event that exists, I would
> be happy to listen for that or replace mozafterpaint for that and ensure it
> works on all tests, etc.

I don't think there's any such event right now.

> if there isn't an event, then we would need somebody from the layout team to
> look into this- maybe there is a way to wait for mozafterpaint event not
> received in 5 seconds or something.

In principal you could wait for the 'load' event, record its time, wait some specified time and if another mozreftest event is not received use the time of the 'load' event. That would be more correct than what we're currently doing. But obviously that's still prone to error if the process is starved of CPU time for some reason, and it slows down test runs in the typical case where everything runs smoothly.

Pageloader could instead do something along the lines of what:

https://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest-content.js

does to decide when to dispatch MozReftestInvalidate, but maybe better would be if gecko implemented a new event (disabled behind a pref) that gives you the information that you need.
(In reply to Jonathan Watt [:jwatt] from comment #9)
> I don't think mozafterpaint is the right thing. Specifically
> I think the way it is currently being used is to wait until both a 'load'
> and 'mozafterpaint' event have been received, then stop the timer.

So I don't think this is correct. After thinking about how best to fix this issue for a bit it seemed like what was needed is a way to tell if there is a mozafterpaint pending after the load event is fired...and it turns out we already have that - nsIDOMWindowUtils::isMozAfterPaintPending. Furthermore after looking at the code it seems that we're using this already (at least some of the time):

http://hg.mozilla.org/build/talos/file/tip/talos/pageloader/chrome/pageloader.js

So it seems the description of how talos works that I was given and recounted in comment 9 is inaccurate, what we're currently doing is actually fine, and we don't need to mess with the tests to make them report their own timing.
See also bug 1254628 (Assess MozAfterPaint usage in talos).
I don't think there is anything to do here.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.