The default bug view has changed. See this FAQ.

Talos svg_opacity needs mozAfterPaint, possibly other tests too

RESOLVED INVALID

Status

Testing
Talos
RESOLVED INVALID
3 years ago
a year ago

People

(Reporter: avih, Assigned: avih)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Assignee)

Description

3 years ago
Matt alerted me to the fact that svg_opacity doesn't seem to use mozAfterPaint (MAP), and that it's pretty much meaningless without it.

I checked at talos/test.py, and indeed it doesn't set the MAP flag for this suite.

He says that svg_opacity renders lots of overlapping semi-transparent SVG layers to screen, and should measure how fast we can optimize/render them, but that without MAP it pretty much measures only (svg?) parsing speed.

On top of parsing speed, we already know that svg_opacity is "great" for measuring network stack issues (like detecting http cache v2 issues) - this is probably because its 2 files are over 100k, but they're almost the only big files which don't measure anything other than loading the files to the browser from the network and parsing them, so they're the first to show big changes on network issues.

The other big files we have are some files at tsvgx (all but one do animation and TEST_HAS_OWN_TIMING which are unaffected by MAP) and gearflowers.svg which might do the same - i.e. tsvgx also doesn't set the MAP flag at test.py. However, it might be ignored when reporting regressions because we throw away the highest value, and I didn't check if gearflowers.svg ends up as the highest value of tsvgx. But I think we did notice that tsvgx also "Responds" to network issues, so it could be because all of its page load tests which are not animations measure just that without MAP.

So, what we need here is to set MAP for tsvg_opacity.

However, considering that setting MAP will change the numbers considerably to the level they won't be comparable anymore to old numbers, and also considering the fact that the current MAP-less tsvg_opacity does twistedly serve a purpose - measuring network issues, I think what we need here is a new test which copies tsvg_opacity buy with MAP.

I'd suggest tsvg_opacity_ttfr (this-time-for-real), and we can use this bug to handle it.

Phase 2 will be to consider what to do with tsvgx MAP-less page-load tests which are not animations, and in general review all tests and rectify this aspect.

As a side note, I'm happy to finally understand what was going on with the http cache v2 issues and the svg tests, and that we're finally getting to handle tsvg_opacity, which I wanted to do for a long while but never had the courage to actually do it ;)

I'm taking this bug.
(Assignee)

Comment 1

3 years ago
(In reply to Avi Halachmi (:avih) from comment #0)
> ... tsvgx also doesn't set the
> MAP flag at test.py. However, it might be ignored when reporting regressions
> because we throw away the highest value, and I didn't check if
> gearflowers.svg ends up as the highest value of tsvgx.

Actually, the reason it doesn't "pop" is that tsvgx has other "real" tests - the animation ones, and any regressions with big files which don't measure rendering is either dropped as the highest - as I suggested above, or just get swallowed into the average and manifest as much lower changes overall from network issues.
(Assignee)

Comment 2

3 years ago
Joel, do we need this for tsvg_opacity? tsvgr_opacity? both?

Do we actually need/use both of those currently?
Flags: needinfo?(jmaher)
we only run tsvgr_opacity now, so just that.  This test name will become tsvgr_opacity_paint.  As this is one of the last "untouched" tests, I doubt we will have much luck finding other tests to change the MAP settings on.
Flags: needinfo?(jmaher)
(Assignee)

Updated

3 years ago
Blocks: 908700
(Assignee)

Comment 4

3 years ago
Joel and me just discussed this.

We concluded that we'll keep the old test as-is for now (will show as tsvgr_opacity) and we know it measures mostly network and parsing perf - I'll add a clearer wiki note on this (the wiki already has such note, but now it's more conclusive).

And, we'll add a new test - tsvgrr_opacity_paint (note the double 'r') which will use MAP and will hopefully properly measure svg opacity rendering/optimization performance.
(Assignee)

Comment 5

3 years ago
Created attachment 8443097 [details] [diff] [review]
bug1027481.patch

This is unexpected. I don't think there's a difference between the old and the new versions in the numbers they report. Joel, how can I make sure locally that mozAfterPaint is indeed used with the new version? Does the patch miss anything?

Here are the numbers I'm getting (10 tppagecycles), they look pretty much the same to me:

svgr_opacity (old):

|0;big-optimizable-group-opacity-2500.svg;733;395;369;375;374;382;355;350;374;365
|1;small-group-opacity-2500.svg;789;718;714;690;729;729;734;701;724;728


svgrr_opacity (new, hopefully with MAP):

|0;big-optimizable-group-opacity-2500.svg;707;335;369;373;359;354;368;371;352;386
|1;small-group-opacity-2500.svg;870;735;717;732;723;718;724;722;735;714
Attachment #8443097 - Flags: review?(jmaher)
(Assignee)

Comment 6

3 years ago
Created attachment 8443100 [details] [diff] [review]
bug1027481.v2.patch - typo only - doesn't affect results
Attachment #8443097 - Attachment is obsolete: true
Attachment #8443097 - Flags: review?(jmaher)
Attachment #8443100 - Flags: review?(jmaher)
Comment on attachment 8443100 [details] [diff] [review]
bug1027481.v2.patch - typo only - doesn't affect results

Review of attachment 8443100 [details] [diff] [review]:
-----------------------------------------------------------------

this is nice!

::: talos/compare.py
@@ +39,5 @@
>  test_map['ts_paint'] = {'id': 83, 'tbplname': 'ts_paint'}
>  test_map['tpaint'] = {'id': 82, 'tbplname': 'tpaint'}
>  test_map['tsvgr'] = {'id': 224, 'tbplname': 'tsvgr'}
>  test_map['tsvgr_opacity'] = {'id': 225, 'tbplname': 'tsvgr_opacity'}
> +test_map['tsvgrr_opacity'] = {'id': 324, 'tbplname': 'tsvgrr_opacity'} # TODO: Verify that 324 is correct

we will need to add this to the db, but a safe assumption.
Attachment #8443100 - Flags: review?(jmaher) → review+
(Assignee)

Comment 8

3 years ago
See comment 5 - I don't think this change is working as expected. Any idea how to verify?
Flags: needinfo?(jmaher)
I bet this is measuring to the first mozafterpaint event we see (which is how all the other tests do it), I was afraid of this.  

I ran it locally and ensured we got the mozafterpaint event with some dump statements and my results ended up similar.
Flags: needinfo?(jmaher)
(Assignee)

Comment 10

3 years ago
(In reply to Joel Maher (:jmaher) from comment #9)
> I bet this is measuring to the first mozafterpaint event we see (which is
> how all the other tests do it), I was afraid of this.  
> 
> I ran it locally and ensured we got the mozafterpaint event with some dump
> statements and my results ended up similar.

Hmmm... Matt, could you please describe clearly how you've reached the conclusion that MAP is not used at this test? It appears that Joel suggests that despite the lack of explicit MAP flag for this test, it somehow still uses it anyway.
Flags: needinfo?(matt.woodrow)
I might have implied something accidentally-

running the original test we wait for onload, adding a listener for mozafterpaint yields almost identical results which is similar to how we see other tests.  This is because mozafterpaint fires multiple times and we pick up the first instance.  In fact there are cases where mozafterpaint can fire before onload, given that adding it to svgr_opacity might not make a difference.
(Assignee)

Comment 12

3 years ago
If I understand you correctly, you say that it's possible that the new tsvgrr_opacity test does use MAP (and the old test probably doesn't use MAP), but that we might be capturing a very early MAP event, maybe even before the load event, and very likely before the MAP we're want for this (which is after the SVG layers are rendered to screen)?

If that's the case, then we have IMO few options:

1. If the mozAfterPaint event is reliable and usable for our case, then fix the talos pageloader addon to capture the correct MAP event.

2. If the MAP event is not deterministic enough for our case, then maybe fix it and then make sure that the pageloader addon is aligned with the MAP behavior.

3. Regardless of the above and as a complete solution, modify the test a bit such that it measures its own timing. In this case we'll need:

- Make the whole document body with display:none initially.
- onload, set display to block, and apply to a requestAnimationFrame callback.
- Once rAF calls back, measure the duration since the load event, and report it as the result.

Assuming approach 3 is valid (Matt?), I think it would be the fastest to deliver, even if maybe not too fun to leave the pageloader addon broken like this (we can do a followup on this however).
So when using MAP, we ensure we receive both the onload and mozafterpaint events before recording the time.  On autophone (an on device android fennec testing system) we found that we would wait for the 3rd (I could be wrong) MAP event to get more realistic numbers.

The problem is we have no idea when we have rendered completely, there isn't a mozafterpaint_final event that is generated.  Maybe there is another event or something we can change in Firefox to generate the desired event.
(Assignee)

Comment 14

3 years ago
Markus, is the mozAfterPaint event reliable enough to use it to detect the "first content rendering after the load event"? is "first content rendering" deterministic enough for us (e.g. maybe the first rendering is only partial to the full rendering - and we need full here?), maybe we should, instead, aim at capturing some later or a different event?

Also, ehsan suggested that you might also be able to answer the 3-validity question from comment 12, so help there would be appreciated as well. FWIW, the file in question is this: http://hg.mozilla.org/build/talos/file/tip/talos/page_load_test/svg_opacity/small-group-opacity-2500.svg (and another similar one but a bit smaller).
Flags: needinfo?(mstange)
(Assignee)

Comment 15

3 years ago
Something doesn't add up. I implemented option 3 from comment 12, and modified the SVG file from comment 14 as follows (added the style, id and script):

> <svg xmlns="http://www.w3.org/2000/svg" width="600" height="600"
>      style="display:none" id="main">
>   <script>
>     window.addEventListener("load", function() {
>       var start = performance.now();
>       document.getElementById("main").style.display = "block";
>       requestAnimationFrame(function() {
>         alert(performance.now() - start);
>       });
>     });
>   </script>
> 
>   <title>Test that implementations minimize offscreen surface size for group opacity</title>
>   <rect x="0" y="0" width="12" height="12" stroke="white" opacity=".5"/>
>   <rect x="12" y="0" width="12" height="12" stroke="white" opacity=".5"/>
...

This gives me the following results, consistently in +/- 15ms range (the number in parenthesis is roughly the delta to the old results):

big-optimizable-group-opacity-2500.svg: ~295ms (~+70ms)
small-group-opacity-2500.svg:           ~670ms (~+50ms)

Considering that the old test results include network load time (http://localhost/..) and therefore a bit higher, the new results (with rAF) look to me very similar to the old ones.

Also, the new results do look meaningful. E.g. if I don't set the style to display:block, then rAF returns in relatively normal frame time (< 20ms).

From the numbers I've seen so far, I can only conclude that the old test did measure the correct thing (with some bias from the network load time). I don't know if it uses mozAfterPaint implicitly somehow, or maybe the load event handler is invoked after the rendering, but it doesn't seem like I can come up with numbers that are meaningfully different than the old test (neither by using the old test with mozAfterPaint, nor with the rAF method).

Matt, I'll need your help to reproduce the scenario from which you concluded that the old test doesn't measure a correct thing.
(In reply to Avi Halachmi (:avih) from comment #10)
> (In reply to Joel Maher (:jmaher) from comment #9)
> > I bet this is measuring to the first mozafterpaint event we see (which is
> > how all the other tests do it), I was afraid of this.  
> > 
> > I ran it locally and ensured we got the mozafterpaint event with some dump
> > statements and my results ended up similar.
> 
> Hmmm... Matt, could you please describe clearly how you've reached the
> conclusion that MAP is not used at this test? It appears that Joel suggests
> that despite the lack of explicit MAP flag for this test, it somehow still
> uses it anyway.

I only looked for the MAP flag and didn't find it. If we are recording the paint duration (as it appears we are) then this isn't really an issue.
Flags: needinfo?(matt.woodrow)
(Assignee)

Comment 17

3 years ago
(In reply to Matt Woodrow (:mattwoodrow) from comment #16)
> I only looked for the MAP flag and didn't find it. If we are recording the
> paint duration (as it appears we are) then this isn't really an issue.

Quoting from IRC just now:

<mattwoodrow> avih: Indeed, but I think that means my patch just didn’t work
<avih> mattwoodrow: either that, or that the test didn't work. did you verify that your patch didn't work?
<mattwoodrow> avih:  Talos profiling on try looks like it’s recording paint time
<avih> mattwoodrow: so you're more convinced than not that your patch didn't work and that the test seems good in this?
<mattwoodrow> avih: yes

^ This, together with my independent conclusion from comment 15 suggest that this was a case of wrong identification of a non-existing issue.

Closing as INVALID, please reopen if needed.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → INVALID
(Assignee)

Comment 18

3 years ago
Adding another bit of IRC chat:

<avih> mattwoodrow: re "Talos profiling on try looks like it’s recording paint time", how did you conclude that the paint time happens inside the measurement and not outside it?
<mattwoodrow> avih: Doesn’t it stop recording when the test ends?
<avih> mattwoodrow: it should on some tests, but that alone is not an indication enough. it should set some markers on some (possibly other) tests, so that's a better indication, but i think even this is not 100% reliable for this conclusion
<avih> mattwoodrow: the only reliable conclusion would come from you logically understanding why your patch didn't work, and that a patch which does work moves the needle on this test
<mattwoodrow> avih: Enabling tiling did change it
<avih> because right now, i don't understand how this test works, because i don't think it's waiting for mozafterpaint
<avih> mattwoodrow: hmm.. interesting. ok, i'll take it :) thx
Flags: needinfo?(mstange)
(Assignee)

Comment 19

3 years ago
Created attachment 8446401 [details] [diff] [review]
bug1027481.v3-rAF.patch

Just adding the rAF version for future reference.
See Also: → bug 1254628
You need to log in before you can comment on or make changes to this bug.