Ensure Talos Performance Tests still work with silk enabled

RESOLVED FIXED in Firefox 39

Status

()

Core
Graphics
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: mchang, Assigned: mchang)

Tracking

37 Branch
mozilla39
x86
Mac OS X
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox39 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

2 years ago
From https://bugzilla.mozilla.org/show_bug.cgi?id=1127151#c16. Ensure talos tests still run correctly with layout.frame_rate = -1 and silk enabled.
(Assignee)

Updated

2 years ago
Depends on: 1128691
(Assignee)

Updated

2 years ago
Blocks: 1071275
(Assignee)

Comment 1

2 years ago
Created attachment 8568105 [details] [diff] [review]
WIP - Disable silk if in ASAP mode

Disables silk if we are in ASAP mode. For now, this will be good enough to use the old refresh driver / compositor scheduling mechanisms. Once we have bug 	1133527, which uses software vsync everywhere, I plan to delete all the old scheduling mechanisms and replace it with a simpler ASAP mode for talos only.

WIP until I verify this actually works during talos tests.
(Assignee)

Comment 2

2 years ago
Two pushes

One with silk disabled:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=52eae9770cfe

One with silk enabled:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=56fffabb4bd5

Retriggered on what I think looked off. If you could please take a look Avih to see if everything looks in order to you. Thanks!

Both commits were based off the same parent commit.
Flags: needinfo?(avihpit)
(Assignee)

Comment 3

2 years ago
From irc: try using the compare talos tool - http://compare-talos.mattn.ca/?oldRevs=600f44fd317c&newRev=52eae9770cfe&server=graphs.mozilla.org&submit=true

Need to have a clean push because it integrates results from m-c, which may or may not have PGO. Base push w/o any patches based on m-c parent c7968255c1ea

base push - https://treeherder.mozilla.org/#/jobs?repo=try&revision=742011f28b13
base + disable silk in talos mode, silk still disabled - https://treeherder.mozilla.org/#/jobs?repo=try&revision=a18f27aa5662
base + disable silk in talos mode + silk enabled - https://treeherder.mozilla.org/#/jobs?repo=try&revision=0ba98f618617
Flags: needinfo?(avihpit)
(Assignee)

Comment 4

2 years ago
Comparisons:

base vs patch - http://compare-talos.mattn.ca/?oldRevs=742011f28b13&newRev=a18f27aa5662&server=graphs.mozilla.org&submit=true

base vs silk - http://compare-talos.mattn.ca/?oldRevs=742011f28b13&newRev=0ba98f618617&server=graphs.mozilla.org&submit=true

With the patch, it's confusing that Windows XP Tscroll went down so much. 

With silk, it's confusing that ubuntu tpaint went down since nothing should actually affect ubuntu here. The other thing I see is that in many tests, we don't actually set layout.frame_rate = 0, most of them run at 0. In this case, it makes sense that some of the numbers would change quite a bit. Do we know which tests actually set layout.frame_rate = 0?

Thanks!
Flags: needinfo?(avihpit)
(Assignee)

Updated

2 years ago
Blocks: 1137905

Comment 5

2 years ago
The tests which use ASAP are all the animation tests. Those are: TART/CART/tsvg[x]/tscrollx/tp5o-scroll/glterrain.

Please explain how those two try pushes affect Firefox in both ASAP mode and in "normal" mode (so overall 4 cases).
Flags: needinfo?(avihpit)

Comment 6

2 years ago
I'm summarizing what the patches do:

The "patch" adds code which only affects firefox when the silk is enabled (via prefs). What it does is disable silk in ASAP mode. But since the silk prefs are disabled by default anyway right now, this patch is literally a no-op and probably also unreachable code.

The "silk" one takes the "patch" and enables the silk prefs. This should result in ASAP mode supposedly identical to now (since the "patch" disables silk on that case), but for non ASAP tests - it's using silk instead of the current timing mechanism.

Still missing here is "full silk" patch which also tests ASAP mode with silk, which mchang says will be added at a later stage.
(Assignee)

Comment 7

2 years ago
Compare try with nightly versus this bug's patch + silk enabled - http://compare-talos.mattn.ca/dev/?oldBranch=Try&oldRevs=742011f28b13&newBranch=Try&newRev=0ba98f618617&submit=true

Comment 8

2 years ago
(In reply to Mason Chang [:mchang] from comment #7)
> Compare try with nightly versus this bug's patch + silk enabled ...

So this looks good IMO. All the ASAP tests are within the noise level of the base results (indicated on compare-talos by the fact that it doesn't color the diff neither green nor red).

This means that the two patches work well together - they revert to current ASAP mode even when silk is enabled.

There are few cases where silk is active (on the non-ASAP tests) and where the results differ. Probably most notably:

- sessionrestore_no_auto_restore win7 32 (~8% improvement)

- Dormaeo-DOM win8.1 64 (~16% regression)
- tpaint on all platforms except osx and XP (~7-10% regression)
- tresize on windows and osx (~4-8% regression)

If we only care about the regressions, then overall these are meaningful but few and not very big regressions.

I'd suggest to first try and explain them, and then probably accept them unless someone think they can be avoided somehow.
(Assignee)

Comment 9

2 years ago
Created attachment 8571421 [details] [diff] [review]
Disable silk in talos ASAP mode

Disables Silk in ASAP mode. It does this by checking the layout.frame_rate preference in gfxPlatform and will only enable the vsync compositor and refresh driver if layout is not in ASAP mode. Since we also have a Compositor ASAP mode independent of layout.frame_rate, we only enable the vsync compositor if both the refresh driver and compositor are not in ASAP mode. 

The talos results from comment 8 are with Silk enabled. This patch does not enable silk, so the talos results should be net 0.
Attachment #8568105 - Attachment is obsolete: true
Attachment #8571421 - Flags: review?(mstange)
Comment on attachment 8571421 [details] [diff] [review]
Disable silk in talos ASAP mode

This is adding a little too much duplicated logic for my taste.

How about this:
Change CalculateCompositionFrameRate to:

// Returns false if the default frame rate should be used.
// If the frame rate is overriden, returns true and sets
// *aOutOverrideFrameRate to the override frame rate.
static bool UseCustomCompositionFrameRate(int32_t* aOutOverrideFrameRate)
{
  int32_t compositionFrameRatePref = gfxPrefs::LayersCompositionFrameRate();
  if (compositionFrameRatePref >= 0) {
    *aOutOverrideFrameRate = compositionFrameRatePref;
    return true;
  }
  // Use the same frame rate for composition as for layout.
  int32_t layoutFrameRatePref = gfxPrefs::LayoutFrameRate();
  if (layoutFrameRatePref >= 0) {
    *aOutOverrideFrameRate = layoutFrameRatePref;
    return true;
  }

  // Use the default frame rate, don't override anything.
  return false;
}

ScheduleSoftwareTimerComposition():

  int32_t rate = kDefaultFrameRate;
  UseCustomCompositionFrameRate(&rate);

and then, in UseVsyncComposition(), do this:
  int32_t overrideFrameRate;
  return gfxPrefs::UseVsyncAlignedCompositor() &&
         gfxPrefs::HardwareVsyncEnabled() &&
         !UseCustomCompositionFrameRate(&overrideFrameRate);


In CreateVsyncRefreshTimer() I'd instead do this:

  if (!gfxPrefs::VsyncAlignedRefreshDriver() ||
      !gfxPrefs::HardwareVsyncEnabled() ||
      gfxPrefs::LayoutFrameRate() != -1) {
    return;
  }

so that we won't use Vsync either if the layout frame rate has been set to something custom (non-asap). Or did you intentionally want to ignore non-zero override values if vsync is enabled?
(Assignee)

Comment 11

2 years ago
(In reply to Markus Stange [:mstange] from comment #10)
> Comment on attachment 8571421 [details] [diff] [review]
> Disable silk in talos ASAP mode
> 
> This is adding a little too much duplicated logic for my taste.
> 
> How about this:
> Change CalculateCompositionFrameRate to:
> 
> // Returns false if the default frame rate should be used.
> // If the frame rate is overriden, returns true and sets
> // *aOutOverrideFrameRate to the override frame rate.
> static bool UseCustomCompositionFrameRate(int32_t* aOutOverrideFrameRate)
> {
>   int32_t compositionFrameRatePref = gfxPrefs::LayersCompositionFrameRate();
>   if (compositionFrameRatePref >= 0) {
>     *aOutOverrideFrameRate = compositionFrameRatePref;
>     return true;
>   }
>   // Use the same frame rate for composition as for layout.
>   int32_t layoutFrameRatePref = gfxPrefs::LayoutFrameRate();
>   if (layoutFrameRatePref >= 0) {
>     *aOutOverrideFrameRate = layoutFrameRatePref;
>     return true;
>   }
> 
>   // Use the default frame rate, don't override anything.
>   return false;
> }
> 
> ScheduleSoftwareTimerComposition():
> 
>   int32_t rate = kDefaultFrameRate;
>   UseCustomCompositionFrameRate(&rate);
> 
> and then, in UseVsyncComposition(), do this:
>   int32_t overrideFrameRate;
>   return gfxPrefs::UseVsyncAlignedCompositor() &&
>          gfxPrefs::HardwareVsyncEnabled() &&
>          !UseCustomCompositionFrameRate(&overrideFrameRate);
> 
> 

Yeah I don't really like that we check these pref values explicitly in both the Compositor and refresh driver. Instead I'd like to make it explicit that we're in ASAP mode and keep track of the pref in one place. I was going to delete CalculateCompositionRate once we have silk everywhere. So yes, there is some code duplication at the moment, but I want to delete it when I delete all the software scheduling mechanisms in general.

> In CreateVsyncRefreshTimer() I'd instead do this:
> 
>   if (!gfxPrefs::VsyncAlignedRefreshDriver() ||
>       !gfxPrefs::HardwareVsyncEnabled() ||
>       gfxPrefs::LayoutFrameRate() != -1) {
>     return;
>   }
> 
> so that we won't use Vsync either if the layout frame rate has been set to
> something custom (non-asap). Or did you intentionally want to ignore
> non-zero override values if vsync is enabled?

I think ignoring non-zero override values if vsync is enabled is the way to go. Do we ever use custom compositor rates? If not, I think we should really only have two routes: vsync or ASAP. What do you think?
Flags: needinfo?(mstange)
Comment on attachment 8571421 [details] [diff] [review]
Disable silk in talos ASAP mode

Ok, I agree with that. r+ under the promise that, as soon as vsync is enabled everywhere, we rename the prefs to ....asap-mode (or something similar), make them booleans, and remove all code that handles override frame rates.
Flags: needinfo?(mstange)
Attachment #8571421 - Flags: review?(mstange) → review+
(Assignee)

Comment 13

2 years ago
(In reply to Markus Stange [:mstange] from comment #12)
> Comment on attachment 8571421 [details] [diff] [review]
> Disable silk in talos ASAP mode
> 
> Ok, I agree with that. r+ under the promise that, as soon as vsync is
> enabled everywhere, we rename the prefs to ....asap-mode (or something
> similar), make them booleans, and remove all code that handles override
> frame rates.

Filed bug 1138627.
(Assignee)

Comment 14

2 years ago
(In reply to Avi Halachmi (:avih) from comment #8)
> (In reply to Mason Chang [:mchang] from comment #7)
> > Compare try with nightly versus this bug's patch + silk enabled ...
> 
> So this looks good IMO. All the ASAP tests are within the noise level of the
> base results (indicated on compare-talos by the fact that it doesn't color
> the diff neither green nor red).
> 
> This means that the two patches work well together - they revert to current
> ASAP mode even when silk is enabled.
> 
> There are few cases where silk is active (on the non-ASAP tests) and where
> the results differ. Probably most notably:
> 
> - sessionrestore_no_auto_restore win7 32 (~8% improvement)
> 
> - Dormaeo-DOM win8.1 64 (~16% regression)
> - tpaint on all platforms except osx and XP (~7-10% regression)
> - tresize on windows and osx (~4-8% regression)
> 
> If we only care about the regressions, then overall these are meaningful but
> few and not very big regressions.
> 
> I'd suggest to first try and explain them, and then probably accept them
> unless someone think they can be avoided somehow.

So I did a few more retriggers, Linux is now no longer affected, which makes tons more sense. So it looks like tpaint and tresize are the big ones on OSX and Windows.

Then Windows 8.1 has a dromaeo regression.
(Assignee)

Comment 15

2 years ago
 https://hg.mozilla.org/integration/mozilla-inbound/rev/e7ed44d59353
https://hg.mozilla.org/mozilla-central/rev/e7ed44d59353
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
status-firefox39: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Comment on attachment 8571421 [details] [diff] [review]
Disable silk in talos ASAP mode

>diff --git a/layout/base/nsRefreshDriver.cpp b/layout/base/nsRefreshDriver.cpp
>--- a/layout/base/nsRefreshDriver.cpp
>+++ b/layout/base/nsRefreshDriver.cpp
>+  NS_WARNING("Enabling vsync refresh driver\n");


For future reference: our assertion/warning macros shouldn't have a newline at the end -- they print out additional information (a colon and the file path) at the end of the warning-message, and the newline makes that look odd (particularly now that all of our logging lines start with [Parent 12345] or [Child 23456] -- the newline throws that off).

I'm cleaning some of these up in bug 1161731.
You need to log in before you can comment on or make changes to this bug.