Closed Bug 1266181 Opened 4 years ago Closed 3 years ago

pageloader should focus on content before starting test

Categories

(Testing :: Talos, defect)

defect
Not set

Tracking

(firefox49 fixed)

RESOLVED FIXED
mozilla49
Tracking Status
firefox49 --- fixed

People

(Reporter: jmaher, Assigned: avih)

References

(Blocks 1 open bug)

Details

(Whiteboard: [talos_wishlist])

Attachments

(1 file, 1 obsolete file)

it appears that pageloader might not focus on the content page when starting a test:
https://bugzilla.mozilla.org/show_bug.cgi?id=1254898#c45

we should implement it and see what it looks like on try, etc.!
Whiteboard: [talos_wishlist]
Blocks: 1254898
Comment on attachment 8750464 [details] [diff] [review]
bug1266181.patch

Review of attachment 8750464 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/talos/talos/pageloader/chrome/pageloader.js
@@ +253,5 @@
>                       browserWindow.moveTo(0, 0);
>                       browserWindow.focus();
>  
>                       content = browserWindow.getBrowser();
> +                     content.focus && content.focus();

I really don't like this- what if content.focus doesn't exist, do we not focus? I would rather see a simple log message explaining that.  I appreciate the safeguard, but would rather know why it won't exist sometimes and use that to help us investigate if those cases end up being problematic.
Attachment #8750464 - Flags: review?(jmaher) → review-
reply to Joel Maher (:jmaher) from comment #2)
> I really don't like this- what if content.focus doesn't exist, do we not
> focus? I would rather see a simple log message explaining that.  I
> appreciate the safeguard, but would rather know why it won't exist sometimes
> and use that to help us investigate if those cases end up being problematic.

Fair enough. But I have other issues on my mind. See next.

So results came in, and they look great (pretty much identical) on the vast majority of the tests, however, some tests show some diff, and surprisingly, some of those are not even using the pageloader so they should definitely not be affected. This is weird.

Joel, any idea what might be going on? is it plain noise? some of those are 10-40% (though none with "high" confidence). sessionrestore_no_auto_restore opt linux 64 is 10% with "medium" confidence, but afaik it should not be using the pageloader at all.. so where does the diff come from?

Suggestion how to proceed?

We can make the startup focus conditional and enable it only for the video test, but that would mean another config option for talos etc... which would have been fine if I knew we need it, but these results seem a bit weird to be considered conclusive on this IMO...(In
So I decided to play it safe and only enable it for the video test (bug 1254898).

I'm having seriously seriously hard time adding a talos option tpstartupfocus which can be set both from test.py and from the command line. So far I've managed to make it work only from tesy.py.

I've modeled it after tpscrolltest, but it too, apparently, only works at test.py but not from the command line. I'm pretty sure it used to work from the command line when I added it, but it doesn't anymore.

Quite hard to tell when and how it got broken because the entire talos commit history was apparently thrown away when talos moved to m-c (judging by "history" for tp-cmdline.js - https://github.com/mozilla/gecko-dev/commits/master/testing/talos/talos/pageloader/components/tp-cmdline.js ).

Still, even if we do have the history, I doubt it would help much in understanding what's the correct way to do it.

Looking at other talos boolean options to model tpstartupfocus after, I just can't seem to find any two which are handled the same consistently throughout the code. Every boolean option is handled at different places and files, appear (different) multiple times at the code, and in different variations and initializations.

Please advise, the main options being IMO:
- Add as option which only works from test.py (like tpscrolltest).
- Add as option however you tell me to.
- Don't add an option and enable it for all tests (like the patch I posted does, with the weird perf diffs).
Flags: needinfo?(jmaher)
to solve the issues of the differences, I want at least 10 data points per job before calling it a problem.  Unfortunately our infrastructure is "down" and we might need to wait a few hours before collecting more data.  sessionrestore/ts_paint are noisy tests, they need more data points or manually looking at history/trends.

I would prefer not to add this as a flag, instead add it as a default part of pageloader- since I can see a lot of value here.
Flags: needinfo?(jmaher)
Thanks, Joel.

When I'm reconsidering the global effect, if there is one, I now think that having the content focused before the test starts is actually more representative of actual behavior with users. That's because we typically use a link to get to a page, and on such cases the focus is at the content and not at the navigation bar.

So, with the retriggers it looks better. Now really the vast majority of tests/platforms are pretty much identical as before, but there are few exceptions, all of them with "low" confidence:

- There are few non-pageloader tests which show a diff. I'm ignoring them since this is pure noise, as the patch only touches the pageloader.

The following look like noise to me, judging after looking at the graphs:

- tp5o responsiveness e10s on win7 is 9% worse.

- tp5n main_normal_fileio opt e10s win7 is 50% worse.

- tabpaint e10s: linux is 8% worse, win7 is 7% better, win8 is 4% worse.


So the only ones which I think is not noise are only on linux 64:
- glterrain/tp5o_scroll/tscrollx: all those tests show ~15% better on non-e10s and ~10% worse on e10s. I should ping some linux dev to get an opinion on this.

Overall, we need an opinion on the linux changes but otherwise I think this is OK.
@nical

Talos is currently having the focus at the URL bar when the browser window opens, and we want to change it to focus the content once on startup (before the tests start).

On all platforms the results seems unaffected, but on Linux64: glterrain/tp5o_scroll/tscrollx all show ~15% better on non-e10s and ~10% worse on e10s.

See https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=304a0af9742e&newProject=try&newRevision=a12e582a037b&framework=1&showOnlyImportant=0

Any idea what might be going on? should we be concerned? Do you think it's OK to focus the content on startup and take the modified numbers as the new baseline?
Flags: needinfo?(nical.bugzilla)
(In reply to Avi Halachmi (:avih) from comment #8)
> Any idea what might be going on? should we be concerned? Do you think it's
> OK to focus the content on startup and take the modified numbers as the new
> baseline?

I have no clue. forwarding the question to Jeff and Lee.
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(lsalzman)
Flags: needinfo?(jmuizelaar)
I noticed yesterday that linux64 glterrain became bi-modal on may 9th:
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,485c0fb96b42e7472f3cafbfea5ce0c3058b1bd1,1%5D

looking in detail at glterrain, I see that the raw numbers are bi-modal and not useful, likewise for tp5o_scroll and tscrollx.

these are all within the normal data ranges on inbound:
- tp5o responsiveness e10s on win7 is 9% worse.
- tp5n main_normal_fileio opt e10s win7 is 50% worse.
- tabpaint e10s: linux is 8% worse, win7 is 7% better, win8 is 4% worse.

I vote for landing this as is!
Flags: needinfo?(lsalzman)
Flags: needinfo?(jmuizelaar)
> On all platforms the results seems unaffected, but on Linux64: glterrain/tp5o_scroll
> /tscrollx all show ~15% better on non-e10s and ~10% worse on e10s.

Joel, this looks like a real pattern to me, but you're saying you're convinced it's due to some freak accident/infrastructure/whatever and that we should go ahead and land the patch (after the improvement you wanted) ?

If you're not sure, please return the ni? which you removed. Otherwise, let me know and I'll post the updated patch.
Flags: needinfo?(jmaher)
please post the updated patch- the graphs became bi-modal this week for linux64 and your data happens to be getting split up among those modes unfavorably random- collecting 10 more data points for each of those jobs would probably reduce your worries.
Flags: needinfo?(jmaher)
Huh, this is weird. I was about to land it but it seems the patch does not work anymore...

The content does not get focused. Investigating.
Mike, few days ago using content.focus() worked, but it seems it does not anymore. Any idea why?

I've changed that to use content.selectedBrowser.focus() instead, which seems to work for both e10s and non e10s. Reasonable?
Attachment #8752082 - Flags: review?(mconley)
Attachment #8750464 - Attachment is obsolete: true
(the patch at comment 14 has an indentation issue which I fixed locally. Apparently the entire file is indented with non-even number of indents which confused my editor).
Comment on attachment 8752082 [details] [diff] [review]
bug1266181.v2.patch

Review of attachment 8752082 [details] [diff] [review]:
-----------------------------------------------------------------

General gripe - "content" is really not a great name for this thing anyway - it's the gBrowser of a window, which is definitely not the "content" in the browser sense of the word.

Anyhow, I'm unsure why content.focus() would have stopped working. Even if we _did_ focus the gBrowser, I think it makes way more sense to do what you're doing here, and focus the selectedBrowser instead.

::: testing/talos/talos/pageloader/chrome/pageloader.js
@@ +253,5 @@
>                       browserWindow.moveTo(0, 0);
>                       browserWindow.focus();
>  
>                       content = browserWindow.getBrowser();
> +                     if (content.selectedBrowser)

One-liners without braces always freak me out. Mind bracing these? And makesure that the content.selectedBrowser.focus(); is indented 2 spaces.
Attachment #8752082 - Flags: review?(mconley) → review+
(In reply to Mike Conley (:mconley) - (catching up on reviews) from comment #16)
> General gripe - "content" is really not a great name for this thing anywa

Yeah, but it was there before and I wanted a minimal patch so I left it as is. Also, supposedly the pageloader addon is due for a rewrite at some stage.


> I think it makes way more sense to do what you're doing here,

Glad to hear.

> One-liners without braces always freak me out. Mind bracing these?

Sure.

> makesure that the content.selectedBrowser.focus(); is indented 2 spaces.

Yeah, see comment 15.

Thanks.
Just adding info that I had to update the pageloader addon version from 1.0.5 to 1.0.6, since otherwise the AMO(?) server would refuse to sign it. jmaher r+ it on IRC.

The tree is closed now, I'll push it when it opens.
https://hg.mozilla.org/mozilla-central/rev/5d6dac688ec8
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
So, bug 1274536 suggests that the patch here does affect the talos results of some tests, most notably ~30% worse numbers at tabpaint e10s on most platforms.

Assuming the "regression" (quoted - because it can only affect talos numbers - it can't affect actual Firefox performance) really is caused the patch here, I think we should revisit my suggestion at comment 4 and enable this focus only for the video test.

However, I would need some help from Joel on that.
Flags: needinfo?(jmaher)
It's also possible that the "issue" is with the tabpaint test.

Options:

1. Maybe tabpaint does not expect the content area to be focused and so ends up measuring unintended case (at which case the tabpaint test itself should make sure it initializes the browser the way it wants it before the test).

2. Maybe focused content area really is a better state for the tabpaint test, but before this patch it was not focused, and hence the previous numbers which tabpaint generated were not representative enough, and now they're more representative.

However, I was not involved with the tabpaint test and so cannot comment on any of these options.

Mike maybe?
Flags: needinfo?(mconley)
as discussed on irc, here are the 3 options:
1) make content.focus() for the video test only (I can help with the pageloader plumbing here)
2) accept this as a new baseline for the numbers/tests
3) leave this code alone and investigate why this is worse for tabpaint/tsvgx in e10s land- possibly fix tests to be better, or firefox to work better in those cases.

lets see what mconley says about tabpaint and determine which route to take.
Flags: needinfo?(jmaher)
(In reply to Joel Maher (:jmaher) from comment #23)
> 3) leave this code alone and investigate why this is worse for
> tabpaint/tsvgx in e10s land- possibly fix tests to be better, or firefox to
> work better in those cases.

If we keep the patch as is, for tsvgx we should definitely just take it as a new baseline with no reservations (same also with 2% tp5o linux64 e10s).

The only real thing to understand is the ~30% worse numbers on tabpaint e10s.
We'd need to take a profile to be absolutely sure, but I ran into some issues with focusing and tpaint a while back with e10s. Specifically, when the parent process requests that the browser element focus, it sends a message down to the child, which then causes some sync messages to be sent to the parent (sometimes several), and if the parent is busy doing something and is not available to respond, then the content process is blocked until the parent can answer. This can be pretty bad if the child is in the middle of a performance test.

For tpaint, the solution (in order to optimize painting content) was to have the parent wait for content to paint before focusing it.

If that can't be done, it might be worth experimenting with waiting for a few seconds after the focus to begin the test.
Flags: needinfo?(mconley)
(In reply to Mike Conley (:mconley) - (needinfo me!) from comment #25)
> ... Specifically, when
> the parent process requests that the browser element focus, it sends a
> message down to the child, which then causes some sync messages to be sent
> to the parent (sometimes several), and if the parent is busy doing something
> and is not available to respond, then the content process is blocked until
> the parent can answer. This can be pretty bad if the child is in the middle
> of a performance test.

However, it is not at the middle of a performance test.

The patch focuses selectedBrowser once, on startup of Firefox, and that's it. Even more, after the focus happens - Firefox still idles for about 10 seconds before the tests actually start.

So I don't think this hypothesis is relevant as a possible cause for the different numbers.
Alright, fair enough.

In that case, I think a profile is in order. ~30% for e10s-only is no bueno, and we need to understand where that time is going.
Just to focus our effort:

1.What we originally wanted and still want is focusing the content area before the video test starts.

2. The fact that a patch which focused all pageloader tests shows unexpected numbers with tabpaint is important to understand, but outside the scope of this bug. I'll file a separate bug for it.

3. For our original goal - and to create less waves, I think we should focus the content only for the video test (so a followup for the patch which landed or a backout and pushing something else instead).

4. We can do #3 using some config options as I tried at comment 4 without great success, or as Joel suggested on IRC - using preferences (which introduces yet another system to pass config from test.py to the pageloader addon, and which I'm not greatly fond of at the context of this bug), or, a new idea:

- Add a "focusContent" function to talos-powers and let the test itself focus the content via talos-powers. This means also backing out the patch which landed and resolving this bug as WONTFIX.

This way it's not a global thing, and the test has more control over the setup which it needs.

Joel, what do you think?
Flags: needinfo?(jmaher)
Also, something I just thought of.

Without the focus, the try pushes of the video test were failing intermittently. With the focus they deterministically never hang.

This seems to suggest that the focus is not deterministic between talos runs. If that is indeed the case, and if, as it appears, it affects the results of some test, then this could easily be the cause for some bimodal results.
yes, I think we should go forward with your proposed plan- ideally this will be like mozafterpaint where we apply it to as many tests as it makes sense to.

To reiterate the ideal solution for toggling pageloader, it would be setting a preference in the test, and having pageloader read that preference.  This would be much more straightforward rather than setting command line flags that the pageloader addon needs to understand.
Flags: needinfo?(jmaher)
(In reply to Joel Maher (:jmaher) from comment #30)
> yes, I think we should go forward with your proposed plan

OK - I assume this means not an unconditional global thing (i.e. backout the patch which landed here), right?


> ideally this will
> be like mozafterpaint where we apply it to as many tests as it makes sense
> to.

I don't know about "many tests". For now there's only one test which awaits this feature - the video test.

Whether or not "many tests" would benefit from this feature is a question which we didn't really try to answer, and considering the sensitivity of talos results, I'm not sure I want to.


> To reiterate the ideal solution for toggling pageloader, it would be setting
> a preference in the test...

There's still this option too:

(In reply to Avi Halachmi (:avih) from comment #28)
> - Add a "focusContent" function to talos-powers and let the test itself
> focus the content via talos-powers. This means also backing out the patch
> which landed and resolving this bug as WONTFIX.
> 
> This way it's not a global thing, and the test has more control over the
> setup which it needs.

Incidentally or not, I already have a small patch for talos-powers which enables this features, and also a small patch for the video test which uses talos-powers for this, and I also verified that it works (i.e. I undid the global-focus patch which landed here, applied the talos-powers focus patch and its usage at the video test - and it works as expected).

Unless you still have reservations, I suggest to take this approach (talos powers), as it's much more focused, pardon the pun.

What say you?
Flags: needinfo?(jmaher)
discussed in irc- we will go forward with existing code that avih has to add a new cli option.
Flags: needinfo?(jmaher)
(In reply to Joel Maher (:jmaher) from comment #32)
> discussed in irc- we will go forward with existing code that avih has to add
> a new cli option.

There was a misunderstanding on IRC which got clarified since: The patch I have uses talos powers and does not add a new cli option.

I'll file a new bug to backout this global focus and instead add talos-powers feature where the test itself can request to focus the content.
Blocks: 1274992
You need to log in before you can comment on or make changes to this bug.