Add a Talos regression test for session restore times

RESOLVED FIXED in Firefox 32

Status

()

defect
RESOLVED FIXED
6 years ago
5 years ago

People

(Reporter: vladan, Assigned: Yoric)

Tracking

(Blocks 3 bugs, {ateam-talos-task, dev-doc-complete})

unspecified
Firefox 32
Points:
---
Dependency tree / graph
Bug Flags:
firefox-backlog +
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [leave open] p=0)

Attachments

(3 attachments, 10 obsolete attachments)

4.79 KB, patch
froydnj
: review+
ttaubert
: checkin+
Details | Diff | Splinter Review
691.83 KB, patch
Details | Diff | Splinter Review
2.27 KB, patch
Details | Diff | Splinter Review
Session restore times tend to show up in benchmarks:

http://lifehacker.com/5976082/browser-speed-tests-chrome-24-firefox-18-internet-explorer-10-and-opera-1212

We should have a comparable test in our Talos suite. See also bug 936617
Blocks: WBGP
Blocks: start-faster
No longer blocks: WBGP
any idea how to test this?  would we create a profile that we use that is seeded with restore stuff?  would we want to restore or see the hit that finding data to restore causes?
Component: General → Session Restore
Product: Core → Firefox
Assignee: nobody → dteller
Posted patch Experimental patch, WIP (obsolete) — Splinter Review
Here's a first draft.
For some reason, though, I cannot call tpRecordTime. Any idea?
Attachment #8369474 - Flags: feedback?(jmaher)
Attachment #8369474 - Flags: feedback?(avihpit)
Posted patch Experimental patch, WIP v2 (obsolete) — Splinter Review
Ok, here is a patch that actually does something.
Is there something major I need to change before proceeding?
Attachment #8369474 - Attachment is obsolete: true
Attachment #8369474 - Flags: feedback?(jmaher)
Attachment #8369474 - Flags: feedback?(avihpit)
Attachment #8369486 - Flags: feedback?(jmaher)
Attachment #8369486 - Flags: feedback?(avihpit)
Could you please summarize what works and what doesn't with the latest patch?
Comment on attachment 8369486 [details] [diff] [review]
Experimental patch, WIP v2

Review of attachment 8369486 [details] [diff] [review]:
-----------------------------------------------------------------

this looks like a good start!

::: talos/page_load_test/sessionrestore/addon/content/index.html
@@ +13,5 @@
> +
> +<div>
> +  <strong>Time to sessionRestored</strong> <span id="sessionRestored">(in progress)</span>
> +</div>
> +  

nit: blank line with whitespace

::: talos/page_load_test/sessionrestore/profiles/prefs.js
@@ +1,2 @@
> +// Restore sessionstore.js
> +pref("browser.startup.page", 3);

be aware that we will have all the other preferences from PerfConfigurator.py in here as well.

::: talos/test.py
@@ +210,5 @@
>              'xperf_providers', 'xperf_user_providers', 'xperf_stackwalk', 'filters', 'preferences', 'extensions',
>              'setup', 'cleanup','fennecIDs'
>              ]
>  
> +class sessionrestore_base(PageloaderTest):

I still think this should be a ts based test.  I will keep an open mind for now.

@@ +215,5 @@
> +    """
> +    TODO
> +    """
> +    tpcycles = 1
> +    tppagecycles = 1

we will need more cycles here

@@ +216,5 @@
> +    TODO
> +    """
> +    tpcycles = 1
> +    tppagecycles = 1
> +    tploadaboutblank = True

this was changed recently to tpnocache, I don't think this is necessary for your test.
Attachment #8369486 - Flags: feedback?(jmaher) → feedback+
Comment on attachment 8369486 [details] [diff] [review]
Experimental patch, WIP v2

Review of attachment 8369486 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.

We should discuss if we need page load test or ts. I'm still not fully aware of the implications and modifications required to change to ts. Also currently there's only one profile and I had the impression this test will have more than one.
Attachment #8369486 - Flags: feedback?(avihpit) → feedback+
Yes, I wanted to collect feedback with one profile first.
Here's a summary of an IRC discussion with jmaher, yoric and myself:

1. The test will probably use more than one profile.
2. We still don't know if we'll be measuring 1 or N keys per profile.
3. Restarting the browser is required before collecting the next set (or 1 if one key) of measurements.

4. talos page load tests support reporting N different keys per load.
5. We don't know if talos page load test supports restarting the browser between pages (apparently no current test needs/does it).
6. ts tests currently support a single key per test.
7. ts tests can and do restart the browser between data collection.

If we don't want to touch the talos infrastructure, then we'll have to use ts tests as follows:

Each profile X key will be a different ts test. E.g. if we need 3 profiles and 2 keys per profile, we'll end up with 6 different test names.

While this is reasonably OK with 1 key per profile, using more than one will restart the browser more than technically required, and consequentially, will take longer to run.
Also, IMO, ideally, we should include all the measurements (each key x profile) as subtests of a single big test. Currently talos doesn't support this, but if we can make pageload test restart between measurements, or make ts test aggregate data better, then it would be better than having each subtest as a test of its own.
Ok, here we go.
r?jmaher for everything Talos-related
r?ttaubert for everything Session Restore-related, including sessionstore.js
Attachment #8369486 - Attachment is obsolete: true
Attachment #8390469 - Flags: review?(ttaubert)
Attachment #8390469 - Flags: review?(jmaher)
Comment on attachment 8390469 [details] [diff] [review]
Startup Test for session restore

Review of attachment 8390469 [details] [diff] [review]:
-----------------------------------------------------------------

I like this in general.  Does the sessionrestore.js try to hit the network?  If so, we should find a way to use local resources.  Keep in mind there are a bunch of little bugs to get this turned as well as testing on all platforms.  Once we have r+ for the sessionstore stuff, and confirmation on no network access, I will start the process of turning this on.

::: talos/startup_test/sessionrestore/profile/prefs.js
@@ +1,2 @@
> +// Restore sessionstore.js
> +user_pref("browser.startup.page", 3);

this could be served inside the test definition: http://hg.mozilla.org/build/talos/file/tip/talos/test.py#l241, that would save an extra file.
Attachment #8390469 - Flags: review?(jmaher) → review+
(In reply to Joel Maher (:jmaher) from comment #11)
> Comment on attachment 8390469 [details] [diff] [review]
> Startup Test for session restore
> 
> Review of attachment 8390469 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I like this in general.  Does the sessionrestore.js try to hit the network? 
> If so, we should find a way to use local resources.  Keep in mind there are
> a bunch of little bugs to get this turned as well as testing on all
> platforms.  Once we have r+ for the sessionstore stuff, and confirmation on
> no network access, I will start the process of turning this on.

Barring any mistake from my part, this should not hit the network, as we only restore tabs on demand.
At the moment, we are measuring the time to sessionRestored, which seems to be roughly the first "domwindowopened". Tim, can you confirm that this is the case? This looks ok for measures.

I'm not so sure about our origin of time, though (firstPaint). I wonder if we shouldn't rather start measuring time from nsSessionStartup.init().
Flags: needinfo?(ttaubert)
I have applied Jmaher's feedback.
Also, I forked this test into two, to test both the time to auto-restore the session and the time to load-but-not-restore it.
Attachment #8390469 - Attachment is obsolete: true
Attachment #8390469 - Flags: review?(ttaubert)
Attachment #8394940 - Flags: review?(ttaubert)
Attachment #8394940 - Flags: feedback?(jmaher)
Comment on attachment 8394940 [details] [diff] [review]
Startup Test for session restore, v2

Review of attachment 8394940 [details] [diff] [review]:
-----------------------------------------------------------------

thanks for putting this together.
Attachment #8394940 - Flags: feedback?(jmaher) → feedback+
Attachment #8395089 - Flags: review?(ttaubert)
Comment on attachment 8394940 [details] [diff] [review]
Startup Test for session restore, v2

Review of attachment 8394940 [details] [diff] [review]:
-----------------------------------------------------------------

I can't review the python parts but the approach looks good to me!
Attachment #8394940 - Flags: review?(ttaubert) → review+
Comment on attachment 8395089 [details] [diff] [review]
Timeline event sessionRestoreInit

Review of attachment 8395089 [details] [diff] [review]:
-----------------------------------------------------------------

Can't review the platform parts but I think there is value in maybe having both measurements? sessionRestored-firstPaint and sessionRestored-sessionRestoreInit? Not sure if that really makes sense though as improvements will probably affect both similarly. In that case sessionRestored is I think the better starting point as that would cover more of the sessionstore code.
Attachment #8395089 - Flags: review?(ttaubert) → review+
Flags: needinfo?(ttaubert)
Comment on attachment 8395089 [details] [diff] [review]
Timeline event sessionRestoreInit

Review of attachment 8395089 [details] [diff] [review]:
-----------------------------------------------------------------

r? froydnj for the platform bits
Attachment #8395089 - Flags: review?(nfroyd)
Let's start with sessionRestoredInit, in that case.
Comment on attachment 8395089 [details] [diff] [review]
Timeline event sessionRestoreInit

Review of attachment 8395089 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/sessionstore/src/nsSessionStartup.js
@@ +80,5 @@
>    /**
>     * Initialize the component
>     */
>    init: function sss_init() {
> +    Services.obs.notifyObservers(null, "sessionstore-initializing", null);

So this...

::: toolkit/components/startup/nsAppStartup.cpp
@@ +707,5 @@
>      if (mXPCOMShutdownProbe) {
>        mXPCOMShutdownProbe->Trigger();
>      }
>  #endif //defined(XP_WIN)
> +  } else if (!strcmp(aTopic, "sessionstore-init-started")) {

...doesn't match this, nor is there any place where you might have registered either of those topics for observation.

Either you need to straighten out your topic bits, or you need to convince me that having different topics in this patch is indeed the correct thing.
Attachment #8395089 - Flags: review?(nfroyd) → review-
Oops, forgot to qref before asking for a review.
Sorry about that.
Attachment #8395089 - Attachment is obsolete: true
Attachment #8401527 - Flags: review?(nfroyd)
Comment on attachment 8401527 [details] [diff] [review]
Timeline event sessionRestoreInit, v2

Review of attachment 8401527 [details] [diff] [review]:
-----------------------------------------------------------------

Please verify that you can see the new timeline event in about:telemetry.

::: toolkit/components/startup/nsAppStartup.cpp
@@ +705,5 @@
>        mPlacesInitCompleteProbe->Trigger();
>      }
>  #endif //defined(XP_WIN)
> +  } else if (!strcmp(aTopic, "sessionstore-init-started")) {
> +    StartupTimeline::Record(StartupTimeline::SESSION_RESTORE_INIT);

You are missing an AddObserver call in Init().
Attachment #8401527 - Flags: review?(nfroyd) → review-
Sorry, it has been a difficult week.
Attachment #8401527 - Attachment is obsolete: true
Attachment #8401568 - Flags: review?(nfroyd)
Comment on attachment 8401568 [details] [diff] [review]
1. Timeline event sessionRestoreInit, v3

Review of attachment 8401568 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?") from comment #24)
> Sorry, it has been a difficult week.

Apologies, I should have just r+'d your last patch and told you to add the necessary call.  Agreed on the hard week.

You added a test to this version!  r++.
Attachment #8401568 - Flags: review?(nfroyd) → review+
Attachment #8401568 - Attachment description: Timeline event sessionRestoreInit, v3 → 1. Timeline event sessionRestoreInit, v3
Keywords: checkin-needed
Whiteboard: [land part 1.]
https://hg.mozilla.org/integration/mozilla-inbound/rev/a8e24b3061e6
Flags: in-testsuite+
Keywords: checkin-needed
Whiteboard: [land part 1.] → [leave open]
After an interesting sci-fi debugging session, here is v3. This version should be final.

Jmaher, this patch applies to the Talos repo. How do we proceed from here?
Attachment #8394940 - Attachment is obsolete: true
Attachment #8406758 - Flags: review+
Flags: needinfo?(jmaher)
Attachment #8406758 - Attachment is obsolete: true
Attachment #8406837 - Flags: feedback?(jmaher)
Attachment #8406837 - Attachment is obsolete: true
Attachment #8406837 - Flags: feedback?(jmaher)
Apparent bug was just a snafu of mine with command-line arguments.
I'm ready.
Attachment #8406839 - Attachment is obsolete: true
Attachment #8406844 - Flags: review+
Same one, but without the insanely long timeout.
Attachment #8406844 - Attachment is obsolete: true
next steps are to see if this runs on the slaves fine- once there, we can add entries to the graph server database, and start adjusting the talos.json job definitions.

to clarify, this only runs on desktop, not android?
Indeed, just desktop, no android.
Depends on: 996747
Depends on: 996760
Graph server updates are in place!  With my hacks on try server jobs are running and green.

We still have:
* documentation (comment 35)
* verification of stability
* land the code on talos
* deploy a new talos
* add the test to an existing suite in talos.json http://dxr.mozilla.org/mozilla-central/source/testing/talos/talos.json (can be done at the same time we deploy a new talos)

What does a regression look like?  If the numbers are higher is that a regression?
Flags: needinfo?(jmaher)
I looked at the numbers on try server:
https://tbpl.mozilla.org/?tree=Try&rev=6692d82bce0c

for the most part we are stable, we have a +-2% window on most numbers, linux32 was the noisiest, and win7 was second noisiest only due to a single data point.

The concerning thing is windows xp, it fails and even the screenshot bits fail.  In general, I would like to get windows xp fixed, then we can land the code on talos.

Right now I will need to get more information as to why this is failing on xp :(
On XP, I see a message "07:59:20     INFO -  'setterm' is not recognized as an internal or external command,". Is that something new?
Ah, there is the same message on Windows 7, without consequence.
I don't seem to have a WinXP install at hand, but I can add some logging that might tell us a bit more.
Flags: needinfo?(dteller)
Attachment #8408198 - Flags: feedback?(jmaher)
Flags: needinfo?(dteller)
it appears that all windows xp tests are not working on that push- I repushed and will see if the chamber is full or empty for winXP
Attachment #8408198 - Flags: feedback?(jmaher)
Flags: firefox-backlog?
ok, I had something wrong in my talos staging repo- I reverted other changes, and we have green windows xp- the numbers are stable.

remaining items:
* documentation (comment 35)
* land the code on talos
* deploy a new talos
* add the test to an existing suite in talos.json http://dxr.mozilla.org/mozilla-central/source/testing/talos/talos.json (can be done at the same time we deploy a new talos)

What does a regression look like?  If the numbers are higher is that a regression?
landed on talos:
https://hg.mozilla.org/build/talos/rev/35127ce89efd

We can deploy this when we have some basic documentation
Flags: firefox-backlog? → firefox-backlog+
Whiteboard: [leave open] → [leave open] p=0
I'll try and handle documentation by tomorrow.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 32
You need to log in before you can comment on or make changes to this bug.