Closed Bug 1336389 Opened 3 years ago Closed 2 years ago

[e10s-multi] Add Talos test to measure content process startup time

Categories

(Testing :: Talos, defect, P1)

Version 3
defect

Tracking

(firefox58 fixed)

RESOLVED FIXED
mozilla58
Tracking Status
firefox58 --- fixed

People

(Reporter: gkrizsanits, Assigned: gkrizsanits)

References

Details

(Whiteboard: [e10s-multi:+])

Attachments

(1 file)

No description provided.
Whiteboard: [e10s-multi:+]
I'm trying to add a new Talos test for which I need to sign an xpi file which does not work any longer.

"Starting with Firefox 53, new extensions on this site can only be WebExtensions."

What's the strategy here for new Talos test and changing existing Talos tests (both requires signing an old style add-on)? Using WebExtensions for these Talos add-ons is simply not an option right now, what other options do I have? Sorry if I missed the internal communication about this but could not find any info/announcement about how to do this.

Note: I'm trying to use the key that we use for in tree add-ons [1], is there a reason to not make an exception for this key on AMO?

[1]: https://mana.mozilla.org/wiki/display/ateam/Signing+Extensions+in+Tree
Flags: needinfo?(amckay)
talos still has many legacy addons, they are all signed (and we continue to sign them).  We have some web extensions used already and there is a similar signing process although we have a few issues to resolve around that specifically on beta.

possibly new addons cannot be signed and only older existing ones.
Internal extensions that pose no security risk can be signed by an internal signing tool. :aswan can help you out with that.
Flags: needinfo?(amckay) → needinfo?(aswan)
We're trying to get away from having to sign these extensions.  You should be able to use the extension unsigned on Nightly right now.  The danger is that things will break on beta where signing can't be disabled, we're currently racing against the clock to fix that over in bug 1361002.  I would advocate just landing the extension unsigned and if we're getting close to the point where 56 uplifts to beta and 1361002 isn't resolved, we can sign the new extension then, but I hope it doesn't come to that...
Flags: needinfo?(aswan)
(In reply to Andrew Swan [:aswan] from comment #4)
> We're trying to get away from having to sign these extensions.  You should
> be able to use the extension unsigned on Nightly right now.  The danger is
> that things will break on beta where signing can't be disabled, we're
> currently racing against the clock to fix that over in bug 1361002.  I would
> advocate just landing the extension unsigned and if we're getting close to
> the point where 56 uplifts to beta and 1361002 isn't resolved, we can sign
> the new extension then, but I hope it doesn't come to that...

Thanks, this sounds perfect to me.
Priority: -- → P1
The test is a modified/simplified tab paint test. It measures a bit more than the content process startup: from the time the parent requests a new tab with some URL to the moment when the new child process is able to process the URL. So it does measure a few things pre process creation (browser element and tab creation on parent side) and does not process the part where we actually parse and render the page on the content side, only process startup + process / frame script loading until the child is creating the listener that is needed for processing the url of the page.

On my machine if a new process is created the time is more than 200ms if e10s-multi is off than it's around 20ms. I don't think there is a big difference between the child and the parent initiated process creation (if any).

Finally the probe starts the measurement on the parent side and finishes it on the content side.

I will flag :mconley for the review when his review queue won't be blocked any more.
Assignee: nobody → gkrizsanits
Attachment #8904152 - Flags: feedback?(jmaher)
Flags: needinfo?(mconley)
Comment on attachment 8904152 [details] [diff] [review]
Measure web content process startup time. v1

Review of attachment 8904152 [details] [diff] [review]:
-----------------------------------------------------------------

looking from a talos integration point of view this is looking good.  Prior to official landing we will need to sign this addon:
https://wiki.mozilla.org/EngineeringProductivity/HowTo/SignExtensions#Signing_an_Addon

I don't mind doing that if there are problems getting it to work

::: testing/talos/talos.json
@@ -14,5 @@
>              "talos_options": ["--enable-stylo"],
>              "tests": ["dromaeo_css", "kraken"]
>          },
>          "other-e10s": {
> -            "tests": ["a11yr", "ts_paint", "tpaint", "sessionrestore", "sessionrestore_many_windows", "sessionrestore_no_auto_restore", "tabpaint"]

we should add this to the stylo version as well.

::: testing/talos/talos/tests/cpstartup/cpstartup.manifest.develop
@@ +1,1 @@
> +% chrome://cpstartup/content/cpstartup.html#auto

the .develop file is generated when running locally, we don't need to add this to the tree.
Attachment #8904152 - Flags: feedback?(jmaher) → feedback+
Flags: needinfo?(mconley)
Attachment #8904152 - Flags: review?(mconley)
Comment on attachment 8904152 [details] [diff] [review]
Measure web content process startup time. v1

Review of attachment 8904152 [details] [diff] [review]:
-----------------------------------------------------------------

Seems okay. I'm a little worried that we have to send a message for every browser-child.js's WebNavigation.init() even when we're not testing... but I guess the overhead shouldn't be too bad, and I understand that we want the measurement to be as accurate as possible.

Worst case scenario, I suppose we could record the Services.telemetry.msSystemNow(), and then only send the message with that time in an idleCallback, but I doubt it'll come to that.

Thanks!

::: testing/talos/talos/tests/cpstartup/bootstrap.js
@@ +5,5 @@
> +var { classes: Cc, interfaces: Ci, utils: Cu } = Components;
> +
> +Cu.import("resource://gre/modules/Services.jsm");
> +Cu.import("resource://gre/modules/Task.jsm");
> +Cu.import("resource:///modules/RecentWindow.jsm");

Unused

@@ +11,5 @@
> +const PREALLOCATED_PREF = "dom.ipc.processPrelaunch.enabled";
> +
> +const TARGET_URI = "chrome://cpstartup/content/target.html";
> +
> +var CPStartup = {

Can you please add a docstring above here describing what's being measured?

@@ +81,5 @@
> +      }
> +    }
> +  },
> +
> +  go: Task.async(function*(gBrowser) {

We can probably just use straight-up async JS now:

async go() {
  await this.openTab(gBrowser);
}

Then we don't need to import Task.jsm either.

@@ +103,5 @@
> +      });
> +    });
> +  },
> +
> +  whenTabShown() {

Nit: whenTabShown is maybe not the right name here... maybe whenTabReady ?

::: toolkit/content/browser-child.js
@@ +256,5 @@
>      addMessageListener("WebNavigation:LoadURI", this);
>      addMessageListener("WebNavigation:SetOriginAttributes", this);
>      addMessageListener("WebNavigation:Reload", this);
>      addMessageListener("WebNavigation:Stop", this);
> +    sendAsyncMessage("Content:BrowserChildReady", { time: Services.telemetry.msSystemNow() });

Can you add a comment about how this is used for performance testing only?
Attachment #8904152 - Flags: review?(mconley) → review+
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #7)
> looking from a talos integration point of view this is looking good.  Prior
> to official landing we will need to sign this addon:
> https://wiki.mozilla.org/EngineeringProductivity/HowTo/
> SignExtensions#Signing_an_Addon

Actually it's not possible to sign any non-webextension addon any more. The only way I could make this work is to use the same addon id as all the other talos addons. 

> 
> I don't mind doing that if there are problems getting it to work
> 

I'll just push it as it is, but if you still think it's necessary to sign the add-on let me know and we'll find a way to make it work.
my understanding is we need this to be signed in order to install and use the addons, specifically on mozilla-beta.  IIRC there was a secondary server setup for signing old addons- I had specifically asked this question from the addons/webextensions team- we have planned to migrate 1-2 talos extensions/quarter, but we might have an emergency on our hands if this is the case.

:kmag, do you know who can discuss legacy addon signing?
Flags: needinfo?(kmaglione+bmo)
Comment 4 is relevant. In worst case scenario we won't uplift this test to beta...
Backed out for failing new talos test cpstartup:

https://hg.mozilla.org/integration/mozilla-inbound/rev/b2e90060a137b10b38ad7a394dab802841094861

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=5cd2ba3bc6c4dbdba69c295791d20d1c804d8f11&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=131950610&repo=mozilla-inbound

04:30:05     INFO -  TEST-START | cpstartup
04:30:05     INFO -  Initialising browser for cpstartup test...
04:30:05     INFO -  Application command: /builds/slave/test/build/application/firefox/firefox  http://localhost:45949/getInfo.html -profile /tmp/tmpD0T0sO/profile
04:30:05     INFO -  TEST-INFO | started process 20591 (/builds/slave/test/build/application/firefox/firefox  http://localhost:45949/getInfo.html)
04:30:10     INFO -  TEST-INFO | 20591: exit 0
04:30:10     INFO -  Browser initialized.
04:30:10     INFO -  Running cycle 1/1 for cpstartup test...
04:30:10     INFO -  Using env: {'DISPLAY': ':0',
04:30:10     INFO -   'HOME': '/home/cltbld',
04:30:10     INFO -   'JSGC_DISABLE_POISONING': '1',
04:30:10     INFO -   'LANG': 'en_US.UTF-8',
04:30:10     INFO -   'LANGUAGE': 'en_US:en',
04:30:10     INFO -   'LD_LIBRARY_PATH': '/builds/slave/test/build/application/firefox',
04:30:10     INFO -   'LOGNAME': 'cltbld',
04:30:10     INFO -   'MAIL': '/var/mail/cltbld',
04:30:10     INFO -   'MINIDUMP_SAVE_PATH': '/builds/slave/test/build/blobber_upload_dir',
04:30:10     INFO -   'MINIDUMP_STACKWALK': '/builds/slave/test/build/linux64-minidump_stackwalk',
04:30:10     INFO -   'MOZ_CRASHREPORTER': '1',
04:30:10     INFO -   'MOZ_CRASHREPORTER_NO_REPORT': '1',
04:30:10     INFO -   'MOZ_DISABLE_NONLOCAL_CONNECTIONS': '1',
04:30:10     INFO -   'MOZ_NO_REMOTE': '1',
04:30:10     INFO -   'MOZ_UPLOAD_DIR': '/builds/slave/test/build/blobber_upload_dir',
04:30:10     INFO -   'NODE_PATH': '/usr/lib/nodejs:/usr/lib/node_modules:/usr/share/javascript',
04:30:10     INFO -   'NO_EM_RESTART': '1',
04:30:10     INFO -   'PATH': '/builds/slave/test/build/venv/bin:/usr/local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games',
04:30:10     INFO -   'PROPERTIES_FILE': '/builds/slave/test/buildprops.json',
04:30:10     INFO -   'PWD': '/builds/slave/test',
04:30:10     INFO -   'PYTHONPATH': '/builds/slave/test/build/tests/talos',
04:30:10     INFO -   'RUST_BACKTRACE': 'full',
04:30:10     INFO -   'SCRIPTSPATH': '/builds/slave/test/scripts',
04:30:10     INFO -   'SHELL': '/bin/bash',
04:30:10     INFO -   'SHLVL': '1',
04:30:10     INFO -   'TERM': 'linux',
04:30:10     INFO -   'TMOUT': '86400',
04:30:10     INFO -   'USER': 'cltbld',
04:30:10     INFO -   'XDG_SESSION_COOKIE': '41bf3e9aed84707956354da8000001c5-1505818748.616143-766203380',
04:30:10     INFO -   'XPCOM_DEBUG_BREAK': 'warn',
04:30:10     INFO -   '_': '/tools/buildbot/bin/python'}
04:30:10     INFO -  TEST-INFO | started process 20829 (/builds/slave/test/build/application/firefox/firefox -profile /tmp/tmpD0T0sO/profile -tp file:/builds/slave/test/build/tests/talos/talos/tests/cpstartup/cpstartup.manifest.develop -tpchrome -tpnoisy -tploadnocache -tpcycles 1 -tppagecycles 20)
04:30:11     INFO -  PID 20829 |
04:30:11     INFO -  PID 20829 | (/builds/slave/test/build/application/firefox/firefox:20879): Pango-WARNING **: error opening config file '/home/cltbld/.pangorc': Permission denied
04:30:11     INFO -  PID 20829 |
04:30:22     INFO -  PID 20829 | RSS: Main: 163012608
04:30:22     INFO -  PID 20829 |
05:30:10     INFO -  Timeout waiting for test completion; killing browser...
05:30:10     INFO -  Terminating psutil.Process(pid=20829, name='firefox')
05:30:10     INFO -  PID 20829 | ExceptionHandler::GenerateDump cloned child 21000
05:30:10     INFO -  PID 20829 | ExceptionHandler::SendContinueSignalToChild sent continue signal to child
05:30:10     INFO -  PID 20829 | ExceptionHandler::WaitForContinueSignal waiting for continue signal...
05:30:10     INFO -  TEST-UNEXPECTED-ERROR | cpstartup | timeout
05:30:10    ERROR -  Traceback (most recent call last):
05:30:10     INFO -    File "/builds/slave/test/build/tests/talos/talos/run_tests.py", line 281, in run_tests
05:30:10     INFO -      talos_results.add(mytest.runTest(browser_config, test))
05:30:10     INFO -    File "/builds/slave/test/build/tests/talos/talos/ttest.py", line 61, in runTest
05:30:10     INFO -      return self._runTest(browser_config, test_config, setup)
05:30:10     INFO -    File "/builds/slave/test/build/tests/talos/talos/ttest.py", line 209, in _runTest
05:30:10     INFO -      if counter_management else None),
05:30:10     INFO -    File "/builds/slave/test/build/tests/talos/talos/talos_process.py", line 130, in run_browser
05:30:10     INFO -      raise TalosError("timeout")
05:30:10     INFO -  TalosError: timeout
Flags: needinfo?(gkrizsanits)
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #11)
> my understanding is we need this to be signed in order to install and use
> the addons, specifically on mozilla-beta.  IIRC there was a secondary server
> setup for signing old addons- I had specifically asked this question from
> the addons/webextensions team- we have planned to migrate 1-2 talos
> extensions/quarter, but we might have an emergency on our hands if this is
> the case.
> 
> :kmag, do you know who can discuss legacy addon signing?

Andy is probably the best person.
Flags: needinfo?(kmaglione+bmo) → needinfo?(amckay)
So it seems like it's not working on nightly either. Locally I can run the test, but for some reason on try it's not working, based on the log my bet is that the add-on is not loaded (since the test does not even start).
Flags: needinfo?(gkrizsanits) → needinfo?(aswan)
:gabor, do you have a try run?  I would like to look at a log for any other errors, etc.
If we really have to sign these legacy extensions for talos then we'll need someone on that team to take ownership of signing those, we are really trying to limit access to a few people. Details are here:

https://mana.mozilla.org/wiki/display/FIREFOX/Internal+Extension+Signing

Please note, they will break and break regularly and its the team writing these extensions to maintain them.
Flags: needinfo?(amckay)
Signing should not be required, the fact that this works locally but not in automation (see comment 15) suggests that this is just a configuration issue in automation.  Gabor, do you have a link to a failed try run or can I reproduce this by just doing a simple try run with the attached patch?
Flags: needinfo?(aswan) → needinfo?(gkrizsanits)
(In reply to Andrew Swan [:aswan] from comment #18)
> Signing should not be required, the fact that this works locally but not in
> automation (see comment 15) suggests that this is just a configuration issue
> in automation.  Gabor, do you have a link to a failed try run or can I
> reproduce this by just doing a simple try run with the attached patch?

Eh... sorry I mean on inbound it's not working, on try it seems to be just fine: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2887998a60c47361a4b614c46c444c88f1e9a87e&selectedJob=132665915
Flags: needinfo?(gkrizsanits)
I pushed to try with that patch and forcing addon signing as we do for inbound:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6eae1ac3c92524fa904a0ede2ad55b81e61c2fa5

still waiting on results.
:andym, so this is a new test that :gabor has written that has a new addon- he is the author and maintainer of the addon.  We require signed addons to run as this is a requirement for tests to run on mozilla-beta; making our integration branches as close to release is important.

we have been able to sign existing addons recently, is there an issue with new addons or did things change in the last week?  We signed an addon for pageloader on September XX.

It is fine to go through another process for signing, we just need to know what that is and help remove roadblocks.

:gabor- is there a chance you could write this cpstartup as a webextension instead of a legacy addon?
Flags: needinfo?(gkrizsanits)
Flags: needinfo?(amckay)
September XX == Tuesday September 19th (3 days ago)
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #22)
> :gabor- is there a chance you could write this cpstartup as a webextension
> instead of a legacy addon?

Conceptually the problem is that web extension code lives in its own process which makes it quite challenging to write this test (if possible). But even if I, let's say, force the test to run web extensions in the parent process (which might set up a whole different level of problems for the future) I'm not sure we have a single talos test based on web extensions, so I'd imagine getting the first one to work might be a challenge.
Flags: needinfo?(gkrizsanits)
we have some talos tests with webextensions included- I suspect it wouldn't be too crazy, but there could be oddities in dumping output from the test or communicating with other tools we have available via legacy addons.
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #22)
> :andym, so this is a new test that :gabor has written that has a new addon-
> he is the author and maintainer of the addon.  We require signed addons to
> run as this is a requirement for tests to run on mozilla-beta; making our
> integration branches as close to release is important.

Joel, I thought that as of bug 1361002 getting fixed, Talos doesn't use any signed extensions?  Something isn't adding up here, your comment about mozilla-beta seems like a different issue than what Gabor reports (working in local builds and on try, but breaking on inbound)

Here is the log from the failure on inbound:
https://treeherder.mozilla.org/logviewer.html#?job_id=131954080

I don't see any browser logs there though.  Does the talos framework put them somewhere else?  I have a vague recollection that maybe it just swallows them completely?  :(
Flags: needinfo?(amckay) → needinfo?(jmaher)
:aswan, talos has all of the extensions signed and only run them as signed on integration and release branches.  As for bug 1361002 that was to solve getting the web extensions to work, not legacy addons.  Why I mentioned mozilla-beta is that if we require signed addons there, then when we uplift changes from central->beta they will break and it can be a large fire drill every merge day.

Is there a way to run legacy addons on mozilla-central and mozilla-beta where we don't need signed addons- when testing on try and locally, we disable this pref:
http://searchfox.org/mozilla-central/source/testing/talos/talos/run_tests.py#121

possibly we can disable that by default and have talos runs on central/beta with no problem?
Flags: needinfo?(jmaher)
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/be61a3dc120a
Talos test for content process startup time. r=mconley
https://hg.mozilla.org/mozilla-central/rev/be61a3dc120a
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
:mconley, I see you reviewed this, it looks as though :gabor is inactive- can you add a description about cpstartup to the wiki page:
https://wiki.mozilla.org/Buildbot/Talos/Tests

If not, can you help find an owner for this?
Flags: needinfo?(mconley)
I've added cpstartup here: https://wiki.mozilla.org/Buildbot/Talos/Tests#cpstartup
Flags: needinfo?(mconley)
You need to log in before you can comment on or make changes to this bug.