Closed Bug 1357382 Opened 3 years ago Closed 2 years ago

Add mitmproxy support to mozharness and talos, and add first rev of new quantum-pageload talos test

Categories

(Testing :: Talos, enhancement)

Version 3
enhancement
Not set

Tracking

(firefox55 fixed)

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: rwood, Assigned: rwood)

References

Details

Attachments

(1 file, 1 obsolete file)

Modify Talos so it can use web-page-relay to load test pages:

https://github.com/chromium/web-page-replay
Summary: Modify Talos so it can use web-page-relay → Modify Talos so it can use web-page-replay
Question, do we need https/SSL support in web-page-replay for Talos? Or is it sufficient to run WPR with the --no-ssl option? Thanks.
Flags: needinfo?(benjamin)
HTTPS is required since we're loading the sites over HTTPS and not rewriting the pages.
Flags: needinfo?(benjamin)
this requires a lot more hacking:
https://github.com/chromium/web-page-replay/blob/master/documentation/GettingStarted.md

it is doable, but we need to create a cert_override.txt file:
https://developer.mozilla.org/en-US/docs/Cert_override.txt

that ^ doc is 9 years old, I assume that still works and won't cause other odd problems.

Possibly there are other ways to validate a cert, please chime in if there are better methods.
I assume others at Mozilla have used web-page-replay successfully using https- if we had the cert_override.txt file and related .wpr (recorded pages), this could speed up our work.
The instructions require running wpr with sudo. I assume this will give us problems on automation.

I read that sudo is required for using ports 80/53. I'm asking in the issue tracker if we can change this [1]

[1] https://github.com/chromium/web-page-replay/issues/93


Pasting information from McManus about cert override:
-----------------------------------------------------
You don't want an override - there are functional differences to the code in the presence of overridden exceptions. Instead you want to load the cert from your fake CA into the firefox trust store for the test . I'm sure david will have better references.

https://wiki.mozilla.org/CA:AddRootToFirefox

it looks like the fake CA root used by WPR for MITM is checked into github as wpr_cert.pem (I'm just guessing from a quick look)

Just so you know, I forsee a few other problems with the web page replay approach
* iirc it does not speak h2 and lots of the targetted sites speak h2 (including google and fb) with firefox.
* on replay it resolves everything to localhost, which has unrealistic implications if you are running h2 regarding connection management

big picture: The shaping approach is ok for a first cut - but its only going to be a rough guide to reality. It makes a huge assumption about the only bottleneck being the last mile and also assumes that all the origins involved share that bottleneck fairly (in real life they don't). Plus packet loss considerations. Like I said, I'm ok with the first cut but its worth documenting why we know it will be different than live data. The H2 thing though, that you really ought to sort out now. 2/3 of all our https traffic is over h2 and honestly the server implementations can differ pretty widely.. google, facebook, and nginx do a much better job that the node based servers and it would be a shame if we tuned for something other than what our users are running (~20% of all http transactions are likely with google)
Summary: Modify Talos so it can use web-page-replay → Modify Talos so it can use mitmproxy
Comment on attachment 8863913 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos;

./mach talos-test --suite quantum-pageload --no-download

WIP - assumes mitmproxy is already installed and certificate added to Firefox

- mozharness starts mitmproxy and replays archive
- talos test browses to the URL; page loads from mitmproxy
- talos finishes the test (using old/existing perf event/markers)
- mozharness kills mitmproxy
Attachment #8863913 - Flags: feedback?(jmaher)
Attachment #8863913 - Flags: feedback?(armenzg)
Comment on attachment 8863913 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos;

https://reviewboard.mozilla.org/r/135644/#review138660

::: testing/mozharness/mozharness/mozilla/testing/talos.py:289
(Diff revision 1)
> +
> +    def start_mitmproxy_playback(self):
> +        """ Startup mitmproxy and replay the specified flow file """
> +        self.info("Starting mitmproxy playback of %s" % self.mitmproxy_replay_src)
> +        python = self.query_python_path()
> +        self.mitmproxy_p = subprocess.Popen(["mitmdump", "-S", self.mitmproxy_replay_src, "-k"],

I haven't looked at bsmedberg's Google doc to review if there are other parameters receommended.
Have a look there if you haven't done so.

::: testing/mozharness/mozharness/mozilla/testing/talos.py:292
(Diff revision 1)
> +        self.info("Starting mitmproxy playback of %s" % self.mitmproxy_replay_src)
> +        python = self.query_python_path()
> +        self.mitmproxy_p = subprocess.Popen(["mitmdump", "-S", self.mitmproxy_replay_src, "-k"],
> +                                            stdout=subprocess.PIPE,
> +                                            stderr=subprocess.PIPE)
> +        time.sleep(10)

5 seconds is probably fine. No issue if you prefer 10.

::: testing/mozharness/mozharness/mozilla/testing/talos.py:298
(Diff revision 1)
> +        if self.mitmproxy_p.poll() is None:
> +            self.info("Mitmproxy playback started in background as pid %d" %self.mitmproxy_p.pid)
> +        else:
> +            # cannot continue as we won't be able to playback the pages
> +            self.critical('Mitmproxy playback process failed to start')
> +            os.sys.exit()

I think you can use self.fatal() which aborts Mozharness. I'm not entirely sure. I'm going by memory.
I think it also allows for some teardown if the MH script needs it.

::: testing/mozharness/mozharness/mozilla/testing/talos.py:302
(Diff revision 1)
> +            self.critical('Mitmproxy playback process failed to start')
> +            os.sys.exit()
> +
> +    def add_mitmproxy_cert(self):
> +        """ Install the certificate generated by mitmproxy into Firefox """
> +        self.info("TODO: Add mitproxy cert to Firfox here")

You would need the Talos class to inherit from the MitmproxyFactory.
We probably want to change my "run_and_exit" method to something meaningful like "import_cert_and_setup_proxy".
You would call it here like "self.new_method_name()".

::: testing/mozharness/mozharness/mozilla/testing/talos.py:304
(Diff revision 1)
> +
> +    def add_mitmproxy_cert(self):
> +        """ Install the certificate generated by mitmproxy into Firefox """
> +        self.info("TODO: Add mitproxy cert to Firfox here")
> +
> +    def stop_mitmproxy_playback(self):

I believe Mozharness has some decorators to indicate which methods to call when the scripts ends suddenly.
Could you please add one here?

::: testing/mozharness/mozharness/mozilla/testing/talos.py:402
(Diff revision 1)
>                  unzip_cmd = [unzip, '-q', '-o', archive, '-d', src_talos_pageset]
>                  self.run_command(unzip_cmd, halt_on_failure=True)
>              else:
>                  self.info("Not downloading pageset because the no-download option was specified")
>  
> +        # now that have the suite name, check if mitmproxy replay archive is required,

IMHO we're putting too much under populate_webroot.
I would consider moving fetching and setting mitmproxy to its own function.

In other words, separating it from the pageset related logic.
Comment on attachment 8863913 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos;

https://reviewboard.mozilla.org/r/135644/#review138692

This looks great! Just some nits.
Attachment #8863913 - Flags: feedback?(armenzg) → feedback+
Comment on attachment 8863913 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos;

https://reviewboard.mozilla.org/r/135644/#review138914

I would f+ this, but overall I have questions.  Do we need to do mitmproxy inside of mozharness?  Since we use mozharness to run talos locally via mach that is good.  Where do we get the pageset from mitmproxy to replay?

Inside of talos we have xperf and there is a xtalos/ directory that has the python processes and tools we use- would it make sense to put the logic there to start/stop mitmproxy?  I want to think about cleanup as best as possible and maybe mozharness is better.

::: testing/talos/talos/run_tests.py:193
(Diff revision 1)
>  
>      if config['gecko_profile']:
>          talos_results.add_extra_option('geckoProfile')
>  
> +    if config['mitmproxy']:
> +        browser_config['mitmproxy'] = True

how do we use this mitmproxy in browser_config?

::: testing/talos/talos/run_tests.py:196
(Diff revision 1)
>  
> +    if config['mitmproxy']:
> +        browser_config['mitmproxy'] = True
> +
> +    if config['first_non_blank_paint']:
> +        browser_config['firstNonBlankPaint'] = True

do we pass this into pageloader somehow?  how do we measure this?  Possibly this is future use?

::: testing/talos/talos/test.py:784
(Diff revision 1)
>      unit = 'ms'
>      lower_is_better = True
>      alert_threshold = 5.0
> +
> +@register_test()
> +class qpage_1(PageloaderTest):

I am not a fan of qpage_1 as a name, but for now that will work.  I would like Quantum_1 better.

::: testing/talos/talos/ttest.py:110
(Diff revision 1)
>              global_counters['responsiveness'] = []
>  
> +        # if using mitmproxy we must allow access to 'external' sites
> +        if browser_config.get('mitmproxy', False):
> +            LOG.info("Using mitmproxy so setting MOZ_DISABLE_NONLOCAL_CONNECTIONS to 0")
> +            setup.env['MOZ_DISABLE_NONLOCAL_CONNECTIONS'] = '0'

is this temporary?  I thought we would record the website and then play it back.
Attachment #8863913 - Flags: review-
Comment on attachment 8863913 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos;

https://reviewboard.mozilla.org/r/135644/#review139186

::: testing/mozharness/mozharness/mozilla/testing/talos.py:402
(Diff revision 1)
>                  unzip_cmd = [unzip, '-q', '-o', archive, '-d', src_talos_pageset]
>                  self.run_command(unzip_cmd, halt_on_failure=True)
>              else:
>                  self.info("Not downloading pageset because the no-download option was specified")
>  
> +        # now that have the suite name, check if mitmproxy replay archive is required,

What do you think of putting this into a new action?
I was thinking we could call it "setup_mitmproxy".

Inside of it we could call this code plus these steps (roughly):
```python
python3_path = self.install_python3() # To be implemented
self.configure_python3_venv(
    python_path=python3_path,
    venv_path='venv'
)
self.create_python3_venv()
self.install_python3_modules(modules=['mitmproxy'])
mitmdump = os.path.join(self.path_to_python3_executables(), 'mitmdump')
```
Comment on attachment 8863913 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos;

https://reviewboard.mozilla.org/r/135644/#review138660

> 5 seconds is probably fine. No issue if you prefer 10.

Yeah changed to 5 good call

> I think you can use self.fatal() which aborts Mozharness. I'm not entirely sure. I'm going by memory.
> I think it also allows for some teardown if the MH script needs it.

Done, thanks

> You would need the Talos class to inherit from the MitmproxyFactory.
> We probably want to change my "run_and_exit" method to something meaningful like "import_cert_and_setup_proxy".
> You would call it here like "self.new_method_name()".

Looks like this may be moved inside talos itself

> I believe Mozharness has some decorators to indicate which methods to call when the scripts ends suddenly.
> Could you please add one here?

"when the script ends suddenly" do you mean when the mitmproxy process is killed? I do log that the process is killed. Mozlog does have a LOG.process_start and LOG.process_exit but we don't use mozlog in Talos. IMO woudln't make sense to use that just here without converting over all of talos to use mozlog which would be a separate job on it's own...
Comment on attachment 8863913 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos;

https://reviewboard.mozilla.org/r/135644/#review138914

The mitmproxy pageset will be on tooltool; I just added a placeholder for that code for now as no point in uploading anything to tooltool until it's the 'official' playback file that we will use in production.

No we don't need mitmproxy inside mozharness. I just put it there because that's where the tp5 pageset is currenlty downloaded, and we need to download the mitmproxy archive(s).

I can move it all inside talos if you prefer no problem (except downloading the mitmproxy archive). That would also mean having the code to install the mitmproxy certificate inside talos also, because we need to startup mitmproxy first in order to have it generate a certificate to be added to Firefox.

Is that what we want then? Only use mozharness to download the mitmproxy archive via tooltool but put everything else inside talos itself?

> how do we use this mitmproxy in browser_config?

I had to pass the flag into talos run_tests so that inside talos (ttest) it knows that we need to set MOZ_DISABLE_NONLOCAL_CONNECTIONS to 0.

> do we pass this into pageloader somehow?  how do we measure this?  Possibly this is future use?

Yep that's a placeholder for the future as I know the new quantum pageloader test will need to use that new load event/marker.

> I am not a fan of qpage_1 as a name, but for now that will work.  I would like Quantum_1 better.

Ok will use 'Quantum_1" thanks

> is this temporary?  I thought we would record the website and then play it back.

Not sure what you mean, no we will just be playing back the site from the mitmproxy archive (recording done previously). Without this set I can't browse to the URL that is inside the mitmproxy playback archive.
Comment on attachment 8863913 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos;

https://reviewboard.mozilla.org/r/135644/#review139186

> What do you think of putting this into a new action?
> I was thinking we could call it "setup_mitmproxy".
> 
> Inside of it we could call this code plus these steps (roughly):
> ```python
> python3_path = self.install_python3() # To be implemented
> self.configure_python3_venv(
>     python_path=python3_path,
>     venv_path='venv'
> )
> self.create_python3_venv()
> self.install_python3_modules(modules=['mitmproxy'])
> mitmdump = os.path.join(self.path_to_python3_executables(), 'mitmdump')
> ```

Yeah that's a good idea, and also in there we can have the code to download the mitmproxy archive using tootool.
Just an update...

Mozharness will:
- download/install mitmproxy itself
- download the mitmproxy playback pages/source (via tooltool)
- setup a python 3.x environment for mitmproxy to use

Talos will:
- start up mitmproxy (mitmdump); this starts the page playback, and mitmproxy will generate the CA certificate
- call :armenzg's code to install the CA cert into Firefox
- run the quantum pageload test
- stop mitmproxy
Summary: Modify Talos so it can use mitmproxy → Add mitmproxy support to mozharness and talos
> - call :armenzg's code to install the CA cert into Firefox

You will probably need to add something like this to talos:
> sys.path.insert(1, os.path.join(os.path.dirname(sys.path[0]), 'mozharness')
>
> from mozharness.mozilla.mitmproxy import configure_mitmproxy

I can also refactor the code to create a Python package and turn it as a requirement.
However, if the above works let's go with it.
Blocks: 1362407
Comment on attachment 8863913 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos;

I think we are close, there is some bitrot here and slightly changes needed.  For example, I get this close to running by doing:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=83165473b133ceb251ebd84a1f8c945421d5165d&filter-searchStr=win+chromez

this doesn't run in automation, but running on a loaner it does the same thing.  Then running the command manually (just copy/paste from the log) it works:
C:\slave\test\py3venv\Scripts\mitmdump -s "C:/slave/test/build/tests/talos/talos/mitmdump-alternate-server-replay/alternate-server-replay.py C:/slave/test/build/tests/talos/talos/mitmproxy-recording-1.mp" -k

this indicates that python3, mitmdump, the .md file, the git clone, etc... are all done properly.

I even created a simple python script to subprocess.popen mitmdump and it had a pid, but returned none from subprocess.poll().  possibly that is problematic with mitmdump, or maybe on windows specifically?
Attachment #8863913 - Flags: review?(jmaher)
'ascii' codec can't encode character u'\u2014' in position 44: ordinal not in range(128)
Attachment #8863913 - Attachment is obsolete: true
Attachment #8863913 - Flags: review?(jmaher)
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review143586

::: testing/mozharness/configs/talos/windows_config.py:36
(Diff revision 1)
>      "default_actions": [
>          "clobber",
>          "read-buildbot-config",
>          "download-and-extract",
>          "populate-webroot",
> +        "setup-mitmproxy",

We need to be careful about the timing here. On Windows, "setup-mitmproxy" must happen after the "install" step, because python3 needs the C runtime DLL which is found in the firefox installation.

The try run failure, "AttributeError: 'NoneType' object has no attribute 'split'" is because of this.
Summary: Add mitmproxy support to mozharness and talos → Add mitmproxy support to mozharness and talos, and add first rev of new quantum-pageload talos test
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review144136

::: testing/mozharness/mozharness/mozilla/mitmproxy.py:12
(Diff revision 10)
>  
> +try:
> -DEFAULT_CERT_PATH = os.path.join(os.getenv('HOME'),
> +    DEFAULT_CERT_PATH = os.path.join(os.getenv('HOME'),
> -                                 '.mitmproxy', 'mitmproxy-ca-cert.cer')
> +                                     '.mitmproxy', 'mitmproxy-ca-cert.cer')
> +except:
> +    DEFAULT_CERT_PATH = os.path.join(os.getenv('HOMEDRIVE'), os.getenv('HOMEPATH'),

please add a comment here to indicate the problem- we might have other scenarios where this fails and a comment provides some guidelines for future debugging.

::: testing/mozharness/mozharness/mozilla/testing/talos.py:361
(Diff revision 10)
>                  self.run_command(unzip_cmd, halt_on_failure=True)
>              else:
>                  self.info("Not downloading pageset because the no-download option was specified")
>  
> +    def setup_mitmproxy(self):
> +        """Some talos tests require the use of mitmproxy to playback the pages; set it up here"""

nit: split this comment into 2 lines

::: testing/mozharness/mozharness/mozilla/testing/talos.py:405
(Diff revision 10)
> +        """Mitmproxy requires external playback archives to be downloaded and extracted"""
> +        if self.mitmproxy_recording_set:
> +            return self.mitmproxy_recording_set
> +        if self.query_talos_json_config() and self.suite is not None:
> +            self.mitmproxy_recording_set = self.talos_json_config['suites'][self.suite].get('mitmproxy_recording_set')
> +            return self.mitmproxy_recording_set

should we throw an error if we are running mitmproxy tests but have no mitmproxy_recording_set defined?

::: testing/mozharness/mozharness/mozilla/testing/talos.py:413
(Diff revision 10)
> +        """Retrieve mitmdump alternative playback repo name"""
> +        if self.mitmdump_alt_playback_repo:
> +            return self.mitmdump_alt_playback_repo
> +        if self.query_talos_json_config() and self.suite is not None:
> +            self.mitmdump_alt_playback_repo = self.talos_json_config['suites'][self.suite].get('mitmdump_alt_playback_repo')
> +            return self.mitmdump_alt_playback_repo

same here, do we need to throw an error, or is this optional?  If it is optional it would be nice to document that somewhere.

::: testing/mozharness/mozharness/mozilla/testing/talos.py:428
(Diff revision 10)
> +            if os.path.exists(alt_replay_script):
> +                self.info("Replay script already exists locally: %s" % alt_replay_script)
> +                return
> +            self.info("Getting alternate mitmproxy playback script from %s" % str(self.mitmdump_alt_playback_repo))
> +            try:
> +                self.run_command(['C:/mozilla-build/Git/bin/git',

this is not a good idea to have this path hardcoded, can we find other prior art in mozharness to find the path to git for the current environment/os?

::: testing/mozharness/mozharness/mozilla/testing/talos.py:435
(Diff revision 10)
> +                                 dest],
> +                                 env=self.query_env())
> +            except:
> +                self.fatal("Aborting: Unable to retrieve mitmdump alternate playback script")
> +        else:
> +            self.fatal("Aborting: mitmproxy_alt_playback_repo not specified in talos.json")

I would rather have this cloned into m-c so this is all in-tree; Doing this with an external repo means that if a bad commit is checked into the defined github repo, the quantum test could start failing on all branches and we would have a hard time tracking it down.

::: testing/talos/talos.json:11
(Diff revision 10)
> -        "chromez": {
> -            "tests": ["tresize", "tcanvasmark"],
> -            "talos_options": ["--disable-e10s"]
> -        },
>          "chromez-e10s": {
> -            "tests": ["tresize", "tcanvasmark"]
> +            "tests": ["Quantum_1"],

lets not replace chromez; please add this to the 'g5' suite if possible, otherwise we can create a 'g6' suite.

::: testing/talos/talos.json:123
(Diff revision 10)
>              "talos_options": [
>                  "--xperf_path",
>                  "\"c:/Program Files/Microsoft Windows Performance Toolkit/xperf.exe\""
>              ]
> +        },
> +        "quantum-pageload": {

we should make this -e10s for consistency sake.

::: testing/talos/talos/cmdline.py:124
(Diff revision 10)
> +    add_arg('--mitmproxy',
> +            help='Test uses mitmproxy to serve the pages, specify the '
> +                 'path and name of the mitmdump file to playback')
> +    add_arg('--mitmdumpPath',
> +            help="Path to mitmproxy's mitmdump playback tool")
> +    add_arg("--firstNonBlankPaint", action='store_true', dest="first_non_blank_paint",

what code uses first_non_blank_paint?  Also if --mitmdumpPath is defined can we assume --mitmproxy==True?

::: testing/talos/talos/run_tests.py:100
(Diff revision 10)
> +
> +    # cmd line to start mitmproxy playback using custom playback script is as follows:
> +    # <path>/mitmdump -s "<path>mitmdump-alternate-server-replay/alternate-server-replay.py
> +    #  <path>recording-1.mp <path>recording-2.mp..."
> +    param = os.path.join(here, 'mitmdump-alternate-server-replay', 'alternate-server-replay.py')
> +    param2 = '""' + param.replace('\\', '\\\\\\') + ' ' + \

this is a windows specific hack, lets find a method to do this on windows only.

::: testing/talos/talos/run_tests.py:106
(Diff revision 10)
> +             ' '.join(mitmproxy_recordings).replace('\\', '\\\\\\') + '""'
> +
> +    command = [mitmdump_path, '-s', param2, '-k']
> +
> +    env = os.environ.copy()
> +    env["PATH"] = "C:\\slave\\test\\build\\application\\firefox;" + env["PATH"]

again, windows specific hack, lets strive for platform independence.

::: testing/talos/talos/run_tests.py:108
(Diff revision 10)
> +    command = [mitmdump_path, '-s', param2, '-k']
> +
> +    env = os.environ.copy()
> +    env["PATH"] = "C:\\slave\\test\\build\\application\\firefox;" + env["PATH"]
> +
> +    LOG.info("Starting mitmproxy playback using env path: %s" % env["PATH"])

I added this env path log statement for debugging to get this working, not sure if it is still useful.

::: testing/talos/talos/run_tests.py:113
(Diff revision 10)
> +    LOG.info("Starting mitmproxy playback using env path: %s" % env["PATH"])
> +    LOG.info("Starting mitmproxy playback using command: %s" % ' '.join(command))
> +    # to turn off mitmproxy log output, use these params for Popen:
> +    # Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=env)
> +    mitmproxy_proc = subprocess.Popen(command, env=env)
> +    time.sleep(5)

is the time.sleep required?

::: testing/talos/talos/run_tests.py:140
(Diff revision 10)
> +        LOG.error("Failed to kill the mitmproxy playback process")
> +
> +
> +def is_mitmproxy_cert_installed():
> +    """ Verify mitmxproy CA cert was added to Firefox """
> +    # TODO

can you at least file a bug for this if it is not in this changeset?

::: testing/talos/talos/run_tests.py:149
(Diff revision 10)
> +def install_mitmproxy_cert(mitmproxy_proc, browser_path, config):
> +    # install the CA certificate generated by mitmproxy, into Firefox
> +    LOG.info("Installing mitmxproxy CA certficate into Firefox")
> +    # need mozharness mitmproxy module
> +    LOG.info("sys.path = %s" % sys.path[1])
> +    LOG.info("inserting sys.path = %s" % os.path.join(os.path.dirname(sys.path[1]), 'mozharness'))

is this logging helpful, or overkill?

::: testing/talos/talos/run_tests.py:151
(Diff revision 10)
> +    LOG.info("Installing mitmxproxy CA certficate into Firefox")
> +    # need mozharness mitmproxy module
> +    LOG.info("sys.path = %s" % sys.path[1])
> +    LOG.info("inserting sys.path = %s" % os.path.join(os.path.dirname(sys.path[1]), 'mozharness'))
> +    sys.path.insert(1, os.path.join(os.path.dirname(sys.path[1]), 'mozharness'))
> +    sys.path.insert(1, "c:\\slave\\test\\scripts\\")

we are inserting a hardcoded path here, I don't like this as it is windows specific and buildbot configuration specific.

::: testing/talos/talos/run_tests.py:155
(Diff revision 10)
> +    sys.path.insert(1, os.path.join(os.path.dirname(sys.path[1]), 'mozharness'))
> +    sys.path.insert(1, "c:\\slave\\test\\scripts\\")
> +    # browser_path is exe, we want install dir
> +    browser_install = os.path.dirname(browser_path)
> +    LOG.info('Calling configure_mitmproxy with browser folder: %s' % browser_install)
> +    from mozharness.mozilla.mitmproxy import configure_mitmproxy

is there a reason to have this import here vs at the top?

::: testing/talos/talos/run_tests.py:274
(Diff revision 10)
>      if config['gecko_profile']:
>          talos_results.add_extra_option('geckoProfile')
>  
> +    # some tests use mitmproxy to playback pages
> +    mitmproxy_recordings_list = config.get('mitmproxy', False)
> +    if mitmproxy_recordings_list is not False:

why is the recordings list in the 'mitmproxy' variable?  is this required?  we should do something like:
if not mitmproxy_recordins_list and not mitmdumpPath:
   sys.exit()
   
# do code below

::: testing/talos/talos/run_tests.py:294
(Diff revision 10)
> +
> +        # install the generated CA certificate into Firefox
> +        install_mitmproxy_cert(mitmproxy_proc, browser_config['browser_path'], config)
> +
> +    if config.get('first_non_blank_paint', False):
> +        browser_config['firstNonBlankPaint'] = True

how do we use this in the test?

::: testing/talos/talos/test.py:796
(Diff revision 10)
> +    tppagecycles = 25
> +    gecko_profile_interval = 1
> +    gecko_profile_entries = 2000000
> +    filters = filter.ignore_first.prepare(5) + filter.median.prepare()
> +    unit = 'ms'
> +    lower_is_better = True

do we need mitmproxy variables in here?
Attachment #8868638 - Flags: review?(jmaher) → review-
Attachment #8868638 - Flags: review?(jmaher)
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review144136

> I would rather have this cloned into m-c so this is all in-tree; Doing this with an external repo means that if a bad commit is checked into the defined github repo, the quantum test could start failing on all branches and we would have a hard time tracking it down.

Yeah good idea. I will copy the latest and put it here, if that's ok:

/testing/talos/talos/tests/quantum_pageload/mitmdump-alternate-server-replay/alternate-server-replay.py
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review144136

> lets not replace chromez; please add this to the 'g5' suite if possible, otherwise we can create a 'g6' suite.

Yeah that's just temporary for testing on try, I will take that out on the final update before landing. I added a new 'quantum-pageload' suite, is that ok?
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review144136

> what code uses first_non_blank_paint?  Also if --mitmdumpPath is defined can we assume --mitmproxy==True?

first_non_blank_paint is just a placeholder for run_tests.py to pass into ttest, for when the new test starts using the new perf marker. When we get to that point, if it's not necessary I can remove it. I was mirroring similar behaviour to the --mozAfterPaint option.

We need both --mitmproxy and mitmdumpPath. --mitmproxy takes an argument, the archive to playback. mitmdumpPath is the path to mitmdump itself.

> I added this env path log statement for debugging to get this working, not sure if it is still useful.

I think we should leave it for now as it may be helpful when trying to get this to run on other platforms

> is the time.sleep required?

I wanted to give time for mitmdump to spin up, I think it's safer to have a sleep here as it does take a second or two to startup - at least until it prints out that it's proxy is running.

> is this logging helpful, or overkill?

I think it may be helpful when we try getting the other platforms running

> is there a reason to have this import here vs at the top?

Yes I had it at the top and it broke a bunch of other mozharness stuff, I don't know why... for some reason moving it here fixed that.

> how do we use this in the test?

Placeholder

> do we need mitmproxy variables in here?

It works without so I don't believe so...
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review144136

> why is the recordings list in the 'mitmproxy' variable?  is this required?  we should do something like:
> if not mitmproxy_recordins_list and not mitmdumpPath:
>    sys.exit()
>    
> # do code below

This is how it works currently:

--mitmproxy /path/and/name-of-playback-file.mp --mitmdumpPath path/to/mitdump/tool/
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review144136

> This is how it works currently:
> 
> --mitmproxy /path/and/name-of-playback-file.mp --mitmdumpPath path/to/mitdump/tool/

--mitmproxy is a user-defined talos option i.e. in talos.json; --mitmdumpPath is set auto during the py3venv creation but it has to be passed into talos itself from mozharness, so that run_tests.py can find it
Attachment #8868638 - Flags: review?(jmaher)
Blocks: 1366071
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review144136

> can you at least file a bug for this if it is not in this changeset?

Good idea, thanks, filed Bug 1366071
Attachment #8868638 - Flags: review?(jmaher)
Attachment #8868638 - Flags: review?(jmaher)
Blocks: 1366355
Attachment #8868638 - Flags: review?(jmaher)
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review144136

> Yes I had it at the top and it broke a bunch of other mozharness stuff, I don't know why... for some reason moving it here fixed that.

Yeah so the reason is we need to add mozharness to the path in order to import what we need:

sys.path.insert(1, os.path.join(os.path.dirname(sys.path[1]), 'mozharness'))

not really any point in doing that unless we need mitmproxy, so I moved that and the mozharness imports into a 'import_mozharness_for_mitmproxy' function that is only called if mitmproxy is used.
Attachment #8868638 - Flags: review?(jmaher)
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review145708

getting closer!

outside of the mentioned comments, I would ideally like to see this as we do xperf- in a directory with minimal code in run_test.py, etc.  I am fine with a refactor in the future though, but I would like a bug on file for that.

::: testing/mozharness/mozharness/mozilla/testing/talos.py:411
(Diff revision 28)
> +
> +    def download_mitmproxy_recording_set(self):
> +        """Download the set of mitmproxy recording files that will be played back"""
> +        self.info("Downloading the mitmproxy recording set using tooltool")
> +        dest = os.path.join(self.talos_path, 'talos')
> +        manifest_file = os.path.join(self.talos_path, 'mitmproxy-playback-set.manifest')

would it be possible to put this in a different directory than the root?  maybe self.talos_path/talos/tests/quantum/mitmproxy-playback-set.manifest ?

alternatively, maybe a directory:
talos/talos/artifacts/

::: testing/talos/talos.json:11
(Diff revision 28)
> -        "chromez": {
> -            "tests": ["tresize", "tcanvasmark"],
> -            "talos_options": ["--disable-e10s"]
> -        },
>          "chromez-e10s": {
> -            "tests": ["tresize", "tcanvasmark"]
> +            "tests": ["Quantum_1"],

we need to leave chromez-e10s alone and stick this in a different job

::: testing/talos/talos.json:126
(Diff revision 28)
>              ]
> +        },
> +        "quantum-pageload-e10s": {
> +            "tests": ["Quantum_1"],
> +            "mitmproxy_recording_set": "mitmproxy-recording-set.zip",
> +            "mitmdump_alt_playback_repo": "https://github.com/bsmedberg/mitmdump-alternate-server-replay",

this still points to github.com, I thought we were moving this locally?

::: testing/talos/talos/alternate-server-replay.py:37
(Diff revision 28)
> +
> +    def clear(self):
> +        self.flowmap = {}
> +
> +    # def count(self):
> +    #    return sum([len(i) for i in self.flowmap.values()])

can we remove this if it isn't used?

::: testing/talos/talos/run_tests.py:108
(Diff revision 28)
> +    # this part is platform-specific
> +    if mozinfo.os == 'win':
> +        param2 = '""' + param.replace('\\', '\\\\\\') + ' ' + \
> +                 ' '.join(mitmproxy_recordings).replace('\\', '\\\\\\') + '""'
> +        env = os.environ.copy()
> +        sys.path.insert(1, "c:\\slave\\test\\scripts\\")

this will not work in taskcluster; What do we need from the scripts/ directory?  Maybe we can pass in a path from mozharness or use something relative to what we are running?

::: testing/talos/talos/run_tests.py:110
(Diff revision 28)
> +        param2 = '""' + param.replace('\\', '\\\\\\') + ' ' + \
> +                 ' '.join(mitmproxy_recordings).replace('\\', '\\\\\\') + '""'
> +        env = os.environ.copy()
> +        sys.path.insert(1, "c:\\slave\\test\\scripts\\")
> +        sys.path.insert(1, mitmdump_path)
> +        env["PATH"] = "C:\\slave\\test\\build\\application\\firefox;" + env["PATH"]

I really don't like this hardcoded path, I don't think this will work in taskcluster; see previous comment about path in sys.path.insert.
Attachment #8868638 - Flags: review?(jmaher) → review-
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review146876

a few nits, but this is really good!

::: testing/talos/talos/mitmproxy.py:111
(Diff revisions 28 - 47)
> +    param = os.path.join(here, 'alternate-server-replay.py')
> +
> +    # this part is platform-specific
> +    if mozinfo.os == 'win':
> +        param2 = '""' + param.replace('\\', '\\\\\\') + ' ' + \
> +                 ' '.join(mitmproxy_recordings).replace('\\', '\\\\\\') + '""'

one nit here- is the '""' needed for other platforms outside of windows?  I think it might be.

::: testing/mozharness/mozharness/mozilla/testing/talos.py:38
(Diff revisions 28 - 47)
>  external_tools_path = os.path.join(
>      os.path.abspath(os.path.dirname(os.path.dirname(mozharness.__file__))),
>      'external_tools',
>  )
>  
> +scripts_path = os.path.abspath(os.path.dirname(os.path.dirname(mozharness.__file__)))

can we make the lines above this now say:
external_tools = os.path.join(scripts_path, 'external_tools')

::: testing/talos/talos/run_tests.py:217
(Diff revisions 28 - 47)
>  
>          # install the generated CA certificate into Firefox
> -        install_mitmproxy_cert(mitmproxy_proc, browser_config['browser_path'], config)
> +        # mitmproxy cert setup needs path to mozharness install; mozharness has set this
> +        # value in the SCRIPTSPATH env var for us in mozharness/mozilla/testing/talos.py
> +        scripts_path = os.environ.get('SCRIPTSPATH')
> +        LOG.info('RW: scripts_path:')

can we make this a better log message :)  I think RW: is good for debugging, but not for official code.
Attachment #8868638 - Flags: review?(jmaher) → review+
Comment on attachment 8868638 [details]
Bug 1357382 - Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test;

https://reviewboard.mozilla.org/r/140220/#review146876

> one nit here- is the '""' needed for other platforms outside of windows?  I think it might be.

Right, well this won't work on any platforms except windows for lots of reasons, so I'll leave that until we work on porting this to the other platforms. In the code below that I have:

        # TODO: support other platforms, Bug 1366355
        LOG.error('Aborting: talos mitmproxy is currently only supported on Windows')
        sys.exit(

> can we make this a better log message :)  I think RW: is good for debugging, but not for official code.

oops hah
Pushed by rwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d3e552f71698
Add mitmproxy support to mozharness and talos, add first rev of quantum-pageload talos test; r=jmaher
awesome!  we need a buildbot-config patch for the Q1 job- after that this will be running.
https://hg.mozilla.org/mozilla-central/rev/d3e552f71698
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.