Closed Bug 1517323 Opened 1 year ago Closed 1 year ago

mitmproxy playback is making live connections to upstream servers (aka upstream certificate sniffing)

Categories

(Testing :: Raptor, defect, P2)

defect

Tracking

(firefox69 fixed)

RESOLVED FIXED
mozilla69
Tracking Status
firefox69 --- fixed

People

(Reporter: acreskey, Assigned: tarek)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

mitmproxy by default makes live connections to upstream servers (e.g. https://www.amazon.com).

The reason is a feature called "upstream certificate sniffing"
https://mitmproxy.readthedocs.io/en/v2.0.2/features/upstreamcerts.html

mitmproxy pauses the recorded playback and makes a brief connection to the upstream server to have a look at the ssl certificate.
This is done in for cases where the client connects by direct ip instead of hostname (and two other subtle cert naming fallbacks).

Naturally these connections to live servers introduce significant variability (particularly locally over wifi) due to network latency.

As far as I've been able to tell, this is not needed in the raptor suite.

This behavior can be disabled by providing the "--no-upstream-cert" option when launching mitmdump.

This allows the proxy to playback entirely from disk without a network connection.
:bebe, do you want to take this one? Thanks!
Flags: needinfo?(fstrugariu)
I did a test of this and it's showing a lot of performance regressions:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=afe7833a88bf&framework=10&showOnlyComparable=1&selectedTimeRange=172800

:bebe - I expect that you'll see the same. If so, I think we'll need to figure out why this is the case.
Priority: -- → P2
tested this and found out something  unexpected :D 

When running tp6 tests you need to have a network connection on to prevent the 502 error when first opening the page.
Using "--no-upstream-cert" we don't get this error no more when testing without any network connection 

Also, compared the two tests and I can't find any visual difference between the websites.

Let's run more tests to check if it's more stable with the option on.
Flags: needinfo?(fstrugariu)
Flags: needinfo?(fstrugariu)

To better test if if this change adds any improvements I would need a noise test

:igoldan :rwood :jmaher can you help with a high noise test from our suites? Or how can one identify that test?

my plan is to run that test multiple times in different platforms and see if this change adds more stability to the suite and reduces noise. That would be the only argument to introduce the regressions this change introduces.

Flags: needinfo?(rwood)
Flags: needinfo?(jmaher)
Flags: needinfo?(igoldan)

if you use the perfherder compare view, you can see the stddev (noise) of a set of results from a given revision. In fact, there is a noise metric which is a sum(stddev) of all the values it has access to.

Flags: needinfo?(jmaher)

(In reply to Florin Strugariu [:Bebe] from comment #7)

:igoldan :rwood :jmaher can you help with a high noise test from our suites? Or how can one identify that test?

Some noisy tests are tscrollx [1], tp5o responsiveness [2], tps [3]. If you want more noisy tests, you can check the Talos Tests Ownership Google doc. Joel provided some noise scores to many of our perf tests. I just shared that doc with you.

[1] https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=autoland,1653775,1,1
[2] https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=autoland,1646436,1,1&series=autoland,1683708,0,1&series=autoland,1651412,1,1&series=autoland,1650707,0,1&series=autoland,1683462,0,1
[3] https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=autoland,1648727,1,1&series=autoland,1655080,1,1&series=autoland,1685720,1,1&series=autoland,1654139,1,1&series=autoland,1685771,1,1&zoom=1546428817851.5781,1547011064000,6.44943820224719,15.03370786516854

Flags: needinfo?(igoldan)

I was curious about the impact of this change on raptor tp6 numbers so I made a try push to compare it before/after:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=fb2db3662df0&newProject=try&newRevision=9894488e2033&framework=10

The results are all over the place, some tests improve and some regress but it seems to cause more regressions across the board than improvements, not sure why...

Yes, that is the interesting piece to the puzzle.

When I profile with this change I see that the long network requests complete much sooner (particularly over home wifi).
This much is expected.

But even with the shorter network transfers, there is a tendency for the measured page load metrics to take longer.
e.g. the hero element timing or the load event.

I haven't been able to figure out why this is the case.
Hypothesizing that scheduling is affected in a way that reduces performance?

Perhaps related to fetch scenarios as described here:
https://bugzilla.mozilla.org/show_bug.cgi?id=1501354

Without the upstream connection fetches could complete sooner (before page load) thus consuming cycles on the parent and content processes.
(Just a guess!)

One of the most regressed tests is raptor-tp6-bing-firefox (part of tp6-4).

In that test, both locally and on try, I see that the first load (the cold load) is noticeably faster without the upstream cert sniffing. But the subsequent loads tend to be significantly slower.
(The first load is ignored by the scoring because these are reload tests.)

I'm still investigating why the reloads are slower.
In general it does look like there are fewer network request completing in a few milliseconds and more taking 30-40ms.

I'm running mitmproxy in verbose mode, --verbose and comparing the results with and without this change.

On the bing reload when replaying without the upstream cert sniffing, there is no HTTP2 traffic whereas otherwise it looks like roughly everything comes through via HTTP2.

Yes, with "--no-upstream-cert" an http 1.1 connection is established, otherwise it's http2.

Even without "--no-upstream-cert", if I force the playback to use http 1.1 the performance degrades massively for bing.com.

I'll try to find a way to replay http2 without the live connection through mitmproxy.

(In reply to Andrew Creskey from comment #15)

Yes, with "--no-upstream-cert" an http 1.1 connection is established, otherwise it's http2.

Wow, what a huge difference! It might be worth filing an issue upstream to document this.

I've asked on the mitmproxy forums to see if what we're attempting to do (http2 without live connections) is possible:
https://discourse.mitmproxy.org/t/possible-to-playback-http-2-with-no-upstream-cert/1394

Apparently WepPageReplay does to this, so I'm hopeful.

(In reply to :Ehsan Akhgari from comment #16)

Yes, with "--no-upstream-cert" an http 1.1 connection is established, otherwise it's http2.

Wow, what a huge difference! It might be worth filing an issue upstream to document this.

:Ehsan -- do you mean filing a separate bugzilla issue or raising an issue with mitmproxy?

I've tried re-recording http sessions with both mitmproxy 2.02 and mitmproxy 4.04 but even with options like "http2_priority" they always drop back to http1.1 without the upstream server cert sniffing.

This is likely because the protocol negotiation occurs during the tls handshake.
Hoping that there can be a way to record the negotiated protocol so that it can be used in offline playback.

(In reply to Andrew Creskey from comment #17)

Yes, with "--no-upstream-cert" an http 1.1 connection is established, otherwise it's http2.

Wow, what a huge difference! It might be worth filing an issue upstream to document this.

:Ehsan -- do you mean filing a separate bugzilla issue or raising an issue with mitmproxy?

I meant the latter, for the benefit of other users of that project. :-)

Flags: needinfo?(rwood)

We were able to get a response from the mitmproxy folks regarding http/2 playback without (brief) live connections.

Unfortunately it is indeed not possible at the moment.

https://discourse.mitmproxy.org/t/possible-to-playback-http-2-with-upstream-cert-sniffing-disabled-upstream-cert-false/1394

Maximilian did provide a possible workaround:

If you really really do need HTTP/2, one viable workaround might be to put mitmproxy into upstream/reverse proxy mode and point to some dummy server running on the same machine that advertises h2. This will trick mitmproxy into thinking that upstream does support it.

So this would be an extra complication -- but is this actually possible for us?
We want to advertise not only h2 but rather whatever was negotiated during the recorded session.
Can someone more familiar with proxying replays weigh in here? :Bebe? :davehunt - would you know?

Flags: needinfo?(dave.hunt)

Maybe we could save what protocol was negotiated (with each server?) during playback and feed this into our dummy server during playback?

there might be some options for running a dummy http/2 server via python:
https://medium.com/python-pandemonium/how-to-serve-http-2-using-python-5e5bbd1e7ff1

in our CI and general automation script it isn't too hard to add a server that runs- we already have 2 in raptor (mitmproxy and control server as simplehttpserver)

(In reply to Andrew Creskey from comment #21)

Maybe we could save what protocol was negotiated (with each server?) during playback and feed this into our dummy server during playback?

Do you mean save protocol during recording rather than playback? I think the first thing we should do is see if we can trick mitmdump to use HTTP/2. We could then see about configuring which sites should use HTTP/2, or if we should run all tests using HTTP/2. Ideally we could encapsulate this in the recordings, so we don't need to maintain a list of HTTP/2 enabled sites.

Bebe: Could you look into the suggested workaround for running mitmproxy entirely offline?

Flags: needinfo?(dave.hunt)

Not an area that I have a deep understanding of, but most of the raptor tp6 sites do seem to support HTTP/2.
Some don't e.g. https://www.allrecipes.com
Testing with https://tools.keycdn.com/http2-test

The dummy web server would need to negotiate the tls handshake between mitmproxy and the server.
mitmproxy tries to do a "CONNECT" call the web server.

I don't have enough experience do this on my own... maybe someone with more experience can help

Flags: needinfo?(fstrugariu)

:Bebe if I understand it correctly the idea is that we would have mitmdump negotiate the tls handshake with our dummy webserver where we could advertise h2 as desired.
But it's not clear to me how feasible this is. A given page load replay could touch numerous origins, so would we need to fake tls handshakes from them all?
It's not an area that I have much experience in.

Nick, you might be able to help us here.
We're trying to disable mitmproxy's "upstream cert sniffing" whereby it makes a live connection to the upstream server.
But the problem is that the ALPN is done during the TLS handshake, and that's what allows it to decide whether or not to replay in HTTP/2 vs HTTP/1.

The mitmproxy folks provided us with a possible workaround in Comment 20.
I believe this would mean putting up a dummy server to respond to the tls connections and pointing mitmdump to it: https://mitmproxy.readthedocs.io/en/v2.0.2/features/upstreamproxy.html

But this is beyond my experience with http session playback. What do you think?

Flags: needinfo?(nalexander)

(In reply to Andrew Creskey from comment #27)

Nick, you might be able to help us here.
We're trying to disable mitmproxy's "upstream cert sniffing" whereby it makes a live connection to the upstream server.
But the problem is that the ALPN is done during the TLS handshake, and that's what allows it to decide whether or not to replay in HTTP/2 vs HTTP/1.

The mitmproxy folks provided us with a possible workaround in Comment 20.

The solution in Comment 20 appears to be exactly what I stumbled into in my mach perftest harness (in development) and describe here.

Perhaps my logs are sufficient to clearly show that the suggestion works? Or perhaps I can do some additional tests locally that show we definitely get HTTP/2 responses in my WPR configuration, confirming that the suggestion works?

I believe this would mean putting up a dummy server to respond to the tls connections and pointing mitmdump to it: https://mitmproxy.readthedocs.io/en/v2.0.2/features/upstreamproxy.html

But this is beyond my experience with http session playback. What do you think?

I think that all things are possible and that this is probably not particularly difficult. It's the inverse of the problem I see with WPR: I need mitmproxy to "front" WPR; you need mitmproxy to front... itself? I.e., you almost want to turn mitmproxy into a pass-through just like WPR, and then have it front itself like in my configuration.

I have no specific mitmproxy expertise, but I expect that to not be particularly difficult, based on my experience reading the mitmproxy code while developing mitmproxy_portforward.py.

Flags: needinfo?(nalexander)
Assignee: nobody → tarek

I am exploring both solutions (local h2 server and patching the code), both solutions are not really straightforward.

you can follow my work at https://github.com/mitmproxy/mitmproxy/issues/3536

I hope I will have the solution this week

This patch will block mitmproxy from making upstream requests to
negotiate ALPN. We force ALPN locally and intercept requests made
upstream.

Work in progress: this patch does it only for bing.com

Andrew, I pushed a patch on try here :

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5c29ad471273c0c2980c9817c218cf1843b92be6

where all tp6-4 tests against bing.com should be fully offline (no upstream cert) and not degrade to HTTP/1.x

Looking at the logs it seems to work. Now in terms of performances, it's hard to tell with that small sample.
Would you like to do more tests on your side to validate the solution?

If it works well, we can deploy it in two ways:

1- with a white list of domain that we know we want h2. That's the simplest solution
2- with a new option in the raptor ini files, where we explicitely activate h2 for a test

for 2 it could look like this:

[raptor-tp6m-booking-geckoview]
apps = geckoview
test_url = https://www.booking.com/
playback_pageset_manifest = mitmproxy-recordings-raptor-tp6m-booking.manifest
playback_recordings = android-booking.mp
measure = fnbpaint, fcp, dcf, loadtime
force_h2 = true

if force_h2 is true, h2 will be activated on the test_url domain

Flags: needinfo?(acreskey)
Attachment #9034981 - Attachment is obsolete: true

That's very promising :tarek!

I've push a subset of those tp6-4 from a commit before yours so that we can compare results on try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=dffa1ba06e65812b6d8c7f003fb524c80cd3df79

I'll also pull down your change and run it locally to get a better look.

One question I'm thinking about:
What happens in scenarios where there is content on a site from different origins and one origin supports HTTP/1 and the other other HTTP/2?
As it is now, with the upstream cert sniffing, is mitmproxy handling this?
I should be able to see how this handled both before and after your change.

What happens in scenarios where there is content on a site from different origins and one origin supports HTTP/1 and the other other HTTP/2
As it is now, with the upstream cert sniffing, is mitmproxy handling this?

I haven't tried but I suspect it should work ok as long as we have the hand on what protocol the proxy needs to use when
starting a new connection on behalf of a client. If you have an example of such a scenario in the raptor suite,
I'll be happy to use it alongside bing's tp6-4 to make sure everything works as we want.

Flags: needinfo?(acreskey)

Tarek, your patch is working very well in local testing:

Looking at tp-6-bing-firefox I can see from the mitmproxy logs that it's correctly negotiating h/2 with bing.com and also maintaining h/1 when connecting to login.live.com

Performance-wise on my MacBook and with your patch tp-6-bing-firefox no longer drops when we add in the --no-upstream-cert option.

Once we have more hosts added, I would like to see a performance compare of just the changes to alternate-server-replay-2.0.2.py -- theoretically this shouldn't change at all, right?

As you mentioned, this strategy would require some work to setup and maintain a list of h2 servers.

facebook is a site that requires the client to connect to multiple servers which support h/2:

googleads.g.doubleclick.net,
www.facebook.com,
static.xx.fbcdn.net

I think that would make option from Comment 31 more involved -- we would need to add the list of h/2 servers.

Out of curiosity I modified your alternate-server-replay-2.0.2.py to always force h2, and it does as expected -- makes h2 connections where previously they were h1. It looks to unrealistically improve the performance results too.

One way to generate the list of whitelisted domains you mention in Comment 1 would just be to run the full raptor suite and then extract them from the mitmproxy logs.

I do find that this overall approach to be a little bit on the "crazy" side, but I also think it just might work.

I'll put up patches that test noise and performance on a few more sites.

Out of curiosity I modified your alternate-server-replay-2.0.2.py to always force h2, and it does as expected -- makes h2 connections where previously they were h1. It looks to unrealistically improve the performance results too.

Did you do those tests with existing playback files ? If so, and if the playback file was recorded in h1, you might have called upstream during the test. I generally completely turn off my network to make sure it's fully off.

For switching everything to h2, we will need to ensure that all the recorded files were recorded with h2 and not h1.

Maybe we should also simply turn off the network on raptor slaves during those tests, just as to make sure we're really replaying everything locally. The only network call we're doing is to update the benchmark repo on github, but everything else should be 100% offline.

I'll put up patches that test noise and performance on a few more sites.

Awesome thanks! Happy to land the solution you find the best once this round of tests is done.

(In reply to Tarek Ziadé (:tarek) from comment #36)

Out of curiosity I modified your alternate-server-replay-2.0.2.py to always force h2, and it does as expected -- makes h2 connections where previously they were h1. It looks to unrealistically improve the performance results too.

Did you do those tests with existing playback files ? If so, and if the playback file was recorded in h1, you might have called upstream during the test. I generally completely turn off my network to make sure it's fully off.

Yes, this was done with existing recordings. I re-verified by running them with no network (once raptor was initialized).
So changing the ALPN in the playback script seems to be able to force h2 protocol where it was recorded as h1.

Maybe we should also simply turn off the network on raptor slaves during those tests, just as to make sure we're really replaying everything locally. The only network call we're doing is to update the benchmark repo on github, but everything else should be 100% offline.

That would be ideal!

I'll put up patches that test noise and performance on a few more sites.

Awesome thanks! Happy to land the solution you find the best once this round of tests is done.

I've put up limited try jobs:
This one adds the full list of h2 servers used in tp6-1 and tp6-4 (["fbcdn.net", "facebook.com", "doubleclick.net", "google.com", "gstatic.com", "youtube.com", "googleapis.com", "ytimg.com", "ggpht.com, "bing.com"])

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4402224cc4723b6613b76d0ce5d53f812087e482

And, as a datapoint, this is tp6-1, and tp-4 with just upstream cert sniffing disabled
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d0b145762843cda97375ca5a343b5142147bef3c

So the costs of this approach:
• we have to maintain the list of servers that support h2. When a page is re-recorded, this needs to be updated.
• maybe we have something wrong in our understanding and are distorting the playback?

From what I've seen, I would prefer to use Web Page Replay (Bug 1522133) since it already handles this automatically.
Recently we have made some progress there in terms of integrating it into automation.

But I would argue that what we have in these patches is better than what we have now: mitmproxy making live connections during replay.

we have to maintain the list of servers that support h2

My understanding is that we are in a way already maintaining that list, since each test we have names the domain against which
we do the test in the test_url. The extra domains like login points etc are as far as I know easy to find and once you have a record, the list is not to be maintained unless we do the recording again (it's fixed).

So it seems to me that adding the list for each test is a one shot thing.

From what I've seen, I would prefer to use Web Page Replay
But I would argue that what we have in these patches is better than what we have now: mitmproxy making live connections during replay.

I don't really know Web Page Replay so I can't weigh on this decision, I'll defer to you guys :).

However, what I can say is:

  • It looks like we're very close to fixing the live connection issue with mitmproxy - e.g. we're mostly discussing our options now, the technical blocker is solved.
  • integrating Web Page Replay is going to take some extra work until it works as well as what we have today, so maybe it shoud be done progressively. But if it's a better tool, we should move to it.
  • we will probably have to reset our stats history if we change the underneath tool because proxies add their own specific lag

So what I would suggest is to fix mitmproxy today and look for integrating webpage replay in a single test after.

I agree with :tarek let's fix this in mitmproxy now and also continue with the WPR research
As WPR is a new tool it will add a new set of challenges that might take longer than we plan

Also this bug is blocking other tasks.

Here are some performance comparisons involving this proposed solution.

1. For reference, simply disabling the upstream cert check --no-upstream-cert:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=dffa1ba06e65812b6d8c7f003fb524c80cd3df79&newProject=try&newRevision=d0b145762843cda97375ca5a343b5142147bef3c&framework=10
Leads to a ~38% regression on raptor-tp6-bing-firefox opt, ~17% on raptor-tp6-google-firefox opt, and other sites as well. (due to dropping playback down to http/1)

2. Tarek's change to selectively choose H2 and not connect to upstream (which I've extended to include all H2 servers that I've found in tp6-1, tp6-4)
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=dffa1ba06e65812b6d8c7f003fb524c80cd3df79&newProject=try&newRevision=4402224cc4723b6613b76d0ce5d53f812087e482&framework=10
Probably best to ignore the mobile raptor tests in there -- I did not verify the server-selected protocol for them.
Although I just added more jobs to raptor-tp6m-bing-restaurants-geckoview opt as it should have the correct h2 list and may be showing a large change.
We would need to do a larger test once all the h2 servers are found. But tp6-1 and tp6-2 seem relatively unchanged in performance and noise on Windows.

3. Tarek's change to selectively choose H2 however still connecting to upstream servers (which I've extended it to include all H2 servers that I've found in tp6-1, tp6-4)
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=dffa1ba06e65812b6d8c7f003fb524c80cd3df79&newProject=try&newRevision=8a2422db5174418a8fdc0ee3f4e49e1fbbcf1ffb&framework=10
On Windows I don't see a significant difference here. Which is what I'd expect.

I'm adding the raptor-tp6m-cold-3 jobs because they include raptor-tp6m-bing-geckoview-cold where we may see a particularly large impact from this change.

The upstream cert sniffing always occurs on the cold load, and will also occur during some of the warm loads.

Flags: needinfo?(acreskey)

Even though we haven't collected all the http2 servers for the TP6 mobile suite, it's fair to say that the baselines would need to be reset if this change was made because we are already seeing big changes on some sites:
e.g.
21.94% improvement on raptor-tp6m-instagram-geckoview-cold opt
21.71%% improvement on raptor-tp6m-youtube-geckoview opt
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=dffa1ba06e65812b6d8c7f003fb524c80cd3df79&newProject=try&newRevision=4402224cc4723b6613b76d0ce5d53f812087e482&framework=10

Flags: needinfo?(acreskey)

I think changing baselines for tooling or infrastructure shouldn't be a concern- it is unrelated to the product; it would be interesting to see if firefox vs chrome

:tarek This issue is blocking multiple recording task on raptor.

Thinking about the current state of this bug do you think the new recordings would be affected by these changes?

We are waiting for this to see if it would add any changes to the way we record websites with mitmporxy.

Aso keep in mind that we plan to move everything in raptor to mitm4.0.4. I see your changes are designed and tested with mitm2.0.2

Flags: needinfo?(tarek)

It's unclear to me at this point what the team wants to do. I am happy to adapt the patch to both mitmproxy version, but I guess I need to know of you want the patch to force H2 all the time or to use a flag defined in the ini in each test. If the latter, I will need help to build the new ini file.

I'll wait on Andrew's take on this and then I guess we can move forward.

Flags: needinfo?(tarek) → needinfo?(acreskey)

< Bebe> Would the recording process change with your patch?
< tarek> Bebe: no because when you record, you are talking with the upstream server anyways, so if it has negotiated H2 that's what you get in the dump.
< tarek> mitmproxy here downgrades to h1 when you forbid mitmproxy to ask upstream if it really supports h2, prior to replay the dump
< tarek> so I don't see any case where the recording process would be impacted

(In reply to Tarek Ziadé (:tarek) from comment #45)

It's unclear to me at this point what the team wants to do. I am happy to adapt the patch to both mitmproxy version, but I guess I need to know of you want the patch to force H2 all the time or to use a flag defined in the ini in each test. If the latter, I will need help to build the new ini file.

I'll wait on Andrew's take on this and then I guess we can move forward.

I think this is a great breakthrough for mitmproxy playback -- we can now replay without live connections.

I still suggest WebPageReplay in the long term for pageload tests. (It was designed for this purpose, can replay offline http/2 without the need to categorize protocols for connections, and it readily integrates with the network condition simulating tool ts_proxy).

I don't know the PI roadmap so I couldn't say whether this is worth the cost of integrating now.
And we use mitmproxy in talos too right, so perhaps we want to proceed with this fix for those tests but use WebPageReplay for pageload.

Really not my call :)
Dave likely has some ideas.

Flags: needinfo?(acreskey) → needinfo?(dave.hunt)

(In reply to Andrew Creskey from comment #47)

I think this is a great breakthrough for mitmproxy playback -- we can now replay without live connections.

This is great. From reading the comments we'd need to compile a list of domains that serve http/2 so that mitmproxy serves these using the correct protocol? Is there any way we could modify the recordings so the protocol is stored instead of maintaining a list, which sounds rather high maintainance, and error prone?

I still suggest WebPageReplay in the long term for pageload tests. (It was designed for this purpose, can replay offline http/2 without the need to categorize protocols for connections, and it readily integrates with the network condition simulating tool ts_proxy).

Do we know how WebPageReplay distinguishes between http/1 and http/2 in the recordings?

I don't know the PI roadmap so I couldn't say whether this is worth the cost of integrating now.

Removing the live connections is a priority. WebPageReplay is also planned, but if we can resolve the mitmproxy issues in the short term that would be preferred. We don't yet know all of the implications of adopting WebPageReplay.

And we use mitmproxy in talos too right, so perhaps we want to proceed with this fix for those tests but use WebPageReplay for pageload.

I believe there's a single Talos suite that uses mitmproxy, which we plan to migrate to Raptor.

Regarding mitmproxy versions, we should support both mitm2 and mitm4 if possible, but if we can only support one then mitm4 is preferred as we intend to re-record all sites using this version.

Flags: needinfo?(dave.hunt) → needinfo?(tarek)

Is there any way we could modify the recordings so the protocol is stored instead of maintaining a list, which sounds rather high maintainance, and error prone?

I don't think we'll need any maintenance, as this is a one-shot decision per recording - it won't change once it's set.

This is how we could proceed when creating a record:

  1. create a record with mitmproxy, look at the logs to see for which domains H2 is used (logs will display the protocol on every call)
  2. add the record in the ini file with an h2 flag

for example:

[raptor-tp6-imdb-firefox]
apps = firefox
test_url = https://www.imdb.com/title/tt0084967/?ref_=nv_sr_2
playback_recordings = imdb.mp
measure = fnbpaint, fcp, dcf, loadtime
h2 = www.imdb.com, login.something.com, some.add.com

From there, the playback script can force h2 on those domains, and this won't change anymore (e.g. the h2 list is fixed for that .mp file)

Flags: needinfo?(tarek)
Flags: needinfo?(dave.hunt)

Sounds good, I imagine it wouldn't be too difficult to parse this information from the logs, or to create a simple mitmproxy addon, to make it even easier to update whenever we make a new recording. Thanks!

Flags: needinfo?(dave.hunt)

Ok -- I'll rework the patch accordingly -- what I would need is an initial list of the tests (and their hosts) we know we want to switch to h2. I guess Bebe and Andrew you could give me that? We can always start with a small list and grow it gradually by simply patching the ini file

I'll also activate the patch for mitmproxy 4.x

Flags: needinfo?(fstrugariu)
Flags: needinfo?(acreskey)

:tarek we can add the list of websites during the recording. If you need an example or something similar I can help

Also we could add a protocol extraction script to the deterministicjs adddon

        if flow.response.data.http_version == b'HTTP/1.1':
            with open("http1.txt", "a") as file:
                file.write("\n" + flow.request.url)
        elif flow.response.data.http_version == b'HTTP/2.0':
            with open("http2.txt", "a") as file:
                file.write("\n" + flow.request.url)
Flags: needinfo?(fstrugariu)

(In reply to Dave Hunt [:davehunt] [he/him] ⌚️UTC from comment #48)
...

Do we know how WebPageReplay distinguishes between http/1 and http/2 in the recordings?

It looks like WebPageReplay stores the negotiated protocol:
https://github.com/catapult-project/catapult/blob/11513e359cd60e369bbbd1f4f2ef648c1bccabd0/web_page_replay_go/src/webpagereplay/archive.go#L77

And based on the client hello it finds the host TLS config, including the protocol:
https://github.com/catapult-project/catapult/blob/11513e359cd60e369bbbd1f4f2ef648c1bccabd0/web_page_replay_go/src/webpagereplay/certs.go#L136

:tarek, I think we'll have to have the protocols captured for all tp6 and tp6m sites before this could be landed, otherwise the performance results would be off, right?
But this can be done this without re-recording.
I like :Bebe's idea of dumping this to file during a playback.

Flags: needinfo?(acreskey)

Ok so it sounds like a first step could be to re-record tp6/tp6m along with a script that dumps the protocols, then use that dump in the playback script. Do we want to create a new bug for this Florin ?

Flags: needinfo?(fstrugariu)
Attachment #9064694 - Attachment description: Bug 1517323 - Make HTTP/2 playbacks fully offline → Bug 1517323 - Make HTTP/2 playbacks fully offliner r?Bebe

As discussed with Bebe, we're going to use a json file that contains host/protocol pairs. If present, the file is loaded and used to decide what the alpn negotiation returns. We will try it on Bebe's first record and land the patch if it works well with it

Flags: needinfo?(fstrugariu)
Attachment #9064694 - Attachment description: Bug 1517323 - Make HTTP/2 playbacks fully offliner r?Bebe → Bug 1517323 - Make HTTP/2 playbacks fully offliner r?Bebem,acreskey
Attachment #9064694 - Attachment description: Bug 1517323 - Make HTTP/2 playbacks fully offliner r?Bebem,acreskey → Bug 1517323 - Make HTTP/2 playbacks fully offliner r?Bebe,acreskey
Blocks: 1555915

I am ready to land this but beware that it will stop getting the upstream certs, so we will have a massive H1 downgrade until we have all the new recorded files that informs us about the protocol to use.

as described in https://bugzilla.mozilla.org/show_bug.cgi?id=1517323#c15

Maybe I can remove the option for now and land the patch, and we'd activate it in the patch that introduces the new recordings?

Flags: needinfo?(acreskey)

(In reply to Tarek Ziadé (:tarek) from comment #58)

Maybe I can remove the option for now and land the patch, and we'd activate it in the patch that introduces the new recordings?

That would be good, or can we make it a test configuration? We could default to no upstream connections, but explicitly enable it in all test configurations until we have the new recordings.

I'm not sure the process for when the benchmarks need to be reset.

But if new recordings are coming then my preference (somewhat driven by curiosity) would be:
-land this change, but not activated (i.e. we still connect to upstream)
-a few days before the new recordings land we activate this patch so we can collect data on the current pageset without upstream connections
-land the new recordings.

This way we capture multiple days of results from the upstream connection fix and the new recordings.
Of course there are other ways of capturing the impact of the separate changes, but that's my preference.

Flags: needinfo?(acreskey)
Pushed by tziade@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3856c664251e
Make HTTP/2 playbacks fully offliner r=acreskey,Bebe
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69

Tarek, this commit doesn't stop the upstream certsniffing yet, but it it's still setting the protocol, is that right?
Do we have the protocols recorded for every server?

I'm wondering if this caused the big spikes on loadtime across Firefox and Chrome at the end of day, June 3?
https://health.graphics/quantum/tp6?bits=64&test=warm-loadtime

Flags: needinfo?(tarek)

Tarek, this commit doesn't stop the upstream certsniffing yet, but it it's still setting the protocol, is that right?

This commit should be doing nothing compared to before because the alpn negotiation should happen via the upstream cert sniffing
and the fact that the alpn fallback returns an empty string like before .

Do we have the protocols recorded for every server?

No, this is going to be added with the new recordings by Bebe

I'm wondering if this caused the big spikes on loadtime across Firefox and Chrome at the end of day, June 3?

weird... I guess if June 4th looks high too we need to dig and see what's going on. I don't think anyone else did something in Raptor at that date
so that spike could be this commit..

Flags: needinfo?(tarek) → needinfo?(acreskey)
Flags: needinfo?(acreskey)
You need to log in before you can comment on or make changes to this bug.