Open Bug 1540886 Opened 5 years ago Updated 1 year ago

Intermittent awsy\test_memory_usage.py TestMemoryUsage.test_open_tabs | UnknownException: Reached error page: about:neterror?e=nssFailure2&u=[URL HERE]&c=UTF-8&f=regular&d=%20

Categories

(Testing :: AWSY, defect, P5)

Version 3
defect

Tracking

(Not tracked)

REOPENED

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Regression)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [stockwell unknown])

#[markdown(off)]
Filed by: nerli [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=237457130&repo=autoland

https://queue.taskcluster.net/v1/task/Wl_673h3S561D_p3phQgBQ/runs/0/artifacts/public/logs/live_backing.log

23:20:35 INFO - loaded!
23:20:50 INFO - switching to tab
23:20:50 INFO - switched to tab
23:20:50 INFO - loading https://en.wikipedia.org/wiki/Barack_Obama
23:20:56 INFO - loaded!
23:21:12 INFO - switching to tab
23:21:12 INFO - switched to tab
23:21:12 INFO - loading https://imgur.com/gallery/m5tYJL6
23:21:23 INFO - loaded!
23:21:39 INFO - switching to tab
23:21:39 INFO - switched to tab
23:21:39 INFO - loading https://mail.google.com/
23:21:40 INFO - TEST-UNEXPECTED-ERROR | awsy\test_memory_usage.py TestMemoryUsage.test_open_tabs | UnknownException: Reached error page: about:neterror?e=nssFailure2&u=https%3A//mail.google.com/&c=UTF-8&f=regular&d=%20
23:21:40 INFO - stacktrace:
23:21:40 INFO - WebDriverError@chrome://marionette/content/error.js:179:5
23:21:40 INFO - UnknownError@chrome://marionette/content/error.js:484:5
23:21:40 INFO - handleReadyState@chrome://marionette/content/listener.js:273:21
23:21:40 INFO - handleEvent@chrome://marionette/content/listener.js:242:14
23:21:40 INFO - Traceback (most recent call last):
23:21:40 INFO - File "z:\task_1554160287\build\venv\lib\site-packages\marionette_harness\marionette_test\testcases.py", line 159, in run
23:21:40 INFO - testMethod()
23:21:40 INFO - File "Z:\task_1554160287\build\tests\awsy\awsy\test_memory_usage.py", line 224, in test_open_tabs
23:21:40 INFO - self.open_pages()
23:21:40 INFO - File "z:\task_1554160287\build\venv\lib\site-packages\awsy\awsy_test_case.py", line 395, in open_pages
23:21:40 INFO - self.open_and_focus()
23:21:40 INFO - File "z:\task_1554160287\build\venv\lib\site-packages\awsy\awsy_test_case.py", line 349, in open_and_focus
23:21:40 INFO - self.marionette.navigate(page_to_load)
23:21:40 INFO - File "z:\task_1554160287\build\venv\lib\site-packages\marionette_driver\marionette.py", line 1467, in navigate
23:21:40 INFO - {"url": url})
23:21:40 INFO - File "z:\task_1554160287\build\venv\lib\site-packages\marionette_driver\decorators.py", line 26, in _
23:21:40 INFO - return func(*args, **kwargs)
23:21:40 INFO - File "z:\task_1554160287\build\venv\lib\site-packages\marionette_driver\marionette.py", line 598, in _send_message
23:21:40 INFO - self._handle_error(err)
23:21:40 INFO - File "z:\task_1554160287\build\venv\lib\site-packages\marionette_driver\marionette.py", line 618, in _handle_error
23:21:40 INFO - raise errors.lookup(error)(message, stacktrace=stacktrace)
23:21:40 INFO - TEST-INFO took 179086ms
23:21:40 INFO - tearing down!
23:21:40 INFO - tearing down webservers!
23:21:40 INFO - mozproxy Stopping mitmproxy playback, killing process 1356
23:21:40 INFO - mozproxy Mitmproxy exited with error code 572
23:21:40 INFO - mozproxy Turning off the browser proxy
23:21:40 INFO - mozproxy writing: Z:\task_1554160287\build\application\firefox\distribution\policies.json
23:21:40 INFO - processing data in Z:\task_1554160287\build\tests\results!

This appears to be from Bug 1532491, Eric, please take a look.

There are 65 total failures in the last 7 days on windows7-32-shippable opt, windows10-64-shippable opt, linux64-shippable osx-10-10-shippable opt.

Flags: needinfo?(erahm)
Whiteboard: [stockwell needswork:owner]

rwood, whimboo any ideas? The screenshot doesn't actually show the error but the page title is "Problem loading...", the page itself is https://mail.gooogle.com from the tp6 pageset.

Flags: needinfo?(rwood)
Flags: needinfo?(hskupin)
Flags: needinfo?(erahm)

We've got this in the mitmproxy log:

127.0.0.1:50895: clientconnect
127.0.0.1:50897: clientconnect
127.0.0.1:50894: CONNECT mail.google.com:443
 << Cannot establish TLS with client (sni: mail.google.com): TlsException("SSL handshake error: Error([('SSL routines', 'ssl3_read_bytes', 'sslv3 alert bad certificate')],)",)
127.0.0.1:50898: clientconnect
127.0.0.1:50894: clientdisconnect
For request https://tpc.googlesyndication.com/pagead/js/r20181031/r20110914/client/ext/m_window_focus_non_hydra.js best match https://tpc.googlesyndication.com/pagead/js/r20181031/r20110914/client/ext/m_window_focus_non_hydra.js with score==3
127.0.0.1:50875: GET https://tpc.googlesyndication.com/pagead/js/r20181031/r20110914/client/ext/m_window_focus_non_hydra.js
[replay]               << 200  (content missing)
127.0.0.1:50243: clientdisconnect
127.0.0.1:50875: CONNECT tpc.googlesyndication.com:443
 << Error in HTTP connection: HttpException('Cannot assemble flow with missing content',)
127.0.0.1:50875: clientdisconnect

The page load failed due to some networking issues. As your last comment states there has been a SSL handshake error with mail.google.com. Do you test the live site, or was that a recording? Note that we also see similar issues one and off for other Marionette tests which are using badssl.com, bad that doesn't happen that often as this one here. Is it always failing for the same site?

Flags: needinfo?(hskupin)

We have a google mail page running in Raptor desktop (tp6-9). :bebe did you see any issues when first landing that?

Flags: needinfo?(rwood) → needinfo?(fstrugariu)
Summary: Intermittent awsy\test_memory_usage.py TestMemoryUsage.test_open_tabs | UnknownException: Reached error page: about:neterror?e=nssFailure2&u=https%3A//mail.google.com/&c=UTF-8&f=regular&d=%20 → Intermittent awsy\test_memory_usage.py TestMemoryUsage.test_open_tabs | UnknownException: Reached error page: about:neterror?e=nssFailure2&u=[URL HERE]&c=UTF-8&f=regular&d=%20
Flags: needinfo?(erahm)

If I copy and paste the dumped error page in bug 1541731 it renders as:

Secure Connection Failed

An error occurred during a connection to www.google.com. You have received an invalid certificate. Please contact the server administrator or email correspondent and give them the following information: Your certificate contains the same serial number as another certificate issued by the certificate authority. Please get a new certificate containing a unique serial number. Error code: SEC_ERROR_REUSED_ISSUER_AND_SERIAL

    The page you are trying to view cannot be shown because the authenticity of the received data could not be verified.
    Please contact the website owners to inform them of this problem.

I'm not 100% sure that's the actual error (there are others listed in the html). I think we need to involve someone who understands that message and why it would intermittently happen with mitmproxy.

Flags: needinfo?(erahm)

:erahm what mitmproxy version are you using 4.0.4 has some issues with missing content flows

That's why we remove them when doing the replay:
https://searchfox.org/mozilla-central/source/testing/raptor/raptor/playback/alternate-server-replay-4.0.4.py#41

Flags: needinfo?(fstrugariu)
Flags: needinfo?(erahm)

(In reply to Florin Strugariu [:Bebe] from comment #13)

:erahm what mitmproxy version are you using 4.0.4 has some issues with missing content flows

That's why we remove them when doing the replay:
https://searchfox.org/mozilla-central/source/testing/raptor/raptor/playback/alternate-server-replay-4.0.4.py#41

I'm using the original version that raptor used.

Flags: needinfo?(erahm)

There are 26 total failures in the last 7 days, most of them being on windows10-64-shippable opt & windows7-32-shippable opt.

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=241694046&repo=mozilla-central&lineNumber=914

12:33:01 INFO - TEST-START | awsy\test_memory_usage.py TestMemoryUsage.test_open_tabs
12:38:39 INFO - TEST-UNEXPECTED-ERROR | awsy\test_memory_usage.py TestMemoryUsage.test_open_tabs | UnknownException: Reached error page: about:neterror?e=nssFailure2&u=https%3A//www.facebook.com/&c=UTF-8&f=regular&d=%20
12:38:39 INFO - stacktrace:
12:38:39 INFO - WebDriverError@chrome://marionette/content/error.js:179:5
12:38:39 INFO - UnknownError@chrome://marionette/content/error.js:484:5
12:38:39 INFO - handleReadyState@chrome://marionette/content/listener.js:273:21
12:38:39 INFO - handleEvent@chrome://marionette/content/listener.js:242:14
12:38:39 INFO - Traceback (most recent call last):
12:38:39 INFO - File "z:\task_1555843357\build\venv\lib\site-packages\marionette_harness\marionette_test\testcases.py", line 159, in run
12:38:39 INFO - testMethod()
12:38:39 INFO - File "Z:\task_1555843357\build\tests\awsy\awsy\test_memory_usage.py", line 224, in test_open_tabs
12:38:39 INFO - self.open_pages()
12:38:39 INFO - File "z:\task_1555843357\build\venv\lib\site-packages\awsy\awsy_test_case.py", line 395, in open_pages
12:38:39 INFO - self.open_and_focus()
12:38:39 INFO - File "z:\task_1555843357\build\venv\lib\site-packages\awsy\awsy_test_case.py", line 349, in open_and_focus
12:38:39 INFO - self.marionette.navigate(page_to_load)
12:38:39 INFO - File "z:\task_1555843357\build\venv\lib\site-packages\marionette_driver\marionette.py", line 1467, in navigate
12:38:39 INFO - {"url": url})
12:38:39 INFO - File "z:\task_1555843357\build\venv\lib\site-packages\marionette_driver\decorators.py", line 26, in _
12:38:39 INFO - return func(*args, **kwargs)
12:38:39 INFO - File "z:\task_1555843357\build\venv\lib\site-packages\marionette_driver\marionette.py", line 598, in _send_message
12:38:39 INFO - self._handle_error(err)
12:38:39 INFO - File "z:\task_1555843357\build\venv\lib\site-packages\marionette_driver\marionette.py", line 618, in _handle_error
12:38:39 INFO - raise errors.lookup(error)(message, stacktrace=stacktrace)
12:38:39 INFO - TEST-INFO took 338002ms
12:38:39 INFO - tearing down!
12:38:39 INFO - tearing down webservers!
12:38:39 INFO - mozproxy Stopping mitmproxy playback, killing process 4016
12:38:39 INFO - mozproxy Mitmproxy exited with error code 572
12:38:39 INFO - mozproxy Turning off the browser proxy
12:38:39 INFO - mozproxy writing: Z:\task_1555843357\build\application\firefox\distribution\policies.json
12:38:39 INFO - processing data in Z:\task_1555843357\build\tests\results!

Bob, could you please assign this to someone?

Flags: needinfo?(bob)

:erahm will be back tomorrow so at a minimum if we can't resolve it soon after we'll disable it. ssl issues... hmmm.

Regressed by: 1532491

(In reply to Bob Clary [:bc:] from comment #18)

:erahm will be back tomorrow so at a minimum if we can't resolve it soon after we'll disable it. ssl issues... hmmm.

We definitely shouldn't disable this test. Nhi, is there someone on the Necko team who could help us understand what might be causing these intermittent SSL errors (see comment 18 for an example)?

Flags: needinfo?(nhnguyen)

Kershaw, could you take a look please?

Flags: needinfo?(nhnguyen) → needinfo?(kershaw)
Whiteboard: [stockwell disable-recommended] → [stockwell needswork:owner]

I think :keeler should be better than me to take a look at the SSL errors.

I'll also try to get the HTTP log to see if there is any networking issue.

Flags: needinfo?(kershaw) → needinfo?(dkeeler)

(In reply to Eric Rahm [:erahm] from comment #11)

If I copy and paste the dumped error page in bug 1541731 it renders as:

Secure Connection Failed

An error occurred during a connection to www.google.com. You have received an invalid certificate. Please contact the server administrator or email correspondent and give them the following information: Your certificate contains the same serial number as another certificate issued by the certificate authority. Please get a new certificate containing a unique serial number. Error code: SEC_ERROR_REUSED_ISSUER_AND_SERIAL

    The page you are trying to view cannot be shown because the authenticity of the received data could not be verified.
    Please contact the website owners to inform them of this problem.

I'm not 100% sure that's the actual error (there are others listed in the html). I think we need to involve someone who understands that message and why it would intermittently happen with mitmproxy.

This is a bug in your replay mechanism. It should never generate two certificates that have the same serial number and issuer distinguished name but differ elsewhere. This is an easy bug to hit when writing a MITM system. Typically the setup is every time the system sees a server certificate, it replaces the issuer distinguished name with that of the intercepting CA and re-sign the certificate. Unfortunately, this will not work if the signature would ever change (e.g. if the issuer key changes or if the signature scheme isn't deterministic (e.g. ECDSA)). I recommend always replacing the serial number as well.

Flags: needinfo?(dkeeler)

(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #22)

(In reply to Eric Rahm [:erahm] from comment #11)

If I copy and paste the dumped error page in bug 1541731 it renders as:

Secure Connection Failed

An error occurred during a connection to www.google.com. You have received an invalid certificate. Please contact the server administrator or email correspondent and give them the following information: Your certificate contains the same serial number as another certificate issued by the certificate authority. Please get a new certificate containing a unique serial number. Error code: SEC_ERROR_REUSED_ISSUER_AND_SERIAL

    The page you are trying to view cannot be shown because the authenticity of the received data could not be verified.
    Please contact the website owners to inform them of this problem.

I'm not 100% sure that's the actual error (there are others listed in the html). I think we need to involve someone who understands that message and why it would intermittently happen with mitmproxy.

This is a bug in your replay mechanism. It should never generate two certificates that have the same serial number and issuer distinguished name but differ elsewhere. This is an easy bug to hit when writing a MITM system. Typically the setup is every time the system sees a server certificate, it replaces the issuer distinguished name with that of the intercepting CA and re-sign the certificate. Unfortunately, this will not work if the signature would ever change (e.g. if the issuer key changes or if the signature scheme isn't deterministic (e.g. ECDSA)). I recommend always replacing the serial number as well.

Thanks for the thorough explanation. It sounds like this is a problem with mitmproxy; I'm not sure how quickly we can expect a fix, do you know if there are any prefs we could use to ignore this error?

Flags: needinfo?(dkeeler)

No - unfortunately this requirement was taken to heart early in the development of NSS and at this point it's essentially a fundamental law of the universe as far as NSS is concerned.

I'm assuming each of these tests has a fresh profile directory, but if that's not the case you could try deleting the files cert8.db, cert9.db, key3.db, key4.db, secmod.db, and pkcs11.txt in the profile - the duplicate issuer/serial number certificate might be cached there.

Flags: needinfo?(dkeeler)
Flags: needinfo?(bob)

There are 23 total failures in the last 7 days: https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-04-29&endday=2019-05-06&tree=trunk&bug=1540886

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=244820465&repo=mozilla-central&lineNumber=1357

[task 2019-05-06T10:50:53.288Z] 10:50:53 INFO - TEST-START | awsy/test_memory_usage.py TestMemoryUsage.test_open_tabs
[task 2019-05-06T11:02:13.659Z] 11:02:13 INFO - TEST-UNEXPECTED-ERROR | awsy/test_memory_usage.py TestMemoryUsage.test_open_tabs | UnknownException: Reached error page: about:neterror?e=nssFailure2&u=https%3A//www.youtube.com/&c=UTF-8&f=regular&d=%20
[task 2019-05-06T11:02:13.661Z] 11:02:13 INFO - stacktrace:
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - WebDriverError@chrome://marionette/content/error.js:179:5
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - UnknownError@chrome://marionette/content/error.js:484:5
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - handleReadyState@chrome://marionette/content/listener.js:273:21
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - handleEvent@chrome://marionette/content/listener.js:242:14
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - Traceback (most recent call last):
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_harness/marionette_test/testcases.py", line 159, in run
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - testMethod()
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/awsy/awsy/test_memory_usage.py", line 224, in test_open_tabs
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - self.open_pages()
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/awsy/awsy_test_case.py", line 395, in open_pages
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - self.open_and_focus()
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/awsy/awsy_test_case.py", line 349, in open_and_focus
[task 2019-05-06T11:02:13.663Z] 11:02:13 INFO - self.marionette.navigate(page_to_load)
[task 2019-05-06T11:02:13.664Z] 11:02:13 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 1467, in navigate
[task 2019-05-06T11:02:13.664Z] 11:02:13 INFO - {"url": url})
[task 2019-05-06T11:02:13.664Z] 11:02:13 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/decorators.py", line 26, in _
[task 2019-05-06T11:02:13.664Z] 11:02:13 INFO - return func(*args, **kwargs)
[task 2019-05-06T11:02:13.665Z] 11:02:13 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 598, in _send_message
[task 2019-05-06T11:02:13.665Z] 11:02:13 INFO - self._handle_error(err)
[task 2019-05-06T11:02:13.665Z] 11:02:13 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 618, in _handle_error
[task 2019-05-06T11:02:13.666Z] 11:02:13 INFO - raise errors.lookup(error)(message, stacktrace=stacktrace)

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → INCOMPLETE
Blocks: 1567138
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Has Regression Range: --- → yes
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.