Closed Bug 1816119 Opened 2 years ago Closed 7 months ago

Intermittent Wd <random tetst> - AssertionError: unknown error (500): Failed to decode response from marionette (ThreadSanitizer failed to allocate)

Categories

(Core :: Sanitizers, defect, P5)

defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: assertion, intermittent-failure)

Filed by: chorotan [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=405256633&repo=try
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/HHOQTdyiS-Cal8ZSBWCpPQ/runs/0/artifacts/public/logs/live_backing.log


task 2023-02-10T11:58:36.763Z] 11:58:36     INFO - TEST-START | /_mozilla/webdriver/take_full_screenshot/iframe.py
[task 2023-02-10T11:58:37.159Z] 11:58:37     INFO - STDOUT: ============================= test session starts ==============================
[task 2023-02-10T11:58:37.160Z] 11:58:37     INFO - STDOUT: platform linux -- Python 3.6.9, pytest-7.0.1, pluggy-unknown -- /builds/worker/workspace/build/venv/bin/python
[task 2023-02-10T11:58:37.161Z] 11:58:37     INFO - STDOUT: rootdir: /builds/worker/workspace/build
[task 2023-02-10T11:58:37.163Z] 11:58:37     INFO - STDOUT: plugins: asyncio-0.14.0
[task 2023-02-10T11:58:37.163Z] 11:58:37     INFO - STDOUT: collecting ... 
[task 2023-02-10T11:58:37.173Z] 11:58:37     INFO - STDOUT: collected 2 items
[task 2023-02-10T11:58:37.174Z] 11:58:37     INFO - STDOUT: tests/web-platform/mozilla/tests/webdriver/take_full_screenshot/iframe.py::test_source_origin[same_origin] 
<...>
task 2023-02-10T11:59:15.767Z] 11:59:15     INFO - PID 17860 | 1676030355766	geckodriver::browser	DEBUG	Browser process stopped: exit status: 0
[task 2023-02-10T11:59:15.768Z] 11:59:15     INFO - PID 17860 | 1676030355766	webdriver::server	DEBUG	<- 200 OK {"value":null}
[task 2023-02-10T11:59:15.773Z] 11:59:15     INFO - STDOUT: =================================== FAILURES ===================================
[task 2023-02-10T11:59:15.773Z] 11:59:15     INFO - STDOUT: _______________________ test_source_origin[same_origin] ________________________
[task 2023-02-10T11:59:15.777Z] 11:59:15     INFO - STDOUT: session = <Session 54b64d63-e83b-4f45-9636-255a50118f06>
[task 2023-02-10T11:59:15.777Z] 11:59:15     INFO - STDOUT: url = <function url.<locals>.url at 0x7fa89a4b68c8>, domain = ''
[task 2023-02-10T11:59:15.777Z] 11:59:15     INFO - STDOUT: inline = <function inline.<locals>.inline at 0x7fa89a4b6b70>
[task 2023-02-10T11:59:15.777Z] 11:59:15     INFO - STDOUT: iframe = <function iframe.<locals>.iframe at 0x7fa89a4b6c80>
[task 2023-02-10T11:59:15.777Z] 11:59:15     INFO - STDOUT:     @pytest.mark.parametrize("domain", ["", "alt"], ids=["same_origin", "cross_origin"])
[task 2023-02-10T11:59:15.777Z] 11:59:15     INFO - STDOUT:     def test_source_origin(session, url, domain, inline, iframe):
[task 2023-02-10T11:59:15.777Z] 11:59:15     INFO - STDOUT:         session.url = inline("""{0}{1}""".format(DEFAULT_CSS_STYLE, DEFAULT_CONTENT))
[task 2023-02-10T11:59:15.778Z] 11:59:15     INFO - STDOUT:     
[task 2023-02-10T11:59:15.778Z] 11:59:15     INFO - STDOUT:         response = take_full_screenshot(session)
[task 2023-02-10T11:59:15.785Z] 11:59:15     INFO - STDOUT: >       reference_screenshot = assert_success(response)
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT: domain     = ''
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT: iframe     = <function iframe.<locals>.iframe at 0x7fa89a4b6c80>
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT: inline     = <function inline.<locals>.inline at 0x7fa89a4b6b70>
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT: response   = <Response status=500 error=<UnknownErrorException http_status=500>>
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT: session    = <Session 54b64d63-e83b-4f45-9636-255a50118f06>
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT: url        = <function url.<locals>.url at 0x7fa89a4b68c8>
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT: tests/web-platform/mozilla/tests/webdriver/take_full_screenshot/iframe.py
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT: :35: 
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT: response = <Response status=500 error=<UnknownErrorException http_status=500>>
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT: value = None
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT:     def assert_success(response, value=None):
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT:         """
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT:         Verify that the provided webdriver.Response instance described
[task 2023-02-10T11:59:15.786Z] 11:59:15     INFO - STDOUT:         a valid success response as defined by `dfn-send-a-response` and
[task 2023-02-10T11:59:15.787Z] 11:59:15     INFO - STDOUT:         the provided response value.
[task 2023-02-10T11:59:15.787Z] 11:59:15     INFO - STDOUT:     
[task 2023-02-10T11:59:15.790Z] 11:59:15     INFO - 
[task 2023-02-10T11:59:15.790Z] 11:59:15     INFO - TEST-UNEXPECTED-FAIL | /_mozilla/webdriver/take_full_screenshot/iframe.py | test_source_origin[same_origin] - AssertionError: unknown error (500): Failed to decode response from marionette
[task 2023-02-10T11:59:15.790Z] 11:59:15     INFO - session = <Session 54b64d63-e83b-4f45-9636-255a50118f06>
[task 2023-02-10T11:59:15.790Z] 11:59:15     INFO - url = <function url.<locals>.url at 0x7fa89a4b68c8>, domain = ''
[task 2023-02-10T11:59:15.790Z] 11:59:15     INFO - inline = <function inline.<locals>.inline at 0x7fa89a4b6b70>
[task 2023-02-10T11:59:15.790Z] 11:59:15     INFO - iframe = <function iframe.<locals>.iframe at 0x7fa89a4b6c80>
[task 2023-02-10T11:59:15.790Z] 11:59:15     INFO - 
[task 2023-02-10T11:59:15.791Z] 11:59:15     INFO -     @pytest.mark.parametrize("domain", ["", "alt"], ids=["same_origin", "cross_origin"])
[task 2023-02-10T11:59:15.791Z] 11:59:15     INFO -     def test_source_origin(session, url, domain, inline, iframe):
[task 2023-02-10T11:59:15.791Z] 11:59:15     INFO -         session.url = inline("""{0}{1}""".format(DEFAULT_CSS_STYLE, DEFAULT_CONTENT))
[task 2023-02-10T11:59:15.791Z] 11:59:15     INFO -     
[task 2023-02-10T11:59:15.791Z] 11:59:15     INFO -         response = take_full_screenshot(session)
[task 2023-02-10T11:59:15.791Z] 11:59:15     INFO - >       reference_screenshot = assert_success(response)
[task 2023-02-10T11:59:15.791Z] 11:59:15     INFO - 
[task 2023-02-10T11:59:15.791Z] 11:59:15     INFO - domain     = ''
[task 2023-02-10T11:59:15.791Z] 11:59:15     INFO - iframe     = <function iframe.<locals>.iframe at 0x7fa89a4b6c80>
[task 2023-02-10T11:59:15.791Z] 11:59:15     INFO - inline     = <function inline.<locals>.inline at 0x7fa89a4b6b70>
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO - response   = <Response status=500 error=<UnknownErrorException http_status=500>>
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO - session    = <Session 54b64d63-e83b-4f45-9636-255a50118f06>
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO - url        = <function url.<locals>.url at 0x7fa89a4b68c8>
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO - 
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO - tests/web-platform/mozilla/tests/webdriver/take_full_screenshot/iframe.py:35: 
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO - _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO - 
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO - response = <Response status=500 error=<UnknownErrorException http_status=500>>
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO - value = None
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO - 
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO -     def assert_success(response, value=None):
[task 2023-02-10T11:59:15.792Z] 11:59:15     INFO -         """
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO -         Verify that the provided webdriver.Response instance described
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO -         a valid success response as defined by `dfn-send-a-response` and
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO -         the provided response value.
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO -     
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO -         :param response: ``webdriver.Response`` instance.
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO -         :param value: Expected value of the response body, if any.
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO -         """
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO - >       assert response.status == 200, str(response.error)
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO - E       AssertionError: unknown error (500): Failed to decode response from marionette
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO - E         
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO - E       assert 500 == 200
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO - E         +500
[task 2023-02-10T11:59:15.793Z] 11:59:15     INFO - E         -200
[task 2023-02-10T11:59:15.794Z] 11:59:15     INFO - 
[task 2023-02-10T11:59:15.794Z] 11:59:15     INFO - response   = <Response status=500 error=<UnknownErrorException http_status=500>>
[task 2023-02-10T11:59:15.794Z] 11:59:15     INFO - value      = None
[task 2023-02-10T11:59:15.794Z] 11:59:15     INFO - 
[task 2023-02-10T11:59:15.794Z] 11:59:15     INFO - tests/web-platform/tests/webdriver/tests/support/asserts.py:67: AssertionError
[task 2023-02-10T11:59:15.794Z] 11:59:15     INFO - .

There is a crash of Firefox when Marionette tried to create a screenshot:
https://treeherder.mozilla.org/logviewer?job_id=405256633&repo=try&lineNumber=34612-34624

[task 2023-02-10T11:58:50.038Z] 11:58:50     INFO - PID 17860 | 1676030330037	Marionette	DEBUG	0 -> [0,5,"WebDriver:TakeScreenshot",{"full":true,"highlights":[],"id":null}]
[task 2023-02-10T11:58:50.073Z] 11:58:50     INFO - PID 17860 | 1676030330071	RemoteAgent	TRACE	WebDriverProcessData actor created for PID 18055
[task 2023-02-10T11:58:50.086Z] 11:58:50     INFO - PID 17860 | 1676030330084	Marionette	TRACE	[11] MarionetteCommands actor created for window id 8589934593
[task 2023-02-10T11:58:50.273Z] 11:58:50     INFO - PID 17860 | ==17880==ERROR: ThreadSanitizer failed to allocate 0xf70000ede000 (271579387650048) bytes at address 80400001000 (errno: 12)
[task 2023-02-10T11:58:50.312Z] 11:58:50     INFO - PID 17860 | Exiting due to channel error.
[task 2023-02-10T11:58:50.317Z] 11:58:50     INFO - PID 17860 | Exiting due to channel error.
[task 2023-02-10T11:58:50.317Z] 11:58:50     INFO - PID 17860 | Exiting due to channel error.
[task 2023-02-10T11:58:50.317Z] 11:58:50     INFO - PID 17860 | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=1.80338) Exiting due to channel error.
[task 2023-02-10T11:58:50.317Z] 11:58:50     INFO - PID 17860 | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=2.97445) 1676030330311	webdriver::server	DEBUG	Teardown session
[task 2023-02-10T11:58:50.317Z] 11:58:50     INFO - PID 17860 | 1676030330311	geckodriver::marionette	ERROR	Failed to close browser connection: Socket not connected (os error 107)
[task 2023-02-10T11:58:50.317Z] 11:58:50     INFO - PID 17860 | 1676030330311	webdriver::server	DEBUG	<- 500 Internal Server Error {"value":{"error":"unknown error","message":"Failed to decode response from marionette","stacktrace":""}}
[task 2023-02-10T11:58:50.317Z] 11:58:50     INFO - PID 17860 | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=4.6695) Exiting due to channel error.
[task 2023-02-10T11:58:50.317Z] 11:58:50     INFO - PID 17860 | Exiting due to channel error.

Bob, this looks graphics related but not sure if the above information is helpful to actually know what happened.

Flags: needinfo?(bhood)

Seems more like a network error than graphics, but let me run this by my team during Triage this week, and we'll see if anybody has insights.

Blocks: gfx-triage
Flags: needinfo?(bhood)
Moving bug to Remote Protocol::Marionette component per bug 1815831.
Component: geckodriver → Marionette
Product: Testing → Remote Protocol
No longer blocks: gfx-triage

We need a stack from thread sanitizer for this to be actionable.

Component: Marionette → Sanitizers
Flags: needinfo?(gpascutto)
Product: Remote Protocol → Core

I have no idea why I'm needinfoed here.

Flags: needinfo?(gpascutto)

So far this was a one-time error and the failure didn't happen again. If no more failure is happening this week I'll go ahead and close the bug as incomplete.

Summary: Intermittent Wd <random tetst> - AssertionError: unknown error (500): Failed to decode response from marionette → Intermittent Wd <random tetst> - AssertionError: unknown error (500): Failed to decode response from marionette (ThreadSanitizer failed to allocate)

Looks like some sort of OOM in TSan (either in TSan itself or the application):

[task 2023-02-10T11:58:50.273Z] 11:58:50 INFO - PID 17860 | ==17880==ERROR: ThreadSanitizer failed to allocate 0xf70000ede000 (271579387650048) bytes at address 80400001000 (errno: 12)

It's tsan linux similar to bug 1770595. So I would say we should dupe?

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Duplicate of this bug: 1808059
Status: REOPENED → RESOLVED
Closed: 2 years ago1 year ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Status: REOPENED → RESOLVED
Closed: 1 year ago11 months ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---

(In reply to Treeherder Bug Filer from comment #37)

New failure instance: https://treeherder.mozilla.org/logviewer?job_id=456059443&repo=mozilla-esr115

This failure is actually for bug 1851376. I've re-classified.

Status: REOPENED → RESOLVED
Closed: 11 months ago7 months ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---

(In reply to Treeherder Bug Filer from comment #40)

New failure instance: https://treeherder.mozilla.org/logviewer?job_id=456393301&repo=mozilla-central

This is actually bug 1894213.

Status: REOPENED → RESOLVED
Closed: 7 months ago7 months ago
Resolution: --- → INCOMPLETE
Duplicate of this bug: 1904167
No longer duplicate of this bug: 1904167
You need to log in before you can comment on or make changes to this bug.