Closed Bug 1828464 Opened 2 years ago Closed 2 years ago

High freq Linux TSAN <WD> /webdriver/tests/<random.py> | <random> - setup error: webdriver.error.UnknownErrorException: unknown error (500): Failed to decode response from marionette when Gecko 114 merges to Beta on 2023-05-08

Categories

(Core :: Sanitizers, defect)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1770595
Tracking Status
firefox-esr102 --- unaffected
firefox112 --- unaffected
firefox113 --- unaffected
firefox114 + fixed

People

(Reporter: NarcisB, Unassigned)

References

Details

[Tracking Requested - why for this release]:

Central-as-beta-simulation
How to run these simulations
Failure log
Log snippet:

[task 2023-04-17T12:49:57.468Z] 12:49:57     INFO - TEST-UNEXPECTED-ERROR | /webdriver/tests/bidi/script/call_function/exception_details.py | test_invalid_function - setup error: webdriver.error.UnknownErrorException: unknown error (500): Failed to decode response from marionette
[task 2023-04-17T12:49:57.468Z] 12:49:57     INFO - event_loop = <_UnixSelectorEventLoop running=False closed=False debug=False>
[task 2023-04-17T12:49:57.468Z] 12:49:57     INFO - request = <SubRequest 'bidi_session' for <Function test_invalid_function>>
[task 2023-04-17T12:49:57.468Z] 12:49:57     INFO - kwargs = {'capabilities': {}, 'configuration': {'capabilities': {'moz:firefoxOptions': {'args': ['--profile', '/tmp/tmpbnjp6ag8...rker/workspace/build/application/firefox/firefox'}}, 'host': '127.0.0.1', 'port': 60334, 'timeout_multiplier': 3, ...}}
[task 2023-04-17T12:49:57.468Z] 12:49:57     INFO - setup = <function _wrap_asyncgen.<locals>._asyncgen_fixture_wrapper.<locals>.setup at 0x7fc67022f320>
[task 2023-04-17T12:49:57.469Z] 12:49:57     INFO - finalizer = <function _wrap_asyncgen.<locals>._asyncgen_fixture_wrapper.<locals>.finalizer at 0x7fc670214440>
[task 2023-04-17T12:49:57.469Z] 12:49:57     INFO - 
[task 2023-04-17T12:49:57.469Z] 12:49:57     INFO -     @functools.wraps(func)
[task 2023-04-17T12:49:57.469Z] 12:49:57     INFO -     def _asyncgen_fixture_wrapper(
[task 2023-04-17T12:49:57.469Z] 12:49:57     INFO -         event_loop: asyncio.AbstractEventLoop, request: SubRequest, **kwargs: Any
[task 2023-04-17T12:49:57.470Z] 12:49:57     INFO -     ) -> _R:
[task 2023-04-17T12:49:57.470Z] 12:49:57     INFO -         gen_obj = func(**_add_kwargs(func, kwargs, event_loop, request))
[task 2023-04-17T12:49:57.470Z] 12:49:57     INFO -     
[task 2023-04-17T12:49:57.470Z] 12:49:57     INFO -         async def setup() -> _R:
[task 2023-04-17T12:49:57.470Z] 12:49:57     INFO -             res = await gen_obj.__anext__()
[task 2023-04-17T12:49:57.470Z] 12:49:57     INFO -             return res
[task 2023-04-17T12:49:57.470Z] 12:49:57     INFO -     
[task 2023-04-17T12:49:57.470Z] 12:49:57     INFO -         def finalizer() -> None:
[task 2023-04-17T12:49:57.471Z] 12:49:57     INFO -             """Yield again, to finalize."""
[task 2023-04-17T12:49:57.471Z] 12:49:57     INFO -     
[task 2023-04-17T12:49:57.471Z] 12:49:57     INFO -             async def async_finalizer() -> None:
[task 2023-04-17T12:49:57.471Z] 12:49:57     INFO -                 try:
[task 2023-04-17T12:49:57.471Z] 12:49:57     INFO -                     await gen_obj.__anext__()
[task 2023-04-17T12:49:57.471Z] 12:49:57     INFO -                 except StopAsyncIteration:
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO -                     pass
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO -                 else:
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO -                     msg = "Async generator fixture didn't stop."
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO -                     msg += "Yield only once."
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO -                     raise ValueError(msg)
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO -     
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO -             event_loop.run_until_complete(async_finalizer())
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO -     
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO - >       result = event_loop.run_until_complete(setup())
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO - 
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO - event_loop = <_UnixSelectorEventLoop running=False closed=False debug=False>
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO - finalizer  = <function _wrap_asyncgen.<locals>._asyncgen_fixture_wrapper.<locals>.finalizer at 0x7fc670214440>
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO - func       = <function bidi_session at 0x7fc66d775b00>
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO - gen_obj    = <async_generator object bidi_session at 0x7fc67022f7a0>
[task 2023-04-17T12:49:57.472Z] 12:49:57     INFO - kwargs     = {'capabilities': {},
[task 2023-04-17T12:49:57.473Z] 12:49:57     INFO -  'configuration': {'capabilities': {'moz:firefoxOptions': {'args': ['--profile',
[task 2023-04-17T12:49:57.473Z] 12:49:57     INFO -                                                                     '/tmp/tmpbnjp6ag8.mozrunner'],
[task 2023-04-17T12:49:57.473Z] 12:49:57     INFO -                                                            'binary': '/builds/worker/workspace/build/application/firefox/firefox'}},
[task 2023-04-17T12:49:57.473Z] 12:49:57     INFO -                    'host': '127.0.0.1',
[task 2023-04-17T12:49:57.473Z] 12:49:57     INFO -                    'port': 60334,
[task 2023-04-17T12:49:57.473Z] 12:49:57     INFO -                    'timeout_multiplier': 3,
[task 2023-04-17T12:49:57.473Z] 12:49:57     INFO -                    'webdriver': {'args': ['-vv'],
[task 2023-04-17T12:49:57.473Z] 12:49:57     INFO -                                  'binary': '/builds/worker/fetches/geckodriver'}}}
[task 2023-04-17T12:49:57.473Z] 12:49:57     INFO - request    = <SubRequest 'bidi_session' for <Function test_invalid_function>>
[task 2023-04-17T12:49:57.473Z] 12:49:57     INFO - setup      = <function _wrap_asyncgen.<locals>._asyncgen_fixture_wrapper.<locals>.setup at 0x7fc67022f320>
[task 2023-04-17T12:49:57.473Z] 12:49:57     INFO - 
[task 2023-04-17T12:49:57.474Z] 12:49:57     INFO - tests/web-platform/tests/tools/third_party/pytest-asyncio/pytest_asyncio/plugin.py:293: 
Summary: Perma Linux <WD> /webdriver/tests/<random.py> | <random> - setup error: webdriver.error.UnknownErrorException: unknown error (500): Failed to decode response from marionette when Gecko 114 merges to Beta on 2023-05-08 → High freq Linux <WD> /webdriver/tests/<random.py> | <random> - setup error: webdriver.error.UnknownErrorException: unknown error (500): Failed to decode response from marionette when Gecko 114 merges to Beta on 2023-05-08

The last 9 runs on central had not a single failure. Maybe whimboo knows why this is so frequent for the beta simulation?

Flags: needinfo?(aryx.bugmail) → needinfo?(hskupin)

Yes, the problem can be seen a bit above the actual failure:

https://treeherder.mozilla.org/logviewer?job_id=412723150&repo=try&lineNumber=67578-67588

[task 2023-04-17T12:49:23.015Z] 12:49:23     INFO - PID 1020 | 1681735763013	Marionette	DEBUG	0 -> [0,1,"WebDriver:NewSession",{"webSocketUrl":true}]
[task 2023-04-17T12:49:23.029Z] 12:49:23     INFO - PID 1020 | 1681735763028	WebDriver BiDi	DEBUG	Waiting for initial application window
[task 2023-04-17T12:49:26.237Z] 12:49:26     INFO - PID 1020 | ==24465==ERROR: ThreadSanitizer failed to allocate 0xfee8a3e10000 (280275135299584) bytes at address 83a00001000 (errno: 12)
[task 2023-04-17T12:49:26.267Z] 12:49:26     INFO - PID 1020 | Exiting due to channel error.
[task 2023-04-17T12:49:26.268Z] 12:49:26     INFO - PID 1020 | 1681735766265	webdriver::server	DEBUG	Teardown session
[task 2023-04-17T12:49:26.268Z] 12:49:26     INFO - PID 1020 | 1681735766265	mozrunner::runner	DEBUG	Killing process 24465
[task 2023-04-17T12:49:26.276Z] 12:49:26     INFO - PID 1020 | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: CompositorBridgeChild receives IPC close with reason=AbnormalShutdown (t=1.86833) 1681735766273	webdriver::server	DEBUG	<- 500 Internal Server Error {"value":{"error":"unknown error","message":"Failed to decode response from marionette","stacktrace":""}}

Means that this is actually bug 1770595. Shall we dupe this bug or make it dependent on it?

Flags: needinfo?(hskupin) → needinfo?(aryx.bugmail)
Summary: High freq Linux <WD> /webdriver/tests/<random.py> | <random> - setup error: webdriver.error.UnknownErrorException: unknown error (500): Failed to decode response from marionette when Gecko 114 merges to Beta on 2023-05-08 → High freq Linux TSAN <WD> /webdriver/tests/<random.py> | <random> - setup error: webdriver.error.UnknownErrorException: unknown error (500): Failed to decode response from marionette when Gecko 114 merges to Beta on 2023-05-08
Component: web-platform-tests → Marionette
Product: Testing → Remote Protocol
Flags: needinfo?(nbeleuzu)

Lets set this bug as blocked by bug 1770595 - if the fix doesn't arrive downstream in time, we can still decide independently on the next step for this failure on beta and release.

Depends on: 1770595
Flags: needinfo?(aryx.bugmail)

The bug is marked as tracked for firefox114 (nightly). However, the bug still isn't assigned.

:Honza, could you please find an assignee for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit auto_nag documentation.

Flags: needinfo?(odvarko)

There is nothing that we can do here. We have to wait for the fix as mentioned on the bug 1770595.

Component: Marionette → Sanitizers
Flags: needinfo?(odvarko)
Flags: needinfo?(nbeleuzu)
Product: Remote Protocol → Core
Version: Default → unspecified

Narcisb, can you please check if that high frequent intermittent is fixed now with the next beta simulation? If yes, then we can dupe. Thanks.

Flags: needinfo?(nbeleuzu)
You need to log in before you can comment on or make changes to this bug.